Bug 44097 - AWT thread blocked during startup
AWT thread blocked during startup
Status: RESOLVED FIXED
Product: java
Classification: Unclassified
Component: Unsupported
4.x
PC Windows ME/2000
: P2 (vote)
: 4.x
Assigned To: issues@java
issues@java
perfawtthread perfpromod
: PERFORMANCE, REGRESSION, THREAD
: 44089 44556 (view as bug list)
Depends on:
Blocks: 45449
  Show dependency treegraph
 
Reported: 2004-06-01 14:20 UTC by Antonin Nebuzelsky
Modified: 2007-09-26 09:14 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
:


Attachments
Several thread dumps during the time IDE is frozen while J2SE is scanned (75.01 KB, text/plain)
2004-06-01 14:23 UTC, Antonin Nebuzelsky
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Antonin Nebuzelsky 2004-06-01 14:20:30 UTC
See the attachment for several thread dumps 
during J2SE parsing which was automatically 
started at a new project creation. The thread 
dumps show that AWT thread is blocked waiting in 
NBMDRepositoryImpl.beginTrans for exclusive mutex 
to read information from MDR.

This is unfortunatelly an MDR design flaw which 
can show up at any time a time consuming MDR 
database modifications are being done. See issue 
42479 for another occurence.

MDR must be able to provide information without 
blocking readers while a writer is holding a 
write mutex to MDR. Exclusive mutex as 
implemented today is an evil.
Comment 1 Antonin Nebuzelsky 2004-06-01 14:22:25 UTC
I would like to stress out that this occurence of this AWT blocking 
is a regression to the previous refactoring builds. This shows how 
evil this is. It really can show up somewhere else again anytime.
Comment 2 Antonin Nebuzelsky 2004-06-01 14:23:21 UTC
Created attachment 15397 [details]
Several thread dumps during the time IDE is frozen while J2SE is scanned
Comment 3 Martin Matula 2004-06-01 14:54:09 UTC
I would not be that fatalist. It can be called a swing/AWT design flaw
as well as MDR design flow. This problem was known from the beginning
and discussed with Jesse and other folks when working on build system
integration. The cure to this is not the mutex you suggest (that would
require a lot of code/work to implement and months to stabilize), but
making sure that modules make calls to MDR in event thread only when
they know what they are doing. Moreover the ExclusiveMutex is now able
to recognize that an AWT thread is waiting for it, it gives it a
higher priority when graning access to the mutex and also enables long
running tasks such as scanning to pause their transaction for a moment
(if the nature of the task allows it), let AWT/swing do its job and
reacquire the lock again. We are using this support when scanning, but
there seems to be a regression in this mechanism which we will look at
and try to fix it for the next build.
Comment 4 _ tboudreau 2004-06-01 15:34:45 UTC
"The cure to this is not the mutex you suggest (that would require a lot of code/work to 
implement and months to stabilize), but making sure that modules make calls to MDR in 
event thread only when they know what they are doing."

<rant>
I truly do not get why, when designing things in NetBeans, we *start* from the proposition 
that potentially long running operations must be synchronous - it leads to massively deep 
stack traces (with each additional stack frame an additional opportunity for someone to 
cause a deadlock).  Operating systems have been using message-based 
queues for years to decouple such things with quite a bit of success, yet we treat the 
natural case as being synchronous operation for everything, no matter how complex, and 
invokeLater() as a sort of dirty hack you do when trapped in a corner, rather than using 
event/message queues as designed.  This gets us into troubles like this.

This also goes to our typical approach to threading which is, whatever thread asks for a 
thing first wins (consider the java parser).  That's more like a threading circus than a 
threading model.

My point is that, perhaps MDR queries simply *should not be synchronous, period.*  That 
is, when you want some information, you queue a request, and MDR gets back to you 
when that information is ready.  If somebody *really* needs the information 
synchronously, they can do some equivalent of wait(myQuery), and it's their responsibility 
not to do that when asking for something that will likely take a long time, and their 
responsibility to design their code not to expect that all queries take 0 time.

I think no synchronous solution can ever work 100% unless you can prove a maximum 
duration for queries.

"enables long running tasks such as scanning to pause their transaction for a moment (if 
the nature of the task allows it), let AWT/swing do its job"

This is *almost* what I'm talking about, except that this should be the standard case, not 
the exceptional case, and should not involve any kind of black magic or hacks.  Queueing 
work in logical serial units should be the norm, not the exception.
</rant>

To calm down, a bit, what I would expect from anything that can take a potentially infinite 
amount of time is:
 - There is a message queue to which queries are posted
 - There is/are dedicated thread(s) which will do the work
 - There is a notification callback when a query's result is available, with failure, 
cancellation and timeout semantics
 - No code should be designed to assume a query is instantaneous.  Code that must block 
until a result is available is thus *forced* to post some UI to indicate that it is waiting for a 
result, but the AWT event queue does not need to be blocked.
 
After all that, *if* MDR can determine *for sure* that a query can complete in <threshold> 
time *before* it runs, then as an optimization, it may perform the query synchronously, to 
avoid context switching costs, but that's an optimization you don't do until everything else 
is rock solid.

What scares me is that all this seems pretty obvious and basic, and it sounds like we're not 
terribly close to such a design.  Martin, I hope you can correct me.
Comment 5 _ tboudreau 2004-06-01 15:45:58 UTC
For the particular stack trace in this exception, it is trying to populate the combo box in 
the editor toolbar.  There is no particular reason that either:

 - a. the combo's contents must be exactly accurate before it ever appears on the screen - 
populating it could be invoke-latered

 - b. it needs to know its full contents - it's a combo box.  Unless its popup is open, it 
does not need to know the full contents of its list, it only needs to know the selected item 
it should display, and if there is at least one additional item so the popup button should 
be enabled.  So possibly an optimization to request from Explorer - NodeListModel should 
resolve its contents lazily.  What it's doing now is silly.

        at org.openide.explorer.view.NodeListModel$1.run(NodeListModel.java:86)
        at org.openide.util.Mutex.doEvent(Mutex.java:903)
        at org.openide.util.Mutex.readAccess(Mutex.java:227)
        at org.openide.explorer.view.NodeListModel.setNode(NodeListModel.java:70)
        at org.openide.explorer.view.ChoiceView.updateChoice(ChoiceView.java:151)
        at org.openide.explorer.view.ChoiceView.access$200(ChoiceView.java:29)
        at 
org.openide.explorer.view.ChoiceView$PropertyIL.propertyChange(ChoiceView.java:195)
Comment 6 Jan Becicka 2004-06-01 15:53:49 UTC
*** Issue 44089 has been marked as a duplicate of this issue. ***
Comment 7 Martin Matula 2004-06-01 16:35:57 UTC
Blocking of AWT thread during scanning is now fixed:

Checking in src/org/netbeans/modules/javacore/ExclusiveMutex.java;
/cvs/java/javacore/src/org/netbeans/modules/javacore/Attic/ExclusiveMutex.java,v
 <--  ExclusiveMutex.java
new revision: 1.1.2.28.2.18; previous revision: 1.1.2.28.2.17
done
Checking in src/org/netbeans/modules/javacore/FileScanner.java;
/cvs/java/javacore/src/org/netbeans/modules/javacore/Attic/FileScanner.java,v
 <--  FileScanner.java
new revision: 1.1.2.16.2.12; previous revision: 1.1.2.16.2.11
done
Comment 8 Martin Matula 2004-06-01 16:44:21 UTC
Thanks for your summary Tim. Moving from synchronous to ansynchronous
is not possible for javacore since it is based on JMI (all the APIs
are generated from a model) that does not support asynchronous calls.
So what we are doing currently is transforming the clients of the JMI
API to call it asynchronously (or at least make sure it is not called
in AWT thread). We may provide a utility classes/methods for making it
easier for clients in the future so that a single request processor
could be used to schedule the calls, etc. But purely moving to this
approach without moving the calls to source hierarchy from AWT would
not help anyway because of backward compatibility reasons (since the
old src API is synchronous).
So for now we are trying to fix exactly the problems you found out
about populating the editor drop down by moving the data-collecting
code to a different thread.
Comment 9 _ tboudreau 2004-06-01 18:31:42 UTC
"Moving from synchronous to ansynchronous is not possible for javacore since it is based 
on JMI (all the APIs are generated from a model) that does not support asynchronous 
calls."

This suggests to me that either we are stretching JMI beyond its design limitations, or it is 
simply not well enough designed to actually solve the problem it's supposed to solve.

The utility/helper approach sounds like a very, very good idea - no reason that couldn't be 
wrappered on top of JMI (preferably along with deprecating any other avenues of access to 
metadata).  

Let java/srcmodel be blocking, and deprecated - deadlocks and hangs are good 
encouragement to stop using deprecated calls.
Comment 10 Antonin Nebuzelsky 2004-06-09 18:43:49 UTC
Verified fixed in trunk.
Comment 11 _ tboudreau 2004-06-14 17:23:14 UTC
Just did a cvs update and a clean build, and got a long delay after the first paint of the 
main window, when opening with a userdir which had openide open as a project, and a 
few files open in the editor.  Stack dump looks like exactly the same thing going on as 
before:


"AWT-EventQueue-1" prio=5 tid=0x00587920 nid=0x1ebe200 in Object.wait() 
[f1140000..f1142b20]
        at java.lang.Object.wait(Native Method)
        at org.netbeans.modules.javacore.ExclusiveMutex.enter(ExclusiveMutex.java:83)
        - locked <0x6283ab58> (a org.netbeans.modules.javacore.ExclusiveMutex)
        at org.netbeans.mdr.NBMDRepositoryImpl.beginTrans(NBMDRepositoryImpl.java:218)
        at 
org.netbeans.modules.java.bridge.MemberElementImpl.getName(MemberElementImpl.java
:132)
        at org.openide.src.MemberElement.getName(MemberElement.java:81)
        at org.netbeans.modules.beans.PatternChildren$Listener.<init>(PatternChildren.java:
234)
        at org.netbeans.modules.beans.PatternChildren$Listener.<init>(PatternChildren.java:
228)
        at org.netbeans.modules.beans.PatternChildren.<init>(PatternChildren.java:39)
        at org.netbeans.modules.beans.PatternChildren.<init>(PatternChildren.java:86)
        at 
org.netbeans.modules.beans.PatternsBrowserFactory.createClassNode(PatternsBrowserFact
ory.java:75)
        at 
org.netbeans.modules.java.ui.nodes.ExFilterFactory.createClassNode(ExFilterFactory.java:
64)
        at org.openide.src.nodes.FilterFactory.createClassNode(FilterFactory.java:75)
        at 
org.netbeans.modules.javadoc.comments.JavaDocPropertySupportFactory.createClassNode
(JavaDocPropertySupportFactory.java:57)
        at 
org.netbeans.modules.java.ui.nodes.ExFilterFactory.createClassNode(ExFilterFactory.java:
64)
        at org.openide.src.nodes.FilterFactory.createClassNode(FilterFactory.java:75)
        at 
org.netbeans.modules.refactoring.ui.RefactoringFilterFactory.createClassNode(Refactoring
FilterFactory.java:42)
        at 
org.netbeans.modules.java.ui.nodes.ExFilterFactory.createClassNode(ExFilterFactory.java:
64)
        at org.openide.src.nodes.SourceChildren.createNodes(SourceChildren.java:169)
        at org.openide.nodes.Children$Keys$KE.nodes(Children.java:1986)
        at org.openide.nodes.ChildrenArray.nodesFor(ChildrenArray.java:112)
        at org.openide.nodes.Children$Info.nodes(Children.java:1082)
        at org.openide.nodes.Children.justComputeNodes(Children.java:588)
        at org.openide.nodes.ChildrenArray.nodes(ChildrenArray.java:54)
        at org.openide.nodes.Children.getNodes(Children.java:324)
        at org.openide.nodes.FilterNode$ChildrenAdapter.run(FilterNode.java:1297)
        at org.openide.nodes.FilterNode$Children.updateKeys(FilterNode.java:1253)
        at org.openide.nodes.FilterNode$Children.addNotifyImpl(FilterNode.java:1150)
        at org.openide.nodes.FilterNode$Children.addNotify(FilterNode.java:1142)
        at org.openide.nodes.Children.callAddNotify(Children.java:419)
        at org.openide.nodes.Children.getArray(Children.java:462)
        at org.openide.nodes.Children.getNodes(Children.java:315)
        at org.openide.explorer.view.VisualizerNode.getChildren(VisualizerNode.java:179)
        at org.openide.explorer.view.NodeListModel.findSize(NodeListModel.java:198)
        at org.openide.explorer.view.NodeListModel.getSize(NodeListModel.java:133)
        at org.openide.explorer.view.NodeListModel.addAll(NodeListModel.java:274)
        at org.openide.explorer.view.NodeListModel$1.run(NodeListModel.java:86)
        at org.openide.util.Mutex.doEvent(Mutex.java:903)
        at org.openide.util.Mutex.readAccess(Mutex.java:227)
        at org.openide.explorer.view.NodeListModel.setNode(NodeListModel.java:70)
        at org.openide.explorer.view.ChoiceView.updateChoice(ChoiceView.java:151)
        at org.openide.explorer.view.ChoiceView.access$200(ChoiceView.java:29)
        at 
org.openide.explorer.view.ChoiceView$PropertyIL.propertyChange(ChoiceView.java:195)
        at 
java.beans.PropertyChangeSupport.firePropertyChange(PropertyChangeSupport.java:252)
        at org.openide.explorer.ExplorerManager.setExploredContext(ExplorerManager.java:
284)
        at org.openide.explorer.ExplorerManager.setRootContext(ExplorerManager.java:411)
        at 
org.netbeans.modules.java.ui.NavigationView.changeRootContext(NavigationView.java:
289)
        at 
org.netbeans.modules.java.ui.NavigationView.activatedNodeChanged(NavigationView.java:
249)
        at org.netbeans.modules.java.ui.NavigationView.addNotify(NavigationView.java:166)
        at java.awt.Container.addNotify(Container.java:2049)
        - locked <0x6227d858> (a java.awt.Component$AWTTreeLock)
        at javax.swing.JComponent.addNotify(JComponent.java:4288)
        at 
org.netbeans.modules.java.ui.actions.NavigateAction$ToolbarPresenter.addNotify(Navigate
Action.java:103)
        at java.awt.Container.addImpl(Container.java:658)
        - locked <0x6227d858> (a java.awt.Component$AWTTreeLock)
        at javax.swing.JToolBar.addImpl(JToolBar.java:583)
        at java.awt.Container.add(Container.java:307)
        at org.netbeans.modules.editor.NbEditorToolBar.addPresenters(NbEditorToolBar.java:
373)
        at 
org.netbeans.modules.editor.NbEditorToolBar.checkPresentersAdded(NbEditorToolBar.java
:230)
        at org.netbeans.modules.editor.NbEditorToolBar.access$200(NbEditorToolBar.java:
79)
        at org.netbeans.modules.editor.NbEditorToolBar$5.run(NbEditorToolBar.java:218)
        at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:178)
        at java.awt.EventQueue.dispatchEvent(EventQueue.java:477)
        at 
java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchThread.java:234)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:
184)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:178)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:170)
        at java.awt.EventDispatchThread.run(EventDispatchThread.java:100)
Comment 12 Martin Matula 2004-06-15 23:55:29 UTC
P1->P2
Comment 13 Martin Matula 2004-06-17 13:11:56 UTC
*** Issue 44556 has been marked as a duplicate of this issue. ***
Comment 14 Tomas Hurka 2004-06-22 08:41:14 UTC
Moved to new subcomponent java/javacore.
Comment 15 Martin Matula 2004-07-13 15:52:12 UTC
Should not happen anymore - should be fixed by fix to issue 45077 (we
changed the way how the files are scanned). There can still be a delay
if you delete the storage files (or the storage files are corrupted)
and you start the IDE with some files open in the edtor. But this
should be a very rare case and it should not take too long.
Comment 16 Quality Engineering 2007-09-20 11:52:03 UTC
Reorganization of java component


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo