This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
[nb winsys](031024), [jdk1.4.2_02] Steps to reproduce: - run IDE - open some java file - maximize Documents Window - restart IDE -> IDE hangs (see attached ide.log file with full thread-dump) I cannot reproduce this deadlock on Solaris, but on Win2K is 100% reproducible.
Created attachment 11986 [details] ide.log file with full thread-dump
[Note: this was found in new winsys build] I think this is not a problem of winsys itself. I can't judge from the dump who is making the mistake... all it seems weird to me. Those stack traces dealing with nodes, folder instance from various threads (it seems it doesn't have a clean separation). Passing to nodes first.
pnejedly: please take care of this issue. Thanks
It works if you remove monitor.jar from modules directory :)
Not nodes problem. First of all, I'm not sure why the DataObject performs node creation from inside a Children.MUTEX.readAccess, but this piece of code has quite interesting history (DCL, writeAccess, more locks, readAccess with more locks). Then a JavaNode waits for FolderInstance to finish from inside a Node creation. Not nice, especially while holding systemwide lock (Children.MUTEX). And finally MonitorAction constructor, called from a FolderInstance processor thread, creates a lot of interesting stuff, but it also creates some Nodes and sets them up, so they need to acquire writeAccess. I'd blame the JavaNode but it is hard to decide between it and the MonitorAction. IN ideal world, both should do corrective steps. Note: The problem probably wouldn't arise under new Nodes threading, as the MonitorAction would initialize all its Nodes lock-free and add hem to the UI (if really needed) from invokeLater. Note2: The problem may be provoked by the WS change because of some startup optimalizations (Monitor's node structure is rooted in Runtime tab, right)
Oops, UI is really not the best subcomponent...
*** Issue 37282 has been marked as a duplicate of this issue. ***
I am not able to reproduce it with the same configuration so the priority changed to P2. Even though I am not convinced the java module is the culprit I have prepared a patch postponing the FolderInstance task to JavaNode$JavaSourceChildren.addNotify in order to prevent the starvation. Since it is more hack than a solution I have just attached it here. The right solution seems to me to not perform a node creation inside a Children.MUTEX.readAccess in DataObject as Petr N. mentioned above. At least I do not see any reason for the mutex there. Reassined back to openide for further investigation.
Created attachment 12374 [details] patched JavaNode$JavaSourceChildren
OK, It seems I can legally call node construction without the read lock (with proper locking only). It would solve *this particular* deadlock (and maybe some others), but your code may (and frequently will) still get called under the Children.MUTEX.readLock, because that way it is usually created for all lazy Children (e.g. Children.Keys -> FolderChildren).
No readlock for node creation anymore in openide/loaders/src/org/openide/loaders/DataObject.java, v1.13 Fixes this particular deadlock, but there are still potential deadlocks between JavaNode and web module.
Looks like a bug in MonitorAction to me, though I don't see anything bad about your patch either.
OK, yarda have finally spoken and explained the presence of read-lock to me: Usually, when your node is about to be displayed in explorer, FolderChilden (as any other Children.Keys) calls the node creation under the readlock: readLock->getNodeDelegate()->priv.lock->createNodeDelegate() But the node may be asked by direct query: getNodeDelegate()->priv.lock->createNodeDelegate() This means that now (after my patch), the node creation code must not try to acquire Children.MUTEX In the light of this, I'm considering rolling my change back.
Sorry for not speaking up sooner, I had to realize the whole story. Now I reccon and I support the rollback. I think that there is little value in solving deadlocks just by changing random piece of code to lock in different order or delay some actions. As this example shows, once upon a time I decided to solve deadlock in issue 11132 by changing the order of locks in getNodeDelegate, fine for release 3.2, the problem was fixed, but now Petr decided to revert the order again and we can reopen the issue for 3.6. Deadlocks are so easy: After a while everyone learns how to read thread dumps and change order by modifying few lines of code, but in spite how tempting this solution is and how it immediatelly helps, from a longer point of view it is completely useless. The only valuable solution is JUnit test that is going to reproduce the deadlock and warn everyone when he mangles those few necessary lines of code that fixed it. Fighting deadlocks is so hard. I support the rollback and I'd like to ask for the junit test next time. And I admit I did not write one for issue 11132 (but we were all by 20000 issues younger), if you want reopen that issue to me and I can fix my mistake it. Better late than never.
I've reverted the change. Now it's on web folks to fix the monitor action.
This bug was filed on October 27 and code that supposedly causes the problem ("createNodeStructure") is no longer invoked from the MonitorAction at startup. The monitor caused deadlocks (see issue 36749, which appears to be a duplicate) after Jesse Glick modified the use of Nodes (to preempt deadlocks) on September 6, this was rolled back on November 14 after which those deadlocks disappeared (this issue was filed between those dates). In the process I also ensured that the monitor will not create any components until it the user starts the UI. The startup issue is definitely gone as a result of this, and since we have not had any reports of deadlocks during running of the monitor it is safer not to attempt to modify that code for now. It's my intention to switch to Looks as soon as it becomes available.
verified in [nb_dev](200402181900)