This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 157872

Summary: Excessive locking of DataObject.getNodeDelegate
Product: platform Reporter: Jesse Glick <jglick>
Component: Data SystemsAssignee: Jiri Skrivanek <jskrivanek>
Status: RESOLVED WONTFIX    
Severity: blocker CC: jtulach, t_h
Priority: P3 Keywords: THREAD
Version: 6.x   
Hardware: All   
OS: All   
Issue Type: DEFECT Exception Reporter:
Bug Depends on:    
Bug Blocks: 36855    
Attachments: Suggested patch
This might work

Description Jesse Glick 2009-02-04 00:40:22 UTC
This method acquires Children.MUTEX for no inherent reason. This is bad because just creating a typical DataNode would
not otherwise need to lock that mutex.

Specifically, try running NetBeans Project Metadata Inspector in recent builds. It fails:

Recommended templates:
  Templates/Classes/Class.java ("Java Class")
  Templates/Classes/Package ("Java Package")
  Templates/Classes/Interface.java ("Java Interface")
  Templates/JUnit/SimpleJUnitTest.java ("Test for Existing Class")
  Templates/NetBeansModuleDevelopment/newAction ("Action")
  Templates/NetBeansModuleDevelopment/emptyLibraryDescriptorjava.lang.IllegalStateException: Should not acquire
Children.MUTEX while holding ProjectManager.mutex()
        at org.openide.nodes.Children$ProjectManagerDeadlockDetector.execute(Children.java:1799)
        at org.openide.util.Mutex.doWrapperAccess(Mutex.java:1320)
        at org.openide.util.Mutex.readAccess(Mutex.java:351)
        at org.openide.loaders.DataObject.getNodeDelegateImpl(DataObject.java:281)
        at org.openide.loaders.DataObject.getNodeDelegate(DataObject.java:273)
        at org.netbeans.modules.apisupport.projectinspector.InspectProjectAction.dump(InspectProjectAction.java:373)
        at org.netbeans.modules.apisupport.projectinspector.InspectProjectAction.access$100(InspectProjectAction.java:95)
        at org.netbeans.modules.apisupport.projectinspector.InspectProjectAction$Impl$1$1.run(InspectProjectAction.java:137)
        at org.openide.util.Mutex.readAccess(Mutex.java:362)
        at org.netbeans.modules.apisupport.projectinspector.InspectProjectAction$Impl$1.run(InspectProjectAction.java:122)
        at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:573)
        at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:1005)

This is because the code is calling getNodeDelegate (on a template DataObject) to get its display name, while holding
PM.mutex. AFAIK this is safe and should not throw an error.
Comment 1 Jesse Glick 2009-02-04 00:43:09 UTC
I tried to write a test case to reproduce jtulach's hypothesized deadlock in issue #36855 which was used to justify the
current behavior. It seems Log.controlFlow can be used to simulate a particular deadlock, but what I am interested in is
rather demonstrating that some calls do _not_ deadlock, and I wasn't sure how to do that. Nor was I sure what the
various output to the "diagnostic" logger meant when the test did deadlock. What I had so far:

final Logger trace = Logger.getLogger(getName());
class MDO extends MultiDataObject {
    MDO(FileObject f, MultiFileLoader loader) throws IOException {
        super(f, loader);
    }
    protected @Override Node createNodeDelegate() {
        trace.info("entering createNodeDelegate");
        Node n;
        if (getPrimaryFile().hasExt("test2")) {
            n = Children.MUTEX.writeAccess(new Mutex.Action<Node>() {
                public Node run() {
                    trace.info("inside mutex for createNodeDelegate");
                    Node n = new DataNode(MDO.this, Children.LEAF);
                    trace.info("finished mutex for createNodeDelegate");
                    return n;
                }
            });
        } else {
            n = new DataNode(MDO.this, Children.LEAF);
        }
        trace.info("exiting createNodeDelegate");
        return n;
    }
}
class TestLoader extends UniFileLoader {
    TestLoader() {
        super(DefaultDataObject.class.getName());
        getExtensions().addExtension("test1");
        getExtensions().addExtension("test2");
    }
    protected @Override MultiDataObject createMultiObject(FileObject f) throws IOException {
        return new MDO(f, this);
    }
}
MockLookup.setInstances(new DataLoaderPool() {
    protected @Override Enumeration<? extends DataLoader> loaders() {
        return Enumerations.singleton(new TestLoader());
    }
});
FileObject f = FileUtil.createMemoryFileSystem().getRoot().createData("f.test2");
final DataObject d = DataObject.find(f);
assertEquals(MDO.class, d.getClass());
Log.controlFlow(trace, Logger.getLogger(DataObjectTest.class.getName()),
        "THREAD:direct MSG:looking for node directly " +
        "THREAD:direct MSG:entering createNodeDelegate " +
        "THREAD:children MSG:looking for node outside mutex " +
        "THREAD:children MSG:looking for node inside mutex " +
        "THREAD:direct MSG:inside mutex for createNodeDelegate " +
        "THREAD:direct MSG:finished mutex for createNodeDelegate " +
        "THREAD:direct MSG:exiting createNodeDelegate " +
        "THREAD:direct MSG:got node directly " +
        "THREAD:children MSG:got node inside mutex " +
        "THREAD:children MSG:got node outside mutex",
        0);
Thread direct = new Thread("direct") {
    public @Override void run() {
        trace.info("looking for node directly");
        d.getNodeDelegate();
        trace.info("got node directly");
    }
};
direct.start();
Thread children = new Thread("children") {
    public @Override void run() {
        trace.info("looking for node outside mutex");
        Children.MUTEX.writeAccess(new Mutex.Action<Node>() {
            public Node run() {
                trace.info("looking for node inside mutex");
                Node n = d.getNodeDelegate();
                trace.info("got node inside mutex");
                return n;
            }
        });
        trace.info("got node outside mutex");
    }
};
children.start();
direct.join();
children.join();
Comment 2 Jesse Glick 2009-02-04 00:46:19 UTC
The change I propose is to not hold _any_ locks while calling createNodeDelegate. In the unlikely event that two threads
try to call getNodeDelegate for the first time at once, they will both create nodes, and one will be chosen. A global
lock is used just to protect actual field accesses and create a memory barrier.
Comment 3 Jesse Glick 2009-02-04 00:46:53 UTC
Created attachment 76524 [details]
Suggested patch
Comment 4 Jesse Glick 2009-02-04 00:48:36 UTC
The patch does make NBPMI work again, by the way.
Comment 5 Jiri Skrivanek 2009-02-04 13:15:09 UTC
Your patch looks good to me. The test cannot deadlock at all in my opinion. There is no callback from
createNodeDelegate, so it will not stuck. Jardo, could you comment what exactly to test?
Comment 6 Jaroslav Tulach 2009-02-05 08:17:42 UTC
1. I'd like to object to your term "jtulach's hypothesized deadlock". It is not mine. And as far as I can tell it is 
not "hypothesized" - both issues, issue 36855 and issue 11132 represent deadlocks that really happened.

2. This is not real fix of your problem. Change your DataNode to use new Children.Array() instead of Children.LEAF and 
I am sure that the "java.lang.IllegalStateException: Should not acquire Children.MUTEX while holding 
ProjectManager.mutex()" will be back.

3. I'd prefer to fix InspectProjectAction (something new, not known yet?), and either:
a) use FileSystem.Status
b) don't hold project mutex while dealing with (data object's) Nodes

4. If you decide to insinuate your patch, you shall either change it to provide compatiblity (by extending the API and 
letting DataObject's or DataLoader decide which behaviour they want) or mark it as semantically incompatible. Code 
which relies on 1:1 data object/node mapping will be randomly negatively impacted. 

Please reconsider #2 before doing #4. This is not fix of your problem.
Comment 7 Jesse Glick 2009-02-11 02:15:04 UTC
To #1 - OK. There would be no deadlock with the suggested patch either, because DO.gND would not be calling external
code with a lock held.

To #2 - the patch _does_ fix my problem. I did not try a variety of other nodes, just the ones actually used. Why
creating certain kinds of Node (not even necessarily part of any hierarchy) would need to acquire a global lock, I don't
know, but that's part of the mystery of openide.nodes threading.

To #4 - the patch does ensure that only one Node is associated with the DataObject. It does not ensure that only one
Node will ever be constructed, though this seems very unlikely to be relied on by anyone, since nodes are views over data.

#3b is possible, apparently easier than fixing openide.loaders to not acquire unnecessary locks.
Comment 8 Jaroslav Tulach 2009-02-11 10:10:27 UTC
Tomáši, can you think about changing new AbstractNode(new Children.Array()) to not acquire any global lock? That would 
solve mine #2 complaint. As far as I know the code in "assignTo" was originally optimized from migrating a node from 
one Children to another Children which is minority (<1%%) usecase these days probably. Rewriting that would make life 
for majority of users simpler.
Comment 9 t_h 2009-02-11 10:43:27 UTC
Created attachment 76839 [details]
This might work
Comment 10 Jesse Glick 2009-02-11 15:28:15 UTC
The Children patch sounds like it could be useful independently of this issue.
Comment 11 t_h 2009-02-13 15:28:04 UTC
ok, it is in b989b198adfe