This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 74204 - Deadlock while creating appclient
Summary: Deadlock while creating appclient
Status: VERIFIED FIXED
Alias: None
Product: projects
Classification: Unclassified
Component: Ant Project (show other bugs)
Version: 5.x
Hardware: All All
: P1 blocker (vote)
Assignee: Jan Lahoda
URL:
Keywords: RANDOM, THREAD
: 75164 (view as bug list)
Depends on:
Blocks: 75635
  Show dependency tree
 
Reported: 2006-03-30 16:32 UTC by Lukas Jungmann
Modified: 2006-08-08 13:03 UTC (History)
8 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
thread dump (34.04 KB, text/plain)
2006-03-30 16:33 UTC, Lukas Jungmann
Details
simplified-commented-threaddump.txt (22.34 KB, text/plain)
2006-04-05 14:26 UTC, Martin Krauskopf
Details
simplified-commented-threaddump.txt (previsous one was not simplified, just adjusted) (11.22 KB, text/plain)
2006-04-05 14:27 UTC, Martin Krauskopf
Details
Proposed Patch (3.76 KB, patch)
2006-05-02 13:49 UTC, Jan Becicka
Details | Diff
Improved patch. Thanks to Martin K. (3.79 KB, patch)
2006-05-02 15:31 UTC, Jan Becicka
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Lukas Jungmann 2006-03-30 16:32:34 UTC
-create new ear w/ ejb module and appclient
-press finish in the new ear wizard

=> deadlock
Comment 1 Lukas Jungmann 2006-03-30 16:33:43 UTC
Created attachment 29506 [details]
thread dump
Comment 2 Martin Krauskopf 2006-04-05 14:24:49 UTC
It's a deadlock between MergedClassPathImplementation from javacore and
ProjectManager.mutex. Not sure on which side it should be fixed yet. The first
thread is spawned by AppClientProjectGenerator(* - see below) and the second one
by NNListenerManager. See the simplified attached thread dump and search for XXX
comments which shows possibly problematic places.
(*)It can happen also for other project types - possibly also outside of j2ee;
since the code for all project types is the same(**). So it seems that it is
rather problem of j2ee/metadata - NNListenerManager; or deeper (e.g. projects or
javacore).
I think we can get out of the flow of the second thread (ProjectGenerator) but
maybe that NNListenerManager could from its thread with the fork to
ProjectManager.mutex.readAccess -> and thus wait for all pending write-access
calls. But this could wrong from the NNListenerManager vs. JMManager semantics
points of view....
So (temporarily?) reassinging to Martin who seems to be the author of the code
in the NNListenerManager (cvs ann). Please continue in evaluation from the
NNListenerManager's point of view so we can move further.
Comment 3 Martin Krauskopf 2006-04-05 14:26:01 UTC
Created attachment 29628 [details]
simplified-commented-threaddump.txt
Comment 4 Martin Krauskopf 2006-04-05 14:27:51 UTC
Created attachment 29629 [details]
simplified-commented-threaddump.txt (previsous one was not simplified, just adjusted)
Comment 5 Martin Adamek 2006-04-26 08:38:16 UTC
Reassigning to javacore for evaluation.
Comment 6 Martin Krauskopf 2006-04-28 17:24:10 UTC
Also see the dependent issue if you want another threaddump. I think the
dependent issue is the third deadlock between ProjectManager.MUTEX and
MergedClassPathImplementation I saw.

Note that you are calling foreign code under your own lock. But maybe it is
rightful. Please evalutate, so we can move on. Thanks.
Comment 7 Martin Adamek 2006-04-28 17:35:20 UTC
I guess NNListenerManager (now most of the code moved to NNMDRListener) is doing
nothing wrong, but let me investigate more.
Comment 8 Martin Krauskopf 2006-05-02 09:44:15 UTC
NNListenerManager is just one case. Other deadlocks do not contain NNLM in any
stacktrace in a threaddump.
Comment 9 Jan Becicka 2006-05-02 13:49:02 UTC
Created attachment 30164 [details]
Proposed Patch
Comment 10 Jan Becicka 2006-05-02 13:54:53 UTC
Please test attached patch. The deadlock is pretty complicated. There are too
many strange things there. Events are fired under locks, save() is called during
project creation etc...
Comment 11 Martin Krauskopf 2006-05-02 14:23:23 UTC
Mariane could you try the patch. You said that you have 100% reproducible case
(issue 75635). Let me know through the email if you need the built nbm. Thanks.
Comment 12 Martin Krauskopf 2006-05-02 15:01:07 UTC
Ummm, just guessing... Although the call addClassPathResources(cp) in
addClassPath() method escapes the MergedClassPathImplementation lock, it will
finally call the line in the addClassPathResources() method:

entries = cp.entries();

under MCPI lock which seems to be the problem of issue 75635 (see the deadlock
there) since it will try to acquire ProjectManager.MUTEX read access.

Maybe it should be switched with the line below? I.e.

  synchronized(this) {
    ..
    entries = cp.entries();
  }

would be replaced with

  synchronized(this) {
    ..
  }
  entries = cp.entries();

or not? :)

Comment 13 Jan Becicka 2006-05-02 15:31:47 UTC
Created attachment 30168 [details]
Improved patch. Thanks to Martin K.
Comment 14 Martin Krauskopf 2006-05-02 15:39:35 UTC
Then you do not need the local variable at all for entries, just
cp.entries().iterator(). Nitpicking :)
I'll send to Marian a new patch.
Comment 15 Martin Krauskopf 2006-05-02 16:11:51 UTC
Marian is still able to reproduce. See:

http://www.netbeans.org/nonav/issues/showattachment.cgi/30173/exc.txt

which is threaddump with the last patch applied. Problem starts with line 262 in
patched version.
Comment 16 Jan Becicka 2006-05-02 16:58:49 UTC
The real problem, from my point of view is, that project infrastructure fires
events while holding Mutex.
Maybe we can find another workaround for this deadlock, but I'm not sure. It
looks like the real problem probably lies in fact, that it is not a good idea to
fire events while holding some lock.. Any help appreciated...
Comment 17 Petr Jiricka 2006-05-02 17:59:39 UTC
We agreed this is not a showstopper for NB 5.5 beta, but still very important.
Thanks.
Comment 18 Jaroslav Tulach 2006-05-03 13:26:11 UTC
I've seen the deadlock once and even wrote a test for it with Hrebejk and 
Tomas Z. I agree that the real problem is firing changes while holding the 
project's mutex. The fix is simple: Just before entering Mutex.xyzAccess, do 
FileSystem.runAtomicAction, that will prevent the events to be delivered. Imho 
this is the change that shall be done. Of course with having a test for the 
deadlock...
Comment 19 Jan Becicka 2006-05-04 14:27:57 UTC
In this case the fix should be done in projects...
Comment 20 Jan Becicka 2006-05-04 14:55:49 UTC
*** Issue 75164 has been marked as a duplicate of this issue. ***
Comment 21 Jan Lahoda 2006-05-04 15:50:28 UTC
Jarda, do you have the test? I am afraid that runAtomicAction may not help - the
event that caused the deadlock in issue #75635 was not a file event.
Comment 22 Jaroslav Tulach 2006-05-04 16:44:34 UTC
I am not sure if it is really the same, but test for similar problem - e.g. 
fileChangeEvent under writeAccess has been written for issue 50259.

Re. 75635: It is a different deadlock, imho. That is why it needs different 
test ;-)
Comment 23 Jan Lahoda 2006-05-04 17:05:26 UTC
This issue and issue #75635 are different: in this one the problem is caused by
writing into project properties, the second one by write into project.xml. So
fix of one of them will not (probably) fix the other.
Comment 24 Jan Lahoda 2006-05-04 18:04:54 UTC
Re the runAtomicAction: I do not like it very much, as the filesystem events are
not a cause of the problem, they are only intermediate. I could simply write a
piece of code that would deadlock without any filesystem events. So I do not
think this would be a proper fix.

I think it would be more appropriate if the listener
("AppClientProjectClassPathExtender$4.run") would be rewritten not save project
synchronously. I will check with jungi and madamek if this is possible.
Comment 25 Martin Krauskopf 2006-05-04 18:15:52 UTC
Also be aware of web and ejb modules (copy-pasted code). Thus:

$NB_CVS_55/j2ee/clientproject/src/org/netbeans/modules/j2ee/clientproject/classpath/AppClientProjectClassPathExtender.java
$NB_CVS_55/j2ee/ejbjarproject/src/org/netbeans/modules/j2ee/ejbjarproject/classpath/EjbJarProjectClassPathExtender.java
$NB_CVS_55/web/project/src/org/netbeans/modules/web/project/classpath/WebProjectClassPathExtender.java
Comment 26 Jaroslav Tulach 2006-05-04 19:19:31 UTC
Ok, you will write the test (e.g. piece of code) that deadlocks without fs 
events. I'll wait for your fix and then I am going write one that does 
deadlock with due to fs events.
Comment 27 Jan Lahoda 2006-05-04 19:57:46 UTC
Aha, I probably understand what you mean: you propose to wrap the writeAccess in
the ProjectManager.saveProject with runAtomicAction. Yes this makes sense to me,
but would not solve this problem - the fs events would be fired at the end of
saveProject method, and this is still under the writeAccess (acquired in
org.netbeans.spi.project.support.ant.AntProjectHelper.putProperties). To solve
this (particular) deadlock fully with runAtomicAction, one would have to wrap
the writeAccess in putProperties into an atomic action, and this does not make
much sense to me.
Comment 28 Tomas Zezula 2006-05-04 20:36:59 UTC
I doubt that the [J2EE|Web]*ClassPathExtender can be rewritten, it is called
under the Mutex.readAccess, so it cannot perform the Mutex.writeAccess it has to
post the WriteAccess. There are two possible solutions, either not to throw
under the read access or try to minimize the synchronization in the
MergedClassPath. 
Comment 29 Jan Lahoda 2006-05-04 21:17:42 UTC
Aha, I probably understand what you mean: you propose to wrap the writeAccess in
the ProjectManager.saveProject with runAtomicAction. Yes this makes sense to me,
but would not solve this problem - the fs events would be fired at the end of
saveProject method, and this is still under the writeAccess (acquired in
org.netbeans.spi.project.support.ant.AntProjectHelper.putProperties). To solve
this (particular) deadlock fully with runAtomicAction, one would have to wrap
the writeAccess in putProperties into an atomic action, and this does not make
much sense to me.
Comment 30 Jan Lahoda 2006-05-04 21:36:44 UTC
(Sorry for the double-post, some technical problems).

Regarding the [J2EE|Web|AppClient]ClassPathExtender, I do not think it should be
rewritten, but simply not calling PM.saveProject synchronously (ie. rescheduling
it into another thread) would (I hope) solve this problem (with atomic action in
saveProject itself, maybe). Or maybe whole storeLibLocations method can be
rescheduled into another thread? Do you think this is impossible?

Re firing under read access: I do not think it is possible to change it now: it
would be huge semantically incompatible change.
Comment 31 Tomas Zezula 2006-05-05 06:43:37 UTC
Yes, I think that it may be fine to call storeLibraries asynchronously, it is
important to take the libraries to be saved synchronously to avoid race
conditions. I also think that changing the firing is not good idea at least for
now, I've only mentioned it as one possible solution among others.
Comment 32 Jan Lahoda 2006-05-15 07:29:16 UTC
I would like to propose the following solution:
1. For 5.5, post PM.saveProject in [J2EE|Web|AppClient]ClassPathExtender into
another thread. (I think this will not cause any race conditions: if the project
metadata are accessed through AntProjectHelp, they will be always consistent,
and if accessed through files on disk, they will also be consistent.)
2. For trunk, do 1. + wrap content of ProjectManager.saveProject with
runAtomicAction (+test)

Any objections or comments?
Comment 33 Martin Krauskopf 2006-05-15 08:37:45 UTC
> 1. For 5.5, post PM.saveProject in [J2EE|Web|AppClient]ClassPathExtender...

Seems like the ad-hoc workaround for this issue. It will not solve issue 75635
(at least it seems so from that threaddump) and maybe others yet-unrevealed
similar issues. But probably we could workaround them in the similar way if
there is not better solution and use the robust one for trunk as you supposed.
Comment 34 Jan Lahoda 2006-05-16 10:14:35 UTC
>Seems like the ad-hoc workaround for this issue. It will not solve issue 75635
>(at least it seems so from that threaddump) and maybe others yet-unrevealed
>similar issues. But probably we could workaround them in the similar way if
>there is not better solution and use the robust one for trunk as you supposed.

Actually, I do not know about any solution (at least not on the projects side)
which would solve both this issue and issue #75635. Wrapping content of
PM.saveProject into runAtomicAction does not solve this deadlock - though it may
prevent slightly different deadlocks in the future.
Comment 35 Martin Krauskopf 2006-05-16 11:30:46 UTC
After our offline talk, go ahead "probably" ;) Since the postponement, we should
write that test we talked about. Feel free to reassign since the proposed 5.5
solution needs to be done actually on the J2EE side.
Comment 36 Antonin Nebuzelsky 2006-06-06 12:21:29 UTC
Honzo, can you proceed with the 5.5 solution ASAP?
Comment 37 Jan Lahoda 2006-06-06 13:40:45 UTC
I committed the solution for NB5.5 as described above:
Checking in
j2ee/clientproject/src/org/netbeans/modules/j2ee/clientproject/classpath/AppClientProjectClassPathExtender.java;
/cvs/j2ee/clientproject/src/org/netbeans/modules/j2ee/clientproject/classpath/Attic/AppClientProjectClassPathExtender.java,v
 <--  AppClientProjectClassPathExtender.java
new revision: 1.1.4.4; previous revision: 1.1.4.3
done
RCS file:
/cvs/j2ee/clientproject/test/unit/src/org/netbeans/modules/j2ee/clientproject/classpath/Attic/AppClientProjectClassPathExtenderTest.java,v
done
Checking in
j2ee/clientproject/test/unit/src/org/netbeans/modules/j2ee/clientproject/classpath/AppClientProjectClassPathExtenderTest.java;
/cvs/j2ee/clientproject/test/unit/src/org/netbeans/modules/j2ee/clientproject/classpath/Attic/AppClientProjectClassPathExtenderTest.java,v
 <--  AppClientProjectClassPathExtenderTest.java
new revision: 1.1.2.1; previous revision: 1.1
done
Checking in
j2ee/ejbjarproject/src/org/netbeans/modules/j2ee/ejbjarproject/classpath/EjbJarProjectClassPathExtender.java;
/cvs/j2ee/ejbjarproject/src/org/netbeans/modules/j2ee/ejbjarproject/classpath/EjbJarProjectClassPathExtender.java,v
 <--  EjbJarProjectClassPathExtender.java
new revision: 1.6.36.2.2.1; previous revision: 1.6.36.2
done
Checking in
web/project/src/org/netbeans/modules/web/project/classpath/WebProjectClassPathExtender.java;
/cvs/web/project/src/org/netbeans/modules/web/project/classpath/WebProjectClassPathExtender.java,v
 <--  WebProjectClassPathExtender.java
new revision: 1.10.32.3.2.1; previous revision: 1.10.32.3
done
Comment 38 Tomas Zezula 2006-06-09 11:57:43 UTC
The Martin's improved patch is fine and will work fine.
Comment 39 Lukas Jungmann 2006-08-08 13:03:02 UTC
Thank you guys.