This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 191796 - Deadlock caused by double locking in AbstractModel (synchronized methods and transactionSemaphore)
Summary: Deadlock caused by double locking in AbstractModel (synchronized methods and ...
Status: RESOLVED FIXED
Alias: None
Product: xml
Classification: Unclassified
Component: XAM (show other bugs)
Version: 7.0
Hardware: PC Mac OS X
: P1 normal (vote)
Assignee: Antonin Nebuzelsky
URL:
Keywords: THREAD
: 184371 190532 192037 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-11-10 13:09 UTC by Tomas Danek
Modified: 2013-04-19 09:55 UTC (History)
5 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
thread dump (16.66 KB, text/plain)
2010-11-10 13:10 UTC, Tomas Danek
Details
modified source file (25.00 KB, text/plain)
2010-11-15 22:30 UTC, Nikita Krjukov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tomas Danek 2010-11-10 13:09:43 UTC
Product Version: NetBeans IDE 7.0 Beta (Build 201011082300)
Java: 1.6.0_22; Java HotSpot(TM) 64-Bit Server VM 17.1-b03-307
System: Mac OS X version 10.6.4 running on x86_64; MacRoman; en_US (nb)
Userdir: /Users/tomas/.netbeans/7.0beta
-----------------


do not have exact steps, just edited src level in project properties, pom.xml was opened in editor, swicthed projects view<-->editor, IDE became unresponsive.

Attaching thread dump.
Comment 1 Tomas Danek 2010-11-10 13:10:25 UTC
Created attachment 102883 [details]
thread dump
Comment 2 Tomas Danek 2010-11-10 13:11:10 UTC
Please evaluate for potential beta stopper, thanks.
Comment 3 Antonin Nebuzelsky 2010-11-10 14:35:05 UTC
Deadlock between "StatusProvider" and "AWT-EventQueue-1".

AbstractModel.endTransaction() is waiting to enter its synchronized block and the lock is held by AbstractModel.startTransaction() in the other thread (AWT) which is blocked inside this method in transactionSemaphore.acquireUninterruptibly(). AWT thread will be able to acquire only after the transaction semaphore is released (which would happen in endTransaction if it was not blocked).

Why is there the double locking in AbstractModel?

1. synchronized access to startTransaction() and endTransaction()
2. transactionSemaphore

Reassigning to xml.
Comment 4 Jesse Glick 2010-11-10 19:01:17 UTC
IIRC there is a duplicate of this somewhere in the xml component.
Comment 5 Antonin Nebuzelsky 2010-11-11 15:41:20 UTC
*** Bug 190532 has been marked as a duplicate of this bug. ***
Comment 6 Sergey Lunegov 2010-11-13 09:13:45 UTC
Ok. Looks like but is not constantly reproducible.
However, Nikita please take a look. And yes, it can be related to Maven plugin.
Comment 7 Nikita Krjukov 2010-11-13 16:07:47 UTC
> Why is there the double locking in AbstractModel?
> 
> 1. synchronized access to startTransaction() and endTransaction()
> 2. transactionSemaphore
> 

I've looked the sources and I agree with your explanation. I wonder how it managed to work so far for about 4 years. Maybe because there wasn't many multi-thread usage of XAM model. 

I think it's necessary to get rid of transactionSemaphore by replacing it with wait() & notify() calls. Despite of Semaphore, they are consolidated with synchronized clause. So result should be similar, but it should prevent the deadlock. 

Unfortunately true history of relevant modifications has lost. They have been done about 4 years ago. But I suppose the reason of Semaphore usage is explained partly at the comment at lines 405-408. As I understood, they wanted avoiding reentrant transactions (by the same thread). But now it seems impossible because of the check at the beginning of startTransaction() method. So replacement to wait&notify shouldn't break this feature. 

It seems can be fixed quite quickly. But any change to XAM model is very dangerous because it can affect a lot of other modules. So it requires a lot of time on testing. I'm out of NetBeans development process now (and all my team as well). So I'd like either somebody else does the fix or at least I need get help from somebody about repository, release schedule and so on.
Comment 8 Nikita Krjukov 2010-11-13 17:12:10 UTC
I'm going to prepare changes, which I've mentioned. I suppose they all will be located at single file: AbstractModel.java Then I can test it quickly locally and also can send it (source file) to maven team for testing. I think the issue shouldn't be easy to reproduce like any other problems with threads synchronization. But I hope you'll be able to make any assessment.
Comment 9 Nikita Krjukov 2010-11-15 22:30:27 UTC
Created attachment 102989 [details]
modified source file 

You can find the suggested modifications in the source file. 

I've compiled it (and NetBeans at all), run JUnit tests for several modules and run NetBeans with it. 

I haven't faced any problems except many errors in BPEL Model JUnit tests. But then it turned out that absolutely the same errors happen without my modification. So it's another problem.
Comment 10 Jesse Glick 2010-11-16 17:39:24 UTC
Tonda do you want to test the proposed change?
Comment 11 Antonin Nebuzelsky 2010-11-18 14:55:31 UTC
Thanks for the patch!

It looks correct to me. Integrated. I only changed wait(1000) to wait().

http://hg.netbeans.org/core-main/rev/4f5854f34dd1

This fix is not in 7.0 Beta, to be released soon. It will appear only in daily builds, starting with the next one.
Comment 12 Antonin Nebuzelsky 2010-11-18 14:56:54 UTC
*** Bug 192037 has been marked as a duplicate of this bug. ***
Comment 13 Quality Engineering 2010-11-19 06:16:36 UTC
Integrated into 'main-golden', will be available in build *201011190001* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/4f5854f34dd1
User: Antonin Nebuzelsky <anebuzelsky@netbeans.org>
Log: #191796 - Deadlock caused by double locking in AbstractModel (synchronized methods and transactionSemaphore).
Comment 14 Svata Dedic 2013-04-19 09:55:12 UTC
*** Bug 184371 has been marked as a duplicate of this bug. ***