38013 – Deadlock on startup

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 38013 - Deadlock on startup

Summary: Deadlock on startup

Status:	VERIFIED FIXED

Alias:	None

Product:	platform
Classification:	Unclassified
Component:	Text (show other bugs)
Version:	3.x
Hardware:	PC Windows XP

Importance:	P2 blocker (vote)
Assignee:	Petr Nejedly

URL:
Keywords:	RANDOM, THREAD

Depends on:
Blocks:

Reported:	2003-12-11 11:18 UTC by _ tboudreau
Modified:	2008-12-22 19:45 UTC (History)
CC List:	1 user (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
Thread dump (14.63 KB, text/plain) 2003-12-11 11:19 UTC, _ tboudreau	Details
Possible lock order fix (PR.M first) (847 bytes, patch) 2004-01-20 17:17 UTC, Petr Nejedly	Details \| Diff
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description _ tboudreau 2003-12-11 11:18:48 UTC

Seems to be random - doesn't happen all the time.
 I produced this state starting NetBeans normally
on a 2 processor machine, WinXP.

Note in the attached stack trace, there are three
threads all doing something with an
EditorDocument.  The EQ and java parsing thread
are waiting to lock on the editor document.

Comment 1 _ tboudreau 2003-12-11 11:19:12 UTC

Created attachment 12524 [details]
Thread dump

Comment 2 Dusan Balek 2004-01-14 10:21:42 UTC

After consultation reassigning Petr.

Comment 3 Dusan Balek 2004-01-14 10:22:24 UTC

After consultation reassigning to Petr.

Comment 4 Petr Nejedly 2004-01-20 17:12:54 UTC

The DefaultRP is in document's writeAccess and tries to lock
PositionRef.Manager

The Java source parsing thread has the PositionRef.Manager already
locked and wants read access to the editor.

The question is what is the correct order.
I've tried to lock PR.M before writeEnter to document
and it seems it works, but I'm not sure about the order used
in other parts of code (I think the opposite order should be used: Doc
first, PR.M second).


I'll attach the diff. Mila, can you please review it?

Comment 5 Petr Nejedly 2004-01-20 17:17:17 UTC

Created attachment 12982 [details]
Possible lock order fix (PR.M first)

Comment 6 Miloslav Metelka 2004-01-21 09:53:54 UTC

Unfortunately I think we are facing one more problem here. If you look
at the thread dump of the Java parsing thread it is already inside a
document renderer so it should normally be no-op to acquire another
lock on the same doc. Unfortunately there is likely a _different_
document that was just loaded (by the second RP thread) so in fact the
locks acquired are like this:

Java parsing thread:
1. orig-doc-read-lock
2. PR.M
and wants new-doc-read-lock

RP:
1. new-doc-write-lock
2. CES.getLock()
and wants PR.M

so it looks like counter-locking of PR.M with new-doc.
As an relatively low-cost experiment we could try to merge PR.M and
CES.getLock() into a single lock. I do not try to bet whether it would
bring another deadlocks or not. I think that it could help in our case
as just one thread at time could proceed through CES.getLock() and the
new-doc would not be known to other threads until CES.getLock() gets
released which should avoid the deadlock.
Another possibility could be to write-lock the orig-doc (if there was
any) before loading the new-doc. Not sure how feasible is that.

Comment 7 Petr Nejedly 2004-01-21 15:31:23 UTC

So it seems we finally know what's going on.
There is no reload in progress, no oldDoc and newDoc and the java
parser holds no orig-doc-read-lock while parsing. It may seem stragne
but it is OK, as there is no document yet loaded and the java parser
is pulling directly from stream.

What happened:
Java parser was already parsing a stream (no Document anywhere), when
an open request came.
The parser thread was about to create a new PositionRef, found no
document, so locked the PR.M and tried to proceed w/o document.
But the document openning thread have just loaded the document and (as
a writer) notified the PR.M.
PR.M saves the document reference and tries to convert all registered
PositionRefs to this new state. To do this, it have to wait to lock
the PR.M
The parser thread gets another timeslice and continues with creating
the PositionRef, but this time the document is already present
(something the PR.M was not prepared for) and deadlocks trying to
read-lock it.

Mila's proposed solution could fix the problem so I'll try to
implement it.

Comment 8 Miloslav Metelka 2004-01-21 16:28:19 UTC

Yes, my appologies - I've seen the
PositionRef$Manager$DocumentRenderer.renderrender() in the stacktrace
of java parsing thread and did not realize that it finally went
through the branch for the case when the actual document is null i.e.
through the direct invocation of run() in the renderer.

Comment 9 Petr Nejedly 2004-01-22 15:23:18 UTC

Fixed using the proposed solution (locking on CES's lock instead of
PR.M private lock).
openide/src/org/openide/text/PositionRef.java ,v1.51

Comment 10 Marian Mirilovic 2004-02-24 16:44:38 UTC

verified - Tim, reopen  if you reproduce it again