This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 138196 - Deadlock on classloaders with JNLP distribution
Summary: Deadlock on classloaders with JNLP distribution
Status: VERIFIED FIXED
Alias: None
Product: platform
Classification: Unclassified
Component: Module System (show other bugs)
Version: 6.x
Hardware: PC Windows XP
: P2 blocker (vote)
Assignee: Jesse Glick
URL:
Keywords: RANDOM, THREAD
Depends on:
Blocks:
 
Reported: 2008-06-25 09:25 UTC by toniomack
Modified: 2008-12-23 08:40 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
Stack trace (17.44 KB, text/plain)
2008-06-25 09:28 UTC, toniomack
Details
Thread dump for NB 6.1 platform (20.33 KB, text/plain)
2008-07-18 14:01 UTC, toniomack
Details
file to use with FeedReader sample and NB6.1 (4.11 KB, text/plain)
2008-07-18 14:03 UTC, toniomack
Details
master.jnlp to use with FeedReader sample and NB 6.1 (1.24 KB, text/plain)
2008-07-18 14:05 UTC, toniomack
Details

Note You need to log in before you can comment on or make changes to this bug.
Description toniomack 2008-06-25 09:25:06 UTC
A Jnlp distribution occasionally freeze during start-up after the "Done loading module" is displayed in the splash screen.
There is a deadlock ( see attachment).
This issue never happened with a Zip distribution.
Application size is arround 60M and jars are signed.
Reproduced using JWS 1.5 and 1.6.0_02.
Comment 1 toniomack 2008-06-25 09:28:07 UTC
Created attachment 63399 [details]
Stack trace
Comment 2 Jesse Glick 2008-06-26 03:15:35 UTC
Hm. Not sure how to solve. Class loading asks a SecurityManager for permission to continue, yet the SM is loaded by the
CL being checked, and it is perhaps not fully resolved.
Comment 3 toniomack 2008-07-04 15:30:18 UTC
When the deadlock occur we can systematically see that

1 - in the "main" thread we try to load "org.openide.util.datatransfer.ExClipboard" due to the call 
to TopSecurityManager.makeSwingUseSpecialClipboard in class org.netbeans.NonGui
finally the JNLPClassLoader try to load "java.awt.datatransfer.Clipboard"

2 - in the "AWT-EventQueue-2" we try to load "sun.text.resources.CollationData" and the TopSecurityManager try to check
permission for "accessClassInPackage.sun.text.resources"

Any idea of what changes in the app could have led to see this issue since it recently started to show up ?
Comment 4 Jesse Glick 2008-07-08 01:20:43 UTC
Don't know of any recent changes that would have made this a regression.
Comment 5 toniomack 2008-07-11 10:21:54 UTC
Work around : comment call to TopSecurityManager.makeSwingUseSpecialClipboard in org.netbeans.core.NonGui(line 117)
The hang does not appear any more but the functionality of ExClipboard is no longer  available
Comment 6 toniomack 2008-07-18 13:56:37 UTC
The work around does not work, it just decrease the frequency of the freezes.

The freeze can also be reproduced with netbeans-6.1-200805300101-ml :

1 - Create a new project using sample application FeadReader, copy the files master.jnlp and platform.properties
attached into the project
2 - "Build JNLP" application
3 - Deploy the war file 
4 - Install and run the application using Web Start
The application will occasionally freeze ( thread dump in attached DeadlockTraceNB6.1.TXT)
Comment 7 toniomack 2008-07-18 14:01:04 UTC
Created attachment 64959 [details]
Thread dump for NB 6.1 platform
Comment 8 toniomack 2008-07-18 14:03:05 UTC
Created attachment 64960 [details]
file to use with FeedReader sample and NB6.1
Comment 9 toniomack 2008-07-18 14:05:31 UTC
Created attachment 64961 [details]
master.jnlp to use with FeedReader sample and NB 6.1
Comment 10 Jesse Glick 2008-07-18 19:41:32 UTC
Not sure I would be able to reproduce, but trying to fix based on what I see in the thread dumps. Seems like the loading
of AWTPermission.class is the problem. Can preload this class, as was in fact already being done for several other
classes (though to solve some long-lost deadlock during debugging, not JNLP).

http://hg.netbeans.org/core-main/rev/de1508eb74a1

Since it seems you have already been testing source patches, it would be great if you could test this one. Mark VERIFIED
if it works, else reopen and attach new thread dumps.
Comment 11 Jesse Glick 2008-07-22 15:07:44 UTC
Explanation of what I _think_ was the problem (mainly based on the thread dump) and why the fix might work:

SecurityManager's occupy a special place in class loading because they can be called while loading a class
(SecurityManager.checkPackageAccess). ClassLoader's of course are involved in class loading too. The difference is that
a ClassLoader impl itself is clearly loaded by a "lower" class loader, so there is a strict stratification. By contrast,
the SM is set globally for the VM, so our TSM is called with cPA for every class that is loaded - even those classes in
the same class loader as TSM itself, even classes in the boot classpath. This is arguably a design flaw in Java.

What seems to have been happening here is that the first time TSM.cPA was called (in EQ), the test

  if (perm instanceof AWTPermission)

was encountered, yet AWTPermission was not loaded yet - not a big surprise, probably there was no activity in AWT yet at
all. So the class loader loading this code - JNLPClassLoader - asked to resolve and link in AWTPermission. Now
JNLPClassLoader was already locking its parent AppClassLoader in a different thread (main), in the normal class loader
delegation locking chain. (BTW this locking chain may be changed for JDK 7, mainly to support cyclic class loader
graphs.) Unfortunately EQ was already holding locks in the wrong direction: AppClassLoader first (to load something from
that loader), then JNLPClassLoader just because TSM's code triggered class loading.

The fix ensures that AWTPermission is loaded as soon as TSM is initialized, so that the body of checkPermission (called
from super's checkPackageAccess) does not mention any classes which would not already have been loaded. It seems that
someone long ago did similar fixes for other unusual classes used from TSM, though apparently to solve some issue with
running TSM in the debugger, rather than JNLP.
Comment 12 toniomack 2008-07-28 17:10:05 UTC
Verified, 
Thanks!!