Bug 205319 - ide hangs after fresh install upon click to expand 'servers'
ide hangs after fresh install upon click to expand 'servers'
Status: RESOLVED FIXED
Product: serverplugins
Classification: Unclassified
Component: GlassFish
7.2
PC Windows 7
: P1 (vote)
: 7.2
Assigned To: TomasKraus
issues@serverplugins
not_in_rc1
: 71_HR_FIX
: 199259 209916 210902 212935 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-20 04:30 UTC by darkos
Modified: 2012-05-24 09:17 UTC (History)
9 users (show)

See Also:
Issue Type: DEFECT
:


Attachments
startup + thread dump after clicking to expand 'servers' (11.26 KB, application/octet-stream)
2011-11-20 22:57 UTC, darkos
Details
thread dump using latest version. (37.26 KB, text/plain)
2012-04-03 03:18 UTC, pbelbin
Details
thread dump using NetBeans-dev-web-main-7271-on-20120403-full.zip (68.94 KB, text/plain)
2012-04-03 22:25 UTC, pbelbin
Details
dump using ctrl+break using netbeans dev build 201204080400 (39.03 KB, text/plain)
2012-04-08 22:26 UTC, pbelbin
Details
possible fix for original deadlock (6.08 KB, patch)
2012-04-11 09:32 UTC, Petr Hejl
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description darkos 2011-11-20 04:30:29 UTC
[ BUILD # : rc0 ]
[ JDK VERSION : 1.7 ]

I downloaded and set up solaris 11 11/11 in a Virtual Box environment.

as root:

next, I installed jdk 7.0 update 1

next, I installed NetBeans 7.1 rc0 that we've been asked to test.

as myself, I started NetBeans, and the ide came up as expected.  I then
activated the 'ide','java se', 'java web and EE' options, and said 'yes' to the
question about installing junit. (was junit earlier?)

I then clicked on the 'services' tab, and clicked to expand the 'servers', at
which point the cursor became the busy indicator, and the ide became
unresponsive.

I brought up the terminal window from which I started NetBeans and hit CTRL-C.

I then restarted NetBeans, and navigated again to the same place, and got the
same result.
Comment 1 Vince Kraemer 2011-11-20 05:16:58 UTC
please use control-\ to generate a thread dump so we can see where the ide is running into trouble.
Comment 2 darkos 2011-11-20 22:57:02 UTC
Created attachment 113352 [details]
startup + thread dump after clicking to expand 'servers'
Comment 3 Vince Kraemer 2011-11-21 18:43:25 UTC
http://hg.netbeans.org/web-main/rev/3e04a4c2b60e

Please try the next dev build that includes this code change to verify that it addresses this issue so we can push the fix into the release71 branch.
Comment 4 Quality Engineering 2011-11-22 15:50:40 UTC
Integrated into 'main-golden', will be available in build *201111220600* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/3e04a4c2b60e
User: vince kraemer <vkraemer@netbeans.org>
Log: #205319 : deadlock when creating a local personal domain on solaris or other platforms where some files in a domain are not readable.
Comment 5 darkos 2011-11-23 03:45:39 UTC
I have tested the fix (in the daily dev build), and it does indeed appear to fix the issue.

thank you!

best regards,
peter.
Comment 6 Vince Kraemer 2011-11-23 16:55:17 UTC
this is deadlock is in code that was added to address http://netbeans.org/bugzilla/show_bug.cgi?id=184674
Comment 7 Petr Hejl 2011-11-24 13:38:00 UTC
The fix seems to be good for 7.1. Though I would recommend some more investigation - the GFInstanceProvider code/lifecycle seems to be really deadlock prone.

I'm not sure about the CommonServerSupport lock in getDomainsRoot method. Is it guarding some state? Perhaps it is just making the domain initialization atomic (or both). If this is the case rescheduling to another thread actually breaks such principle.
Comment 8 David Konecny 2011-11-24 20:57:52 UTC
I reviewed the fix but code is too complex to give a definitive answer. I agree with what PetrH said - posting domain creation into separate thread will avoid this deadlock but I would think it also breaks contract of getDomainsRoot(), no?

The deadlock is limited only to the scenario where domain being copied was created under root privileges and is unreadable for regular user as described in issue 184674. Alternative solution would be to revert fix for issue 184674 and reopen it - it has 15 duplicates and existed since 6.9 so total number of impacted users is small.

On the other hand because this deadlock is limited only to this scenario the fix Vince is proposing is safe and can be integrated into 7.1 - nobody else apart from users in this scenario is impacted if there is an additional issue with it.
Comment 9 Marian Mirilovic 2011-11-25 09:09:13 UTC
http://hg.netbeans.org/releases/rev/3197a272e915
Comment 10 Quality Engineering 2011-11-29 08:45:47 UTC
Integrated into 'releases'
Changeset: http://hg.netbeans.org/releases/rev/3197a272e915
User: vince kraemer <vkraemer@netbeans.org>
Log: #205319 : deadlock when creating a local personal domain on solaris or other platforms where some files in a domain are not readable.
Comment 11 Jiri Skrivanek 2011-12-07 08:13:00 UTC
Verified by reporter in build 2011-12-05_11-21-34.
Comment 12 Tomas Pavek 2012-01-27 15:21:43 UTC
*** Bug 199259 has been marked as a duplicate of this bug. ***
Comment 13 pbelbin 2012-03-15 13:25:01 UTC
this is again happening with 7.2.
Comment 14 Vince Kraemer 2012-03-15 15:55:50 UTC
(In reply to comment #13)
> this is again happening with 7.2.

Please provide info about the build where you see this (from the About box) and thread dump if you can.

Please verify that the steps you took match those in the initial description.  If they are different, please provide them...
Comment 15 Vince Kraemer 2012-03-15 15:57:15 UTC
Tomas, please verify that your recent changes for the stackoverflow are not coming into play here.
Comment 16 TomasKraus 2012-03-15 17:38:24 UTC
OK. They were not pushed into main repository yet. But when comparing stack traces this looks completely different.

Is it possible to reproduce this in some less complicated way? I would like to avoid installing VirtualBox + Solaris.
Comment 17 pbelbin 2012-03-15 21:03:00 UTC
the VirtualBox Solaris install was used in the original bug report to verify that the issue was happening on other than Windows, I believe.

I am currently seeing this issue on Windows 7 x64 with JDK 7u3 x64.
Comment 18 TomasKraus 2012-03-19 15:21:26 UTC
I was trying to reproduce it on my trunk build without any success. Please can you test this scenario with last netbeans build from http://bertram-tst.netbeans.org:8080/job/web-main/7126/ ?

This build already contains bug fix that addressed issue in Services :: Servers menu. I wanted to reproduce this myself but it works fine for me. :(
Comment 19 pbelbin 2012-03-20 04:34:17 UTC
I downloaded and tried out the version you indicated.

however: what I got ended up behaving the same.

here is what I did:

1. when asked, accept the junit download.
2. use netbeans64 binary to start netbeans (contained in the zip).
3. once the ide initially comes up, go into the plugins menu and enable the following already installed options: java SE, base IDE, web applications.
4. once they have been installed, open the following sub windows from the main toolbar's 'window' item: projects, files, services.
5. click on the 'servers' item.  it will show that there is nothing there.
6. right click on it, and choose 'add server'
7. proceed to add a local domain instance of glassfish 3.1.2 located somewhere on your machine.
8. at this point, I got an error report dialog that came up, and it got registered as ide slowness.
9. shut down the ide.
10. start the ide.
11. click on 'services'
12. click to expand 'servers'
13. get the never ending hourglass and 'please wait'.

note: I really have no idea which jdk this is using, since it was executed via the contents of the .zip download, and not set up using the installer that is normally used.

and: since there's no installer, how do I clean up the install history stuff so that next time it won't find any remnants of having been running?
Comment 20 pbelbin 2012-03-20 04:36:38 UTC
note: this issue is currently occurring on Windows 7 x64.  I have not tried it using the VirtualBox hosted Solaris x86 environment lately.

I have installed jdk 7u3 x64 on this computer, so I am hopeful this is what it was using when I ran the previously mentioned details.
Comment 21 darkos 2012-03-23 16:10:45 UTC
(In reply to comment #18)
> I was trying to reproduce it on my trunk build without any success. Please can
> you test this scenario with last netbeans build from
> http://bertram-tst.netbeans.org:8080/job/web-main/7126/ ?
> 
> This build already contains bug fix that addressed issue in Services :: Servers
> menu. I wanted to reproduce this myself but it works fine for me. :(

should this be available in daily build 201203230400 ?

I have downloaded and tried this also on a windows 2008 r2 x64 with jdk 7u3 and it is still behaving badly.  ie: expand 'servers' and get the 'Please Wait' forever.
Comment 22 Petr Hejl 2012-03-29 14:48:15 UTC
(In reply to comment #21)
> I have downloaded and tried this also on a windows 2008 r2 x64 with jdk 7u3 and
> it is still behaving badly.  ie: expand 'servers' and get the 'Please Wait'
> forever.

Guys if you are still able to reproduce the issue please attach couple of thread dumps (http://wiki.netbeans.org/GenerateThreadDump) generated when the IDE is in "waiting forever" state.
Comment 23 pbelbin 2012-04-03 01:39:58 UTC
when I click on the included link for how to generate the dump, I get a page indicating a MySQL error.
Comment 24 pbelbin 2012-04-03 01:50:58 UTC
Database error

 A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was:
 (SQL query hidden)
 from within function "MediaWikiBagOStuff::_doinsert". MySQL returned error "1114: The table 'objectcache' is full (localhost)".
Comment 25 pbelbin 2012-04-03 03:18:47 UTC
Created attachment 117707 [details]
thread dump using latest version.

here is a thread dump of NetBeans 201204021038 exhibiting the same 'waiting forever' behavior.
Comment 26 TomasKraus 2012-04-03 09:06:18 UTC
Thank you fore the dump. Analysing this I see a typical school example of deadlock:

Thread tid=0x0000000006d16800
-----------------------------
Holds       0x00000000f905bb28
Waiting for 0x00000000f95185e0

Thread tid=0x0000000006d0b800
-----------------------------
Holds       0x00000000f917a900
Holds       0x00000000f99d9908
Holds       0x00000000f95185e0
Holds       0x00000000f9518b60
Holds       0x00000000f9519de8
Waiting for 0x00000000f905bb28

So we have cycle in 0x00000000f95185e0 and 0x00000000f905bb28 lock requests.
Cause is again typical school example of wrong locking pattern (those two locks are requested in reverse order):
 * 0x00000000f905bb28 first and 0x00000000f95185e0 second in tid 0x0000000006d16800
   - org.netbeans.modules.j2ee.deployment.impl.ServerRegistry.init
     requested 0x00000000f905bb28
   - org.netbeans.modules.glassfish.common.GlassfishInstanceProvider.getPrelude
     requested 0x00000000f95185e0
 * 0x00000000f95185e0 first and 0x00000000f905bb28 second in tid 0x0000000006d0b800
   - org.netbeans.modules.glassfish.common.GlassfishInstanceProvider.getEe6
     requested 0x00000000f95185e0
   - org.netbeans.modules.j2ee.deployment.impl.ServerRegistry.getServerInstance
     requested 0x00000000f905bb28

We have to rewerse order of locking in one of the code paths to fix this issue.
Comment 27 TomasKraus 2012-04-03 09:46:22 UTC
To fix this issue I modified both getEe6() and getPrelude() in GlassfishInstanceProvider class to use double checked locking and to use two independent locks instead of locking on the whole class.

Method initialized() was changed to use those two locks instead of GlassfishInstanceProvider class locking too.
Comment 28 TomasKraus 2012-04-03 09:55:28 UTC
Also added @SuppressWarnings("LeakingThisInConstructor") to constructor to remove this warning. It's annoying and we know what we are doing. :)
Comment 29 TomasKraus 2012-04-03 13:52:09 UTC
changeset:   218701:722614006deb
user:        Tomas Kraus <TomasKraus@netbeans.org>
date:        Tue Apr 03 14:43:54 2012 +0200
files:       glassfish.common/src/org/netbeans/modules/glassfish/common/GlassfishInstanceProvider.java
description: Bug 205319 - IDE hangs after fresh install upon click to expand Servers

pushing to https://TomasKraus:***@hg.netbeans.org/web-main/
remote: added 3 changesets with 1 changes to 1 files

Here is fix. Please try to verify this fix when build with this fix will be ready.
Comment 30 TomasKraus 2012-04-03 14:36:22 UTC
Netbeans with this fis are available on
http://bertram2.netbeans.org:8080/job/web-main/7271/

Please verify if problem is solved or not. I was unable to reproduce it in my development environment.
Comment 31 pbelbin 2012-04-03 22:25:03 UTC
Created attachment 117763 [details]
thread dump using NetBeans-dev-web-main-7271-on-20120403-full.zip

it's still giving me the endless hourglass....  

try again!

regards,
peter
Comment 32 TomasKraus 2012-04-03 23:25:10 UTC
I removed one issue but looke like there is problem even with lassfishInstanceProvider.getEe6 itself being called twice. I'll give it one more try.
Comment 33 pbelbin 2012-04-08 22:26:44 UTC
Created attachment 118002 [details]
dump using ctrl+break using netbeans dev build 201204080400

dump using ctrl+break using netbeans dev build 201204080400
Comment 34 Vince Kraemer 2012-04-10 17:01:33 UTC
*** Bug 210902 has been marked as a duplicate of this bug. ***
Comment 35 David Konecny 2012-04-11 00:22:37 UTC
Tomas, I had to revert your changeset as it was causing another P1:

changeset:   219455:1cb303917a69
parent:      218759:722614006deb
user:        David Konecny <dkonecny@netbeans.org>
date:        Wed Apr 11 12:08:01 2012 +1200
summary:     Backed out changeset 722614006deb

It is in web-main now.
Comment 36 David Konecny 2012-04-11 00:42:24 UTC
I forgot to say that the original fix caused issue 210741.
Comment 37 Petr Hejl 2012-04-11 09:32:47 UTC
Created attachment 118114 [details]
possible fix for original deadlock

Tomas is the synchronization on init code really needed? I think this change could fix the original issue without introducing SOE. I quickly tested the basic use cases (register/uregister/deploy/start/stop) and it seems to work ok. Though I'm not a GF plugin expert.
Comment 38 Petr Hejl 2012-04-11 11:56:43 UTC
*** Bug 209916 has been marked as a duplicate of this bug. ***
Comment 39 TomasKraus 2012-04-11 14:44:07 UTC
changeset:   219483:fa7930503972
user:        Tomas Kraus <TomasKraus@netbeans.org>
date:        Wed Apr 11 16:31:57 2012 +0200
files:       glassfish.common/src/org/netbeans/modules/glassfish/common/GlassfishInstanceProvider.java
description:
Bug# 205319 - IDE hangs after fresh install upon click to expand 'servers'
Bug# 210741 - InstanceCreationException caused by bug# 205319 fix.
Removed deadlock condition in
org.netbeans.modules.glassfish.common.GlassfishInstanceProvider
and org.netbeans.modules.j2ee.deployment.impl.ServerRegistry
Comment 40 TomasKraus 2012-04-11 15:23:20 UTC
Peter pointed me to one theoretical deadlock conditions when using [CountDownLatch].await() when waiting for first thread to finish initialization:

 * thread A: calls ServerRegistry.init() and get "ServerRegistry" Lock
             calls GlassfishInstanceProvider.getEe6(...) but is not first
                   and is  suspended in await()

 * thread B: calls GlassfishInstanceProvider.getEe6(...) and is the first one
             so it goes trough new GlassfishInstanceProvider(...)
             and ee6Provider.init()            
             ee6Provider.init() calls ServerRegistry.getServerInstance(...)
             and ServerRegistry.getServerInstance(...) is requesting
             "ServerRegistry" Lock. But this lock is hild by thread A which is
             waiting in await() for thread B.

We decided to modify this fix to call <some>Provider.init() outside synchronized block and to remove CountDownLatch.

changeset:   219600:ae6cb02f6d83
user:        Tomas Kraus <TomasKraus@netbeans.org>
date:        Wed Apr 11 17:08:09 2012 +0200
files:       glassfish.common/src/org/netbeans/modules/glassfish/common/GlassfishInstanceProvider.java
description:
Bug# 205319 - IDE hangs after fresh install upon click to expand 'servers'
Bug# 210741 - InstanceCreationException caused by bug# 205319 fix.
Previos attempt to fix deadlock may not work so using more conservative fix.
Comment 41 TomasKraus 2012-04-11 15:31:35 UTC
New fix attempt is being built on http://bertram2.netbeans.org:8080/job/web-main/7350/ so please test it to see if it resolved your issue.
Comment 42 bdwalker 2012-04-11 16:17:40 UTC
Seems to be resolved. I tested the basic use cases (expand/add/start/stop/delete) with the build artifact from job #7350. I was unable to reproduce the issue, i.e., the server section expanded without any problems.


(In reply to comment #41)
> New fix attempt is being built on
> http://bertram2.netbeans.org:8080/job/web-main/7350/ so please test it to see
> if it resolved your issue.
Comment 43 TomasKraus 2012-04-11 23:53:30 UTC
Thank you for feedback.
Closing this issue.
Comment 44 pbelbin 2012-04-12 03:23:28 UTC
I tried the indicated build, and indeed, it does appear to not suffer the issue for me.

best regards!
Comment 45 Jiri Skrivanek 2012-05-24 09:17:19 UTC
*** Bug 212935 has been marked as a duplicate of this bug. ***


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo