This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
(I am filing this bug in the OpenIDE category even though this could also be related to the Java module, the Web module, and/or other modules in the IDE. It is difficult to know which has the primary responsibility in this case.) We are seeing frequent hangs when unmounting a filesystem mounted through our module (a derivative of the Web module). The hangs are complete deadlocks and seem to occur between the Swing thread and one or more threads trying to lookup object instances. Note that I do not mean to imply that this is only a problem when working with our module, it's just that testing our module relies on mounting and unmounting a large number of filesystems--this could be a problem unmounting any filesystem but we do not test that case on a regular basis. In the attached stack trace, you can see that the RunToCursorAction.enable() method on the Swing thread is doing a lookup of a debugger instance at the same time the Java source parsing thread is looking up a compiler type, while several other threads are accessing the FolderList and other assorted objects with Mutexes. We see similar hangs often, as well as a number of other Studio hangs under other circumstances. In all cases, they seem to have only cursory relationship to our module (only one method, dispose(), is being invoked on one of our classes in our module in the attached trace). The probability for hang seems to be exacerbated by having one or more Java and/or JSP files from the filesystem open (or recently opened) when the filesystem is unmounted. In the case of having JSP files opened, we frequently see in the thread dump a lookup for a JSP syntax coloring/parsing service. We leave open the possibility that this is somehow caused by our module somehow, but the attached thread dump seems to negate that since none of our objects are in the traces except or one of our DataObjects (not a Java or JSP file) being discarded. Thread dumps from many of these hangs have lead us to suspect that either the Open API's Lookup class/infrastructure is not properly threadsafe, or somehow use of this system from various modules (directly or indirectly) is not properly taking into account thread safety issues. Another possibilty is that the Mutex class is broken and/or being used improperly in various places. To be honest, these are just suppositions--it is baffling trying to analyze this problem. One open question is why the Java parser is being invoked during an unmount of a filesystem. We have seen this as a source of trouble in other cases, though I cannot articulate them at this time. We have also come to believe that there may be a significant problem with zombie FileObjects and DataObjects in the IDE during filesystem unmounts. Could it be that this hang phenomenon as shown in the trace is evidence of that? Our impression of the stability of the Studio has fallen since Studio 4.1, largely due to hangs like this one. We are concerned that users of our module are going to encounter this problem, since they are prone to mount and unmount filesystems more frequently on average. We are also concerned because similar hangs seem to occur upon switching projects, which is certainly something all Studio users do.
Created attachment 11473 [details] Thread dump after hang
My colleague has also mentioned that the same thing occurs with the Find... feature, perhaps even more frequently. I am able to reproduce the problem by doing the following: 1) Open a Java file in the editor by double clicking its node 2) Use the Find... feature to locate the same node in the search window 3) Double click the node in the search window to open it in the editor In a majority of cases, a second editor tab will appear for the file. I have been able to reproduce this reliably. One note, when the second editor appears, it my be linked to the first editor such that edits in one appear in the other. However, it is definitely possible to get into a pathological state where the editors are not related and changes in one are tracked in the other, resulting in the changes not being saved under certain conditions.
Sorry, ignore last post--wrong issue.
Jarda, could you please look at this issue and suggest some solution if there is any? You are the only living expert on Datasystems. :-) Todd, yes threading is well known problem which we want to address in future version(s). We are very well aware of this, but unfortunately in current state it is hard to fix any issue related to deadlocks. Usually any fix in the openide causes bunch of problems somewhere else. We will try to prepare some workaround, but do not expect anything magic what will solve all these problems. Definitely not before completion of redesign of threading and Datasystems.
The root of this deadlock is the code inside "Folder Recognizer" thread. Nearly every other thread is waiting for it to finish, but the recognizer is blocked. Fix shall be based on based on having following the priority of resources - e.g. Nodes are allowed to make calls into DataSystems while holding a resource, Datasystems cannot call nodes while holding a resource. The "Folder Recognizer" thread is a resource, being held by DataSystems and calling to Nodes. This shall not happen. The possible ways of not doing it: 1. org.openide.loaders.DataNode$PropL.propertyChange(DataNode.java:551) shall reschedule to different thread before calling to Node.super.methods 2. Simple workaround in com.sun.jato.tools.sunone.context.ClassesDataFolder.dispose(ClassesDataFolder.java:54) to reschedule super.dispose into another thread than Folder Recognizer. 3. The wider problem is why the ClassesDataFolder is being disposed. I am pretty sure that this is related to unmounting of FileSystem. Such action triggers recheck of ClassDataFolder and this data object is no onger able to recognize itself (checkConsistency) that is why it invokes dispose. If there would be some way for the ClassesDataFolder to survive we would not get into this trouble. Generally I think solution 1 is the most appropriate, but requires changes in platform - is not in release35. Moreover I understand that it is based on not anywhere written assumption about the "resources hierarchy" described above, that will be anyway violated on a tons of other places.
some more facts which may help the analysis of the issue... 1) It was said in the opening issue statement: "we frequently see in the thread dump a lookup for a JSP syntax coloring/parsing service." To be more clear, we are specifically referring to the JspParser (Jasper integration) material and not the syntax analyzer (org.netbeans.modules.web.core.syntax.Jsp11Syntax) which most of the JSP syntax coloring relies on. This is probably unimportant but I just wanted to be clear. 2) It is important to understand how our module currently reacts to unmounting; specifically with respect to the reaction of our Loaders to conditions of invalid filesystems. Because we have a lot of tracing in our module we use for debugging specific functionality we noticed a lot of unexpected behavior executing upon unmounts of the web app filesystem. For example, we noticed that our primary design artifact DataObjects (Models, Views, and Command and JSPs) were being recreated during unmount. We would end up having these orphan or what we called "Zombie" DataObjects around. Our investigation showed that our loader's findPrimaryFile() was being called for FileObjects which were in fact parented by invalid filesystems. Why FileObject activity would proceed on filesystems which were invalid was a mystery to us. Our decision at that time was to return null from all our findPrimaryFile() calls in our loaders. This seemed to make the Zombie S1AF DataObjects disappear. Question: is there ever a proper situation in which a DataObject should be created for a FileObject of an invalid filesystem? If not, why doesn't the platform assert that a FileObject is in a valid FileSystem before attempting to load its DataObject? The hang occurences became pronounced two weeks ago (just about every time) as we approached a feature complete build of S1AF for Studio5. The hang thread dumps shows that we were hung deep within some of our loader's handleFindDataObject(). To be consistent with half of our loaders which already had the pattern, we added checks for invalid filesystems to all remaining loaders. The majority of the hangs went away. The hangs that remain are accurately described by Todd in the original issue posting. Hopefully this additional information will help frame the issue. Regarding the comments from jtulach earlier today: Todd will have to evaluate your suggestions #1 and #2 Regarding your comment #3: You assert that ClassesDataFolder.dispose() is being called in reaction to unmount. Question: Should we not predict that DataObject.dispose() will be called on all the live DataObjects for our mounted web app after the app is unmounted? Were you suggesting that dispose() was abnormal during unmount? You state that unmount action triggers "recheck" of ClassDataFolder but this fails so dispose() is called instead. You state "If there would be some way for the ClassesDataFolder to survive we would not get into this trouble." In light of our explaination of how we currently return null from handleFindDataObject() during cases of invalid filesystems...does this help provide insight to our issue?
Hi, we would like to reproduce hang on, so can you provide us something like test case, or reproducible scenario with step by step notice ? Is the hanging on reproducible just on Windows or is it reproducible on other operating systems ? Thanks advance ...
I will try to provide ASAP (and attach) a small S1AF web app you can mount and unmount to reproduce the hang. I will do this on Solaris since I heard that is what you folks probably use. In the meantime please setup the following test environment to run the S1AF module. The S1AF module is available at http://clue.sfbay/kits/jato/trunk/Build030902/ please install the NBM Optionally if you would like to see some of our debug traffic in ide.log add the following switches to your ide.cfg -J-Dsunone.jato.debug=true -J-Dsunone.jato.debug.usesystemerr=true -J-Dsunone.jato.debug.file=/tmp/Debug.properties and then ensure the contents of your /tmp/Debug.properties is the following (this lines enable indicated classes and packages to output debug) ---------------------------------- jsp.JatoJspLoader app.JatoAppLoader command.CommandDefinitionLoader mode.ModelDefinitionLoader view.ViewBeanDefinitionLoader view.ContainerViewDefinitionLoader mount context zombie ----------------------------------
correction to Debug.properties contents (model misspelled): ---------------------------------- jsp.JatoJspLoader app.JatoAppLoader command.CommandDefinitionLoader model.ModelDefinitionLoader view.ViewBeanDefinitionLoader view.ContainerViewDefinitionLoader mount context zombie ----------------------------------
Jaroslav-- I do not think you will be able to reproduce this in any reliable way, as it is a race condition and subject to a multitude of factors. We see it frequently enough, but only because we mount and unmount hundreds of apps. Each time it happens, the thread dump is different and involves different threads and different objects. I believe this thread dump was from an unmount attempt (that's usually the case for these hangs). I don't think there is any reason why we want to allow ClassesDataFolder to survive an unmount--that will only lead to zombies and other issues, no? The dispose() call is perfectly reasonable and expected here as far as I know. Also keep in mind that the ClassesDataFolder.dispose() call is just one of many variations. This is only one instance of the hang and every hang is different. The common feature in all thread dumps during these types of hangs is many threads simulataneously accessing FolderRecognizer. I'm worried about your workaround suggestion for a couple of reasons. First, the fact that ClassesDataFolder appears in this thread dump is just coincidence. We have a large number of DataObjects, any of which (or none) could be involved in a hang. In this instance, are you perhaps focusing on ClassesDataFolder as a problem when in fact it may be several of the other threads that are causing the deadlock? If this is the case, then doing something in ClassesDataFolder.dispose() won't have any effect. Second, wouldn't we need to put your suggested workaround in all of our DataObjects, since any one of them could be involved in such a hang? Are you sure that this hang is *caused* by dispose() in ClassesDataFolder, and so you recommend we make similar changes everywhere? What about other modules' DataObjects, like JavaDataObject? Third, I assume you know what you are talking about, but to me your workaround seems radical. Under normal conditions I would never assume that doing something like this would be a "safe" or recommended operation, as my assumption is that a DataObject's lifecycle is complex and something I don't want to mess with. Is there any danger that this workaround will cause more problems (i.e. hangs) than it fixes, or cause a problem in future releases of NetBeans? Do you have confidence that this will be a workaround we can rely on? If there is any doubt, I might prefer to take our chances with the occasional hang than make our module incompatible with future releases. Is there something else we can do to avoid getting the FolderRecognizer involved? For example, we have enabled filesystem refresh on these Web app filesystems, and we are using an instance of FolderLookup on the mounted app. Could these be factors?
Todd, please attach some other samples of deadlocks.
Is there a DataLoader that would recognize a FileObject and after an unmount of a FileSystem it would not recognize it? From the description above I think it is, probably a workaround for some other issue. Can this be problem. Yes, it can. How would one recognize the problem? Probably by a stacktrace that involves MultiFileLoader.checkCollision and then DataObject.setValid (false) - a sign that existing DataObject is no longer recognized by its own loader. It that a faulty behaviour? Yes, it causes deadlock. Is the problem in data systems? Yes, they are not ready to survive this situation. Or in the loader who is doing that? Actually, is it really necessary to not recognize what has already been recognized? How to fix it. Either make DataSystems more robust (possible source of other bugs) or improve the recognition (if possible). I will attach a patch that might fix this on data system side. If it helps, we might start considering whether to apply it or find less dangerous solution.
Created attachment 11525 [details] Prevents calling to Nodes from Folder Recognizer Thread
Todd, could you please try Yarda's patch and let us know what are the results? The patch itself looks to me as too dangerous to be put into an update release. But I would like to know whether it solved the problem or not. If yes the I would propose to include the patch into main trunk sources and have it there for some time to prove that there are no serious regressions. As for your current release I afraid that there is no easy solution. If Yarda'a patch solves the problem or at least improves it I would propose to implement that directly in your ClassesDataFolder and other affected DataObjects. As a side note I would like to assure you that we are aware of these problems and we work on them. The threading model is being clarified and simplified as much as possible. The Datasystems are also being completely redesign. But that's the future.
Created attachment 11550 [details] Several other thread dumps from hangs that we suspect are related. Some contain S1AF module classes and some do not.
>Is there a DataLoader that would recognize a FileObject and after an >unmount of a FileSystem it would not recognize it? From the >description above I think it is, probably a workaround for some other >issue. Can this be problem. Yes, it can. How would one recognize the >problem? Probably by a stacktrace that involves >MultiFileLoader.checkCollision and then DataObject.setValid (false) - >a sign that existing DataObject is no longer recognized by its own >loader. It that a faulty behaviour? Yes, it causes deadlock. Is the >problem in data systems? Yes, they are not ready to survive this >situation. Or in the loader who is doing that? Actually, is it really >necessary to not recognize what has already been recognized? How to >fix it. Either make DataSystems more robust (possible source of other >bugs) or improve the recognition (if possible). Yes, we have implemented what we call a "zombie check" in several of our loaders because we saw DataObjects being created during unmount and other invalid situations. The creation of these objects often caused the unmount to hang, and these DataObjects remained live in the IDE and seemingly caused many other problems. The zombie checks ensure that DataObjects are not created for FileObjects or FileSystems which are invalid. It usually looks something like this: ----- protected DataObject handleFindDataObject( final FileObject fo, RecognizedFiles rf) throws IOException { try { if (!fo.getFileSystem().isValid()) return null; } catch (FileStateInvalidException e) { // Ignore return null; } ... ----- Once we implemented checks for zombies, we believe our module's reliability went up considerably. We assumed this was a problem with the Open API in that it was incorrectly causing re-recognition of DataObjects whose FileObjects or FileSystems were invalid. However, are you saying that our zombie check logic might be causing problems such as hangs by not recognizing previously recognized DataObjects? Is there some other way we could protect from zombie DataObjects being created during unmount situations? >I will attach a patch that might fix this on data system side. If it >helps, we might start considering whether to apply it or find less >dangerous solution. I would try to apply the patch, but it is very difficult to reproduce the problem reliably. I'm not sure I would be able to say anything definitive about what I saw. Also, I have not seen the problem lately after we added more zombie checks to our loaders. Can you please advise on the wisdom of using our zombie checks?
"Also, I have not seen the problem lately after we added more zombie checks to our loaders." - I'm really glad to hear that you workaround it and that it works. "Can you please advise on the wisdom of using our zombie checks?" - Yarda could you comment this please?
As I wrote in my second comment, the root cause for the deadlock in this issue is the unability of some DataLoader to re-recognize something previously regonized - e.g. zombie check. If the DataLoader would not behave in such way, there would be no need for this issue.
The problem was workarounded. Closing as WONTFIX.
Can you confirm that the disposition of this issue from the core team is that our module is causing the hang because we have faulty code in our Loaders findPrimaryFile and handleFindDataObject methods? That is, that our code is current refusing to create new DataObjects for FileObjects from invalid FileSystems. If you confirm above...what is your recommendation on how we should implement our loaders to deal with this unmount activity which proceeds on the fileobjects for the invalid filesystem? Can you confirm that it is as designed in the core that DataObjects will be re-recognized in the case of unmount? We would greatly appreciate some comment and expert perspective on this scenario. Again, here it is: We designed DataObjects and Loaders for our Module. Everything works nicely for mounting a web application. We engage the UNMOUNT action and our DataObjects are disposed and the loaders recreate the DataObjects all over again but in this case functionality fails and we hang all over the place. Functionality fails because the new DataObjects find themselves running in a filesystem which is trying to disappear. We hang in the same code paths mutex locks that we have presented in this case, its just that we can follow the stack traces to the creations of new DataObjects. Hence, we eliminated the creation of the duplicate DataObjects, those code paths were eliminated and the hangs, most all of them, are gone. I don't remember reading anywhere that it said there are two conditions in which DataObjects are created: 1) regular cases and 2) pathologic cases when the filesystem is invalid. If we are not suppose to balk on creating new DataObjects during unmount (as you said in your last comment) what are we suppose to do? Would you agree that if we proceeded to create the duplicate DataObjects we would have to do something different for the condition of unmount? What condition do we look for and how should our zombie DataObjects behave? When you comment here please consider that our module is the most stable its been now that we check for invalid filesystems across the board and deny the creation of duplicate DataObjects.
I reopened the issue so that we get closure on our questions. If you would like to close the issue its fine by me, I just would like the questions answered and associated with the issue.
Yarda (or others), can you please answer this short list of open questions? We are still confused until we have clear answers to these: 1. Is it normal for our loaders to be asked to recreate DataObjects when FileObjects and/or Filesystems are invalid? 2. Is it normal for a filesystem unmount to cause rerecognition of DataObjects for the unmounted filesystem? 3. We see that turning off zombie checks results in NEW DataObjects being created. Is that expected? Or, would you instead expect a DataObjectAlreadyExists exception to be thrown? 4. Is returning NULL from our zombie checks the best behavior? Would it be better to throw DataObjectAlreadyExists or some other exception? 5. Do you have any other recommendations for us to avoid the problems caused by DataObjects being created for invalid FileObjects and Filesystems? Thank you.
> 1. Is it normal for our loaders to be asked to recreate DataObjects > when FileObjects and/or Filesystems are invalid? DataObject can work on any filesystem, not only those mounted in repository and because only filesystems in repository can be valid, it is ok for data system to work over invalid filesystems. > 2. Is it normal for a filesystem unmount to cause rerecognition of > DataObjects for the unmounted filesystem? Seems so. > 3. We see that turning off zombie checks results in NEW DataObjects > being created. Is that expected? Or, would you instead expect a > DataObjectAlreadyExists exception to be thrown? It is not possible to throw DOAExists. It can be thrown only when constructor of DataObject fails. Trying to create new objects is fine if somebody is interested in them. > 4. Is returning NULL from our zombie checks the best behavior? Would > it be better to throw DataObjectAlreadyExists or some other exception? You can either return null or try to create new data object (which may result in DOAExists exception, if it really exists). I am not in possition to know the best behaviour. > 5. Do you have any other recommendations for us to avoid the problems > caused by DataObjects being created for invalid FileObjects and > Filesystems? My recommendation is to not block "Folder Recognizer" thread by waiting on Children.MUTEX - e.g. reschedule all possible calls from that thread to another one.
Yarda, yes the best solution is to not block Folder Recognizer and that's what we have to do in the long term. But how this should be solved in the short term? Is really the current workaround unacceptable or dangerous? Could you please answer question 4 from the short term point of view? ad answer 1: yes, it is possible to have filesystem which is not in repository and which is then "invalid" (kind of strange naming) and Datasystems should work on it. But is this common? The API allows that but is there anybody really doing something like that? I do not think so. The threading model of DS is known to be messy and so any threading problem is hard to solve. So IMHO if current workaround works and tests prove that it is not causing regressions I would accept it, live with it and properly document it in the source code.
Oops.. in "Could you please answer question 4 from the short term point of view?" I of course meant question 5.
> Yarda, yes the best solution is to not block Folder Recognizer and > that's what we have to do in the long term. Really? I thought the long term solution was to stop using the datasystems API.
Not really an option for us...<grin> As for the zombie check workaround we have in place, I think we have decided based on our empirical observations of behavior that even if it can potentially cause a hang in the Folder Recognizer, it usually doesn't, and the module is far more stable overall. Therefore, we will continue with it in place. I think the root of our issue is that we, like the Web module, are trying to provide context for DataObjects rather than simply create them on a per-file basis. This fact leads to the unfortunate problem of needing DataObjects that cannot be spuriously recreated if their context is invalid. This seems to be incompatible with current Netbeans assumptions about DataObject lifecycle, so we are basically on our own trying to make this work flawlessly.
PetrJ, sure it is. But current planning and schedules are so unclear that I rather count that there might be one more release with current DS. For this one we could do the Yarda's patch. Todd, could you close the issue then? :-)
Workaround in place. Closing as WONTFIX.
I don't know much about the semantics of Datasystems in this case (not sure anyone does, actually), so that might be the "primary cause". However re. the threading here: Agreed that one contributing evil factor is that DataNode.fireChange is receiving an event from the folder recognizer thread - called with an implied lock, i.e. the recognition task - and then refiring an event (here, nodeDestroyed) which will surely need to acquire Children.MUTEX in a write lock, which is IMHO illegal. (Nodes/Children are close to the GUI and may block on low-level structures like Datasystems, assuming the blockage is expected to be short-lived. But not vice-versa.) No need to worry about the trunk - this deadlock should be made impossible as a result of issue #35833. (Not to say that some other problem might not arise, but at least you would not have this EQ <-> FolRec deadlock.)
Sorry to drag this open again, but I noted this interesting comment in the Netbeans FolderLookup class: postCreationTask() protected final Task postCreationTask(Runnable run)Starts the creation of the object in the Folder recognizer thread. Doing all the lookup stuff in one thread should prevent deadlocks, but because we call unknown data loaders, they obviously must be implemented in correct way. Note that this seems to fit the profile of what we are seeing on unmount--we use FolderLookup in our module, we direct it at folders that have our DataObjects in them, the hang in response to the DataObject rerecognition is a problem with the Folder Recognizer thread, and the hang happens on unmount, which is when we commonly see the FolderLookup become active and "fight" the unmounting filesystem by trying to rerecognize invalid objects. Is it possible that our use of FolderLookup is the source of (at least some of) the zombies we are seeing, and could FolderLookup's insistence on running on the Folder Recognizer thread be the problem here? If this is the case, or could be the case, we can override postCreationTask() to run in a different thread--does anyone have any recommendations for a better thread?
Argh. I *could* change FolderLookup to run postCreationTask in a different thread, to at least test my theory. That is, if it weren't marked final. This isn't the first time we've been stymied by the liberal and constraining use of final methods in the Open API. Very frustrating. I hope final methods are not part of the plan for the datasystems rewrite...
This ide hang seems happen often on solaris. Stripes build 030922 on solaris9: To reproduce( not 100% , but repeat the steps a few times, you will get it): 1.swich to Sun ONE application framwork. 2.mount sample application which can be got from unpacking attached war file. 3.extend Jato Sample node, Settings & Configuration node, and Application Classes|jatosample|module1 node. 4. double click AddValuesViewBean node, ConceptIndexTiledView node, ConceptIndexViewBean node, CustomersModel node, and E0120Command node. 5. double click to open ConceptIndex jsp node under ComceptIndexViewBean|JSP Pages. 6. umount the application with Jato Sample|Unmount Application. I saw the following NPE sometime without ide hang. java.lang.NullPointerException at org.netbeans.modules.java.ParserAnnotation.attachToLineSet(ParserAnnotation.java:134) at org.netbeans.modules.java.JavaEditor.processAnnotations(JavaEditor.java:449) at org.netbeans.modules.java.JavaEditor.access$300(JavaEditor.java:77) [catch] at org.netbeans.modules.java.JavaEditor$2.run(JavaEditor.java:297) at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:178) at java.awt.EventQueue.dispatchEvent(EventQueue.java:448) at java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchThread.java:197) at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:144) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:136) at java.awt.EventDispatchThread.run(EventDispatchThread.java:99)
Created attachment 11696 [details] sample war file
Created attachment 11697 [details] stack trace
This hang is identical to the other hangs attached here, which is seemingly caused by dispose() called on the Folder recognizer thread. The NPE is just a side effect and is not relevant. I never heard any response from Yarda regarding my previous comments on use of FolderLookup--a response would help us know where to go with this issue. I am attempting a workaround in the JATO module code. Can someone on the Netbeans team please confirm (or deny) whether this workaround may help (or hurt): if (Thread.currentThread().getName().equals( "Folder recognizer")) // NOI18N { // Do not call dispose in the folder recognizer thread. RequestProcessor.getDefault().post( new Runnable() { public void run() { dispose(); } }); return; }
Re. the NPE in ParserAnnotation.attachToLineSet - please file separately, for Java module. Re. use of final methods - to the contrary, any future API will have *more* things final and not subclassable or overridable. Permitting promiscuous subclassing in a public API leads (in our experience) to horrible backwards compatibility and API evolution problems, as well as a confusing API. If you want to test your theory about some code, get the source, patch for testing, and run with a patch turned on: http://nbbuild.netbeans.org/patching.html No need to support this kind of thing in the production application. Anyway this is off-topic for Issuezilla; bring it up on nbdev if you want. General principles listed here: http://openide.netbeans.org/tutorial/api-design.html#design.less.final
"Re. the NPE in ParserAnnotation.attachToLineSet" - already filed as 36032. "Can someone on the Netbeans team please confirm (or deny) whether this workaround may help (or hurt)" - I think it is OK. It is exactly what Yarda suggested in his first reply.
Thanks David. I just wanted to ask because the workaround code is in our DataObject's dispose() method; a slight variation of Yarda's suggestion. The good news is that the workaround does seem to prevent the hangs our QA team was seeing, and so far doesn't appear to have any problematic side effects (it does cause a little odd behavior during unmount, but nothing problematic).
Unfortunately, we are still seeing occasional hangs. Please see the latest thread dump, 1.txt. The interesting thing about it is that Folder recognizer thread is calling dispose() on the JavaDataObject and seemingly doing the same thing Yarda said was a problem in our DataObjects. Our module does not do anything to specialize JavaDataObjects or loaders--these are the standard objects that ship with Studio. Again, I have to ask: is this an unusual situation? Is something different about our module that the Folder recognizer thread is calling dispose() on DataObjects? Is it our use of FolderLookup that is causing this?
Created attachment 11701 [details] Hang apparently in JavaDataObject.dispose()
QA, are you able to reproduce it in our labs?
Ok, QA/I will look at it tomorrow....
Hi, I was trying to reproduce the problem and was not sucessfull. I was using Nevada and with installed JATA from nbm. I was testing version 030910,030912, 030922, 030924,030925. I used two machines: single processor&Solaris 8 and double processor&Solaris 9 and JDK1.4.2 (not all combinations of JATA version and machines was used, but all 03092* were tested on the double-processor and 090325 was tested on the single processor). I used the example attached at 2003-09-23. I had following difficulties: 1. The example does not contain WEB-INFO/jatoapp.xml, so I added one. 2. I was unable to find: "5. double click to open ConceptIndex jsp node under ComceptIndexViewBean|JSP Pages." and used "Documents/jatosample/module1/ConceptIndex.jps". I am probably doing something wrong. If this is still a problem, could you please write more precise "steps to reproduce"? Or is it a completely random problem?
Here are the steps to reproduce the problem: 1. save the war file(attached) or get one from jato installation by clicking Help|Sun ONE Application Framework(JATO) Technical Documentation which will open a browser. Select Sample Application link and it will show you a page where you can save sample application war file. You may want to try the second approach as war file from early build may have different jato library file from the build you are using. 2. mount the directory containing the war file. 3. unpack the war file by using the war file node's popup menu action. 4. the sample application should be mounted. 5. swith to Sun ONE Application Framework tab. 6. continue with other steps(see the previous comments).
I tried to reproduce a described behaviour, but I wasn't successful. I used single procesor machine 1GB memory, solaris 8,j2sdk1.4.2, S1S 5 build 030904 (030922). I encountered only early mentioned java.lang.NullPointerException.
Hi, I was able to reproduce the deadlock in build 030922. The thread dumps are attached to the issue for reference. I have used two processor machine, Solaris 9 and JDK1.4.2. I will try 030929. (The way of uncompressing the archive was the most important piece of the puzzle.)
Created attachment 11747 [details] Two full thread dumps of the deadlock. Build 090322.
Created attachment 11766 [details] Thread dump showing unmount hang with no JATO stack frames
I've added a thread dump attachment that shows an unmount hang that occurred without any involvement from the S1AF/JATO module code. The problem appears to be simulataneous access to the lookup and/or FolderList.getChildrenList() method.
Yarda, please take over this issue. David K is out of ideas. This seems pretty hairy. Thanks
Ok, I was chosen to solve the issue, but I was not following its life for three weeks. Before I do that, I'd like to know if anybody tried to reproduce the issue with the patch I provided here on 2003-09-04. If anybody reproduced the deadlock with my patch applied, please write it here. Otherwise I am going to apply that patch and mark the issue as fixed. Thanks.
We have included a workaround based on your patch in the JATO module for JATO DataObjects only. The latest hang was taken from a build that included that workaround, but the thread dump does not include any JATO stack frames. We have seen thread dumps from similar hangs that did include JATO stack frames, but otherwise they look very similar to this hang. This makes me think that our workaround based on your patch has fixed any hangs caused by JATO DataObjects.
Last deadlock reported as separate issue 36449 as it is different. The all others (including the original report) have been fixed: /cvs/openide/test/unit/src/org/openide/loaders/Deadlock35847Test.java,v initial revision: 1.1 /cvs/openide/loaders/src/org/openide/loaders/DataNode.java,v <-- revision: 1.6;
Fixed also in Nevada Patch 1 and in Arrow.
verified -> todd.fast 2003-10-05
x