Bug 131739 - Cancel running task hangs IDE
Cancel running task hangs IDE
Status: VERIFIED FIXED
Product: cnd
Classification: Unclassified
Component: Project
6.x
All Solaris
: P2 (vote)
: 6.x
Assigned To: _ gordonp
issues@cnd
61fixes1-fixed
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-04-01 16:28 UTC by Alexander Pepin
Modified: 2008-06-04 08:36 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
:


Attachments
messages.log zip (187.37 KB, application/octet-stream)
2008-04-01 20:42 UTC, Alexander Pepin
Details
jvm stack (78.44 KB, text/plain)
2008-04-01 20:43 UTC, Alexander Pepin
Details
screenshot (137.42 KB, image/jpeg)
2008-04-01 20:44 UTC, Alexander Pepin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Pepin 2008-04-01 16:28:33 UTC
Steps to reproduce:
- create Quote sample project
- build the project
- set in project properties "Console Type" to "Output Window"
- run the project
- while the project is running try to cance it via Quote_1(Build, Run) progress bar
Result: Multiple warning windows appear without content then IDE hangs.
Comment 1 Thomas Preisler 2008-04-01 18:34:49 UTC
I'm having trouble reproducing this one.

I tried Quote and IO (both interactive apps) running in output window on both Windows XP and Windows Vista (and Mac OS X) and didn't see the hang. 
Everything worked as expected.

During development of this feature I tested interactive apps many time running i output window and never experienced any hangs.

It doesn't mean that there isn't a problem though, just that I cannot reproduce it.

Can you please try again (after rebooting your machine) and attach stack trace and perhaps screenshot of the warnings. Who outputs them?

Can you also try on other XP machines to see if the problem is local to your machine or if it happens on other machines as well.
Comment 2 Alexander Pepin 2008-04-01 20:28:28 UTC
Indeed when I tried to reproduce it recently I could do it only from the second attempt on my laptop and could not
reproduce at all on another desktop. I found in the message.log
msg
Caused: java.nio.channels.ClosedChannelException
	at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:91)
	at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:352)
	at org.netbeans.core.output2.FileMapStorage.flush(FileMapStorage.java:351)
[catch] at org.netbeans.core.output2.OutWriter.flush(OutWriter.java:419)
	at java.io.PrintWriter.flush(PrintWriter.java:276)
	at org.netbeans.modules.cnd.execution.NativeExecution$OutputReaderThread.run(NativeExecution.java:210)
ALL [null]: Problem writing to output file
SEVERE [global]


message.log and screenshot are attached
Comment 3 Alexander Pepin 2008-04-01 20:42:45 UTC
Created attachment 59510 [details]
messages.log zip
Comment 4 Alexander Pepin 2008-04-01 20:43:23 UTC
Created attachment 59511 [details]
jvm stack
Comment 5 Alexander Pepin 2008-04-01 20:44:06 UTC
Created attachment 59512 [details]
screenshot
Comment 6 Thomas Preisler 2008-04-02 04:30:02 UTC
I can now reproduce in a debug environment on Windows Vista.

Here is what I think is happening:

CND reads output from the pplication process character by character until stream is empty or there is an I/O error. For each character it calls NB's output.write() and output.flush():

OutWriter (NetBeans)
        public synchronized void flush() {
            if (checkError()) {
                return;
            }
            try {
                getStorage().flush();
                if (lines != null) {
                    lines.fire();
                }
            } catch (IOException e) {
                handleException (e);
            }
        }

NativeExecution (CND)
        public void run() {
            try {
                int read;
                
                while ((read = err.read()) != (-1)) {
                    if (read == 10)
                        output.write("\n"); // NOI18N
                    else
                        output.write((char) read);
                    output.flush();
                }
                output.flush();
            } catch (IOException e) {
                ErrorManager.getDefault().notify(ErrorManager.INFORMATIONAL, e);
            }
        }

After canceling a processwriting to o flushing output may no longer work but output.flush() silently handles the IO exception (in hadleException()), displays a dialog  but never rethrows 
the exception so the loop keep going.

What the root cause is is difficult to say. OutWriter should rethrow the exception. It would solve the problem. But I don't think CND is stopping all threads and processes and closing all 
streams in the right order which would probably also solve the problem.

I tried many different things in CND but most of my attempts made the situation worse and I don't have a safe and simple solution.

Instead I have implemented a flags that gets set if the user cancels the execution and will cause the loop to break. It is very safe and does solve the issue with the hang and multiple 
dialogs but it doesn't solve the deeper problem. I will have QA test before committing the change.
Comment 7 Jesse Grodnik 2008-04-02 15:26:50 UTC
Escalated to P1.
Comment 8 Thomas Preisler 2008-04-02 16:59:07 UTC
Alexander. :
I tried your fix and it seems to be working fine. At least I did not observe neither hang nor warnings on both IO and Quote. I believe your fix should be 
integrated into NB6.1.
Comment 9 Thomas Preisler 2008-04-03 03:17:18 UTC
changeset 596e59a7beea in main
details: http://hg.netbeans.org/main?cmd=changeset;node=596e59a7beea
description:
	131739 Cancel running task hangs IDE on Windows

changeset ec7f20533a93 in main
details: http://hg.netbeans.org/main?cmd=changeset;node=ec7f20533a93
description:
	131739 Cancel running task hangs IDE on Windows
Comment 10 Alexander Pepin 2008-04-03 16:45:26 UTC
I missed this while testing the patch because it was not highly visible but now I have realized that quote processes are
not killed when user cancels running. They are shown as working in "Task Manager" and even we can attach to them to
debug from IDE. The last can mislead a user if he/she tries to debug Quote and runs it once again. Moreover sometimes
these processes continue working after closing IDE and then they take up to 50% of processor time (probably because IDE
is trying to kill them and I realized the problem because of that). So I think this issue needs additional investigation
before committing a fix into 6.1 branch.  
Comment 11 Thomas Preisler 2008-04-03 20:57:36 UTC
Changes pushed to 6.1.

I see your comment just now.....

As I said in an earlier comment, this fix is not a fix for all problems with canceling a running process. It is only meant to prevent the hang and to avoid the multiple 
dialogs which was what this bug was about.

By looking at the code the other day, I see not all threads are correctly stopped or all streams correctly closed and it is also possible that the running process is not 
correctly stopped. I tried many things but all attempts failed and several caused hangs so I concluded that now is not the time to try and fix this. A fix would affect 
everytime you ran a projects and would require extensive testing on many different platforms. The 'fix' I made for this issue (131739) affects only code when you 
cancel so it is very low risk.

I suggest you file a new (p2) bug regarding the still-running process and I can take another look. Please provide enough info so I can reproduce it.

I will close this IZ as Fixed.
Comment 12 Alexander Pepin 2008-04-07 15:49:34 UTC
verified in NB6.1 build 200804040802
Comment 13 Maria Tishkova 2008-04-22 16:01:35 UTC
I can still reproduce  the problem with canceling on Solaris.

Here is messages.log:
ALL [null]: Problem writing to output file
SEVERE [global]

msg
Caused: java.nio.channels.ClosedChannelException
        at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:89)
        at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:350)
        at org.netbeans.core.output2.FileMapStorage.flush(FileMapStorage.java:351)
[catch] at org.netbeans.core.output2.OutWriter.flush(OutWriter.java:419)
        at java.io.PrintWriter.flush(PrintWriter.java:270)
        at org.netbeans.modules.cnd.execution.NativeExecution$OutputReaderThread.run(NativeExecution.java:215)
ALL [null]: Problem writing to output file
SEVERE [global]


I think the problem is Thomas fix is not quite full:

It should check cancel variable after while() loop, as flush() is invoked once more and can throws (and I can see it
now) IOException which can lead to hang of my Solaris machine.
Comment 14 _ gordonp 2008-04-23 00:46:44 UTC
Possible fix with http://hg.netbeans.org/main/rev/d43b7e961a15. The IOE shouldn't happen but
I don't know about the hang as neither Thomas nor I have been able to duplicate that part.

I've sent an updated jar file to Maria for her to test in Sun Studio. If she can no longer
get it to hang I'll mark the bug fixed.
Comment 15 _ gordonp 2008-04-23 15:48:59 UTC
The fix above appears to have fixed both the IOE (expected) and hang (hoped for).

Per email from Maria:
> Thanks a lot for the fix! I have tested it  on my Solaris machine and it looks
> that now I can cancel the running process without hang and exceptions.

Closing as fixed.
Comment 16 Maria Tishkova 2008-04-23 17:05:51 UTC
Gordon, I was able to reproduce this again with the latest fix you sent to me.
It looks like we have the following situation:
in while loop we are still not canceled, and then we are canceled and flush() is invoked.
looks like some sync problems are still in place.


the same message as it was before in messages.log:

msg
Caused: java.nio.channels.ClosedChannelException
        at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:89)
        at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:350)
        at org.netbeans.core.output2.FileMapStorage.flush(FileMapStorage.java:351)
[catch] at org.netbeans.core.output2.OutWriter.flush(OutWriter.java:419)
        at java.io.PrintWriter.flush(PrintWriter.java:270)
        at org.netbeans.modules.cnd.execution.NativeExecution$OutputReaderThread.run(NativeExecution.java:215)
ALL [null]: Problem writing to output file
SEVERE [global]
Comment 17 _ gordonp 2008-04-23 17:11:32 UTC
Maria, was this still on Solaris? If so, was it S10 or an older version. Also, single or multi-cpu?
Comment 18 Maria Tishkova 2008-04-23 17:18:52 UTC
Solaris:

uname -a
SunOS masha 5.11 snv_79a i86pc i386 i86pc

1 proc
Comment 19 _ gordonp 2008-04-24 17:00:11 UTC
Marking as fixed per email from Maria and Alexander.
Comment 20 Alexander Pepin 2008-04-24 17:46:56 UTC
verified in sstrunk build based on NB6.1 with provided jar file.
Comment 21 jinb 2008-05-03 16:09:26 UTC
fix backported into release61_fixes branch
changeset:   77533:be7dcc3ac528


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo