Bug 226051 - RequestProcessor awaitTermination() and isTerminated() wrongly return true.
RequestProcessor awaitTermination() and isTerminated() wrongly return true.
Status: STARTED
Product: platform
Classification: Unclassified
Component: -- Other --
7.2.1
PC Linux
: P3 with 1 vote (vote)
: 8.0
Assigned To: Jaroslav Tulach
issues@platform
:
Depends on: 238643
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-13 10:45 UTC by AxelVoitier
Modified: 2014-01-14 11:49 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
:


Attachments
Patch with test (5.31 KB, patch)
2013-07-23 00:43 UTC, _ tboudreau
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description AxelVoitier 2013-02-13 10:45:02 UTC
Hello,

I am trying to find a way to wait all the task I pushed to a RequestProcessor finish their job.

The RequestProcessor is created with a throughput corresponding to the number of CPU available (8 in my case).

Over several thousands tasks are sent to execution using the post(Runnable run) method.

According to the documentation of RequestProcessor and ExecutorService, I should be able to make this wait like this:

> myRequestProcessor.shutdown();
> while(!myRequestProcessor.awaitTermination(100, TimeUnit.MILLISECONDS)) {
>     /* Check for cancellation */
> }

However this does not work. It seems awaitTermination() (and isTerminated() as well actually) always promptly (< 1 millisec) return true.


I had a look in the code of the RequestProcessor class. I found out these functions return true because the list of Processors they get from the collectProcessors() method is always empty.
Going through the lifetime of the processors field of the RequestProcessor class, this internal list of processors does get populated. However, they do not belong to calling RequestProcessor at the time collectProcessors() is called. Actually, they belongs to nothing (source field of these Processors object is null).

The source field of these Processors object is correctly set to the RequestProcessor thanks to a call to proc.attachTo(this) in the enqueue() method.
But this source field is later on set to null by the run() method of the Processor class. And I think this is why the Processors are no longer found belonging to the RequestProcessor. And so the collectProcessors() returns an empty list. And so the awaitTermination() and isTerminated() promptly return true while there are still a lot of task being processed.


Is the shutdown/termination logic of ExecutorService followed and supported in your implementation? In that case there is a bug here.
Otherwise, is there another way to wait a RequestProcessor finish to execute all its tasks? (I would prefer not using an ugly while(checkCondition()) Thread.sleep(100); )


Checked on a 7.2.1 / rev 36bb0f394df1 of the RequestProcessor file.


Cheers,
Axel Voitier
Comment 1 AxelVoitier 2013-02-13 11:01:10 UTC
In addition, this logic of belongsTo(this) is spread everywhere in awaitTermination() and isTerminated().

If I comment out these checks, the methods do their correct job. And the waiting logic I described in my first comment is working fine.

But I guess if these checks where here it was for a reason. It is just that in my case the processors field only contains the Processors object my tasks are using.
Byu the way, how could this processors field contains Processors that are not belonging to the current RequestProcessor object?!


Cheers,
Axel Voitier
Comment 2 Ondrej Vrabec 2013-06-28 11:11:23 UTC
jardo, can you please take a look?
Comment 3 Jaroslav Tulach 2013-07-22 13:59:47 UTC
This part of RequestProcessor was written by Tim. Tim, can you look at the issue, evaluate and either fix or assign back. Thanks.
Comment 4 _ tboudreau 2013-07-23 00:38:23 UTC
The bug is that RequestProcessor.Processor.run() sets the source field to null too early, before the current work has run.

Moving the
  source = null
line to the bottom of the outer for(;;) loop fixes it.
Comment 5 _ tboudreau 2013-07-23 00:43:15 UTC
Created attachment 137591 [details]
Patch with test

The attached patch fixes the problem;  all of the org.openide.util tests pass.

I attempted to run commit-validation, but commit-validation hangs indefinitely for me with or without the changes.  So, giving this back to Jarda rather than pushing the change.

This probably *should* be tested very exhaustively - it alters the behavior of one of the most frequently run sections of code in all of the platform.

If any code was relying on it appearing that there are more request processor threads available then are actually available (because the source field being null is used to imply that the thread is available for new work when it is still busy), it will probably break.
Comment 6 _ tboudreau 2013-07-23 00:43:53 UTC
Passing back to you, Jarda.
Comment 7 Jaroslav Tulach 2013-07-23 07:45:26 UTC
The patch looks promising, thanks Tim!
Comment 8 Jaroslav Tulach 2013-07-23 08:10:36 UTC
I think the patch is OK. Applied as ergonomics#22a514df137c - we'll see what the ergonomics builder says after running all its tests.
Comment 9 Jaroslav Tulach 2013-08-06 06:34:27 UTC
Looks like the fix was not safe. It seems to cause a lot of failures due to starvation in FolderInstance and FolderLookup. In period Jul 24 to Aug 5 only one from all ergonomics build succeeded. Usually waiting on a RequestProcessor.Task at FolderInstance.java:332, while the "Folder Recognizer" thread is empty and waits for something to do.

E.g. the previous fix somehow messes with finished state of a task and contains race condition. I need to revert it (53759f2bbb5e) and disable the test for now: 61d48098070b
Comment 10 Quality Engineering 2013-08-08 02:29:50 UTC
Integrated into 'main-silver', will be available in build *201308072300* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)

Changeset: http://hg.netbeans.org/main-silver/rev/22a514df137c
User: Jaroslav Tulach <jtulach@netbeans.org>
Log: #226051: Applying Tim's patch for awaitTermination. Thanks for the donation.
Comment 11 _ tboudreau 2013-08-08 03:15:19 UTC
If you can create a simple test which fails with the patch but passes without it, I'm happy to try to refine the solution.

It sounds like something is depending on a side effect of the source field being nulled before the method has run to completion.  It did appear that the null test is used to decide if new work should queued for that thread.

Might be able to work around it by storing the source field in another field before nulling it, and using that to test if it is completed (doing that also could easily create a memory leak if the new field isn't nulled when possible).  Really, doing that might just move the race condition with awaitTerminated() to any work that gets queued for the running thread.

I'd be tempted to rewrite RequestProcessor from scratch, wrappered over an ExecutorService and using some other helpful classes from java.util.concurrent which did not exist when it was written.  It would probably be 10% of the current code size.  But at the same time, other than this issue, it is well-tested and works.
Comment 12 Jaroslav Tulach 2013-12-18 16:31:25 UTC
There was a race condition discovered recently by Martin. Maybe it affected behavior with this patch as well. Will try to re-integrate it again and see if the build gets flaky over Christmas. Making P2 so I don't forget.
Comment 13 Jaroslav Tulach 2014-01-14 11:49:53 UTC
I guess I missed the bug anyway during the Christmas time frame. Back to P3.


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo