Please use the Apache issue tracking system for new NetBeans issues (https://issues.apache.org/jira/projects/NETBEANS0/issues) !!
Bug 249305 - Slow evaluation of Watch variables during debug
Slow evaluation of Watch variables during debug
Status: STARTED
Product: debugger
Classification: Unclassified
Component: Java
8.0.2
PC Windows 7
: P2 with 4 votes (vote)
: Dev
Assigned To: Martin Entlicher
issues@debugger
: PERFORMANCE
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-12-11 22:21 UTC by NukemBy
Modified: 2016-11-23 03:18 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
:


Attachments
Watch Vars performance (netbeans snapshot) (385.74 KB, application/octet-stream)
2014-12-11 22:21 UTC, NukemBy
Details
Sample standalone test case (2.65 KB, application/octet-stream)
2014-12-11 22:22 UTC, NukemBy
Details
Video showing that evaluation in debugging session works fine in 8.1 development build. (6.15 MB, video/ogg)
2014-12-12 10:20 UTC, Jiri Kovalsky
Details
PID-7768 HotThreads.zip (58.55 KB, application/x-zip-compressed)
2014-12-15 17:18 UTC, NukemBy
Details
PerfOfVarEval.zip (2.60 KB, application/x-zip-compressed)
2015-02-24 17:59 UTC, NukemBy
Details
This is video of how it works for me in real project, not fine :( (123.08 KB, video/x-matroska)
2015-03-03 13:46 UTC, NukemBy
Details
selfsampler-evaluation-of-vars for NB 20150905 (92.51 KB, application/octet-stream)
2015-09-06 12:38 UTC, NukemBy
Details
non-cached-String-class.png (62.28 KB, image/png)
2016-10-19 16:32 UTC, NukemBy
Details
evaluation-of-breakpoint-expression-in-loop.png (41.68 KB, image/png)
2016-10-19 16:33 UTC, NukemBy
Details
getTreeCellRendererComponent-getDeclaredField (83.26 KB, image/png)
2016-10-22 14:26 UTC, NukemBy
Details
HtmlLabelUI-getPreferredSize (42.28 KB, image/png)
2016-10-22 14:28 UTC, NukemBy
Details
jpda-debugger-perf-hacks.zip (16.53 KB, application/x-zip-compressed)
2016-10-30 17:28 UTC, NukemBy
Details
Newly added files missing in patch (4.58 KB, application/x-zip-compressed)
2016-10-30 22:32 UTC, NukemBy
Details
LruCache vs. LinkedHashMap as LRU (8.01 KB, application/x-zip-compressed)
2016-11-21 22:45 UTC, NukemBy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description NukemBy 2014-12-11 22:21:28 UTC
Created attachment 151053 [details]
Watch Vars performance (netbeans snapshot)

Netbeans is awfully slow with evaluation of variables during debug - in my production project it takes around 1+ second to evaluate 1 entry in the Hashmap - so introspection of hashmap of 20 entries takes 20+ seconds of wait time :(

I had that level of performance all the time with all versions of NetBeans - just can't stand it anymore.

I've checked absolutely same scenario in current version of IDEA Community Edition (detached NetBeans from the process and attached IDEA to the same one) - IDEA shows values almost instantly - so the problem is not in target JVM or running java process.

Unfortunately I do not have standalone test environment to reproduce that severe problem, but I've attached self-profiler report for reference. It took 37 seconds to render grid with 16 lines of HashMap entries - 2 seconds per line :(  

During evaluation most of the CPU time goes to the process being debugged (it actually completely consumes one of 4 CPU cores - TaskManager shows load around 25-30%) and of course JVM is in the paused state at that time.

The standalone sample is also attached, but it does not show so drastic difference - expanding the "map" variable in Netbeans takes around 0.5 sec ("String Value" column is also displayed in my configuration), IDEA - works in the flick of an eye. I can't tell what is the difference in real project - of course it is muuuuch bigger, eats 2GB of RAM while running and so on ... but how that would affect performance of evaluation of the Watch variable!? You can get high CPU usage in NB in that dummy case too - by expanding "map" and scrolling down - newly visible values will be evaluated on the fly with rather significant delay and rather high CPU load.

PS: of course custom variable formatters are enabled - Tools -> Options -> Java -> Java Debugger -> Variable Formatters : Default Map.Entry formatter looks like this
   getKey()+" => "+getValue(

However after disabling all custom formatters raw member variables are also displayed slowly.
Comment 1 NukemBy 2014-12-11 22:22:22 UTC
Created attachment 151054 [details]
Sample standalone test case
Comment 2 Jiri Kovalsky 2014-12-12 10:20:21 UTC
Created attachment 151069 [details]
Video showing that evaluation in debugging session works fine in 8.1 development build.

Product Version: NetBeans IDE Dev (Build 201412111033)
Java: 1.8.0_25; Java HotSpot(TM) 64-Bit Server VM 25.25-b02
Runtime: Java(TM) SE Runtime Environment 1.8.0_25-b17
System: Linux version 3.13.0-24-generic running on amd64; UTF-8; en_US (nb)

Hi Sergey, I have tried to reproduce your problems, but it all worked like a charm in my Linux Mint 17 - see attached video.

This is Intel Dual Core i5 @ 2.67 GHz with 4 GB RAM
Comment 3 NukemBy 2014-12-12 11:06:33 UTC
My environment is Window Server 2008 R2 (== Windows 7) and Java 7 (1.7.0_71-b14), 
but it seems that performance in your environment is similar to mine - i have i5-3470 (which is 4-core - http://www.cpu-world.com/CPUs/Core_i5/Intel-Core%20i5-3470.html).

I see in your video that values are evaluated with significant latency too - try expanding [0..99] and scrolling down to bottom (using scroll thumb, not mouse wheel), you'll notice that last values will be displayed with significant delay - check the CPU load at that time, I bet CPU will quite saturated - can you tell what can cause 100% usage of one core for just enumerating the Map<> and getting toString() of its entries. IDEA does the same with no CPU load at all ..

I mentioned in the original message that this sample demo indeed does not have that level of poor performance i have in real project (which I of course cannot send to you), but it would be quite enough to check were the CPU usage goes from.
Comment 4 Jiri Kovalsky 2014-12-12 11:48:06 UTC
Yes, when I expand any of the hundreds e.g. 0-99 and quickly scroll down to the last items 89 - 99, it displays in blue "Evaluating...", CPU takes 100% for a second and then all entries are sequentially updated. The reason is that NetBeans reads values one after the other so even if you only have displayed say 89 - 99, preceding values 0 - 88 are first processed and then 89 - 99 are updated too.

Martine, is this operation that expensive to occupy all CPU resources? And couldn't we only update displayed values?
Comment 5 NukemBy 2014-12-15 17:18:22 UTC
I've run a HotThreads detector tool (mentioned in https://netbeans.org/bugzilla/show_bug.cgi?id=229076) against NetBeans during slow evaluation of vars - it says that all CPU time goes to

1. 25% -> 'AWT-EventQueue-0' thread - hot points are below this methods

javax.swing.RepaintManager.prePaintDirtyRegions(RepaintManager.java:718)
and
org.netbeans.modules.debugger.ui.views.debugging.DebuggingViewComponent$ViewRefresher.run(DebuggingViewComponent.java:1017)


2. 45% -> N * 'JPDA Debugger' threads - hot points are below this method

org.netbeans.modules.debugger.jpda.ui.models.VariablesFormatterFilter.getValueAt(VariablesFormatterFilter.java:467)


Additional snapshot is attached.
Comment 6 NukemBy 2014-12-15 17:18:50 UTC
Created attachment 151117 [details]
PID-7768 HotThreads.zip
Comment 7 NukemBy 2014-12-15 21:03:08 UTC
And one more, very sad thing - rather regularly debugging sessions ends up with crash of the target JVM (I connect via socket), following is logged to console

JDWP exit error JVMTI_ERROR_INVALID_CLASS(21): on checking for an array class [../../../src/share/back/util.c:1297]
FATAL ERROR in native method: JDWP on checking for an array class, jvmtiError=JVMTI_ERROR_INVALID_CLASS(21)

I fill it is caused by evaluation of the variables and ... IDEA is able to debug that same process endlessly without the crash, so ... yes, there is most probably some bug in JVM (it shouldn't crash) but it seems there is the way to debug without crashes ...
Comment 8 NukemBy 2015-02-24 17:59:18 UTC
Created attachment 152189 [details]
PerfOfVarEval.zip
Comment 9 NukemBy 2015-02-24 18:01:20 UTC
I’ve digged more details on this issue. I’ve measured performance of evaluation of vars in Neabeans vs. IDEA and here is the result ...

The way of measurements is – set conditional breakpoint in the cycle, which is always ‘false’. It will be re-evaluated by IDE with each cycle. Performance is:

Condition: map.size()  < 0 
NetBeans	- 207.64 /sec
IntelliJ IDEA	- 894.45/sec, ~4.4 times faster

Condition: map.entrySet().iterator().next().getValue() == null
NetBeans	- 155.98/sec
IntelliJ IDEA	- 358.81/sec, ~2.2 times faster

Condition: "xxx".equals(map.entrySet().iterator().next().getValue().toString())
NetBeans	- 111.96/sec
IntelliJ IDEA	- 258.60/sec, ~2.1 times faster

Condition: "xxx".equals(map.entrySet().iterator().next().getValue().toString()) --> at stack depth = 50
NetBeans	- 102.91/sec
IntelliJ IDEA	- 255.30/sec, ~2.5 times faster

I have no idea how this is implemented inside, but …  
- IDEA is significantly faster on simple expressions (~4.4 times) 
- difference is significant but not so dramatic on complex expressions (~2.3 times).
- it appears that stack depth has somewhat negative impact only on NetBeans.

I’ve attached MAVEN project for above mentioned test (PerfOfVarEval.zip).

However the major thing, which unfortunatelly cannot be covered by this test – I have major issue in NetBeans with inspection of content of the Maps and List, i.e. viewing multiple values at ones. It is obvious that NB evaluates values one-by-one whereas IDEA seems to be evaluating multiple values in batch – i.e. 100 records at once or so.
Comment 10 NukemBy 2015-03-03 13:46:07 UTC
Created attachment 152365 [details]
This is video of how it works for me in real project, not fine :(
Comment 11 Martin Entlicher 2015-09-03 12:44:35 UTC
I've identified one problem with waiting for AWT event queue when searching for a property editor.
It's cached now in changeset:   291582:c030042309c7
http://hg.netbeans.org/core-main/rev/c030042309c7
Comment 12 Martin Entlicher 2015-09-03 16:06:56 UTC
Some time is spent in retrieval of the formatter from the class type.
I've improved it by caching the formatters: changeset:   291587:eec8ad3f2fca
http://hg.netbeans.org/core-main/rev/eec8ad3f2fca
Comment 13 Martin Entlicher 2015-09-03 16:36:55 UTC
Evaluation is performed by multiple threads. In case of method calls, they need to be synchronized and therefore some threads are always waiting for others. But this might not worsen much the overall performance.

A lot of time is spent in painting, it might be worth to analyse the table refreshes and their impact.
Comment 14 Martin Entlicher 2015-09-03 16:48:22 UTC
In both condition evaluation and variables evaluation, significant time is spent in expressions parsing. Some caching of parser result may help.
Comment 15 Quality Engineering 2015-09-05 03:03:40 UTC
Integrated into 'main-silver', will be available in build *201509050002* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)

Changeset: http://hg.netbeans.org/main-silver/rev/c030042309c7
User: mentlicher@netbeans.org
Log: #249305: Cache the presence of property editor for individual classes.
Comment 16 NukemBy 2015-09-06 12:28:11 UTC
I've checked the result on my project - rendering of 8 records from Hashmap, actually table of 8 records * 'type' + 'value' + 'string value, because i visually see that each cell in output is calculated and rendered separately:

NB 20150831
Comment 17 NukemBy 2015-09-06 12:37:47 UTC
I've checked the result on my project - rendering of 8 records from Hashmap, actually table of 8 records * 'type' + 'value' + 'string value, because i visually see that each cell in output is calculated and rendered separately:

  - NB 20150831 - 17 sec
  - NB 20150905 - 4 sec

Performance improvement is really good, but ... i've checked exactly the same case in IDEA 14.1.1. Community Edition and it renders more records (around 20 of them fits in window) in something around 0.2 sec (really quick, i can't measure that with reasonable precision). So ... there is still something wrong.

I'm attaching current selfprofiler. 

I've checked CPU usage during rendering and i see following (approximatelly):

  - target process  - 23%
  - NetBeans        - 10%
Comment 18 NukemBy 2015-09-06 12:38:51 UTC
Created attachment 155923 [details]
selfsampler-evaluation-of-vars for NB 20150905
Comment 19 Martin Entlicher 2015-09-23 13:47:29 UTC
Thanks for testing the improvements done so far.
Further performance tuning would require bigger changes, which are risky before the approaching code-freeze. I hope that for 8.1 it's in acceptable limits, we'll work on further improvements into the next version.
Comment 20 NukemBy 2016-10-19 16:32:36 UTC
Created attachment 162557 [details]
non-cached-String-class.png
Comment 21 NukemBy 2016-10-19 16:33:45 UTC
Created attachment 162558 [details]
evaluation-of-breakpoint-expression-in-loop.png
Comment 22 NukemBy 2016-10-19 17:03:38 UTC
I see fresh new 8.2 was released with this bug still in-place :(

I've self-profiled NetBeans more and I fell there is not so much needs to be done to fix it.

Problem #1:
http://hg.netbeans.org/main/file/63c43da97b48/debugger.jpda/src/org/netbeans/modules/debugger/jpda/expr/EvaluatorVisitor.java#l2373
https://bugzilla-attachments-249305.netbeans.org/bugzilla/attachment.cgi?id=162557

        if (value instanceof String) {
            StringReference str = createStringMirrorWithDisabledCollection((String) value, vm, evaluationContext);
            ClassType strClass = (ClassType) vm.classesByName("java.lang.String").get(0);
            try {
                List<Value> args = Collections.emptyList();
                return invokeMethod(arg0, strClass.methodsByName("intern").get(0),
                                    false, strClass, str, args, evaluationContext, false);
            } catch (Exception ex) {
                return str;
            }
        }

    it looks like someone expects that class "java.lang.String" will be frequetly refresdhed in target VM and therefore rescans target VM for each evaluation of string ...
    I believe "String" class will not go from any other place than JDK (i.e. it is once and for ever) and anyway - instance #0 is always used unconditionally.
    
    ... so, i think 'strClass' should be cached for entired lifetime of debugger connection to target VM. 
    
    Additionally, i'm not sure ... if 'intern' is needed here at all?
    
    
Problem #2:
http://hg.netbeans.org/main/file/63c43da97b48/debugger.jpda/src/org/netbeans/modules/debugger/jpda/expr/EvaluatorVisitor.java#l2224
https://bugzilla-attachments-249305.netbeans.org/bugzilla/attachment.cgi?id=162558

        private ReferenceType findEnclosingType(ReferenceType type, String name) {
            if (type.name().equals(name)) {
                return type;
            }
            List<ReferenceType> classes = type.virtualMachine().classesByName(name);
            if (classes.size() == 1) {
                return classes.get(0);
            }
            for (ReferenceType clazz : classes) {
                if (isNestedOf(clazz, type)) {
                    return clazz;
                }
            }
            return null;
        }

    There is cool feature in NetBeans - http://wiki.netbeans.org/FaqDebuggingPrints, but it becomes useless when one needs to log some frequently happening operations - because throughput which Netbeans is able to perform is around 2 msg/sec (performance somehow depends on set of opened projects). Having 5000 messages to be logged for analysis - i have to go for lunch while NetBeans is doing this :(

    Steps to reproduce: see this issue https://netbeans.org/bugzilla/show_bug.cgi?id=250470
      - Start 2 instances of NetBeans (via 'netbeans.exe "--userdir [Folder Path]"')
      
      - Start instance #2 with debug enabled: -J-Xdebug "-J-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006" 
      
      - Instance #1 - attach as debugger for instance #3 on port 5006, open sources of NetBeans project "java.source.base"
        and set a breapoint on this line
        http://hg.netbeans.org/main/file/63c43da97b48/java.source.base/src/org/netbeans/modules/java/source/parsing/FolderArchive.java#l143
        
        Suspend:    no thread = Continue
        Print Text: getFile - f: {= folderName }, e: {= entry == null ? null : entry.getURL().toString() }, k: {= kinds.toString() }
        
      - In instance #2, right click in BreakPoints window -> new BreakPoint -> Exception: start typing anything in field "Exception Class Name"
      
      - See what is happening in "Output - Debugger Console" of Instance #1. Self-profiler can also be started there.
    
    Most probably the root cause for this is again - lack of caching. What concerns this particular use-case - unlike Watch variables, expression to be evaluated in context of breakpoint always occurs in the same line - this means - in the same context, and therefore almost everyting can be cached there - including the pre-parsed expression to be evaluated (to avoid unneded 5000 times re-parsing of the same staff).
    

I guess both caches should be cleaned after command "Debug -> Apply Code Changes".

Setting priority to P2 hoping that this really annoying issue will get fixed ealier than in 1 year.
Comment 23 NukemBy 2016-10-20 15:37:17 UTC
Re: Having 5000 messages to be logged for analysis - i have to go for lunch while NetBeans is doing this :(

There is another NetBeans' feature which is also hardly usable - conditional breakpoints. When I need to check what is happening in the loop when specific condition occurs I setup a conditional beakpoint, but ... because evaluation of condition is so slow (because of the same above-mentioned problem) - target program almost stops. In most of the cases I prefer hardcoding condition into program code and setting unconditional breakpoint like this one 
'if( check condition ) { 
  do some use-less action to set a breakpoint here
}'
Comment 24 NukemBy 2016-10-22 14:26:14 UTC
Created attachment 162597 [details]
getTreeCellRendererComponent-getDeclaredField
Comment 25 NukemBy 2016-10-22 14:28:00 UTC
Created attachment 162598 [details]
HtmlLabelUI-getPreferredSize
Comment 26 NukemBy 2016-10-22 14:36:24 UTC
Unfortunatelly, above mentioned problems are only 1/3 of wasted CPU performance.

I've implemented quick&dirty fixes in local copy of NetBeans for the most obvious cases, but it is far from enough.
I've benchmarked the result i'm having so far on the following use-case in my actuall application.

Code context: text parser, parsing text of around 7K in size.
Use-case: set a breakpoint in the middle if the cycle with following conditions
  - Condition:   text.nextWord().contains("something")
  - Suspend:     no threads
  - Print text:  i'm here
i.e. "do not break", "just evaluate condition and continue, print something to output"
 
The results are:

    NetBeans (orig)  249 sec  /* 4:09 min */
    NetBeans (my)    151 sec  /* 2:30 min */
    EDEA (16.2)       71 sec  /* 1:11 min */
    -- no debug --    20msec  /* evaluation of condition is hardcoded */

I suspect IDEA is rather close to best possible performance - according to metrics I see in NetNeans' self profiler, rather much time is the actual evaluation of expression in target VM.

One of the major CPU consuming re-evaluations which can't be cached in current JPDA architecture in NetBeans is ParseExpressionTask in
    debugger.jpda.projects\src\org\netbeans\modules\debugger\jpda\projects\PreferredCCParser.java
    
    ... nevertheless it is executed in 'static' context (i.e. evaluationContext.getContextObject() == null) - it requires some pre-initailziation happening few stack frames ealier and internally generated results are bound to this initialization step, therefore are 'pseudo-dynamic' and can't be right-away reused in next evaluation - so nevertheless result of parsing is "always the same" in terms of syntax tree, actual instances are always new & different. 
        

So ... i have to admit that 'quick' solution is only a 'partial' success. To make it acceptably good - significant refactoring of all JPDA staff is needed. Please do it sooner than later. This performance issue is really pain in the @#$% already for many years :(

Please take following use-cases into consideration when refactoring:
---------------------------------------------------------------------

    1. Inspection of huge Map or List instances (BTW - it seems like 'quick fixes' already provide not 'goot', but 'acceptable enough' performance). 

    As far as I noticed performance significantly depends on source file. NetBeans likes to re-evaluate everything and therefore evaluation of debug var within file with plenty of external imports will be significantly slower than in a dummy mock test without imports and with simple Map filled by just 10000 of String->String entries.

    2. Evaluation of 'Condition' and 'Print Text' expressions for breakpoints happening in the loop.

    Of course this anyway will not work acceptably quick for cases of millions of iterations, but current performance is not acceptable even for few thousands.


I've also found some strange code, which also significantly affect performance of UI staff, but is not releated to JPDA directly:

1. Havy usage of reflection - it is really slow, when talking about thousand of invocations (eats ~3-5% time). 
   Especially I see no sense in it when one NetBeans' class is used that way from another NetBeans' class ...

    Occurences (multiple per each file):
    http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda.projects/src/org/netbeans/modules/debugger/jpda/projects/PreferredCCParser.java#l668
    http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda.ui/src/org/netbeans/modules/debugger/jpda/ui/SourcePath.java#l773
    http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda/src/org/netbeans/modules/debugger/jpda/EditorContextBridge.java#l165
    http://hg.netbeans.org/main/file/9df997191fc8/spi.debugger.ui/src/org/netbeans/modules/debugger/ui/views/debugging/DebugTreeView.java#l172
    
    Why just not use 'proper' interfaces and module dependencies?

2. 'Lookups' - there are plenty of them and they typically go deep into other coder, which is in rather common case is synchronized and waiting for something else ...
    So - result better be cached when possible.
    
    For example this method is invoked rather frequently, is pretty heavy inside, but there is not much sense to refresh it more frequently than once per second (not 1000/sec)
    http://hg.netbeans.org/main/file/9df997191fc8/java.source.base/src/org/netbeans/api/java/source/SourceUtils.java#l724
    
2. getTreeCellRendererComponent-getDeclaredField - affects performance of scrolling throgh Maps/Lists
    See screentshot https://netbeans.org/bugzilla/attachment.cgi?id=162597 
    
3. HtmlLabelUI-getPreferredSize - the same
    See screentshot https://netbeans.org/bugzilla/attachment.cgi?id=162598

4. Not a performance issue
    http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda/src/org/netbeans/modules/debugger/jpda/SourcePath.java#l100
    http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda/src/org/netbeans/modules/debugger/jpda/SourcePath.java#l355

    why not implement via more obvious iteration over internal List<SourcePathProvider> and avoid code-duplication-redundancy like here?
    http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda/src/org/netbeans/modules/debugger/jpda/SourcePath.java#l367
    http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda/src/org/netbeans/modules/debugger/jpda/SourcePath.java#l391

5. These 2 classes (and few others) look like copy/paste - i.e. changing behaviour in one place most probably will require identical change 'somehere there in other file' and at some time will be 'lost in space'
   
   http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda/src/org/netbeans/modules/debugger/jpda/SourcePath.java 
   http://hg.netbeans.org/main/file/9df997191fc8/debugger.jpda.ui/src/org/netbeans/modules/debugger/jpda/ui/SourcePath.java
Comment 27 Martin Entlicher 2016-10-26 13:43:53 UTC
Thanks a lot for your analysis, NukemBy.
I'll go through your findings...
Comment 28 NukemBy 2016-10-26 14:17:42 UTC
It seems my hacks work pretty stable in my local build of NetBeans. I hope I'll get some time during weekend to make patch out of them and share it here.
Comment 29 Martin Entlicher 2016-10-27 16:28:35 UTC
I've added caching of some of Java platform classes. It addresses the problem of repetitive loading of platform classes:
http://hg.netbeans.org/core-main/rev/60ca17bd45d4
The intern() is necessary for the correct result, the String constants are interned in the VM.
Comment 30 Quality Engineering 2016-10-29 02:20:12 UTC
Integrated into 'main-silver', will be available in build *201610290002* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)

Changeset: http://hg.netbeans.org/main-silver/rev/60ca17bd45d4
User: mentlicher@netbeans.org
Log: #249305: Added caching of some of Java platform classes in the VM.
Comment 31 NukemBy 2016-10-30 17:28:46 UTC
Created attachment 162692 [details]
jpda-debugger-perf-hacks.zip
Comment 32 NukemBy 2016-10-30 17:31:37 UTC
I've attached my local hacks as patch - https://netbeans.org/bugzilla/attachment.cgi?id=162692

Patch is made on-top of this revision http://hg.netbeans.org/main/rev/10cc566d91aa
Which i beleive is last commit for Release 8.2.

There is LruCache provided as separate file. I use it to implement cache. It is the same file already included into patch I've provided here: https://netbeans.org/bugzilla/show_bug.cgi?id=268744
Comment 33 NukemBy 2016-10-30 22:32:52 UTC
Created attachment 162695 [details]
Newly added files missing in patch
Comment 34 Martin Entlicher 2016-10-31 14:09:14 UTC
Thanks a lot for the patch.
I will need to be more conservative with the API changes, but will try to do the best...
Comment 35 Martin Entlicher 2016-11-11 13:18:09 UTC
I've added caching of enclosing types:
changeset:   302161:d0353878b325
http://hg.netbeans.org/core-main/rev/d0353878b325
Comment 36 Quality Engineering 2016-11-13 02:52:47 UTC
Integrated into 'main-silver', will be available in build *201611130001* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)

Changeset: http://hg.netbeans.org/main-silver/rev/d0353878b325
User: mentlicher@netbeans.org
Log: #249305: Cache of eclosing types that are expensive to find.
Comment 37 Martin Entlicher 2016-11-15 13:11:15 UTC
Caching of field line numbers is added:
http://hg.netbeans.org/core-main/rev/a2d71001ddc7
It needs to listen on FileObject changes and the data life-time is dependent on the FileObject as well.

The cache of enclosing types changed to a simple LRU cache based on LinkedHashMap.
Since LinkedHashMap provides the functionality of an LRU cache, we do not need a special LruCache implementation.
Comment 38 NukemBy 2016-11-16 09:50:02 UTC
Re: "LinkedHashMap provides the functionality of an LRU cache"

I would argue about this. LinkedHashMap by itself can implement only LRA (Last Recently Added) cache - via overwritten removeEldestEntry() method. Of course it is better than nothing, but efficiency of such approach is close to zero in cases when many new items are added to cache along with small percentage of really frequently used elements - they will be evicted and re-added (re-calculated) all the time. 

Another arguable approach I see in use across NB's code is caching of WeakReference instead of actual objects. Such way cache item can be 'destroyed' by GC at any time - thus brining efficiency close to zero once again (item has to be recacluated once again), especially in conditions when memory usage is close to JVM-configured Xmx. I notice that NB is significantly generally slower when it uses 1000MB out of allowed 1200MB than when memory usage is around 500MB of 1200MB.

And one note regarding FieldLNCache - it uses 'synchronized()' and 'synchronizedSet()' - this will be slight bottlneck in multithreaded environment. LruCache uses ConcurrentHashMap. I actually do not expect much difference in that particular cases because, as far as I understand, evaluation of watch variables is anyway performed only in single thread.
Comment 39 Quality Engineering 2016-11-17 02:50:05 UTC
Integrated into 'main-silver', will be available in build *201611170001* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)

Changeset: http://hg.netbeans.org/main-silver/rev/a2d71001ddc7
User: mentlicher@netbeans.org
Log: #249305: A cache of field line numbers.
Comment 40 NukemBy 2016-11-17 22:04:27 UTC
One more (not really fresh) point to this issue (flickering "Debugging view" most probably adds some latency to evaluation)
https://netbeans.org/bugzilla/show_bug.cgi?id=269018. 

Bugzilla suggested that something similar was fixed long time ago - https://netbeans.org/bugzilla/show_bug.cgi?id=146321 - i was not using NB at that time.
Comment 41 Martin Entlicher 2016-11-19 16:02:16 UTC
To the LinkedHashMap and LRU: LinkedHashMap constructor has an accessOrder parameter, which I'm using to update the order when get() is called.

We use WeakReferences a lot to automatically handle the memory management by GC. If WeakReferences would not be used, the memory consumption would be higher.

When the IDE consumes a lot of memory, GC starts to be more expensive, because it needs to perform full GC and scan the whole used heap. A large memory consumption is an indication of a memory leak. If you submit a heap dump when this happens, we could analyze it to see if there is something to fix...

Evaluation is single-threaded for a single target thread, but there could run two evaluations in two different target threads. It's not expected for such synchronization to cause a significant bottleneck.
Comment 42 Martin Entlicher 2016-11-19 16:03:48 UTC
I've added caching of the parsed data:
http://hg.netbeans.org/jet-main/rev/80cb94d96687
This has the biggest effect on conditional breakpoints.
Comment 43 NukemBy 2016-11-21 22:44:49 UTC
A bit of off-topic - once again regarding "LinkedHashMap and LRU" ...

I didn't know about that parameter in LinkedHashMap constructor, therefore did not try to use LinkedHashMap as LRU cache. It seems that in some conditions such approach is rather appropriate, in other conditions - it will be a bottleneck.

I've tried to simulate few typical use-cases:
- test is executed with MAVEN_OPTS=-Xmx1150m, out of that heap 1GB is taken by static garbage
- in all use-cases there is a background thread which generates random non-static garbage (so that GC also has something to do at this time :)) 
- all tests are performed via 10 parallel "payload" threads
- each test is performed with 'solid' and 'weak' value references

1) "LHM singletonValue=true", 'singleton' here means that two different cache values cannot be used at the same time, i.e. Map access is completely synchronous while new value is evaluated, time to evaluate each value is 2ms
LHM solid  5000 * 2 ms/op ->  9018 ms. created:  2608 garbage: 486736658
LHM weak   5000 * 2 ms/op -> 10210 ms. created:  3071 garbage: 575648878

1.1) evaluation time = 20ms
LHM solid 5000 * 20 ms/op -> 60719 ms. created: 2606 garbage: 3447591015
LHM weak  5000 * 20 ms/op -> 104105 ms. created: 4495 garbage: 5983834561

2) some as previous, but value-evaluation time is close to zero 
LHM solid 20000 * 0 ms/op -> 10836 ms. created:  9833 garbage: 614495667
LHM weak  20000 * 0 ms/op -> 11221 ms. created: 10067 garbage: 643359179

3) "LHM singletonValue=false", i.e. different values may be calculated in different threads, no matter which instance will be stored
LHM solid  5000 * 2 ms/op ->   913 ms. created:  2614 garbage: 48863721
LHM weak   5000 * 2 ms/op ->   903 ms. created:  2616 garbage: 45939101

4) the same with value-evaluation time is close to zero 
LHM solid 20000 * 0 ms/op ->  1404 ms. created: 10117 garbage: 76448761
LHM weak  20000 * 0 ms/op ->  1448 ms. created: 10223 garbage: 79047073

5) Special LruCache class which is natively thread-safe
LRU solid  5000 * 2 ms/op ->  952 ms. created:  2609 garbage: 45055227
LRU weak   5000 * 2 ms/op -> 1166 ms. created:  2709 garbage: 54004176

LRU solid 20000 * 0 ms/op -> 1211 ms. created: 10109 garbage: 70550979
LRU weak  20000 * 0 ms/op -> 1494 ms. created: 10209 garbage: 65949384

Out of these experiments I see that 
- in such conditions caching 'weak' refs is 10-20% slower than solid refs - this is the time required to re-evaluate prematurely garbage-collected cached entries. 
In this test GC-pressure is higher than usual, but time to evaluate single value is rather small. Many real-life cases from NB take 20..200ms for evaluation. So 

- degradation will be different depending on many other factors.

- in heavily-multithreaded environment when real singleton cached values are needed - LinkedHashMap is really not appropriate

- for non-singleton values - overhead of LruCache over simple HashMap is around 4%

And one note regarding "a large memory consumption is an indication of a memory leak." Another case is - you are trying to "eat too large project with too small amount of memory". E.g. when I in my case trying to load all 300+ modules into NB at same - it should consume rather much memory, yes - WeakRefs will allow to survive longer, but performance will be in-acceptable (because of above mentioned reason). At the same time there will be no clear indication to end user (me) about the reason of slowness. Most people just think that someone just do not known how to code quick software ....

Out of memory happening under heavy load is 'clear' indication of small "-Xmx" for that project size. OOM on small projects - of course clear indication of 'real memory leak'.

IMHO, WeakRefs bring more harm than benefits. Application should decide on its own for how long value needs to be cached, not GC. BTW, Time-to-live is embedded into LruCache.

Test project is attached for reference.

PS: thanks for enhancements I see mentioned here. Hope to see 8.2patch1 much quicker than 8.2 release :)
Comment 44 NukemBy 2016-11-21 22:45:47 UTC
Created attachment 162972 [details]
LruCache vs. LinkedHashMap as LRU
Comment 45 Quality Engineering 2016-11-23 03:18:50 UTC
Integrated into 'main-silver', will be available in build *201611230001* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)

Changeset: http://hg.netbeans.org/main-silver/rev/80cb94d96687
User: mentlicher@netbeans.org
Log: #249305: Do cache the parsed data with the evaluated expression. This has the biggest effect for conditional breakpoints.


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo