Bug 197458 - slow update of html.navigator [was:evaluate bottlenecks when opening large files]
slow update of html.navigator [was:evaluate bottlenecks when opening large fi...
Status: RESOLVED FIXED
Product: web
Classification: Unclassified
Component: HTML Navigator
7.3
All All
: P2 (vote)
: 7.3
Assigned To: Jan Becicka
Vladimir Riha
: PERFORMANCE, REGRESSION
Depends on: 197640
Blocks:
  Show dependency treegraph
 
Reported: 2011-04-05 18:08 UTC by David Konecny
Modified: 2013-01-08 02:23 UTC (History)
5 users (show)

See Also:
Issue Type: DEFECT
:


Attachments
zipped HTML document (361.17 KB, application/x-compressed)
2011-04-05 18:08 UTC, David Konecny
Details
Patch improving updateRecursively (4.12 KB, patch)
2013-01-04 08:59 UTC, Jan Becicka
Details | Diff
nps snapshot (203.46 KB, application/octet-stream)
2013-01-04 09:01 UTC, Jan Becicka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Konecny 2011-04-05 18:08:59 UTC
Created attachment 107521 [details]
zipped HTML document

Let's have a look at bottlenecks when opening big HTML documents (attached HTML file is 2MB HTML document with 25000 lines). I do not think users edit huge files in IDE regularly and so it is certainly is not our priority. But if it would be easy to remove some bottlenecks and handle big files more gracefully then let's do it.

* navigator window causing periodic freeze of the IDE for about 50seconds (http://netbeans.org/bugzilla/show_bug.cgi?id=197416) Disabling it makes huge improvement

* indentation/formatter - I would be tempted to basically disable any smart feature which may be impacted by document size and switch editor to some very basic editing mode.

* there seems to be 5 seconds freeze during document opening
Comment 1 Marek Fukala 2011-04-06 14:05:09 UTC
web-main#8445662e0bfe ... ~15MB of memory for the sample file
Comment 2 Marek Fukala 2011-04-06 14:17:08 UTC
web-main#5b17f36b6046 ... ~2,5MB
summary:     #197458 - do not create many TokenPropertyProvider instancies
Comment 3 Marek Fukala 2011-04-06 14:25:39 UTC
The bug "197323 - IllegalStateException: index 281300 is out of boundaries 1" happens after each enter in the sample file. I think it should be P2, not P3 ... David will probably evaluate deeper and possibly increase the priority.

I haven't spot any problem with the navigator, opening, viewing, scrolling is quite smooth. Disabling navigator doesn't help much to me.

However, each keystroke in the file triggers about 150MB of memory allocation. This memory is freed later, but anyway it seems to be too much even with respect to the huge number of embeddings inside the file.

Holding a key for a while causes the ide repaint very sluggish and the heap size increases almost to the limit (at lease +300MB for me). I've make some heapdumps during the interval but haven't found anything apparent and flagrant. The self profiler snapshots shows the editor's view hierarchy painting almost all the time. The question is whether there's a bug in the painting or lexing or it is so slow just because someone else allocates too much memory and the GC tries to clean up some heavily.

I'll keep working on the heapdumps and snapshots evaluation for some time and possibly add more info here.
Comment 4 David Konecny 2011-04-07 00:21:56 UTC
(In reply to comment #3)
> The bug "197323 - IllegalStateException: index 281300 is out of boundaries 1"
> happens after each enter in the sample file. I think it should be P2, not P3
> ... David will probably evaluate deeper and possibly increase the priority.

I'm looking at it right now. I agree it is P2 as apart from that HTML editor work pretty well.

> I haven't spot any problem with the navigator, opening, viewing, scrolling is
> quite smooth. Disabling navigator doesn't help much to me.

I cannot reproduce it either which make me very suspicious! I will see if I can reproduce it on older dev builds. A friend who is running NB 6.9.1 confirmed to me that closing Outline solved his problem as well.
Comment 5 Marek Fukala 2011-04-07 08:16:15 UTC
It's great you are looking into it. I've got a question: What is the factor causing the issue to show up? Just file size? Or it depends on the structure? Embedding? The point is if this happen in relatively common situations it should be possibly fixed in 7.0. OTOH no one else reported the issue AFAIK. It's up to you, I just wanted the be sure this is not a stopper. Thanks
Comment 6 Marek Fukala 2011-04-07 08:17:40 UTC
(In reply to comment #5)
The comment was supposed to be placed into issue 197323, not here, but since you will likely read it here as well I won't duplicate it.
Comment 7 Quality Engineering 2011-04-07 08:43:12 UTC
Integrated into 'main-golden', will be available in build *201104070400* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/8445662e0bfe
User: Marek Fukala <mfukala@netbeans.org>
Log: #197458 - html hints don't hold SyntaxParserResult
Comment 8 David Konecny 2011-04-07 09:37:13 UTC
(In reply to comment #5)
> It's great you are looking into it. I've got a question: What is the factor
> causing the issue to show up? Just file size? Or it depends on the structure?

Number of tokens and therefore file size. The issue has been there since I rewrote indenter (NB 6.7??) so I do not think it is a stopper. But I would like to resolve it first before making any judgements. Of course if we could fix it earlier the better. I think that fix might have few parts and some parts might be safe for 7.0 and some may wait for 7.0.1.
Comment 9 David Konecny 2011-04-12 03:13:04 UTC
I filed sub issue 197640 for a few problems I'm looking at.
Comment 10 Marek Fukala 2011-04-12 12:08:59 UTC
Bug 197659 - do not parse file in EDT during each enter key press
Comment 11 David Konecny 2011-04-13 02:14:17 UTC
While fixing issue 197640 I did also two things (b72b49a81af4):

#1) added some performance debugging. If you want to enable it then run IDE with "-J-DAbstractIndenter.PERF.level=400" and whenever you reformat file or press enter you should see something like:

[IndPer] calculateLinePairs (total pairs=1) took 0 ms
[IndPer] processLanguage (text/html) took 1776 ms
[IndPer] calculateLinePairs (total pairs=9740) took 101 ms
[IndPer] processLanguage (text/html/text/x-css) took 364 ms
[IndPer] recalculateLineIndexes took 61 ms
[IndPer] mergeIndentedLines took 146 ms
[IndPer] calculateLineIndent took 112 ms
[IndPer] updateIndentationForPreservedLines took 8 ms
[IndPer] generateBlockIndentsForForeignLanguage took 6 ms
[IndPer] storeIndentsForOtherFormatters took 0 ms
[IndPer] modifyDocument took 4193 ms
[IndPer] Total time 7913 ms

You may find it cryptic but I understand it. :-) The above one is from reformat of the whole big test file and as you can see 1776 was spent on HTML indentation and 364 on CSS indentation but modification of document took 4193ms which is a lot. Marek, any idea how to make it faster? I tried a few things but without success. I'm using editor.indent.spi.Context.modifyIndent. Petr Pisl mentioned recently that the method is slow because it calls preferences to retrieve tab size but even if I patched it there was no improvement. I noticed that each line change triggers lots of events so I tried to wrap it all in document.runAtomic but that did not help either. I could extract the document content into separate buffer, indent buffer and place text back into document - that would be faster. Should I go that way or am I missing something?

Btw. when you press enter in editor then you get a few extra lines like:

[IndPer] Locating FormatStableStart took 342 ms
[IndPer] Current line index is: 17067
[IndPer] FormatStableStart line starts at index: 303
[IndPer] Number of ranges to ignore: 1122

which tells you what current line index is (17067) and on what line a stable format starts (line 303 in this case; a <div> which encloses current position) and how long it took to find it (342ms). In order to indent current line the indenter will have to process everything in between (lines 303 - 17067). This is another indicator that some less reliable method might be needed to find an indent if the gap is too big.

#2) other issue which I hardcoded for now is that maximum indentation column is 200. If indent grows beyond than well sorry. How do you think Marek I should handle this?? If test document is defined as "<!DOCTYPE html>" then everything is OK but if it is "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">" then indent keeps growing and growing (14000 characters for last document line). Ideally I would pass indentation numbers to some algorithm which would tell me "hold on, indentation keeps growing and something is likely wrong. if XHTML is strict then ignore it and switch document to relaxed".
Comment 12 Marek Fukala 2011-04-13 12:31:40 UTC
(In reply to comment #11)
> of the whole big test file and as you can see 1776 was spent on HTML
> indentation and 364 on CSS indentation but modification of document took 4193ms
> which is a lot. Marek, any idea how to make it faster? I tried a few things but
> without success. I'm using editor.indent.spi.Context.modifyIndent. Petr Pisl
> mentioned recently that the method is slow because it calls preferences to
> retrieve tab size but even if I patched it there was no improvement. I noticed
> that each line change triggers lots of events so I tried to wrap it all in
> document.runAtomic but that did not help either. I could extract the document
> content into separate buffer, indent buffer and place text back into document 
> that would be faster. Should I go that way or am I missing something?
Replacing the whole document content would be super fast, but would cause all the editor "widgets" attached to positions to be collapsed to the offset zero, which I reckon you do not want to do :-).

In my snapshot the whole file reformat took ~22seconds, where about 19 seconds is the application of the modifications.

From the document listeners triggered by the inserts/removes there are two most significant time consumers:

1) lexer document listener (o.n.l.l.inc.DocumentInput) ~4.2+1 (Inserts+Removes) seconds

and 

2) the spellchecker (o.n.m.spellchecker.ComponentPeer) ~3.8+1.2 (I+R)

AFAIR the lexer document listener should be disabled somehow automatically if there's too much changes in a sequence under atomic lock. This doesn't seem to happen in this case.

As for the second one - it schedules a RequestProcessor event after each modification which causes RP$SlowItem-s to be created which is quite slow :-) 

In general is these two are optimized it will save 5.2+5=10.2 seconds from the 19, which is more than 50%.

3) why the DocumentContent$Edit.undoOrRedo is so slow if it runs under one atomicLock? It takes almost 3 seconds.

4) CloneableEditorSupport.getDocument() is doing many lookup searches when looking for the redirect. I tooks almost 1.5 seconds. Triggered from o.o.text.LineListener

> #2) other issue which I hardcoded for now is that maximum indentation column is
> 200. If indent grows beyond than well sorry. How do you think Marek I should
> handle this?? If test document is defined as "<!DOCTYPE html>" then everything
> is OK but if it is "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">" then indent keeps growing

One think which would help is to use the last parser result for the document structure evaluation. Since the new line indent will cause a simple non-semantic document change we can easily use it to find the context.

As for the wrong doctype - I believe adding some simple heuristic to the html parser result (like is there more than 20 unclosed tags in xhtml .. ok, do not consider it as xhtml) so a feature may check if the document can be considered as eligible for further processing.
Comment 13 David Konecny 2011-04-13 20:08:00 UTC
(In reply to comment #12)
> [...] if there's too much changes in a sequence under atomic lock.

Marek, you mentioned atomic lock a few times. AbstractIndenter.modifyDocument method at the moment does not execute modifications under atomic lock. I'm not sure if a layer above indentation (whoever calls indenter) does that or not. I wrapped AbstractIndenter.modifyDocument in doc.runAtomic but because it did not make any difference I have not commit it and asked here instead. Just FYI.

And yes, I noticed that UNDO after whole document reformat takes quite a lot of time too.
Comment 14 Quality Engineering 2011-04-14 08:35:45 UTC
Integrated into 'main-golden', will be available in build *201104140401* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/b72b49a81af4
User: David Konecny <dkonecny@netbeans.org>
Log: #197458 - evaluate bottlenecks when opening large files
Comment 15 Jan Lahoda 2011-04-14 13:50:33 UTC
I have disabled the RP's stack trace grabbing in Spellchecker:
http://hg.netbeans.org/jet-main/rev/eb9bcdc022ee
Hope that will help a bit.
Comment 16 Quality Engineering 2011-04-15 08:38:31 UTC
Integrated into 'main-golden', will be available in build *201104150401* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/eb9bcdc022ee
User: Jan Lahoda <jlahoda@netbeans.org>
Log: #197687, #197458: spellchecker's RequestProcessor should not keep sender's stack trace, using only one RP for all ComponentPeers.
Comment 17 Marek Fukala 2011-04-18 14:10:53 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > [...] if there's too much changes in a sequence under atomic lock.
> 
> Marek, you mentioned atomic lock a few times. AbstractIndenter.modifyDocument
> method at the moment does not execute modifications under atomic lock.

I see both org.netbeans.editor.ActionFactory$FormatAction$1.run(ActionFactory.java:1660) and  org.netbeans.modules.css.formatting.api.support.AbstractIndenter.applyIndents(AbstractIndenter.java:1580)

running its code under atomic locks.
Comment 18 Marek Fukala 2013-01-02 10:17:44 UTC
I've tried the behaviour of the editor with the attached David's file in 7.3 and I realized the IDE is completely stuck by html.navigator upon just opening the file.

I've killed the JVM after more than 2 minutes of one CPU's core being fully occupied by:

"org.netbeans.modules.html.navigator.HtmlNavigatorPanelUI" daemon prio=1 tid=00000000042fe000 nid=0xb5be3000 runnable [00000000b5be2000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.String.equals(String.java:1017)
	at org.netbeans.modules.html.navigator.Diff.equals(Diff.java:157)
	at org.netbeans.modules.html.navigator.Diff.getIndexInParent(Diff.java:262)
	at org.netbeans.modules.html.navigator.Diff.equals(Diff.java:195)
	at org.netbeans.modules.html.navigator.Diff.equals(Diff.java:143)
	at org.netbeans.modules.html.navigator.HtmlElementDescription.equals(HtmlElementDescription.java:135)
	at java.util.ArrayList.indexOf(ArrayList.java:216)
	at java.util.ArrayList.contains(ArrayList.java:199)
	at java.util.AbstractSet.removeAll(AbstractSet.java:157)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:562)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:581)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:581)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:581)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:581)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:581)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:581)
	at org.netbeans.modules.html.navigator.HtmlElementNode.updateRecursively(HtmlElementNode.java:581)
	at org.netbeans.modules.html.navigator.HtmlElementNode.setDescription(HtmlElementNode.java:205)
	at org.netbeans.modules.html.navigator.HtmlNavigatorPanelUI$12.run(HtmlNavigatorPanelUI.java:559)
	at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1477)
	at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2071)


Please evaluate the impact of the bottleneck also on files of more reasonable size, but anyway this seems to me to be P2 bug which should be fixed in 7.3
Comment 19 Jan Becicka 2013-01-03 11:14:42 UTC
I'm afraid, that Marek's algorithm has non linear complexity. Maybe O(n^2), where n is number of children of each node - maybe even worse. And in this case it is probably too many nodes for this algoritm to compute in acceptable time.
Comment 20 Marek Fukala 2013-01-03 13:57:39 UTC
1. my original implementation was a prototype not intended for serious usage
2. if getIndexInParent() result is computed at once for all children and cached somehow then the performance can be much better
3. do you want me to take a look at the issue?
Comment 21 Jan Becicka 2013-01-03 14:08:46 UTC
(In reply to comment #20)
> 1. my original implementation was a prototype not intended for serious usage
> 2. if getIndexInParent() result is computed at once for all children and cached
> somehow then the performance can be much better
> 3. do you want me to take a look at the issue?

I will try to tune it up and we will see.
Comment 22 Jan Becicka 2013-01-04 08:56:46 UTC
Tried to measure updateRecursively, but I gave up after 1 hour. I tried to implement different equals and hashCode and now updateRecursively takes 22s on my machine. Although it sounds like good achievement, this is not fix at all.

1. 22s is still slow for navigator update
2. IDE is not responsive even if updateRecursive is finished.
Comment 23 Jan Becicka 2013-01-04 08:59:23 UTC
Created attachment 129877 [details]
Patch improving updateRecursively
Comment 24 Jan Becicka 2013-01-04 09:01:31 UTC
Created attachment 129878 [details]
nps snapshot

snapshot after updateRecursive is finished
Comment 25 Marek Fukala 2013-01-04 09:55:06 UTC
the patch looks ok ... As the "index in parent" is computed and cached already in the "element path" this is a great improvement.

If you wanted to make it a bit faster you could add "getIndexInParent()" in the Description interface and make the "element path" computation lazy. As the Diff is the only client, it would speed up the Descriptions tree creation quite a lot and keep the Diff performance.

As for the subsequent "nodes update problem", I believe the principal issue is that we recursively expand the whole tree so since we recursively set the children keys on all the descriptions it leads to "merge" of all the nodes which is slow. I believe we can stick with the original 7.2 behaviour - expand just the first level (at least for bigger files).
Comment 26 Jan Becicka 2013-01-04 10:39:05 UTC
Changeset: 9ff64227c434
Author:    Jan Becicka <jbecicka@netbeans.org>
Date:      2013-01-04 11:26
Message:   Performance improvement
Comment 27 Quality Engineering 2013-01-05 02:30:41 UTC
Integrated into 'main-golden', will be available in build *201301050001* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main-golden/rev/9ff64227c434
User: Jan Becicka <jbecicka@netbeans.org>
Log: Performance improvement
Issue #197458 - slow update of html.navigator [was:evaluate bottlenecks when opening large files]
Comment 28 Jan Becicka 2013-01-07 13:21:36 UTC
> As for the subsequent "nodes update problem", I believe the principal issue is
> that we recursively expand the whole tree so since we recursively set the
> children keys on all the descriptions it leads to "merge" of all the nodes
> which is slow. I believe we can stick with the original 7.2 behaviour - expand
> just the first level (at least for bigger files).

That's it. For files bigger than 100k (my constant :) we will not expand all nodes.
http://hg.netbeans.org/web-main/rev/aa739e8261ba
Comment 29 Quality Engineering 2013-01-08 02:23:20 UTC
Integrated into 'main-golden', will be available in build *201301080001* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main-golden/rev/aa739e8261ba
User: Jan Becicka <jbecicka@netbeans.org>
Log: #197458 - slow update of html.navigator


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo