This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
Steps to reproduce: Follow instruction on chromium site: http://dev.chromium.org/Home 1. Download sources 2. Build chromium, provide CFLGS="-g3 -gdwarf-2" and CXXFLGS="-g3 -gdwarf-2". 3. Start IDE with 600Mb of heap and create C/C++ project from existing code. Do not select build, select manual configuring code assistance. 4. Switch off Code assistance "Project popup menu->Code Assistance->C/C++ code Assistance" 5. Exit from IDE. 6. Clear user dir. 7. Start IDE and open created project. At this point project do not have C/C++ support. Only 20000 C/C++ data objects (11Mb) in memory. Other memory are consumed by IDE platform. See constantly increasing memory (up to 440Mb) while scanning is processing (284Mb of heap that cannot be GCed) At the end of scanning IDE holds 234Mb of heap that cannot be GCed. The biggest objects: 88Mb org.netbeans.modules.masterfs.filebasedfs.fileobjects.FolderObj 78Mb org.netbeans.modules.masterfs.filebasedfs.children.ChildrenSupport 67Mb org.netbeans.modules.masterfs.filebasedfs.naming.FileName 64Mb org.netbeans.modules.masterfs.filebasedfs.fileobjects.FileObjectKeeper Size of strings 84Mb. I see a lot of duplicated strings. For example: 682 "14.3-b01 (Sun Microsystems Inc.)" 690 "false" 580 "Java > 1.6" 778 "1.0" 70 "1.7" 13 "org.netbeans.modules.cnd.debugger.common.resources.Bundle" 13 "SeparatorAfterFormat.instance" 3 "/net/elif/export1/sside/av202691/chromium-trunk/src/tools/traceline/svgui" (1 instance located in the pool) 2 "org-netbeans-modules-html-editor-coloring-EmbeddingHighlightsLayerFactory.instance" + path names of project folder/files have 3-5 duplicated string instances To fix BZ#171672 (Cannot create project for Chrome sources with -Xmx512m) we need a help from IDE => could you, please, think, how to reduce platform memory consumption by 100Mb?
Can you generate a heap dump and send me a link where I can download it? That will speedup my evaluation. Otherwise this is related to addRecursiveListener. It needs to keep all objects representing folders under source roots in memory. 88MB of FolderObj is definitely a lot. That shall be made smaller. Btw. can you count the # of folders in the project? I cannot help you with memory your string issue however unless you provide GC root path for some of the strings. Report it as a separate issue (I do not think such strings represent file names) if you believe it worths the effort.
Answers for all questions are in memory snapshots: /net/elif.russia.sun.com/export1/sside/av202691/ChromiumMemoryProfiling/ Folder contains 4 intermediate (while scanning) snapshots and last after finishing scanning: Main-2010-03-09.snapshot Main-2010-03-09(1).snapshot Main-2010-03-09(2).snapshot Main-2010-03-09(3).snapshot Main-2010-03-09(4).snapshot Snapshots also have information about object allocations (each 10-th). Snapshots were taken by http://www.yourkit.com/ version 8.0.23. YourKit license server is endif.russia.sun.com. Also you can use built sources and project Chromium in the folder /net/elif.russia.sun.com/export1/sside/av202691: - chromium-trunk - chromas But I do not sure that CND can right understand full server name /net/elif.russia.sun.com/... because root /ner/elif/... was used for building and project creation. Are resources available from your net?
The chromium sources seem to have a lot of directories, but many of them are SVN ones: av202691/chromium-trunk$ find . -type dir | wc -l 32822 av202691/chromium-trunk$ find . -type dir | grep -v .svn | wc -l 4990 Five thousand of directories is still quite a lot, but significantly less than 33 thousands. As the .svn ones are hidden anyway, the question is whether masterfs's recursive listener shall observe their changes or not. CCing Ondřej so he knows that I am considering to not listen on .svn directories and their content.
I shall also point out for Víťa, that (as soon as bug 180523 is implemented) there can be hard limit on the size of subdirectories for a source root and the parsing API can disable the addRecursiveListener completely, asking user to do manual refresh via Sources/Scan for External Changes
Integrated into 'main-golden', will be available in build *201003120200* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress) Changeset: http://hg.netbeans.org/main/rev/414a5a698cea User: Jaroslav Tulach <jtulach@netbeans.org> Log: Doing something with #181684: One Object and one field less per each FolderObj
We don't handle FS events on svn metadata in any special way now, so it'll be all right if the recursive listener is not added to .svn folders or any of its children.
I did another change in core-main#bb1368f4507a I failed to download the snapshots (2h not enough to download a single one of them). Please remeasure on your side. Provide the list of biggest objects as you did, please include also the # of their instances. Thanks.
Created attachment 95231 [details] memory snapshot at scanning time
Size of not GCed memory is 398Mb. After scanning finished: - memory is reduced to 216Mb - biggest objects Linked Lists go away - number of FolderObj anf FileName objects are the same.
By the way, does IDE really need to store time stamps as strings (see org.netbeans.modules.parsing.impl.indexing.TimeStamps)? Each string consumes 64 bytes. Long consumes only 16 bytes. The best optimization is: - use specialized hash map that keeps long in the entry. It allows to reduce values to 8 bytes. For example see org.netbeans.modules.cnd.repository.util.LongHashMap. It allows to reduce heap memory on 4Mb. (From CND it is a low hang fruit).
It seems constructor allocates a lot of empty ArrayLists with size 10: FileObject.ED(FCLSupport.Op op, Enumeration<FileChangeListener> en, FileEvent fe) { .. fsList = (fsll != null) ? new ArrayList<FileChangeListener>(fsll.getAllListeners()) : new ArrayList<FileChangeListener>(); repList = (repll != null) ? new ArrayList<FileChangeListener>(repll.getAllListeners()) : new ArrayList<FileChangeListener>(); } I do not see any modifications of fsList & repList So it can be rewritten: fsList = (fsll != null) ? new ArrayList<FileChangeListener>(fsll.getAllListeners()) : Collections.<FileChangeListener>emptyList(); repList = (repll != null) ? new ArrayList<FileChangeListener>(repll.getAllListeners()) : Collections.<FileChangeListener>emptyList(); IMHO it saves more then 10Mb
FileObject.ED are short time living objects. Are you saying they were present in significant amount in your memory snapshots?
(In reply to comment #12) > FileObject.ED are short time living objects. Are you saying they were present > in significant amount in your memory snapshots? According to memory snapshot size of FileObject.ED is 103Mb non-GCed memory.
(In reply to comment #13) > According to memory snapshot size of FileObject.ED is 103Mb non-GCed memory. Opps. That is quite a lot. I can imagine ED exists when some kind of refresh is in progress, but then it shall be discarded. If you give me path from some ED to GC root, I'll try to break that chain somehow.
Created attachment 95596 [details] patch for reducing memory
proposed patch reduces memory without changing semantics. + I've made class static => reduced by 4 bytes as well, but object is still 32 bytes, because of aligning. There is a possibitity to reduce object to 24 bytes using one collection for listeneres if it is possible
Integrated as core-main#1a92fa20fe36 (hopefully correct and without indentation changes). I would still like to understand why are these objects hold in memory however (if they are for a long term).
I think Alexander provided snapshot... But you can run it yourself. Btw, can you confirm for : FileSystem fs = fe.getFile().getFileSystem(); Repository rep = fs.getRepository(); is the following always true? (fs == null) == (rep == null) If yes => use one listeners list and it would save 8 bytes per object
Oops, fs can not be null, otherwise NPE
Jarda, can you enhance patch to have logic which checks not only null, but as in patch hasListeners() + if (fsll != null && fsll.hasListeners()) { + fsList = new ArrayList<FileChangeListener>(fsll.getAllListeners()); + } else { + fsList = Collections.<FileChangeListener>emptyList(); + } because ListenerList() { listenerList = new ArrayList<T>(); }
core-main#7d09172f8700, can we now close this bug as fixed?
IMHO it is a first fix in long chain of fixes. Mentioned fix improves only peak memory consumption. See all problems in attached memory snap shot.
Alexander, subject of issue is very abstract. May be it's worth to use this as umbrella task and file each problem as separate issue? Yarda, what do you think?
I will continue then with ignoring .svn subfolders. That will reduce the number of FolderObj to 20%. If there are other things to do, feel free to report them separately.
I would suggest criteria of "bug is fixed": - IDE with switched off C/C++ code assistance can open project, finished scanning and open one file in editor without out of memory exception in 200Mb heap. Do you agree?
Before we get to goals, can you "find .svn | xargs rm -rf" and remeasure the current memory requirements without subversion being on?
Created attachment 95855 [details] memory snapshot without .svn folders
Size of not GCed memory is 220Mb. After scanning finished memory is reduced to 113Mb.
Vladimir, could you suggest patch for removing string from org.netbeans.modules.parsing.impl.indexing.TimeStamps? In fact client set longs that are stored as strings. Class stores strings because it allows to use standard Properties load and store methods. IMHO it is a weak reason to keep strings in memory. Could you also suggest to move in NB utilities CND LongHashMap class?
Created attachment 95923 [details] Reduce memory on 20Mb patch Patch that reduces FileName size from 27Mb to 7Mb. If you agree with patch, what do you think about moving org.netbeans.modules.cnd.utils.cache.CharSequenceKey in NB utilities API?
Interesting patch, having efficient compacted string storage could improve many places where we store long time existing strings (module system and apisupport come to my mind). If you want to donate this, I suggest to put the API into org.openide.util.CharSequences (just few static methods, right?). Please start the API review for that in separate issue. When finished, I'll "just" use it in masterfs.
(In reply to comment #0) > 3. Start IDE with 600Mb of heap and create C/C++ project from existing > code. Do not select build, select manual configuring code assistance. How exactly do I do this? The New Project wizard wants either makefile or a 'configure' script capable of generating makefile. I have neither of those...
(In reply to comment #32) > (In reply to comment #0) > > 3. Start IDE with 600Mb of heap and create C/C++ project from existing > > code. Do not select build, select manual configuring code assistance. > > How exactly do I do this? The New Project wizard wants either makefile or a > 'configure' script capable of generating makefile. I have neither of those... Ok, I created a fake makefile (an empty one) and managed to create C/C++ project somehow. I selected chromium/src as the sources folder. Clicked OK in the New Project wizard and have been waiting since then for the project to open... No scanning, just opening the project, which seems to be stuck in MakeConfigurationDescriptor and collecting files. I'll attach the stacktrace.
Created attachment 96321 [details] Stacktrace showing the ProjectOpenedHook activity
(In reply to comment #34) > Created an attachment (id=96321) [details] > Stacktrace showing the ProjectOpenedHook activity It is ordinary C/C++ project creation. It consume a lot of time because perform recursive ls.
I changed TimeStamps to use LongHashMap. Thanks http://hg.netbeans.org/jet-main/rev/a3f97829146c On the other hand, it seems to me that there are much bigger problems in the C/C++ infrastructure (or platform) itself that prevent using such large projects like chromium. When I tested it following the steps in the first post here the project did not even open. <crying> I remember that in 6.7 we struggled to get the IDE open the ACE project, which contained ~60k files. The main problem at that time was files crawling and mime types recognition. This time we are asked to open a project with ~300k files in ~35k folders. The physics may be the limit this time... </crying> On the constructive note I'd suggest to temporarily turn off registering source path from the C/C++ project's open-hook-impl. This should effectively turn off indexing (there will be no source roots to scan). If the IDE can start, open and work with Chromium project within a reasonable memory heap and be reasonably responsive we can then look at how much harm is done by indexing and either improve it or avoid using it for C/C++ projects.
It seem we are close to target. Two improvements allow to parse and scan project in 512Mb: - See Comment #27 - See Comment #30 This project has performance problem in scanning: - some of files under root has a "parser error" from html,css,js,.. parsers point of view. Such parsers has a bad error recovery algorithms and consume a lot of time on throw-catch exceptions and logging exceptions.
(In reply to comment #36) > I changed TimeStamps to use LongHashMap. Thanks > http://hg.netbeans.org/jet-main/rev/a3f97829146c I think, we need to move it into org.openide.util close to WeakSet. Jarda, what do you think? Don't like duplication of such huge code. > > On the constructive note I'd suggest to temporarily turn off registering source > path from the C/C++ project's open-hook-impl. This should effectively turn off > indexing (there will be no source roots to scan). If the IDE can start, open > and work with Chromium project within a reasonable memory heap and be > reasonably responsive we can then look at how much harm is done by indexing and > either improve it or avoid using it for C/C++ projects. We think about removing usage of indexing API for C++ for 6.9 (issue #182884), + we are open to provide all our optimized structures into NB Platform + come back to use of Indexing API in the next release.
> > http://hg.netbeans.org/jet-main/rev/a3f97829146c > I think, we need to move it into org.openide.util close to WeakSet. > Jarda, what do you think? > Don't like duplication of such huge code. I like such huge amount of code in openide.util neither. But if you can simplify the API to something like: public static <K> Map<Long,K> createLongMap(capacity, factor); then creating org.openide.util.Maps is probably appropriate.
(In reply to comment #39) > > > http://hg.netbeans.org/jet-main/rev/a3f97829146c > > I think, we need to move it into org.openide.util close to WeakSet. > > Jarda, what do you think? > > Don't like duplication of such huge code. > > I like such huge amount of code in openide.util neither. But if you can > simplify the API to something like: > > public static <K> Map<Long,K> createLongMap(capacity, factor); it is not Map<Long, K>, it is "Map<K, long>" which is not possible to declare in Java, because primitive class can not be parameter of template. Purpose of this class is to prevent boxing/unboxing + memory efficient Entry impl.
Integrated into 'main-golden', will be available in build *201004020200* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress) Changeset: http://hg.netbeans.org/main/rev/a3f97829146c User: Vita Stejskal <vstejskal@netbeans.org> Log: #181684: using LongHashMap for timestamps
use optimized char sequnce impl (27->7 Mb) http://hg.netbeans.org/cnd-main/rev/2f368f1909c3
what's the progress with remaining SVN issue? Do we need separate blocker bug?
Integrated into 'main-golden', will be available in build *201004070201* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress) Changeset: http://hg.netbeans.org/main/rev/2f368f1909c3 User: Vladimir Voskresensky <vv159170@netbeans.org> Log: fixing #181684 - NB platform consumes a lot of memory on big projects. (use optimized CharSequences)
Created attachment 96857 [details] Enhancement in masterfs to allow versioning systems to skip .svn folders & etc.
Let's review this small addition to ProvidedExtensions class. First and foremost Ondra needs to modify versioning and check that this API change allows SVN to skip .svn folders (and detect change in entries file), Mercurial to watch time stamp of file inside .hg that changes when one does checkin and in general remove any need for FileChangeListener in the versioning implementations.
jarda, your patch seems not to work, refreshRecursively is not delegated to the ProvidedExtensions implementation in versioning. I tried the patch, did override refreshRecursively in versioning.FilesystemInterceptor yet the default implementation in masterfs is still executed. Unless i am mistaken, you probably need to override also refreshRecursively in org.netbeans.modules.masterfs.ProvidedExtensionsProxy which seems to be the class that finally delegates all ProvidedExtensions methods to versioning. See also org.netbeans.modules.masterfs.filebasedfs.FileBasedFileSystem.StatusImpl.getExtensions(), it returns masterfs.ProvidedExtensionProxy instead of versioning.FilesystemInterceptor
Created attachment 97027 [details] New patch with test This is what one gets when thinking "I'll write a test later...". Nothing obviously works then.
Extended versioning SPI - delegating refreshRecursively to particular versioning systems
Created attachment 97111 [details] versioning spi changes
Created attachment 97114 [details] versioning spi test
[OV01] IMO masterfs.ProvidedExtensions.refreshRecursively() should return "-1" (instead of "0") as default. masterfs.ProvidedExtensionsProxy.refreshRecursively() iterates through all registered implementations of ProvidedExtensions until the first one is found that returns a value other than "-1" (so e.g. "0"). Currently versioning is probably the only implementor of ProvidedExtensions so everything works just fine, however what if there was another simple implementation which did not override the refreshRecursively method and thus always returning 0? When this happens and the new implementation precedes versioning in the iterator, it will effectively suppress the versioning implementation even without wanting to handle recursive listening in a specific way.
Thanks for review. I'll implement OV01 and integrate my part tomorrow. Then I assign back to Ondřej.
My part done in core-main#3da55e68c8b5, passing to Ondřej.
Integrated into 'main-golden', will be available in build *201004170515* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress) Changeset: http://hg.netbeans.org/main/rev/3da55e68c8b5 User: Jaroslav Tulach <jtulach@netbeans.org> Log: #181684: Giving ProvidedExtensions chance to optimize behavior of addRecursiveListener
versioning.spi changes: http://hg.netbeans.org/cdev/rev/d88f0ca0d0d6
changes in mercurial, subversion and cvs: http://hg.netbeans.org/cdev/rev/18eaa28d766c
fixed in versioning, reassigning back to jarda, so he can made any additional changes, test all changes and finally close the issue
Your change looks meaningful, however all I can say is that testing will happen as part of verification and is up to Alex.
Integrated into 'main-golden', will be available in build *201004200200* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress) Changeset: http://hg.netbeans.org/main/rev/d88f0ca0d0d6 User: Ondrej Vrabec <ovrabec@netbeans.org> Log: Issue #181684 - NB platform consumes a lot of memory on big projects.
Verified. Cromium project can be opened in 512 Mb. Profiling results, consumption of non GCed memory: -CND parsing -Indexing Memory 80Mb +CND parsing -Indexing Memory 207Mb -CND parsing +Indexing Memory 119Mb +CND parsing +Indexing Memory 218Mb