47870 – [perf] Classpath scanning keeps too many objects in memory resulting in OOME

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 47870 - [perf] Classpath scanning keeps too many objects in memory resulting in OOME

Summary: [perf] Classpath scanning keeps too many objects in memory resulting in OOME

Status:	CLOSED FIXED

Alias:	None

Product:	java
Classification:	Unclassified
Component:	Unsupported (show other bugs)
Version:	4.x
Hardware:	PC All

Importance:	P2 blocker (vote)
Assignee:	issues@java

URL:
Keywords:	PERFORMANCE

Depends on:
Blocks:

Reported:	2004-08-25 08:55 UTC by Antonin Nebuzelsky
Modified:	2007-09-26 09:14 UTC (History)
CC List:	1 user (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Antonin Nebuzelsky 2004-08-25 08:55:48 UTC

Classpath scanning keeps too many objects in
memory and with a lot files on classpath ends up
with OOME.

With 100 jars, each with 100 very simple java
files (10.000 java files in sum) on classpath,
with -Xmx128m the initial scan took 1:25 min and
the check after restart took 0:28 min.

With 900 jars, each with 30 very simple java files
(27.000 java files in sum) on classpath, I was not
able to finish the scanning at all. Neither with
-Xmx256m. The heap usage was soon right below its
maximum, doing a lot GCs before OOME, and slowing
down the scanning of course.

I don't think there is a reason to keep all the
scanned data in memory, and the created index
cannot be that big to eat all the heap.

Comment 1 Martin Matula 2004-08-25 09:49:53 UTC

The only thing we keep in memory are the indexes. But, if there are
many CP roots on the classpath, the problem may be the MDR cache since
it grows with every CP element. The solution could be a static cache
shared by all CP roots. I have implemented this but haven't commited
it to the cvs yet. I will provide (off-line) a build containing this
change so that you could measure whether it gets better with it.

Comment 2 meliandra 2004-08-25 10:37:06 UTC

My colleague told me that if he add the jars and not the folder
(contains the classes) it's faster.

Have the number of classes no influence to the duration of the scanning?

There's another speciality. The folder contains the classes and same
classes are also in the jars. It could be that there are version
differences from the class and the class in the jar.

Comment 3 Martin Matula 2004-08-25 10:46:58 UTC

Melinda,
yes, jars are faster because they are loaded to memory at once and
also in case of subsequent scan, only the timestamp for the whole jar
is checked, where for directory we need to go through every single
file and check the timestamp. There is nothing we can do about it
(unfortunatelly on most systems it is not enough just to check the
timestamp of the root directory).
I don't understand what you mean by the speciality. We are able to
handle several classpath elements with the same classes in them (even
different versions of them). Or have you encoutered any problems?

Comment 4 Antonin Nebuzelsky 2004-08-25 12:51:02 UTC

> I will provide (off-line) a build containing this
> change so that you could measure whether it gets better with it.

Still the same problem. And the behaviour is even worse with your
build, because the time between clicking OK in the project properties
and scanning start is ten times longer than it was before. Definitely
something strange in your build...

Comment 5 lleland 2004-08-25 15:08:26 UTC

The problem with this process is that, when you specify a folder as a
classpath root, every descendant folder and file is scanned. The
filesystem folder tree may contain many libraries of class files, and
your project may just need to reference a few. Not everyone uses jar
files for everything, especially during development and prototyping.

What is needed is a refinement to the process of specifying a folder
classpath root that allows you to choose which folders, sub-folders,
and files are to be included in that library. Only those folders and
files would be scanned for that library, and the rest skipped. So,
even if the folder holds 100s of files and folders, only those that
relate to that library are scanned.

This can easily be done with a standard filesystem tree view with some
extra icons. For files, you would need an icon for enable and disable
scan. For folders, you would need an icon for all, some, and no sub
folders and files scanned. The root folder would default to enable
all. Context menus for folders, files, and selections of either or
both allow you to enable all or disable scanning. Double-clicking
files toggles enable/disable.

I used this kind of process for a class association mapper, and it
works quite well.

Comment 6 Antonin Nebuzelsky 2004-08-31 13:12:36 UTC

With Martin's improvements (mentioned in issue 43258) it got better. I
am now able to finish the scanning for 900 jars (27.000 java files in
sum). It took 30 minutes with -Xmx160m. The check after restart took
3:20 min.

Comment 7 Martin Matula 2004-09-02 13:26:47 UTC

The heap usage got to the reasonable level after the fixes described
in issue 43258 and the IDE now scales much better with increasing
number of classpath elements.
I think it is fair enough if users need to increase the max heap size
for huge projects containing 900 or so jars.

Comment 8 Antonin Nebuzelsky 2004-09-02 13:30:46 UTC

Verified.

Comment 9 Quality Engineering 2007-09-20 09:55:48 UTC

Reorganization of java component