This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 27190 - Do not serialize entire loader pool
Summary: Do not serialize entire loader pool
Status: VERIFIED FIXED
Alias: None
Product: platform
Classification: Unclassified
Component: -- Other -- (show other bugs)
Version: 3.x
Hardware: All All
: P3 blocker (vote)
Assignee: Jesse Glick
URL:
Keywords: PERFORMANCE
Depends on:
Blocks: 21675
  Show dependency tree
 
Reported: 2002-09-10 20:36 UTC by Jesse Glick
Modified: 2008-12-22 18:58 UTC (History)
4 users (show)

See Also:
Issue Type: TASK
Exception Reporter:


Attachments
Patch (7.84 KB, patch)
2002-09-13 06:56 UTC, Jesse Glick
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jesse Glick 2002-09-10 20:36:20 UTC
Currently all loaders are serialized in bulk to
loaders.ser. This is wasteful since most users
never make any customizations to any loaders at
all! Looks like around 6% of main thread. The
loader pool could instead use PCL's to detect if
any changes are actually made to loaders, and if
so, serialize only those (to userdir, under e.g.
DataLoaders/org-nb-mod-foo-FooDataLoader.ser). No
API change.

I know the new DS API is doing something similar
anyway, but we probably cannot wait for the
performance boost.
Comment 1 Petr Nejedly 2002-09-11 09:19:35 UTC
I see the biggest problem in their deserialization ;-)
Actually, loaders are one of the worst scalable parts of the module
system and their deserialization is about half of the problem
and as you have stated quite easily solvable.
See http://core.netbeans.org/ModuleSystem.html

For now (before new DS impl) I'd suggest to treat the order
as a separate object instead of using FS ordering mechanism
to allow loader reordering without biger performance impact.

Comment 2 Jaroslav Tulach 2002-09-11 14:23:08 UTC
What about serializing the loaders during first startup and module
install into loaderstemplate.ser and when saving compare the
ByteArrayOutputStream (ObjectOutputStream (loader)) with its template
in loaderstemplate.ser and safe only when there is a change, otherwise
just put the name of the class there. 

Btw. 6% is a lot, but how much you can save if you do not deserialize
but just instantiate, 2%?
Comment 3 Petr Nejedly 2002-09-11 14:59:12 UTC
Yarda,
What's wrong with separate most-of-the-time-not-present files and PCL?
I do not like inital-serializing we already use in some places
and I do not want to add another one..
Moreover, the serialized streams are different each time although
there was no customization to the loaders.
I guess it is caused by storing instance IDs into the stream
for backward object references in the stream.

> Btw. 6% is a lot, but how much you can save if you do not
> deserialize but just instantiate, 2%?
They are already instantiated and available in the LoaderPool
by that time so not deserializing into them (the are SCOs, thus
deserialization just updates existing instances) means No-Op and
saving full 6% not two-some.

You can experiment with this by deleting the
$nbuser/system/loaders.ser bofore each startup or patch the core to
not save the loader pool at all.
Comment 4 Jesse Glick 2002-09-11 15:45:05 UTC
Note that ModuleInstall's, if present, are indeed serialized into
memory and saved if changed. However this was done for two reasons:

1. People did not consistently fire change events from
ModuleInstall's. It is documented for DataLoader that you should -
after all, it is displayed with a BeanNode.

2. Most ModuleInstall's had no writeExternal methods; the module
system code optimizes for this case (using reflection) and doesn't
bother trying to serialize them. DataLoader's *normally* have
externalizable data, on the other hand (e.g. action list), so this
optimization would not work.

3. I had no problems with unpredictable .ser binary contents with
ModuleInstall. Perhaps ExtensionList is nondeterministic?

So I would prefer to save .ser only if there have been fired changes.

Re. saving one big fake .ser file vs. folder of .settings (e.g.): for
now I'll probably stick to saving one big .ser for the sake of
possible overhead with .settings, since other timing results hint that
parsing these is probably 10-15ms apiece (?). Folder ordering is very
likely insignificant (I have never seen this occupy a substantial
amount of time in startup profiles). There are various semantic
advantages to saving a folder of .settings, though:

- by making LoaderPoolNode a FolderNode, the current Options window
automatically permits project-specific loader configs, or later on
session-specific configs

- resorting code can be discarded in favor of the standard mechanism;
besides simplifying the code, this means addition of new loaders would
not disturb existing loader ordering unnecessarily (a bug in the
current impl)

- users would be permitted to "delete" loaders if they really wanted
to; the deletion can be reverted by just removing the mask from the
user dir

- would help prototype new DS API

I will consider it, if it can be implemented with no significant
overhead and without much work. For example, it would probably not be
hard to write special-case optimized code in the .settings parser to
recognize FixedFileSystem-generated unmodified settings, and extract
just the class name. This would leave only overhead for creating a few
extra FileObject's and DataObject's, which is probably not large. Easy
enough to measure per-module overhead.
Comment 5 Petr Nejedly 2002-09-11 17:26:04 UTC
Ad 1, 2: I was aiming at SharedClasSobject and its reset() method ;-)
Ad 3) ExtensionList serializes a TreeSet, right? That explains it!

I was not thinking about (new-DS alike) .settings files at all.

So one (not that)big *sparse* loaders.ser file and PCLs
is the way to go for now?
Comment 6 Jesse Glick 2002-09-12 16:51:27 UTC
Yeah, I will probably try a big sparse loaders.ser + PCLs for now,
unless I can see an easy way to make the .settings approach really fast.
Comment 7 Jesse Glick 2002-09-13 02:32:21 UTC
Well, .settings + FolderInstance is not really fast enough, though it
was not as bad as I would have guessed. Try
core/test/perf/src/org/netbeans/core/projects/XMLSettingsTest.java.
(Classpath should have core.jar, openide.jar, performance/src/,
probably XML parser, etc.) It makes a folder containing some number of
unmodified .settings files - using <instance class="..."/> - and some
number of modified .settings files - using <serialdata
class="...">...</serialdata>.

Very roughly, on my P1200 w/ JDK 1.4.1, each .settings file costs
about 3ms, whether modified or not. I haven't yet tried running a
profiler on this stuff to see what is taking so long.

Anyway, this is too much for saving loaders. (Think about this for the
new DS API, too.) For now I will go with a manual .ser file.
Comment 8 Jesse Glick 2002-09-13 06:51:58 UTC
OK, I just tweaked the loader pool logic and get an improvement of
2.38% on stock NB startup, more than two standard deviations.
loaders.ser is much smaller this way. For unmodified loaders, only the
class name is saved (to preserve the ordering). Still need to check
compatibility with the old format (ought to be OK).
Comment 9 Jesse Glick 2002-09-13 06:56:22 UTC
Created attachment 7407 [details]
Patch
Comment 10 Jesse Glick 2002-09-13 17:37:56 UTC
committed     Up-To-Date  1.60       
core/src/org/netbeans/core/LoaderPoolNode.java
Comment 11 Petr Nejedly 2002-09-16 21:24:19 UTC
Nice improvement.
I've watched the new loaders.ser and the storage process and there are
two loaders actually serialized
each time: DefaultLoader (which is modified from the
text module's ModuleInstall consistently each time)
and ScriptLoader (which fires (null,null,null) on
the startup Lookup(ScriptType) change)
I'll make some measurements tomorow to see the impact
of the change and the impact of (potentional) no-change patch.
Comment 12 Jesse Glick 2002-09-20 03:22:52 UTC
Yeah, I know about DefaultLoader and ScriptLoader. Didn't see anything
obvious to do to fix these - DefaultLoader is just a case of needing
Looks, and re. ScriptLoader, well perhaps it could avoid firing
changes unless lookup actually changes (not just gets initialized);
but the API does not support distinguishing between "instrinsic" and
"extrinsic" changes to a loader (i.e. normal changes to its config,
and changes to other things which affect its recognition).
Comment 13 Petr Nejedly 2002-09-23 09:58:09 UTC
Re ScriptLoader: it fires an event just to tell the LoaderPool
it may possibly changed its recognition criteria, while none
of its properties have actually changed.
The LoaderPool reacts to any property change, right?
There is no property really changed in the Loader, right?

It may be possible  to introduce some "property" that will
be ignored by the LoaderPool persistence if it will be worth it.

It remainds me that I forget to post the numbers:
The LoaderPool deserialization time have changed from ~1300ms
to ~250ms on my machine by this fix.
Comment 14 Jesse Glick 2002-11-29 16:32:11 UTC
"It may be possible  to introduce some "property" that will be ignored
by the LoaderPool persistence if it will be worth it." - yes, this
would be possible, but I am not sure whether it is worth the bother...
anyway ScriptLoader should behave differently under new Datasystems:
should probably have one loader instance per script type, I guess.
Comment 15 Petr Nejedly 2002-12-02 11:02:37 UTC
And the DefaultLoader will not be touched by the text module
under new DS as well.
Nothing more to do now.