I had trouble with NB hanging for a long time during
startup. I traced it to trying to get a listing of
/net whichis all the hostnames within a LAN. This
takes quite a while.
This seems to be happening as the java modules
tries to add this to the classpath (or something
the code in the masterfs seems to blindly want to
get a listing of /net which should be unneccessary
in this case.
Created attachment 16169 [details]
Stack trace at the time of the hang
True, the java/j2seplatform module is only (indirectly) calling
URLMapper.findFileObject(URL). Is it even possible to call this on a
remote filesystem without creating all intermediate siblings files?
As a hack it could at least never try to get children of "/net" during
As an aside to Ivan, I'm not sure loading your JDK from a remote disk
is really a good idea to begin with... likely to slow down many
operations. You wouldn't put libc.so for a C program on an NFS server
One should be able to open a file or directory w/o having to
open all the siblings above it. Most applications manage to
access things over /net w/o incurring such problems.
As for loading JDKs from from /net.
I wasn't aware that is was happening. I'm in the process
of upgrading machines. I switch my VM all the time, have more
than one place to control where I get my VM's from (over
/net when doing "official" builds and testsuite because it's
guaranteed to exist on all machines).
So ... don't blame the user!
The problem is that NB filesystem impls currently seem to create all
siblings whenever a FileObject is requested for one child of a parent.
This is not required by the API; it should be fixable.
Re. loading JDKs from remote disks - I wasn't claiming this was a user
error, just pointing out that your IDE performance may suffer if you
This can happen even if your jdk is not on a remote disk.
It's not unreasonable to put /net/<your-own-machine>/<etc>
into various paths because that way when you rlogin elsewhere you
don't have to futz with your paths. On youonw
machine /net/<your-own-machine>/ is a noop to unix.
AFAIK fileSystems don't create blindly any children. According to
attached stacktrace there is called method children but this is impl.
of AbstractFileSystem.List interface that can't be easily ommited
without breaking backward compatibility. Moreoverthis method doesn't
create any children (neither Files nor FileObjects) but provides list
of children as initialization of FileObject that is parent folder. I
don't see why it should be perf. problem to call something like new
File ("/net").list () that returns just String. Naturally if any of
these children is requested then all following calls will suffer from
perf. problems which can't be preveted in no way.
Reopen if you think that I'm wrong.
I think it is exactly calling
that you never want to do, since it makes network accesses. And there
is no logical reason for it to be necessary; it is purely an artifact
of our filesystem implementations that File.list is called here, even
though only children of /net/dbx-fs/ are ever going to be used.
Considering my comment of 2004-07-28 this can happen for
a variety of reasons. I'd urge you to consider addressing this
for promo-E as it is a generic issue.
What happenned there was that I executed something like
The /net/djomolungma somehow ended up being recorded in the classpath
NB uses to load it's own clusters and it hung for a long time trying
to load these clusters.
IMO this should be fixed ASAP.
Just executing nb with a fullpath including /net attached will cause
inexplicable mystifying hangs to the user. So now
we have two cases , /net in the classpath and /net as part of
theexecutable pat. Might there not be other places where an
even more valid use of /net will get people frutrated?
Seems quite serious for users who need to share some parts of their
projects via shared network drives. ->P2. Can this be fixed/worked
around for 4.0?
this is also good candidate for release note ...
provide a justification for release noting the bug including a
description of user impact and the workaround, if any.
thanks in advance
I think it can be fixed for 4.0.
Summary of the problem (for beta2):
There is a pure performance and responsiveness for projects with
resources lying down on remote disks within a LAN (JDK, libraries,
sources etc.) and especially for resources accesed over /net folder.
Currently there isn't any easy workaround except putting all resources
on local disk if the performance seems to insufficient.
Responsiveness depends on many factors: LAN throughput, project
configuration, user's workflow etc.
There isn't easy to solve right this issue with some easy fix.
MasterFileSystem is designed as a facade that delegates either on
LocalFileSystem or VCSFileSystem which both are subclasses of
1/ Add new subclass of AbstractFileSystem.List interface. Something
like AbstractFileSystem.ExList with method boolean existChild(String
name) - then as LocalFileSystem as VCSFileSystem must implement this
new interface. There is disadvantage that also many significant
changes must be done (rethink and check) in filesystems package
(AbstractFolder, AbstractFileObject)because e.g. events are fired only
when children were requested, the same is true for refreshFolder and
so on. This is extraordinary risky for NB4.0 because it affects not
only net folders and fixes must go into relatively fragile parts of code.
2/ Implement somehow in MasterFileSystem for /net path. This is less
risky, it will affect only /net folder but will work only without cvs
I mentioned that it can be fixed in NB4.0 but I maped by mistake
promo-E into NB4.0 which isn't naturally true.
So, I think that target milestone promo-E was right and this issue
should be waived for NB4.0.
First, a /net-specific workaround won't do.
This can hit /home or any other customer-site WAN automount map.
Then, there are some puzzling things here.
a) Why is this only affecting NB4.0 and not older NB's?
In other words is a this a change in API's aand/or implementation
causing a regression? Or is this behaviour is just
triggered much more often?
b) When I first posted this I had cited 'java' as coming over /net.
I got a lecture about this being a bad idea, but our QA gets 'java'
from /net. We have so many machines it's not worth installing
java on all of them.
So another puzzle. How come your QA doesn't run into this?
With the initial QA handoff, based on 4.0 3 weeks away I count
accept this symptom persisting.
c) Note that this applies not just to where java comes from but
apparently also where 'runide' comes from.
It could rear it's head elsewhere.
It seems to me that any customer who does network installs can
potentially run afoul of this.
So another puzzle. Do all beta customer install locally?
Probably yes. Will FCS customers? You really want to take the
risk that they all install locally?
Re. Ivan's point (a) - the same problem existed in the 3.x filesystem
implementation. The difference is that in 3.x, you would not be likely
to create a filesystem whose "mount point" (root) was *above* an
autofs map point. E.g. you might have mounted
"/net/some.thing/some/path" and all filesystem operations would be
inside that. However in 4.0 with the switch to masterfs, it is
equivalent to having mounted just "/" (on Unix at least) and then
expanding subfolders "net", "some.thing", "some", and "path", before
getting to the rest; FileObject's are created for these intermediate
Now of course in practice it is likely that the only thing you will do
is map a URL such as "file:/net/some.thing/some/path/" back onto its
FileObject; and use that FO's children (recursively), and occasionally
walk back up its parent list to "/" without checking for siblings. So
what we want is that the intermediate folders like "/net" not be asked
for their children list unless something actually requests that
However fixing this would apparently require API changes (*) and
require a nontrivial and risky patch in core filesystem code,
specifically folder refresh logic, which is already overly complex. I
feel that risk might be justified in this case because the bug
symptoms are so severe.
Note that Windows UNC paths don't work, either. However in that case
there are apparently JRE problems that also block a solution (e.g. in
(*) I.e.: you have some FileObject for "/net" and you ask
fo.getFileObject("some.thing"). If the FileSystem is an
AbstractFileSystem, the only way it can know whether that folder
actually exists is to call fsList.children("net") and check if
"some.thing" is among the returned values. Perhaps there is some
short-term fix we could use that would work only for LocalFileSystem
(in D) and would not introduce a new public API.
Re. Ivan's point (b) - because NetBeans QA tests the IDE the way it's
most likely to be used by the users. Sorry, I don't consider using
'java' over /net to be a typical usage scenario and wouldn't ever
thought that "it's not worth installing java" on our test machines.
Anyway, we also tried to test various scenarios such as 'java' over
/net and project files and sources over /net, and the results were not
as bad as you report. I think it depends how your /net environment is
configured, how fast is the network, what /net/<some-machine> you're
connecting to, etc.
Re. Ivan's point (c) - we must carefully weigh the pros and cons and
compare risks here and from my point of view there is obvious that
mentioned fix affects the whole IDE including MasterFileSystem,
SystemFileSystem and all features based on it.
(*) - we can avoid API change but we can't avoid all the risky changes
in core filesystems impl.
There seems to exist some specific workaround that should help. Create
and use symlinks pointing to existing files or folders inside /net
(or elswere according to configuration) that caused these problems.
This workaround simulates NB3.6 behaviour.
There is even one option how to fix this issue (basically based on
modification of /net-specific workaround). The most important for this
solution is to isolate all changes into masterfs module.
VCS filesystem will probably never have "mount point" *above* an
autofs map point. If MasterFileSystem won't use even LocalFileSystem
then we will overcome this problem.
There are 2 variants:
1/ LocalFileSystem might be replaced by new implementation of
FileSystem in masterfs
2/ MasterFileSystem might be reimplemented to delegate directly on
java.io.File in case there isn't any suitable mounted VCS filesystem.
This reimplemantation could also take into account the Windows
specific problem with UNC paths and in advance count with future
redesign of VCS filesystem and simplify MasterFileSystem impl. but
with minimalistic functionality ensuring mapping java.io.File to
FileObject. I expect this solution should lead to better performance
and should minimize memory consumption.
Advantage of both solutions is that changes will be isolated and new
version of masterfs module will be enough. So, I sugest to start
implementation (based on some sort of design review) on branch. As
soon as the implementation has been stable there will be possible to
use NB4.0 and only replace the old version of masterfs module with the
new one to get rid of these problems.
I think that this solution might make waiving of this bug for NB4.0
Waiver approved for 4.0.
Just a data point.
Since I couldn't find a UI for mounted FSes and since my old
code still uses mounting, I had to blow away <userdir>/config/Mount
to reset stuff for debugging purposes.
When I restarted the MenuWarmupTask did a MasterFileSystem.refresh()
and i've been stuck on stats of /net for the past 2 hours!
workaround (solaris/maybe linux):
man automaount(1m) will tell you that there is an option, nobrowse
that will defeat the default fetching of all mountable points of,
say /net. Browsing is on by default, so "ls /net" will take
forever just like in the issue being discussed here.
The workaround is to alter your /etc/auto_master to this:
/net -hosts -nosuid,nobrowse
/home auto_home -nobrowse
- nobrowse is added to /net
- /net is put before +auto_master because +auto_master contains
it's own /net line with browsing enabled and a subsequent nobrowse
apparently doesn't override it.
Re. -nobrowse: this is probably highly desirable anyway, even if this
issue gets fixed. Consider opening a project on a remote disk. You
will bring up a file chooser and go through /net on your way. Of
course when you enter /net it will try to show subfolders, and hang
for a long time. This is not NB's fault; it is either the system's
fault for exposing a file abstraction it cannot implement efficiently,
or perhaps the file chooser's fault for trying to show children (or
showing them synchronously).
Already fixed by merging mastersfs51551 in trunk (Date: 05/01/07).