Bug 45963 - masterfs needlessly gets children
masterfs needlessly gets children
Status: RESOLVED FIXED
Product: platform
Classification: Unclassified
Component: Filesystems
4.x
Sun SunOS
: P2 (vote)
: 4.x
Assigned To: rmatous
issues@platform
: PERFORMANCE, RELNOTE
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-07-09 00:30 UTC by ivan
Modified: 2008-12-23 00:16 UTC (History)
6 users (show)

See Also:
Issue Type: DEFECT
:


Attachments
Stack trace at the time of the hang (5.42 KB, text/plain)
2004-07-09 00:31 UTC, ivan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ivan 2004-07-09 00:30:18 UTC
I had trouble with NB hanging for a long time during 
startup. I traced it to trying to get a listing of 
/net whichis all the hostnames within a LAN. This
takes quite a while. 

This seems to be happening as the java modules
tries to add this to the classpath (or something
like that):
"file:/net/dbx-fs/export/fs2/tools/j2sdk1.4.2_02/sparc-S2/src.zip

the code in the masterfs seems to blindly want to 
get a listing of /net which should be unneccessary
in this case.
Comment 1 ivan 2004-07-09 00:31:56 UTC
Created attachment 16169 [details]
Stack trace at the time of the hang
Comment 2 Jesse Glick 2004-07-11 17:11:03 UTC
True, the java/j2seplatform module is only (indirectly) calling
URLMapper.findFileObject(URL). Is it even possible to call this on a
remote filesystem without creating all intermediate siblings files?

As a hack it could at least never try to get children of "/net" during
a refresh.

As an aside to Ivan, I'm not sure loading your JDK from a remote disk
is really a good idea to begin with... likely to slow down many
operations. You wouldn't put libc.so for a C program on an NFS server
I guess.
Comment 3 ivan 2004-07-12 23:29:23 UTC
One should be able to open a file or directory w/o having to
open all the siblings above it. Most applications manage to
access things over /net w/o incurring such problems.

As for loading JDKs from from /net.
I wasn't aware that is was happening. I'm in the process
of upgrading machines. I switch my VM all the time, have more
than one place to control where I get my VM's from (over
/net when doing "official" builds and testsuite because it's
guaranteed to exist on all machines).

So ... don't blame the user! 



Comment 4 Jesse Glick 2004-07-12 23:39:52 UTC
The problem is that NB filesystem impls currently seem to create all
siblings whenever a FileObject is requested for one child of a parent.
This is not required by the API; it should be fixable.

Re. loading JDKs from remote disks - I wasn't claiming this was a user
error, just pointing out that your IDE performance may suffer if you
do this.
Comment 5 ivan 2004-07-28 22:17:47 UTC
This can happen even if your jdk is not on a remote disk.
It's not unreasonable to put /net/<your-own-machine>/<etc>
into various paths because that way when you rlogin elsewhere you 
don't have to futz with your paths. On youonw
machine /net/<your-own-machine>/ is a noop to unix.
Comment 6 rmatous 2004-07-29 11:32:46 UTC
AFAIK fileSystems don't create blindly any children. According to
attached stacktrace there is called method children but this is impl.
of AbstractFileSystem.List interface that can't be easily ommited
without breaking backward compatibility. Moreoverthis method doesn't
create any children (neither Files nor FileObjects) but provides list
of children as initialization of FileObject that is parent folder. I
don't see why it should be perf. problem to call something like new
File ("/net").list () that returns just String[]. Naturally if any of
these children is requested then all following calls will suffer from
perf. problems which can't be preveted in no way.

Reopen if you think that I'm wrong.
Comment 7 Jesse Glick 2004-07-29 21:03:03 UTC
I think it is exactly calling

  new File("/net").list()

that you never want to do, since it makes network accesses. And there
is no logical reason for it to be necessary; it is purely an artifact
of our filesystem implementations that File.list is called here, even
though only children of /net/dbx-fs/ are ever going to be used.
Comment 8 ivan 2004-07-30 02:46:15 UTC
Considering my comment of 2004-07-28 this can happen for
a variety of reasons. I'd urge you to consider addressing this 
for promo-E as it is a generic issue.

What happenned there was that I executed something like
    /net/djomolungma/nb-install/bin/runide
The /net/djomolungma somehow ended up being recorded in the classpath
NB uses to load it's own clusters and it hung for a long time trying
to load these clusters.


Comment 9 ivan 2004-08-05 20:25:04 UTC
IMO this should be fixed ASAP.
Just executing nb with a fullpath including /net attached will cause
inexplicable mystifying hangs to the user. So now
we have two cases , /net in the classpath and /net as part of
theexecutable pat. Might there not be other places where an
even more valid use of /net will get people frutrated?
Comment 10 Jan Chalupa 2004-09-17 12:25:03 UTC
Seems quite serious for users who need to share some parts of their
projects via shared network drives. ->P2. Can this be fixed/worked
around for 4.0?
Comment 11 Marian Mirilovic 2004-09-17 14:04:43 UTC
this is also good candidate for release note ...

Radek, 
provide a justification for release noting the bug including a
description of user impact and the workaround, if any. 
thanks in advance
Comment 12 rmatous 2004-09-17 16:28:55 UTC
I think it can be fixed for 4.0.

Summary of the problem (for beta2):
There is a pure performance and responsiveness for projects with
resources lying down on remote disks within a LAN (JDK, libraries,
sources etc.) and especially for resources accesed over /net folder.  

Currently there isn't any easy workaround except putting all resources
on local disk if the performance seems to insufficient. 

Responsiveness depends on many factors: LAN throughput, project
configuration, user's workflow etc. 



Comment 13 rmatous 2004-10-22 13:53:35 UTC
There isn't easy to solve right this issue with some easy fix. 

MasterFileSystem is designed as a facade that delegates either on
LocalFileSystem or VCSFileSystem which both are subclasses of
AbstractFileSystem. 

Options:
1/ Add new subclass of  AbstractFileSystem.List interface. Something
like AbstractFileSystem.ExList with method boolean existChild(String
name) - then as LocalFileSystem as VCSFileSystem must implement this
new interface. There is disadvantage that also many significant
changes must be done (rethink and check) in filesystems package
(AbstractFolder, AbstractFileObject)because e.g. events are fired only
when children were requested, the same is true for refreshFolder and
so on. This is extraordinary risky for NB4.0 because it affects not
only net folders and fixes must go into relatively fragile parts of code. 

2/ Implement somehow in MasterFileSystem for /net path. This is less
risky, it will affect only /net folder but will work only without cvs
support.

I mentioned that it can be fixed in NB4.0 but I maped by mistake
promo-E into NB4.0 which isn't naturally true.

So, I think that target milestone promo-E was right and this issue
should be waived for NB4.0.
Comment 14 ivan 2004-10-22 21:38:02 UTC
First, a /net-specific workaround won't do. 
This can hit /home or any other customer-site WAN automount map.

Then, there are some puzzling things here.

a) Why is this only affecting NB4.0 and not older NB's?
In other words is a this a change in API's aand/or implementation
causing a regression? Or is this behaviour is just
triggered much more often?

b) When I first posted this I had cited 'java' as coming over /net.
I got a lecture about this being a bad idea, but our QA gets 'java'
from /net. We have so many machines it's not worth installing
java on all of them. 
So another puzzle. How come your QA doesn't run into this?
With the initial QA handoff, based on 4.0 3 weeks away I count
accept this symptom persisting.

c) Note that this applies not just to where java comes from but
apparently also where 'runide' comes from. 
It could rear it's head elsewhere.
It seems to me that any customer who does network installs can
potentially run afoul of this.
So another puzzle. Do all beta customer install locally?
Probably yes. Will FCS customers? You really want to take the
risk that they all install locally?
Comment 15 Jesse Glick 2004-10-22 22:02:08 UTC
Re. Ivan's point (a) - the same problem existed in the 3.x filesystem
implementation. The difference is that in 3.x, you would not be likely
to create a filesystem whose "mount point" (root) was *above* an
autofs map point. E.g. you might have mounted
"/net/some.thing/some/path" and all filesystem operations would be
inside that. However in 4.0 with the switch to masterfs, it is
equivalent to having mounted just "/" (on Unix at least) and then
expanding subfolders "net", "some.thing", "some", and "path", before
getting to the rest; FileObject's are created for these intermediate
folders.

Now of course in practice it is likely that the only thing you will do
is map a URL such as "file:/net/some.thing/some/path/" back onto its
FileObject; and use that FO's children (recursively), and occasionally
walk back up its parent list to "/" without checking for siblings. So
what we want is that the intermediate folders like "/net" not be asked
for their children list unless something actually requests that
information.

However fixing this would apparently require API changes (*) and
require a nontrivial and risky patch in core filesystem code,
specifically folder refresh logic, which is already overly complex. I
feel that risk might be justified in this case because the bug
symptoms are so severe.

Note that Windows UNC paths don't work, either. However in that case
there are apparently JRE problems that also block a solution (e.g. in
URL handling).

(*) I.e.: you have some FileObject for "/net" and you ask
fo.getFileObject("some.thing"). If the FileSystem is an
AbstractFileSystem, the only way it can know whether that folder
actually exists is to call fsList.children("net") and check if
"some.thing" is among the returned values. Perhaps there is some
short-term fix we could use that would work only for LocalFileSystem
(in D) and would not introduce a new public API.
Comment 16 Jan Chalupa 2004-10-23 01:23:18 UTC
Re. Ivan's point (b) - because NetBeans QA tests the IDE the way it's
most likely to be used by the users. Sorry, I don't consider using
'java' over /net to be a typical usage scenario and wouldn't ever
thought that "it's not worth installing java" on our test machines.

Anyway, we also tried to test various scenarios such as 'java' over
/net and project files and sources over /net, and the results were not
as bad as you report. I think it depends how your /net environment is
configured, how fast is the network, what /net/<some-machine> you're
connecting to, etc. 

Comment 17 rmatous 2004-10-25 10:52:44 UTC
Re. Ivan's point (c) - we must carefully weigh the pros and cons and
compare risks here and from my point of view there is obvious that
mentioned fix affects the whole IDE including MasterFileSystem,
SystemFileSystem and all features based on it. 

(*) - we can avoid API change but we can't avoid all the risky changes
in core filesystems impl.

Comment 18 rmatous 2004-10-26 11:08:58 UTC
WORKAROUND:
There seems to exist some specific workaround that should help. Create
and use symlinks  pointing to  existing files or folders inside /net
(or elswere according to configuration) that caused these problems.
This workaround simulates NB3.6 behaviour. 


ALTERNATIVE SOLUTION:
There is even one option how to fix this issue (basically based on
modification of /net-specific workaround). The most important for this
solution is to isolate all changes into masterfs module.

VCS filesystem will probably never have "mount point" *above* an
autofs map point. If MasterFileSystem won't use even LocalFileSystem
then we will overcome this problem. 
There are 2 variants:
1/ LocalFileSystem might be replaced by new implementation of
FileSystem in masterfs 

2/ MasterFileSystem might be reimplemented to delegate directly on
java.io.File in case there isn't any suitable mounted VCS filesystem.
This reimplemantation could also take into account the Windows
specific problem with UNC paths and in advance count with future
redesign of VCS filesystem and simplify MasterFileSystem impl. but
with minimalistic functionality ensuring mapping java.io.File to
FileObject. I expect this solution should lead to better performance
and should minimize memory consumption. 

Advantage of both solutions is that changes will be isolated and new
version of masterfs module will be  enough. So, I sugest to start
implementation (based on some sort of design review) on branch. As
soon as the implementation has been stable there will be  possible to
use NB4.0 and only replace the old version of masterfs module with the
new one to get rid of these problems. 

I think that this solution might make waiving of this bug for NB4.0
more acceptable.




Comment 19 Jan Chalupa 2004-11-01 07:57:24 UTC
Waiver approved for 4.0.
Comment 20 ivan 2004-11-05 01:07:25 UTC
Just a data point.
Since I couldn't find a UI for mounted FSes and since my old
code still uses mounting, I had to blow away <userdir>/config/Mount
to reset stuff for debugging purposes.
When I restarted the MenuWarmupTask did a MasterFileSystem.refresh()
and i've been stuck on stats of /net for the past 2 hours!
Comment 21 ivan 2004-11-09 01:30:25 UTC
workaround (solaris/maybe linux):

man automaount(1m) will tell you that there is an option, nobrowse
that will defeat the default fetching of all mountable points of,
say /net. Browsing is on by default, so "ls /net" will take
forever just like in the issue being discussed here.

The workaround is to alter your /etc/auto_master to this:

/net            -hosts          -nosuid,nobrowse
+auto_master
/home           auto_home       -nobrowse
/xfn            -xfn

- nobrowse is added to /net 
- /net is put before +auto_master because +auto_master contains 
  it's own /net line with browsing enabled and a subsequent nobrowse
  apparently doesn't override it.
Comment 22 Jesse Glick 2004-11-09 23:24:40 UTC
Re. -nobrowse: this is probably highly desirable anyway, even if this
issue gets fixed. Consider opening a project on a remote disk. You
will bring up a file chooser and go through /net on your way. Of
course when you enter /net it will try to show subfolders, and hang
for a long time. This is not NB's fault; it is either the system's
fault for exposing a file abstraction it cannot implement efficiently,
or perhaps the file chooser's fault for trying to show children (or
showing them synchronously).
Comment 23 rmatous 2005-01-14 10:28:21 UTC
Already fixed by merging mastersfs51551 in trunk (Date: 05/01/07).


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo