This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
I had trouble with NB hanging for a long time during startup. I traced it to trying to get a listing of /net whichis all the hostnames within a LAN. This takes quite a while. This seems to be happening as the java modules tries to add this to the classpath (or something like that): "file:/net/dbx-fs/export/fs2/tools/j2sdk1.4.2_02/sparc-S2/src.zip the code in the masterfs seems to blindly want to get a listing of /net which should be unneccessary in this case.
Created attachment 16169 [details] Stack trace at the time of the hang
True, the java/j2seplatform module is only (indirectly) calling URLMapper.findFileObject(URL). Is it even possible to call this on a remote filesystem without creating all intermediate siblings files? As a hack it could at least never try to get children of "/net" during a refresh. As an aside to Ivan, I'm not sure loading your JDK from a remote disk is really a good idea to begin with... likely to slow down many operations. You wouldn't put libc.so for a C program on an NFS server I guess.
One should be able to open a file or directory w/o having to open all the siblings above it. Most applications manage to access things over /net w/o incurring such problems. As for loading JDKs from from /net. I wasn't aware that is was happening. I'm in the process of upgrading machines. I switch my VM all the time, have more than one place to control where I get my VM's from (over /net when doing "official" builds and testsuite because it's guaranteed to exist on all machines). So ... don't blame the user!
The problem is that NB filesystem impls currently seem to create all siblings whenever a FileObject is requested for one child of a parent. This is not required by the API; it should be fixable. Re. loading JDKs from remote disks - I wasn't claiming this was a user error, just pointing out that your IDE performance may suffer if you do this.
This can happen even if your jdk is not on a remote disk. It's not unreasonable to put /net/<your-own-machine>/<etc> into various paths because that way when you rlogin elsewhere you don't have to futz with your paths. On youonw machine /net/<your-own-machine>/ is a noop to unix.
AFAIK fileSystems don't create blindly any children. According to attached stacktrace there is called method children but this is impl. of AbstractFileSystem.List interface that can't be easily ommited without breaking backward compatibility. Moreoverthis method doesn't create any children (neither Files nor FileObjects) but provides list of children as initialization of FileObject that is parent folder. I don't see why it should be perf. problem to call something like new File ("/net").list () that returns just String[]. Naturally if any of these children is requested then all following calls will suffer from perf. problems which can't be preveted in no way. Reopen if you think that I'm wrong.
I think it is exactly calling new File("/net").list() that you never want to do, since it makes network accesses. And there is no logical reason for it to be necessary; it is purely an artifact of our filesystem implementations that File.list is called here, even though only children of /net/dbx-fs/ are ever going to be used.
Considering my comment of 2004-07-28 this can happen for a variety of reasons. I'd urge you to consider addressing this for promo-E as it is a generic issue. What happenned there was that I executed something like /net/djomolungma/nb-install/bin/runide The /net/djomolungma somehow ended up being recorded in the classpath NB uses to load it's own clusters and it hung for a long time trying to load these clusters.
IMO this should be fixed ASAP. Just executing nb with a fullpath including /net attached will cause inexplicable mystifying hangs to the user. So now we have two cases , /net in the classpath and /net as part of theexecutable pat. Might there not be other places where an even more valid use of /net will get people frutrated?
Seems quite serious for users who need to share some parts of their projects via shared network drives. ->P2. Can this be fixed/worked around for 4.0?
this is also good candidate for release note ... Radek, provide a justification for release noting the bug including a description of user impact and the workaround, if any. thanks in advance
I think it can be fixed for 4.0. Summary of the problem (for beta2): There is a pure performance and responsiveness for projects with resources lying down on remote disks within a LAN (JDK, libraries, sources etc.) and especially for resources accesed over /net folder. Currently there isn't any easy workaround except putting all resources on local disk if the performance seems to insufficient. Responsiveness depends on many factors: LAN throughput, project configuration, user's workflow etc.
There isn't easy to solve right this issue with some easy fix. MasterFileSystem is designed as a facade that delegates either on LocalFileSystem or VCSFileSystem which both are subclasses of AbstractFileSystem. Options: 1/ Add new subclass of AbstractFileSystem.List interface. Something like AbstractFileSystem.ExList with method boolean existChild(String name) - then as LocalFileSystem as VCSFileSystem must implement this new interface. There is disadvantage that also many significant changes must be done (rethink and check) in filesystems package (AbstractFolder, AbstractFileObject)because e.g. events are fired only when children were requested, the same is true for refreshFolder and so on. This is extraordinary risky for NB4.0 because it affects not only net folders and fixes must go into relatively fragile parts of code. 2/ Implement somehow in MasterFileSystem for /net path. This is less risky, it will affect only /net folder but will work only without cvs support. I mentioned that it can be fixed in NB4.0 but I maped by mistake promo-E into NB4.0 which isn't naturally true. So, I think that target milestone promo-E was right and this issue should be waived for NB4.0.
First, a /net-specific workaround won't do. This can hit /home or any other customer-site WAN automount map. Then, there are some puzzling things here. a) Why is this only affecting NB4.0 and not older NB's? In other words is a this a change in API's aand/or implementation causing a regression? Or is this behaviour is just triggered much more often? b) When I first posted this I had cited 'java' as coming over /net. I got a lecture about this being a bad idea, but our QA gets 'java' from /net. We have so many machines it's not worth installing java on all of them. So another puzzle. How come your QA doesn't run into this? With the initial QA handoff, based on 4.0 3 weeks away I count accept this symptom persisting. c) Note that this applies not just to where java comes from but apparently also where 'runide' comes from. It could rear it's head elsewhere. It seems to me that any customer who does network installs can potentially run afoul of this. So another puzzle. Do all beta customer install locally? Probably yes. Will FCS customers? You really want to take the risk that they all install locally?
Re. Ivan's point (a) - the same problem existed in the 3.x filesystem implementation. The difference is that in 3.x, you would not be likely to create a filesystem whose "mount point" (root) was *above* an autofs map point. E.g. you might have mounted "/net/some.thing/some/path" and all filesystem operations would be inside that. However in 4.0 with the switch to masterfs, it is equivalent to having mounted just "/" (on Unix at least) and then expanding subfolders "net", "some.thing", "some", and "path", before getting to the rest; FileObject's are created for these intermediate folders. Now of course in practice it is likely that the only thing you will do is map a URL such as "file:/net/some.thing/some/path/" back onto its FileObject; and use that FO's children (recursively), and occasionally walk back up its parent list to "/" without checking for siblings. So what we want is that the intermediate folders like "/net" not be asked for their children list unless something actually requests that information. However fixing this would apparently require API changes (*) and require a nontrivial and risky patch in core filesystem code, specifically folder refresh logic, which is already overly complex. I feel that risk might be justified in this case because the bug symptoms are so severe. Note that Windows UNC paths don't work, either. However in that case there are apparently JRE problems that also block a solution (e.g. in URL handling). (*) I.e.: you have some FileObject for "/net" and you ask fo.getFileObject("some.thing"). If the FileSystem is an AbstractFileSystem, the only way it can know whether that folder actually exists is to call fsList.children("net") and check if "some.thing" is among the returned values. Perhaps there is some short-term fix we could use that would work only for LocalFileSystem (in D) and would not introduce a new public API.
Re. Ivan's point (b) - because NetBeans QA tests the IDE the way it's most likely to be used by the users. Sorry, I don't consider using 'java' over /net to be a typical usage scenario and wouldn't ever thought that "it's not worth installing java" on our test machines. Anyway, we also tried to test various scenarios such as 'java' over /net and project files and sources over /net, and the results were not as bad as you report. I think it depends how your /net environment is configured, how fast is the network, what /net/<some-machine> you're connecting to, etc.
Re. Ivan's point (c) - we must carefully weigh the pros and cons and compare risks here and from my point of view there is obvious that mentioned fix affects the whole IDE including MasterFileSystem, SystemFileSystem and all features based on it. (*) - we can avoid API change but we can't avoid all the risky changes in core filesystems impl.
WORKAROUND: There seems to exist some specific workaround that should help. Create and use symlinks pointing to existing files or folders inside /net (or elswere according to configuration) that caused these problems. This workaround simulates NB3.6 behaviour. ALTERNATIVE SOLUTION: There is even one option how to fix this issue (basically based on modification of /net-specific workaround). The most important for this solution is to isolate all changes into masterfs module. VCS filesystem will probably never have "mount point" *above* an autofs map point. If MasterFileSystem won't use even LocalFileSystem then we will overcome this problem. There are 2 variants: 1/ LocalFileSystem might be replaced by new implementation of FileSystem in masterfs 2/ MasterFileSystem might be reimplemented to delegate directly on java.io.File in case there isn't any suitable mounted VCS filesystem. This reimplemantation could also take into account the Windows specific problem with UNC paths and in advance count with future redesign of VCS filesystem and simplify MasterFileSystem impl. but with minimalistic functionality ensuring mapping java.io.File to FileObject. I expect this solution should lead to better performance and should minimize memory consumption. Advantage of both solutions is that changes will be isolated and new version of masterfs module will be enough. So, I sugest to start implementation (based on some sort of design review) on branch. As soon as the implementation has been stable there will be possible to use NB4.0 and only replace the old version of masterfs module with the new one to get rid of these problems. I think that this solution might make waiving of this bug for NB4.0 more acceptable.
Waiver approved for 4.0.
Just a data point. Since I couldn't find a UI for mounted FSes and since my old code still uses mounting, I had to blow away <userdir>/config/Mount to reset stuff for debugging purposes. When I restarted the MenuWarmupTask did a MasterFileSystem.refresh() and i've been stuck on stats of /net for the past 2 hours!
workaround (solaris/maybe linux): man automaount(1m) will tell you that there is an option, nobrowse that will defeat the default fetching of all mountable points of, say /net. Browsing is on by default, so "ls /net" will take forever just like in the issue being discussed here. The workaround is to alter your /etc/auto_master to this: /net -hosts -nosuid,nobrowse +auto_master /home auto_home -nobrowse /xfn -xfn - nobrowse is added to /net - /net is put before +auto_master because +auto_master contains it's own /net line with browsing enabled and a subsequent nobrowse apparently doesn't override it.
Re. -nobrowse: this is probably highly desirable anyway, even if this issue gets fixed. Consider opening a project on a remote disk. You will bring up a file chooser and go through /net on your way. Of course when you enter /net it will try to show subfolders, and hang for a long time. This is not NB's fault; it is either the system's fault for exposing a file abstraction it cannot implement efficiently, or perhaps the file chooser's fault for trying to show children (or showing them synchronously).
Already fixed by merging mastersfs51551 in trunk (Date: 05/01/07).