This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 25661 - There should not be two DataObject for the same java.io.File
Summary: There should not be two DataObject for the same java.io.File
Status: RESOLVED FIXED
Alias: None
Product: platform
Classification: Unclassified
Component: Filesystems (show other bugs)
Version: 3.x
Hardware: PC Linux
: P1 blocker (vote)
Assignee: rmatous
URL:
Keywords:
Depends on:
Blocks: 26744 26921 27693 27815 30700
  Show dependency tree
 
Reported: 2002-07-16 09:32 UTC by Jaroslav Tulach
Modified: 2008-12-22 17:43 UTC (History)
7 users (show)

See Also:
Issue Type: ENHANCEMENT
Exception Reporter:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jaroslav Tulach 2002-07-16 09:32:03 UTC
Based on Tor's comment after loosing data when he
opened two editors each for one DataObject, both
representing the same file.

In platform build (and probably in 4.0) we
automatically mount the root and home directory
filesystems. This simplifies access to files, but
causes problem, because one java.io.File can be
represented by two FileObject. Each of them can
have its own DataObject and as such this breaks
the "exclusive ownership that a data object should
have over a resource".

There is a need to do something with it. For
example mount just root and prevent other
instantiations of LocalFileSystem (for the user) +
possibility to "mount" CVS filesystem into certain
places in the RootLFS...
Comment 1 rmatous 2002-07-16 12:37:19 UTC
I understand that this problem may be unpleasant. But there is
definitely no reason to be marked as filesystems P1 defect. There is
missing some infrastructure that could help to solve this problem (may
be called e.g. RootFs). So, changed DEFECT-> RFE.
Comment 2 Torbjorn Norbye 2002-07-16 14:00:32 UTC
This is loss of user data. How is that not a P1?
(The bug has been there in previous releases as well
(when mounting contained hierarchies, such as
/foo/nb_all/module and /foo/nb_all/module/src, 
but in 4.0 we're mounting both / and /home/user by
default which only makes this problem more severe.

I agree that this is a usability bug - and Sun's
bugtraq lets you distinguish between bugs, rfe's (enhancements)
and usability bugs (eous). Note however that ease of
use bugs are treated like bugs, not rfes when it comes
to metrics and release criteria.  In other words, if
there is some user interface behavior which makes it
very easy for the user to make a fatal mistake, that's
a bug, not a feature which could be enhanced.I don't
think Yarda fully explained the scenario I ran into,
so I'll describe it here:
   I opened a file by browsing to it in the explorer.
   I made a series of edits (15 minutes worth of work.)
   Then I went to a different source file, using following
   call chains by doing alt-g in the editor to go to
   the definition of the method call I was looking at.
   Suddenly I found myself back in the first file I had
   been editing - but of course not in the same area
   of the file as I was working in earlier.  I made one
   more edit and hit save. What I had -not- noticed was
   that when I was editing the original file, it was a 
   "new" version of the file, found in a different
   filesystem than the one I had started from - so I had
   two tabs for the same file in the editor - and my
   changes in the first file were not present.  When I
   came back later (after exiting the IDE) I found that
   my first changes were gone (and it had been a while
   so I had forgotten exactly what I did).
I hope this helps shed some light on the problem. I don't
really have a strong opinion on the exact severity of this
bug as long as it's fixed before the next major release. I
realize it may be too late for 3.4 if you can't think of
an easy way to fix it. I have some ideas for how it may
work, but I may have overlooked some implications - and I
think Yarda has various ideas as well.
Comment 3 rmatous 2002-07-16 15:40:30 UTC
So, I don`t hasitate that it is a bug. My motivation: this bug was
filed against component openide-filesystems. I don`t think that
problem appeared because of wrong impl. inside openide-filesystems,
unsufficient API, regression .... So, from my point of view (as
maintainer) it is RFE.  But definitely I don`t want to say that I`m
not interested in solving this problem or that I want to discard it in
any way (but definitely not to release34). 

But (more than issue type, priotrity, severity) I`m interested in
ideas for how it may work. I can imagine Jarda`s idea RootFs (with
plugged or mounted filesystems in it). In principle right solution
will be that one java.io.File (or generally resource) coresponds to
one FileObject in Repository. So, I`m eager to know other ideas (not
important if implications were overlooked).
Comment 4 Marek Grummich 2002-07-22 11:27:55 UTC
Set target milestone to TBD
Comment 5 Marek Grummich 2002-07-22 11:29:39 UTC
Set target milestone to TBD
Comment 6 Martin Entlicher 2002-07-29 15:16:20 UTC
I also like the proposed idea of a Root filesystem into which can
others "mount".
IMHO the RootFS can be a MultiFileSystem with two LFS mounted (one at
"/" and sencond at ${HOME}). When I have overlapping mounts with
treefs, it works correctly. I get two "same" files opened in the
editor, but the changes are correctly propagated from one into the
other.
The CVS filesystems will then just plug into the RootFS instead of
mount into the Repository.
Comment 7 Jesse Glick 2002-08-13 16:52:39 UTC
So more concretely I would propose this:

1. There is a special MultiFileSystem mounted in Repository.getDefault
at all times for each local drive (as reported by java.io.File: just /
on Unix, each letter on Win). I think we can drop the mount of $HOME,
it is useless IMHO, just a piece of UI which no one uses anyway.

2. Each RootFS always contains (as MultiFileSystem.delegates) a
LocalFileSystem (perhaps subclass such as ExLocalFileSystem) for the
root of the drive itself. It may also contain zero or more other
filesystems, provided these map to areas of local disk: e.g.
FileUtil.toFile(fs.getRoot()) is non-null, which excludes remote FSs,
JAR/ZIP FSs, etc. (These delegates are *not* mounted in
Repository.default.) Client code sees only FileObject's produced by
RootFS; these delegate to the most specific (lowest) physical
filesystem impl. I.e. just like Unix overlapping mounts.

3. One or the other of these:

3a. There is a second Repository instance, accessible via some other
call than getDefault(). (JNDI naming, an added API method somewhere,
...) It can only hold filesystems mapped to local disk. The RootFS
filesystem(s) automatically delegate to its contents. All modules
which might mount local filesystems (incl. VCS modules) must be
modified to mount to this repository rather than the default one.

3b. Repository.getDefault returns a Repository which transparently
takes calls to add(FileSystem) and either mounts them directly as now,
or redirects the call to someRootFS.setDelegates(...), according to
whether the root folder is detected to be a local disk folder or not.
May involve API changes, since currently add(FS) is final, and I think
this impl should live outside org.openide.filesystems to avoid clutter.

4. The UI of the Filesystems tab could either show all the physical
local disk mounts, or just show the mounted root drives and all else
would be available "in-place". Filesystems Settings needs to show the
physical local disk mounts so they can be customized, e.g. for VCS
parameters.

5. Interaction with projects & Classpath API not clear...
Comment 8 Torbjorn Norbye 2002-08-13 19:02:23 UTC
I'd like to add one more "complication". To fix this bug we need to
discover if a newly mounted directory is a subdirectory of an already
mounted directory (or otherwise, if it is a parent directory of an
already mounted directory).

A naive approach is just to look for directory string prefixes; e.g.
"/home/tor/projects/foo" is a subdirectory of "/home/tor" because the
second is a prefix of the first.

However, on Unix this will often fail to find real matches. Symlinks
are frequently used, so if I have 
   /home/tor/work -> /home/tor/projects
we need to know that /home/tor/work/foo is below /home/tor/projects
for example.

Thus, at a minimum you have to do two searches: first the above, then
one where you check the canonicalized paths. The cpp module
has a class, CppUtils, whose method findFileObject() performs roughly
the above algorithm (looking up a mounted filesystem and then a file
object once a proper filesystem is found).

However, it gets even trickier.  It's not unusual to use
"automounters" in the Solaris world (and in the Linux world too I
think). Therefore, "/home/tor" might really resolve to
"/net/nfsserver1/homedirs/export4/home/tor" - and the File class'
getCanonicalPath will -not- be able to perform these translations. A
more common example of this is the equivalence of "/export" and
"/net/foo/export" on host foo. Discovering these equivalences is
trickier; it can't be done from Java.  The way you check for file
equality is to stat(2) the two files you want to check, and see if
their inode and device numbers are identical.  In ifdef I wrapped the
CppUtils.findFileObject call in a IfdefUtils.findFileObject call which
first did the simple check, and then did a stat() check. I'll be happy
to share the code. Unfortunately, it used JNI code which seems
unpleasant for the core.  So instead I suggest that you do something
simple. I think ls -d -i will give you the inode numbers for a given
set of files; perhaps there is something similar to get the device
numbers as well. If there is no way to get the device number, you can
check for inode equality; when you find that it's quite likely the
directories are the same and you can either use some other heuristic
(like checking if they contain the same files, have the same mtime
attribute, have the same parent, etc.) to determine equality, or
simply ask the user.

Doing the getCanonicalPath will get you 95% there, but it's not
enough. I had this problem in WorkShop where users would complain that
they set breakpoints but they wouldn't show up in the editor; it was
because they had opened the editor file through one known path and the
debugger had opened it up through another; the system never realized
the files were the same. There were other similar problems, so fixing
this is important from a robustness perspective.

---

I'd also like to address the UI. Since we're no longer using the
mounted filesystems as the classpath, I think the right behavior would
be for "related" filesystems (two filesystems where one is mounted at
a root below the other filesystems' root) to be nested; e.g. if I have
mounted / and /home/tor (which is what 4.0 does for me), /home/tor
would be listed below /, not separately. (In fact, it might not even
be called out separately but just have a filesystems node instead of a
folder node when I expand the contents of /home/.)
Comment 9 Jesse Glick 2002-08-13 19:58:37 UTC
Agreed re. the UI, that sounds sensible.

Re. synlinks: OK, checking for matches in canonical path makes sense.
Generally the RootFS before serving any FileObject (or only folders??)
could check its canonicalPath, and look for matches against the
(cached) canonicalPath's of the mount points. One problem is that
File.getCanonicalPath might not be as efficient as we would like -
need to check.

Re. automounters: ixnay. If you set up an automounter this way, and
access files via more than one path on it, IMHO you deserve whatever
trouble you get - I would not expect a tool to save me from this.
Perhaps some future JDK impl of java.io.File will grok automounters
and take them into account in getCanonicalPath(), but I don't think we
should go there.
Comment 10 Torbjorn Norbye 2002-08-13 21:25:06 UTC
Why do you want to look up canonical paths whenever FileObjects are
served?  I was thinking you only have to do this containment test
once: when a filesystem is mounted (or when the root directory of a
filesystem is changed). Thus, it's probably okay for it to be an
"expensive" operation; mounting is an infrequent operation.

Regarding automounting (I'm not sure what "ixnay" stands for.) "If you
set up an automounter this way" - the /net mount is automatically
configured that way on default Solaris systems. Accessing by both
files also happens frequently because you often work in /export to
have everything locally, but occasionally you need to access the files
from /net/host/export because you're on a different system, so your
"PATH" (e.g. mounted filesystems) involves both.  

Having said that, I think it would be fine to only do the canonical
path check in 4.0, and leave more accurate directory equivalence tests
for community contribution on a per platform basis. (The next time the
duplicate data object bug bites me because of this I might go and
write it:)
Comment 11 Jesse Glick 2002-08-14 00:24:59 UTC
Re. looking up canon paths for every file: if
/home/jglick/buildall.xml is a symlink to
/space/src/nb_all/nbbuild/build.xml, even if just / is mounted, I
would not want to make edits from one path and have them clobber edits
from another. Not sure if that can be solved using RootFS, though.
Probably can in the case of directory symlinks.

Re. automounter: if Solaris is configured this way by default, that
would be a stronger argument for supporting it. Still seems like a
potentially messy thing to try to solve though.
Comment 12 rmatous 2002-09-11 10:21:31 UTC
I hasitate if MultiFileSystem is suitable:
1/ MultiFileObject use delegation, that is suitable for our RootFS.
But delegates to another FileObject, that means that duplicates all
FileObjects that exists. If user will look for  folder or file (e.g.
in explerer deep) in hierarchy then huge number of FileObjects will be
created and kept in memory.

2/ MultiFileObject also uses merging for children from individual
filesystems. Here I think is merging not appropriate behaviour,
because e.g. LocalFileSystem keeps CVS folder visible and CVS
filesystem hides it. Merging means, that CVS folder is visible. So, in
this case after CVS FS is pluged, then CVS folder must disapear and
fileDeleted must be fired

Problem in 1/ can`t be easily solved. Jarda once suggested principle
similar to AbstractFileObject, that seems to be much suitable, though.
All AbstractFileObjects delegate to one instance of AbstractFileSystem
(or better intefaces implemented by AbstractFS) instead of number of
FileObject-delegates. Disadvantage is obvious, all filesystems, that
are plugable must implement the same special interface.


Comment 13 Jaroslav Tulach 2002-09-11 14:50:53 UTC
It seems to me that the MultiFileSystem is too complex for this task.
Better to write a new FS from scratch. Just create
DelegatingFileObject that contains a reference to one FileObject and
nothing else (no reference to parent for example). Every such
information can be derived from the file object one delegates to.
Comment 14 Jesse Glick 2002-09-11 16:29:46 UTC
Re. #1 - I would guess that creating two FileObject's is not
*inherently* expensive, it mainly depends on how nicely designed your
implementation of FileObject is - especially the cache mechanism and
how it works with the garbage collector. For example, we already
create a nodeDelegate for every DataObject you explore, but never
display it - we make a FilterNode to wrap it! This is not necessarily
a problem - depends on the memory footprint of FilterNode and what
kinds of expensive resources it uses (e.g. weak listeners and other
things that require reference registration that probably do not work
well with the GC nursery). Also you can optimize for the common case:
if the pluggable FS is an AbstractFileSystem, never create the
delegated AFO, just store its path and call
AFS.{Attr,List,Info,Change} methods directly.

Re. #2 - ??? If you have a LFS "/" and a CVSFS "/home/me/sources/",
then of course the RootFS would delegate to the LFS for all files
under "/" except "/home/me/sources/". It would *not* merge; it would
never ask the LFS for any FileObject's under home/me/sources (unless
the CVSFS was only mounted later). You just need to override
findResourceOn(LFS,"home/me/sources/**") to return null - right? (If
the CVSFS is dynamically mounted, I guess this would have the effect
of firing fileDeleted for CVS/ folders. If the mount is already
present at startup, there should be no such overhead.)
Comment 15 Torbjorn Norbye 2002-09-30 22:46:53 UTC
I've run into another problem related to this. See #27693 for the gory
details, but essentially, the projects module is trying to resolve
URLs such as nbprjres://ProjectFileLocation/../../../../Foo.txt
In doing so, it starts with the FileObject for the directory
where the Project File is located, and then for each "../"
it does a FileObject.getParent().

The problem is that if the Project File is located in let's
say /home/tor/foo, after doing a single "../" I may be at the
root of the "/home/tor" filesystem. 

So, I think it would be cool if this "composite filesystem"
that this issue is dealing with would handle this transparently;
if we "mount" /home/tor below the "/home" node on the / filesystem,
then doing a getParent() on /home/tor should return the fileobject
for /home, not null.

Does that make sense, or is there some reason why it's useful
to have getParent() return null when you're at the top of a
filesystem?
Comment 16 Jaroslav Tulach 2002-10-02 09:17:44 UTC
Definitively. For each fileobject in the filesystem following should
be true:

fileobject.equals (fileobject.getChildren ()[index].getParent ())

for each reasonable index.
Comment 17 rmatous 2002-10-23 10:01:04 UTC
This issue was already started on localfsbranch, so status changed to
STARTED. Sources can be found in ../core/localfs/src in package
org.netbeans.modules.localfs. 
Comment 18 Martin Entlicher 2002-11-08 15:19:58 UTC
Does it make sense to add this module (localfs) to the projects build
after it's stabilized? IMHO this should be tested in projects builds
rather than current dev builds.
Comment 19 Vitezslav Stejskal 2003-02-06 13:54:50 UTC
The projects build uses localfs modules for couple of months and it
seems to work pretty good. What's the status of developement on this
module? Work in progress? Finished?
Comment 20 rmatous 2003-02-06 14:21:22 UTC
You are right, I think this issue can be closed. If problems, then
RFEs or defects can be filed.