This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 42928 - [2004-05-17] Automatically detect text files
Summary: [2004-05-17] Automatically detect text files
Status: CLOSED FIXED
Alias: None
Product: platform
Classification: Unclassified
Component: Filesystems (show other bugs)
Version: 4.x
Hardware: All All
: P3 blocker (vote)
Assignee: Jaroslav Tulach
URL:
Keywords: UI
Depends on:
Blocks:
 
Reported: 2004-05-07 08:57 UTC by _ ttran
Modified: 2008-12-22 17:46 UTC (History)
3 users (show)

See Also:
Issue Type: ENHANCEMENT
Exception Reporter:


Attachments
Files without extension are content/unknown without a check (598 bytes, patch)
2004-05-07 09:43 UTC, Jaroslav Tulach
Details | Diff
DefaultDataObject (23.24 KB, patch)
2004-05-10 07:25 UTC, Jaroslav Tulach
Details | Diff
Boring, abitionless proposal. (26.64 KB, patch)
2004-05-14 14:56 UTC, Jaroslav Tulach
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description _ ttran 2004-05-07 08:57:56 UTC
See also the discussion in issue 42259.

We should just provide an editor cookie for
DefaultDataObject and kill the text module, or
something.  Treat all unknown files as text.  If
we want to do better  we can detect the binary
files (any '\0' in the first 2KB) and refuse to 
open them.  Those can be edited/viewed in a hex
editor in a distant future.

Besides improving the UI a bit, this also kills
one use case for .nbattrs
Comment 1 _ ttran 2004-05-07 09:08:30 UTC
Yarda just checked in the code in to CVS. His patch does something
different than what I proposed.  Instead of treating all unknown files
as text, he open those files and, read the first block, check for less
than \x09 chars.  All this happens during MIME recognizing process
which I fear may have impact on expanding folders with a lot of such files
Comment 2 _ ttran 2004-05-07 09:17:05 UTC
this is indeed very bad.  Yarda, go to All files in trunk build, and
expand /dev to see for yourself.  Compare it with the equivalent in 3.6.

Please revert your patch and solve it differently.  We have to
postpone reading from a file to the moment when the user tries to open
it, not during the time of file type recognition
Comment 3 Jaroslav Tulach 2004-05-07 09:27:18 UTC
This is right. The read of initial block is done in
CachedFileObject.getMimeType(...) which is optimized for usage in
MimeResolvers - e.g. it reads the content of the file once regardless
of how many resolvers are there. So in case that there is a
mimeresolver that is checking the content of the file (and there is
one MimeResolver registered in core) we should not see any slowdown.
Also if there is a MimeResolver that recognizes the mime type or if an
extension is registered as "FileUtil.setMimeType(ext,mimetype)" no
slowdown shall occur as my behaviour is added only as a fallback. 

I've added tests to cover the requested behaviour on known extension,
unknown extension, no extension. We can extend them if we find problem
with some particular form of file. In case we find that some perf
problems are there and are unfixable, here is the log for simpler revert:

Date: 2004/05/07 07:00:38
Author: jtulach
Log:
Trying to differenciate between binary and text files as part of the
mime type recognition.

Members:
        src/org/openide/filesystems/AbstractFileObject.java:1.90->1.91
        src/org/openide/filesystems/FileObject.java:1.84->1.85
        src/org/openide/filesystems/FileUtil.java:1.90->1.91
        src/org/openide/filesystems/MIMESupport.java:1.28->1.29
       
test/unit/src/org/openide/filesystems/FileObjectHosts.txt:INITIAL->1.1
       
test/unit/src/org/openide/filesystems/FileObjectTestHid.java:1.30->1.31
       
test/unit/src/org/openide/filesystems/FileObjecttextplain.txt:INITIAL->1.1
Comment 4 _ ttran 2004-05-07 09:33:15 UTC
> In case we find that some perf problems are there and are unfixable, 
> here is the log for simpler revert:

good. please go revert it.  My IDE is busy trying to read files in
/dev already for 20 minutes.  The whole window is gray because the
Folder recognizer thead is blocked and the IDE is not able to read the
.setting file for editor toolbar

Comment 5 Jaroslav Tulach 2004-05-07 09:38:55 UTC
/dev case exhibits the worst combination: a lot of files without
extension and without any other other resolver that would read them.
If this was a special case (something like /net for search for
nbproject), I could hardcode it to contain "content/unknown". Another
simple fix would be to scan only files with non-empty extension.

If none of the above workarounds work I have to revert. I'll attach
simple patch to evaluate the "non-empty extension" before doing that.
Comment 6 Jaroslav Tulach 2004-05-07 09:43:04 UTC
Created attachment 14756 [details]
Files without extension are content/unknown without a check
Comment 7 Jaroslav Tulach 2004-05-07 14:45:22 UTC
Commented out.

Checking in src/org/openide/filesystems/MIMESupport.java;
/cvs/openide/src/org/openide/filesystems/MIMESupport.java,v  <-- 
MIMESupport.java
new revision: 1.30; previous revision: 1.29
done
Processing log script arguments...
More commits to come...
Checking in test/unit/src/org/openide/filesystems/FileObjectTestHid.java;
/cvs/openide/test/unit/src/org/openide/filesystems/FileObjectTestHid.java,v
 <--  FileObjectTestHid.java
new revision: 1.32; previous revision: 1.31
Comment 8 Jesse Glick 2004-05-07 15:09:07 UTC
"So in case that there is a mimeresolver that is checking the content
of the file (and there is one MimeResolver registered in core) we
should not see any slowdown." - fine for *.xml and a couple of other
filename patterns that were already doing content-based checks. But
until now no one was doing content-based checks for random files, so
your patch was probably adding this overhead for the first time.

Also most files have no particular MIME type normally; they are
recognized by UniFileLoader. Someday this will change, but not now.
Comment 9 Jaroslav Tulach 2004-05-10 07:24:11 UTC
"But
until now no one was doing content-based checks for random files, so
your patch was probably adding this overhead for the first time." 

There is a list of predefined extension to mimetype mappings in
FileUtil (including java and class). For them the read would also be
skipped. So it is unlikely there would be any slowdown on folders we
usually display in IDE.

Anyway here is another patch which changes only DefaultDataObject. It
adds Open to all default data objects and either opens CES directly or
asks question about what to do. The content of the dialog is
pluggable, based on DataLoader.getActions(). That way it immediatelly
without any changes in other modules offers to open the file as text,
which delegates to text module. It could be a reasonable entrypoint
for Tim's contrib/hexedit to display the file in binary editor.
Comment 10 Jaroslav Tulach 2004-05-10 07:25:11 UTC
Created attachment 14776 [details]
DefaultDataObject
Comment 11 _ tboudreau 2004-05-10 08:03:29 UTC
But Jardo, wouldn't you like to use my hex editor in contrib/hexedit for non-text files? :-)  
Comment 12 _ tboudreau 2004-05-10 08:06:43 UTC
Also, more practical comment for promo D:  Why not...

don't do the read check when recognizing the file, do the read check when *opening* the 
file, and if it's binary, instead of opening it, pop up a "this is a binary file" dialog?  This is 
more or less what things like Notepad & other text editors do, and wouldn't surprise 
anybody.
Comment 13 _ ttran 2004-05-10 08:37:51 UTC
>don't do the read check when recognizing the file, do the read check
when *opening* the
> file, and if it's binary, instead of opening it, pop up a "this is a
binary file" dialog?  This is
> more or less what things like Notepad & other text editors do, and
wouldn't surprise
> anybody.

this is the orginal request, but Yarda chose to implement it differently 
Comment 14 Jaroslav Tulach 2004-05-10 08:41:02 UTC
Tim you got it wrong. Your module can add Open In HexEdit action to
DefaultDataObject as text module does and everything starts to work.
The openide/loaders is not going to know and directly refer to text
module, hexedit module or any other module.
Comment 15 Jaroslav Tulach 2004-05-14 14:56:23 UTC
Created attachment 14869 [details]
Boring, abitionless proposal.
Comment 16 Jesse Glick 2004-05-14 20:53:54 UTC
Removing performance keyword; should be no performance impact with new
patch, which looks much more reasonable.


Not sure why you are moving around createNodeDelegate... should be
excluded from this patch.


Not clear on what this does:

HashSet old = new HashSet ();
MultiFileLoader l = DataLoaderPool.getDefaultFileLoader ();
old.addAll (java.util.Arrays.asList (l.defaultActions()));


English edits:

MSG_BinaryFileQuestion=\
    This file appears to contain binary data. \
    Are you sure you want to open it in the text editor?
MSG_BinaryFileWarning=Binary File Detected


Maybe we could supply the text icon for DefaultDataObject? Then we
could kill the text data loader completely, leaving just the stupid
new file wizard item (which is also planned to be killed soon in favor
of a different UI, at which point we can finally kill the whole module).


testHasEditorCookieForResonableContentOfFiles - typo.
Comment 17 Jesse Glick 2004-05-17 05:01:45 UTC
Note that the current behavior could plausibly be considered a bug.
E.g. I cannot open files of unknown extension inside a JAR file using
a current dev build because TXTDataObject cannot mark it as its own
(JFS is r/o).
Comment 18 Jaroslav Tulach 2004-05-17 14:57:12 UTC
Addressed: english, remove of dead code, and commited.

/cvs/openide/loaders/src/org/openide/loaders/Bundle.properties,v  <--
 Bundle.properties
new revision: 1.8; previous revision: 1.7
done
Checking in openide/loaders/src/org/openide/loaders/DataLoaderPool.java;
/cvs/openide/loaders/src/org/openide/loaders/DataLoaderPool.java,v 
<--  DataLoaderPool.java
new revision: 1.10; previous revision: 1.9
done
Checking in
openide/loaders/src/org/openide/loaders/DefaultDataObject.java;
/cvs/openide/loaders/src/org/openide/loaders/DefaultDataObject.java,v
 <--  DefaultDataObject.java
new revision: 1.3; previous revision: 1.2
done
RCS file: /cvs/openide/loaders/src/org/openide/loaders/DefaultES.java,v
done
Checking in openide/loaders/src/org/openide/loaders/DefaultES.java;
/cvs/openide/loaders/src/org/openide/loaders/DefaultES.java,v  <-- 
DefaultES.java
initial revision: 1.1
done
Processing log script arguments...
More commits to come...
RCS file:
/cvs/openide/test/unit/src/org/openide/loaders/DefautDataObjectHasOpenTest.java,v
done
Checking in
openide/test/unit/src/org/openide/loaders/DefautDataObjectHasOpenTest.java;
/cvs/openide/test/unit/src/org/openide/loaders/DefautDataObjectHasOpenTest.java,v
 <--  DefautDataObjectHasOpenTest.java
initial revision: 1.1
done
Processing log script arguments...
More commits to come...
Checking in text/manifest.mf;
/cvs/text/manifest.mf,v  <--  manifest.mf
new revision: 1.44; previous revision: 1.43
done
Processing log script arguments...
Mailing the commit message to cvs@openide.netbeans.org,
cvs@text.netbeans.org (from jtulach@netbeans.org)
Checking in
text/src/org/netbeans/modules/text/ConvertToDefaultAction.java;
/cvs/text/src/org/netbeans/modules/text/ConvertToDefaultAction.java,v
 <--  ConvertToDefaultAction.java
new revision: 1.6; previous revision: 1.5
done
Checking in text/src/org/netbeans/modules/text/ConvertToTextAction.java;
/cvs/text/src/org/netbeans/modules/text/ConvertToTextAction.java,v 
<--  ConvertToTextAction.java
new revision: 1.13; previous revision: 1.12
done
Removing text/src/org/netbeans/modules/text/TXTModule.java;
/cvs/text/src/org/netbeans/modules/text/TXTModule.java,v  <-- 
TXTModule.java
new revision: delete; previous revision: 1.15
Comment 19 Jesse Glick 2004-05-17 15:21:55 UTC
OK, is there is any issue filed for complete removal of the text
module? We shouldn't stop halfway IMHO.
Comment 20 Miloslav Metelka 2004-05-20 10:24:32 UTC
Jesse, will it still be possible to use text/editor directory after
that? I would like to use the text/editor directory for the plain text
editor part after the split of the editor module (I need to extract
java, html and plain text support from the editor's core).
Comment 21 Jesse Glick 2004-05-20 14:54:16 UTC
Mila - sure, I think that's unrelated. I was talking about the NB
module; don't care about whether the CVS space is used. Though it
seems to me that keeping plain text support in a subdir of
editor.netbeans.org would be natural too.
Comment 22 Marian Mirilovic 2005-12-20 15:51:30 UTC
This issue was solved long time ago. Because nobody has reopened it neither
added comments, we are verifying/closing it now. 
If you are still able to reproduce the problem, please reopen. Thanks in advance.