This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 157362 - Simplify reading of FileObject's content
Summary: Simplify reading of FileObject's content
Status: RESOLVED FIXED
Alias: None
Product: platform
Classification: Unclassified
Component: Filesystems (show other bugs)
Version: 6.x
Hardware: All All
: P3 blocker (vote)
Assignee: Jaroslav Tulach
URL:
Keywords: API_REVIEW_FAST
Depends on:
Blocks:
 
Reported: 2009-01-23 14:07 UTC by Jaroslav Tulach
Modified: 2009-02-19 22:53 UTC (History)
2 users (show)

See Also:
Issue Type: ENHANCEMENT
Exception Reporter:


Attachments
Three new methods in FileObject (9.02 KB, patch)
2009-01-23 14:12 UTC, Jaroslav Tulach
Details | Diff
Improved patch (no arg methods, buffering of small files, using encoding) (15.68 KB, patch)
2009-02-02 12:03 UTC, Jaroslav Tulach
Details | Diff
List<String> asLines() as Jesse requested (21.00 KB, patch)
2009-02-04 11:51 UTC, Jaroslav Tulach
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jaroslav Tulach 2009-01-23 14:07:36 UTC
Many people complain about complexity of reading content of file in Java. I've even dedicated to this problem one 
section in the Practical API Design book:
http://wiki.apidesign.org/wiki/Extreme_Advice_Considered_Harmful
Everyone is waiting for the JDK guys to finally realize the problem and fix it. However NetBeans do not need to wait! 
We have our FileObjects and we can and we shall fix the problem ourselves.
Comment 1 Jaroslav Tulach 2009-01-23 14:12:15 UTC
Created attachment 76183 [details]
Three new methods in FileObject
Comment 2 Vince Kraemer 2009-01-23 14:41:54 UTC
a couple nits...

vbk01: The link in the apichange doc for asLines looks like it points to asText.

vbk02: having versions of these methods that use the default encoding might be nice... but that is because I am lazy.
Comment 3 Jiri Skrivanek 2009-01-23 20:20:02 UTC
JS01: Method asLines() is questionable. It may happen that InputStream will not be closed. You can use Scanner class
instead:
        Scanner scanner = new Scanner(fo.getInputStream());
        // or Scanner scanner = new Scanner(fo.asText("UTF-8"));
        while (scanner.hasNext()) {
            String line = scanner.nextLine();
            ...
        }
        scanner.close();
Comment 4 Jesse Glick 2009-02-01 17:55:28 UTC
[JG01] Consider removing TestFileUtils.read and replace existing usages with the new method. At the least, deprecate the
old method.


[JG02] Consider looking for code which could use the new methods and updating it to do so. I'm sure there is plenty.


[JG03] asLines seems to ignore the encoding.
Comment 5 Jaroslav Tulach 2009-02-02 12:03:55 UTC
Created attachment 76455 [details]
Improved patch (no arg methods, buffering of small files, using encoding)
Comment 6 Jaroslav Tulach 2009-02-02 12:05:17 UTC
Unless I hear objections, I'd like to integrate the new patch tomorrow.
Comment 7 Petr Hejl 2009-02-02 12:32:34 UTC
PH01: I would rather see Charset than String (as a parameter of getText and asLines).
Comment 8 Jesse Glick 2009-02-02 14:40:22 UTC
[JG04] Consider making asLines return List<String>, which would be more convenient in certain situations. Otherwise you
need to

List<String> lines = new ArrayList<String>();
for (String line : file.asLines("UTF-8")) {
  lines.add(line)
}

which is a bit annoying. The iterator could still be lazy, of course, but any other List methods would force the whole
file to be read if it was not already.
Comment 9 Jesse Glick 2009-02-02 14:46:39 UTC
PH01 seems unnecessary to me:

- the cases where you would need a live Charset (e.g. loading an XML file of unknown encoding, but not using an XML
parser) seem rare, and in such a case you could easily write the file loading code the old way: new String(f.asBytes(),
...) or create a BufferedReader and loop

- java.io.* APIs dealing with character sets (e.g. new InputStreamReader(...)) always took a String, only adding Charset
as an option in 1.4

- for the common case that you want to load in UTF-8 or ISO-8859-1, you would need an extra call to Charset.forName,
making the convenience API less convenient
Comment 10 Jaroslav Tulach 2009-02-04 11:51:48 UTC
Created attachment 76542 [details]
List<String> asLines() as Jesse requested
Comment 11 Jaroslav Tulach 2009-02-04 11:57:42 UTC
New patch addresses almost all comments:
JS01: For small files the stream is always read at once and closed. Content is cached.
JG01&02: I'll fix something at the time of integration.
JG03: Fixed.
PH01: I guess that most users want to just read in "UTF-8" formating. For them the current state is simpler. If we 
need Charset argument, we can add it later.
JG04: Changed to return List<String>. It only fast operation is to get an iterator and call next on it, through. Tests 
ensure the behaviour is at least correct, so there is room for making things faster in the future, if necessary.

Let's start next 24h objection period now.
Comment 12 Jesse Glick 2009-02-04 15:23:03 UTC
Looks good to me.
Comment 13 Jaroslav Tulach 2009-02-05 13:01:35 UTC
core-main#af5ee74674fa
Comment 14 Quality Engineering 2009-02-06 07:49:34 UTC
Integrated into 'main-golden', will be available in build *200902060201* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/af5ee74674fa
User: Jaroslav Tulach <jtulach@netbeans.org>
Log: #157362: Simplify reading of FileObject's content