This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 134371 - I18N: search is not working for .properties files containing Unicode sequences (\uxxxx)
Summary: I18N: search is not working for .properties files containing Unicode sequence...
Status: VERIFIED FIXED
Alias: None
Product: utilities
Classification: Unclassified
Component: Search (show other bugs)
Version: 6.x
Hardware: All All
: P3 blocker (vote)
Assignee: Marian Petras
URL:
Keywords: I18N
Depends on:
Blocks:
 
Reported: 2008-05-02 06:55 UTC by Masaki Katakai
Modified: 2008-10-06 17:53 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
screenshot (84.45 KB, image/png)
2008-05-02 06:57 UTC, Masaki Katakai
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Masaki Katakai 2008-05-02 06:55:29 UTC
This bug from bug 123631. Search is not working properly for localized
.properties files because it's not implemented yet.
Comment 1 Masaki Katakai 2008-05-02 06:57:29 UTC
Created attachment 60952 [details]
screenshot
Comment 2 Marian Petras 2008-05-02 08:51:01 UTC
Fix of this bug requires one of the following:

a) make the JDK developers fix their bug #4744247 (i.e. do not fix this, just wait until the bug is fixed)
b) make an ugly hack in the search routine such that decoding of .properties files is done in a special way
Comment 3 Marian Petras 2008-07-22 13:06:51 UTC
Fixed.

I made a bridge module between "Resource Bundles" (properties) and "User Utilities" (utilities), named "Searching in
Properties Files" (properties.search). This modules defines a special routine for obtaining CharsetDecoder for a given
file. The routine is special in that it calls a two-argument constructor of PropertiesEncoding.PropCharsetDecoder. The
second argument passes information about length of the file that is about to be decoded such that the decoder knows when
to flush the buffer, allowing it to workaround the JDK bug.

Changeset Id:
3b6c198e12df
(http://hg.netbeans.org/main/rev/3b6c198e12df)
Comment 4 Jesse Glick 2008-07-22 15:38:49 UTC
Wouldn't it have sufficed for PropertiesEncoding.getEncoding(FileObject) to return a (Prop)Charset which retained the
file length as an instance field, passing it in turn to (Prop)CharsetDecoder?
Comment 5 Marian Petras 2008-07-24 15:09:37 UTC
It would be a good solution if there was a guarantee that the encoding (Charset) provided by method
PropertiesEncoding.getEncoding(FileObject) is always used immediately after it is obtained. To be accurate, the
condition is that the FileObject's length must not change since the Charset is obtained until it is used. I can only
guarantee this for the Utilities module, not for other modules.
Comment 6 Jesse Glick 2008-07-24 15:37:23 UTC
Perhaps the PropCharset could retain the FileObject itself as an instance field, only checking the file's length when
actual decoding begins.
Comment 7 Marian Petras 2008-07-24 16:18:15 UTC
Yes, it should work. And the reference to the FileObject could be weak.
Comment 8 Jesse Glick 2008-07-24 16:21:58 UTC
Yes you could use a weak reference, though I cannot think of a situation where you would hold on to the Charset of a
file without wanting to holding on to the file itself.
Comment 9 Marian Petras 2008-07-24 16:40:42 UTC
"... I cannot think of a situation where you would hold on to the Charset of a file without wanting to holding on to the
file itself."

That is also the reason why it is safe to use weak references. If such situation happens (by accident), method
newDecoder() will throw an IllegalStateException("trying to use a decoder for a non-existing FileObject"). But it may
happen that someone keeps a reference to the PropCharset even though it is not necessary any more. In this case, the
weak reference will not prevent the FileObject from being garbage-collected.

If I understand correctly, you would like me to remove the bridge module and use the described solution instead, right?
Comment 10 Jesse Glick 2008-07-24 16:49:33 UTC
True, and you could even keep both

private long size;
private /*Weak*/Reference<FileObject> file;

to cover all possibilities.

If the implementation I suggested really works, then of course it would be nicer - just much simpler code, one less
module, etc.
Comment 11 Marian Petras 2008-07-24 16:56:05 UTC
Why keep the "size" field? When and how could I use it?
Comment 12 Jesse Glick 2008-07-24 17:21:10 UTC
In case the FileObject was collected, then you could use the stored size rather than throwing ISE. (The file may well
still exist on disk - the FO can be collected and then recreated.)
Comment 13 Marian Petras 2008-07-24 17:41:09 UTC
I would prefer not using the cached size. If the FileObject was garbage-collected during the period since initialization
of the Charset until the decoding begins and the corresponding file gets smaller during that period, the decoder might
drop some bytes from input. I believe this is a real danger.

Even if the ISE is called, the same situation might happen during the period since initialization of the decoder, but I
consider this much less probable.
Comment 14 Jesse Glick 2008-07-24 17:45:41 UTC
You could keep a URL for the FileObject. This poses no memory leak threat.
Comment 15 Marian Petras 2008-07-24 18:45:36 UTC
That is a good idea.

So I will use a combination of URL and a weak reference to a FileObject. The weak reference will be used in the first
place. If the FileObject cannot be obtained from the weak reference, the URLMapper.findFileObject(URL) will be used. If
it fails, too, then an ISE will be thrown. All of this will happen in method PropCharset.newDecoder().

I realize that to make it correct to the maximum extent, I could attach a FileChangeListener and update the URL each
time the FileObject is renamed. But I am not sure it is worth doing it.
Comment 16 Marian Petras 2008-07-28 17:29:59 UTC
I have reworked the fix as explained in the above discussion. The bridge module is removed.

Changeset Id:
49ebdda2090e
(http://hg.netbeans.org/main/rev/49ebdda2090e)
Comment 17 Jesse Glick 2008-07-28 19:32:46 UTC
Just reopening for an uncompilable test:

http://deadlock.netbeans.org/hudson/job/test-compilation/975/testReport/org.netbeans.nbbuild/SubAntJUnitReport/_hudson_workdir_jobs_test_compilation_workspace_properties/

/hudson/workdir/jobs/test-compilation/workspace/properties/test/unit/src/org/netbeans/modules/properties/PropertiesEncodingTest.java:63:
PropCharset() has private access in org.netbeans.modules.properties.PropertiesEncoding.PropCharset
                = new PropCharsetEncoder(new PropCharset());
                                         ^
Comment 18 Marian Petras 2008-07-28 20:09:33 UTC
Fixed at Jul 28 18:20 +0000 2008.
Comment 19 Quality Engineering 2008-07-29 03:54:26 UTC
Integrated into 'main-golden', available in NB_Trunk_Production #352 build
Changeset: http://hg.netbeans.org/main/rev/49ebdda2090e
User: Marian Petras <mpetras@netbeans.org>
Log: simplified fix of bug #134371 - now without any bridge module
Comment 20 Ken Frank 2008-08-15 19:55:05 UTC
does this still depend on the jdk issue being fixed also or is this fix the workaround
discussed to not need that issue to be fixed ?


and where in ide does this fix apply to, since 123631 was just about editor ?

ken.frank@sun.com
Comment 21 Marian Petras 2008-08-27 11:00:02 UTC
The change that introduced the bridge module (changeset 3b6c198e12df) added a workaround for searching files.

The subsequent change (changeset 49ebdda2090e) simplified the fix (also removed the bridge module) and made it work
across the IDE such that there is no need for special patches for the editor, searching etc. It should be working
correctly in 99.99% of cases but still it might happen that it would not work in some rare cases. Of course, the fix in
JDK would eliminate the need for the workaround, resulting in 100% reliability and reduced complexity.
Comment 22 Ken Frank 2008-10-06 17:53:31 UTC
v