134371 – I18N: search is not working for .properties files containing Unicode sequences (\uxxxx)

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 134371 - I18N: search is not working for .properties files containing Unicode sequences (\uxxxx)

Summary: I18N: search is not working for .properties files containing Unicode sequence...

Status:	VERIFIED FIXED

Alias:	None

Product:	utilities
Classification:	Unclassified
Component:	Search (show other bugs)
Version:	6.x
Hardware:	All All

Importance:	P3 blocker (vote)
Assignee:	Marian Petras

URL:
Keywords:	I18N

Depends on:
Blocks:

Reported:	2008-05-02 06:55 UTC by Masaki Katakai
Modified:	2008-10-06 17:53 UTC (History)
CC List:	3 users (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
screenshot (84.45 KB, image/png) 2008-05-02 06:57 UTC, Masaki Katakai	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Masaki Katakai 2008-05-02 06:55:29 UTC

This bug from bug 123631. Search is not working properly for localized
.properties files because it's not implemented yet.

Comment 1 Masaki Katakai 2008-05-02 06:57:29 UTC

Created attachment 60952 [details]
screenshot

Comment 2 Marian Petras 2008-05-02 08:51:01 UTC

Fix of this bug requires one of the following:

a) make the JDK developers fix their bug #4744247 (i.e. do not fix this, just wait until the bug is fixed)
b) make an ugly hack in the search routine such that decoding of .properties files is done in a special way

Comment 3 Marian Petras 2008-07-22 13:06:51 UTC

Fixed.

I made a bridge module between "Resource Bundles" (properties) and "User Utilities" (utilities), named "Searching in
Properties Files" (properties.search). This modules defines a special routine for obtaining CharsetDecoder for a given
file. The routine is special in that it calls a two-argument constructor of PropertiesEncoding.PropCharsetDecoder. The
second argument passes information about length of the file that is about to be decoded such that the decoder knows when
to flush the buffer, allowing it to workaround the JDK bug.

Changeset Id:
3b6c198e12df
(http://hg.netbeans.org/main/rev/3b6c198e12df)

Comment 4 Jesse Glick 2008-07-22 15:38:49 UTC

Wouldn't it have sufficed for PropertiesEncoding.getEncoding(FileObject) to return a (Prop)Charset which retained the
file length as an instance field, passing it in turn to (Prop)CharsetDecoder?

Comment 5 Marian Petras 2008-07-24 15:09:37 UTC

It would be a good solution if there was a guarantee that the encoding (Charset) provided by method
PropertiesEncoding.getEncoding(FileObject) is always used immediately after it is obtained. To be accurate, the
condition is that the FileObject's length must not change since the Charset is obtained until it is used. I can only
guarantee this for the Utilities module, not for other modules.

Comment 6 Jesse Glick 2008-07-24 15:37:23 UTC

Perhaps the PropCharset could retain the FileObject itself as an instance field, only checking the file's length when
actual decoding begins.

Comment 7 Marian Petras 2008-07-24 16:18:15 UTC

Yes, it should work. And the reference to the FileObject could be weak.

Comment 8 Jesse Glick 2008-07-24 16:21:58 UTC

Yes you could use a weak reference, though I cannot think of a situation where you would hold on to the Charset of a
file without wanting to holding on to the file itself.

Comment 9 Marian Petras 2008-07-24 16:40:42 UTC

"... I cannot think of a situation where you would hold on to the Charset of a file without wanting to holding on to the
file itself."

That is also the reason why it is safe to use weak references. If such situation happens (by accident), method
newDecoder() will throw an IllegalStateException("trying to use a decoder for a non-existing FileObject"). But it may
happen that someone keeps a reference to the PropCharset even though it is not necessary any more. In this case, the
weak reference will not prevent the FileObject from being garbage-collected.

If I understand correctly, you would like me to remove the bridge module and use the described solution instead, right?

Comment 10 Jesse Glick 2008-07-24 16:49:33 UTC

True, and you could even keep both

private long size;
private /*Weak*/Reference<FileObject> file;

to cover all possibilities.

If the implementation I suggested really works, then of course it would be nicer - just much simpler code, one less
module, etc.

Comment 11 Marian Petras 2008-07-24 16:56:05 UTC

Why keep the "size" field? When and how could I use it?

Comment 12 Jesse Glick 2008-07-24 17:21:10 UTC

In case the FileObject was collected, then you could use the stored size rather than throwing ISE. (The file may well
still exist on disk - the FO can be collected and then recreated.)

Comment 13 Marian Petras 2008-07-24 17:41:09 UTC

I would prefer not using the cached size. If the FileObject was garbage-collected during the period since initialization
of the Charset until the decoding begins and the corresponding file gets smaller during that period, the decoder might
drop some bytes from input. I believe this is a real danger.

Even if the ISE is called, the same situation might happen during the period since initialization of the decoder, but I
consider this much less probable.

Comment 14 Jesse Glick 2008-07-24 17:45:41 UTC

You could keep a URL for the FileObject. This poses no memory leak threat.

Comment 15 Marian Petras 2008-07-24 18:45:36 UTC

That is a good idea.

So I will use a combination of URL and a weak reference to a FileObject. The weak reference will be used in the first
place. If the FileObject cannot be obtained from the weak reference, the URLMapper.findFileObject(URL) will be used. If
it fails, too, then an ISE will be thrown. All of this will happen in method PropCharset.newDecoder().

I realize that to make it correct to the maximum extent, I could attach a FileChangeListener and update the URL each
time the FileObject is renamed. But I am not sure it is worth doing it.

Comment 16 Marian Petras 2008-07-28 17:29:59 UTC

I have reworked the fix as explained in the above discussion. The bridge module is removed.

Changeset Id:
49ebdda2090e
(http://hg.netbeans.org/main/rev/49ebdda2090e)

Comment 17 Jesse Glick 2008-07-28 19:32:46 UTC

Just reopening for an uncompilable test:

http://deadlock.netbeans.org/hudson/job/test-compilation/975/testReport/org.netbeans.nbbuild/SubAntJUnitReport/_hudson_workdir_jobs_test_compilation_workspace_properties/

/hudson/workdir/jobs/test-compilation/workspace/properties/test/unit/src/org/netbeans/modules/properties/PropertiesEncodingTest.java:63:
PropCharset() has private access in org.netbeans.modules.properties.PropertiesEncoding.PropCharset
                = new PropCharsetEncoder(new PropCharset());
                                         ^

Comment 18 Marian Petras 2008-07-28 20:09:33 UTC

Fixed at Jul 28 18:20 +0000 2008.

Comment 19 Quality Engineering 2008-07-29 03:54:26 UTC

Integrated into 'main-golden', available in NB_Trunk_Production #352 build
Changeset: http://hg.netbeans.org/main/rev/49ebdda2090e
User: Marian Petras <mpetras@netbeans.org>
Log: simplified fix of bug #134371 - now without any bridge module

Comment 20 Ken Frank 2008-08-15 19:55:05 UTC

does this still depend on the jdk issue being fixed also or is this fix the workaround
discussed to not need that issue to be fixed ?


and where in ide does this fix apply to, since 123631 was just about editor ?

ken.frank@sun.com

Comment 21 Marian Petras 2008-08-27 11:00:02 UTC

The change that introduced the bridge module (changeset 3b6c198e12df) added a workaround for searching files.

The subsequent change (changeset 49ebdda2090e) simplified the fix (also removed the bridge module) and made it work
across the IDE such that there is no need for special patches for the editor, searching etc. It should be working
correctly in 99.99% of cases but still it might happen that it would not work in some rare cases. Of course, the fix in
JDK would eliminate the need for the workaround, resulting in 100% reliability and reduced complexity.

Comment 22 Ken Frank 2008-10-06 17:53:31 UTC