This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 33338 - I18N - Patch application corrupts localized text if written in encoding different from the system one.
Summary: I18N - Patch application corrupts localized text if written in encoding diffe...
Status: VERIFIED FIXED
Alias: None
Product: utilities
Classification: Unclassified
Component: Diff (show other bugs)
Version: -S1S-
Hardware: PC Windows XP
: P1 blocker (vote)
Assignee: Martin Entlicher
URL:
Keywords: I18N
Depends on:
Blocks:
 
Reported: 2003-05-01 14:38 UTC by Jiri Kovalsky
Modified: 2003-07-14 14:06 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
The binary patch, that fix this problem. Put into <NB-install>/modules/patches/org-netbeans-modules-diff/ folder. (6.95 KB, application/octet-stream)
2003-05-06 14:15 UTC, Martin Entlicher
Details
The contextual diff of the fix. (2.60 KB, patch)
2003-05-06 14:17 UTC, Martin Entlicher
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jiri Kovalsky 2003-05-01 14:38:26 UTC
RC1 build #030428_1 of Sun ONE Studio 5.0
Windows XP with JDK 1.4.1 build #02

Description:
============
Applying a patch on a file containing some
localized text will replace it with question-mark
characters (?). This is very annoying because diff
is then almost not readable and it's in fact data
loss.

Steps to reproduce:
===================
1. Let's have [Locally Modified] file under CVS
filesystem.
2. Invoke "CVS|Diff Textual" on the file.
3. Right click the text area and "Save To File".
4. Delete the file and "CVS|Update" it back.
5. Add some localized text e.g. "&#269;eský text".
6. Invoke "Tools|Apply Patch..." on the file and
select the patch created in step 3.
7. See differences. All czech characters will be
replaced by ? signs.
Comment 1 Martin Entlicher 2003-05-02 11:50:33 UTC
The patch algorithm uses Reader and Writer to read and write the
patches and files.
Is the encoding of your system different from the encoding of the
patched file? I guess it is so.

Thus the problem is:
Patch application corrupts the text if it's written in encoding that
is different from the current system encoding.
Similar bug was solved in javacvs for I/O operations (issue #27547).
The solution would be the same -- do not use Reader and Writer at all,
use pure input and output streams.

This must be fixed at least in NB 4.0. I'm not sure whether this is a
showstopper for NB 3.5, can this be considered as a P1?
Comment 2 Jiri Kovalsky 2003-05-05 14:27:46 UTC
This is completely reproducable in one session of IDE. Opinion of QA
is that this is a showstopper and should be fixed into FCS of Sun ONE
Studio 5.0. I am therefore raising priority accordingly.
Comment 3 Martin Entlicher 2003-05-05 14:44:26 UTC
Starting to work on it...
Comment 4 Martin Entlicher 2003-05-06 09:11:01 UTC
It seems like this is caused by
http://developer.java.sun.com/developer/bugParade/bugs/4415511.html
and
http://developer.java.sun.com/developer/bugParade/bugs/4227538.html

From these "bugs" it's clear why on Windows the conversion from bytes
to characters and back to bytes does not provide the original bytes.
The "ISO-8859-1" encoding does not have this "leaks" and convert all
bytes to characters and back. So the solution seems to be to use the
"ISO-8859-1" encoding for all I/O operations.
Comment 5 Martin Entlicher 2003-05-06 14:14:34 UTC
The problem is fixed in the main trunk. ISO-8859-1 encoding is used
for the reading of the file and the pach. This assures, that all bytes
can be converted to characters and back.

/cvs/diff/src/org/netbeans/modules/diff/PatchAction.java,v  <-- 
PatchAction.java
new revision: 1.12; previous revision: 1.11
Comment 6 Martin Entlicher 2003-05-06 14:15:52 UTC
Created attachment 10228 [details]
The binary patch, that fix this problem. Put into <NB-install>/modules/patches/org-netbeans-modules-diff/ folder.
Comment 7 Martin Entlicher 2003-05-06 14:17:59 UTC
Created attachment 10229 [details]
The contextual diff of the fix.
Comment 8 Jiri Kovalsky 2003-05-06 14:33:31 UTC
I tested it on Czech Windows 98 and English Windows 2000 with
appropriate encoding set on the java data object and it works okay.
Patch is always applied correctly without user data being corrupted.
NetBeans 3.5 RC1 build #200304222350 with Patch33338.jar.
Comment 9 Richard Gregor 2003-05-06 14:38:34 UTC
Code reviewed without objections.
Comment 10 Martin Entlicher 2003-05-06 15:08:46 UTC
I've tested myself on a Solaris with Japanese locale (NB RC1 with the
patch) and it worked also without problems.
Comment 11 _ ttran 2003-05-06 15:15:13 UTC
approved for 3.5
Comment 12 Martin Entlicher 2003-05-06 15:29:39 UTC
Thanks for the verification, review and approval. The bug is fixed in
release35 branch:

/cvs/diff/src/org/netbeans/modules/diff/PatchAction.java,v  <-- 
PatchAction.java
new revision: 1.8.2.3; previous revision: 1.8.2.2
Comment 13 hiroshiy 2003-06-13 10:55:49 UTC
Hello Jiri,

I've verified the fixing in followings.

    Solaris 9 Japanese, j2sdk1.4.1_02,
    Nevada RC7 Build, ja locale.

I think that, this fixing is also valid in czech locale.
However, I don't have czech windows in my office.
Could you try to verify this?

Hiroshi
Comment 14 Jiri Kovalsky 2003-07-14 14:06:49 UTC
Excellent, I am pleased to verify the fix in Sun ONE 
Studio 5.0 Standard Edition build #030528 using Windows 
98, J2SDK 1.4.1_02 and Czech locale.
Thanks Hiroshi for cooperation.