132268 – I18N - utf-16 encoding interaction between wordpad.exe and netbeans ide

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 132268 - I18N - utf-16 encoding interaction between wordpad.exe and netbeans ide

Summary: I18N - utf-16 encoding interaction between wordpad.exe and netbeans ide

Status:	RESOLVED INVALID

Alias:	None

Product:	java
Classification:	Unclassified
Component:	Editor (show other bugs)
Version:	6.x
Hardware:	PC Windows XP

Importance:	P3 blocker (vote)
Assignee:	Jan Lahoda

URL:
Keywords:	I18N

Depends on:
Blocks:

Reported:	2008-04-08 00:41 UTC by vy0123
Modified:	2008-04-25 12:47 UTC (History)
CC List:	2 users (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description vy0123 2008-04-08 00:41:42 UTC

A file is created by wordpad.exe with a single comment line added and saved. The file is opened and modified by the
netbeans ide, another comment line is added and the file saved. When the file is opened by wordpad.exe the content
appears garbled. The before and after hexdump of the file is as follows:

#before: file created by wordpad.exe
$ hexdump -c Server2.java
0000000 377 376   /  \0   *  \0      \0   F  \0   i  \0   l  \0   e  \0
0000010   n  \0   a  \0   m  \0   e  \0   :  \0      \0   S  \0   e  \0
0000020   r  \0   v  \0   e  \0   r  \0   2  \0   .  \0   j  \0   a  \0
0000030   v  \0   a  \0      \0   *  \0   /  \0
000003a

#after: file changed by netbeans ide
$ hexdump -c Server2.java
0000000 376 377  \0   /  \0   *  \0      \0   F  \0   i  \0   l  \0   e
0000010  \0   n  \0   a  \0   m  \0   e  \0   :  \0      \0   S  \0   e
0000020  \0   r  \0   v  \0   e  \0   r  \0   2  \0   .  \0   j  \0   a
0000030  \0   v  \0   a  \0      \0   *  \0   /  \0  \r  \0  \n  \0   /
0000040  \0   *  \0      \0   D  \0   e  \0   s  \0   c  \0   r  \0   i
0000050  \0   p  \0   t  \0   i  \0   o  \0   n  \0   :  \0      \0   *
0000060  \0   /
0000062

Note - the ordering of the first two byte values.

My environment is as follows:

Product Version: NetBeans IDE 6.1 Beta (Build 200803050202)
Java: 1.6.0_03; Java HotSpot(TM) Client VM 1.6.0_03-b05
System: Windows XP version 5.1 running on x86; Cp1252; en_AU (nb)
Userdir: C:\Documents and Settings\user0\.netbeans\6.1beta

Comment 1 Ken Frank 2008-04-09 18:57:31 UTC

to issue evaluators - can this be placed in applicable cat/subcat ?

ken.frank@sun.com

Comment 2 Jana Maleckova 2008-04-11 11:03:43 UTC

reassign to java editor for evaluation

Comment 3 Jan Lahoda 2008-04-11 14:46:18 UTC

AFAICT, file encoded by UTF-16 can be either in big endian or little endian. Endianess is detected by first two bytes.
The original file seems to use little endian, the second big endian, but the second file seems also correct to me. So
this seems to me like a problem in the application that cannot read the UTF-16 file correctly. Please also note that the
encoding is done by JDK itself.