This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 103549 - I18N - Editor does not support UTF-16 encoding
Summary: I18N - Editor does not support UTF-16 encoding
Status: RESOLVED FIXED
Alias: None
Product: web
Classification: Unclassified
Component: HTML Editor (show other bugs)
Version: 6.x
Hardware: All All
: P3 blocker (vote)
Assignee: Marek Fukala
URL:
Keywords: I18N
Depends on:
Blocks: 120529
  Show dependency tree
 
Reported: 2007-05-09 11:01 UTC by Martin Schovanek
Modified: 2009-05-18 10:47 UTC (History)
5 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Schovanek 2007-05-09 11:01:31 UTC
[#200705040000, jdk1.5.0]

to reproduce:
-------------
1) open a .html file
2) change the file encoding to UTF-16 (or other which is not compatible with
default encoding) by:
Comment 1 Marek Fukala 2007-05-09 12:42:54 UTC
reproducible
Comment 2 Ken Frank 2007-05-30 04:59:58 UTC
what is the problem seen ? 
are the characters not displayed ok in editor or when html run in browser ?

also, I don't know if upcoming feq changes for web or html might impact this
situation (allowing project to have an encoding property)

ken.frank@sun.com
Comment 3 Martin Schovanek 2007-06-06 14:49:52 UTC
Partially fixed, works for UTF-16 NOW, but still does not work for eg. UTF-16LE,
because there is not BOM by default, downgrading to P3.

The problem appears when you put the following line into html-head section and
reopen the document.

   <meta http-equiv="Content-Type" content="text/html; charset=UTF-16LE">

Encoding property may serve as workaround for this.
Comment 4 Ken Frank 2007-10-03 18:08:38 UTC
to dev, does feq implememtations for file and project solve this issue ?

ken.frank@sun.com
Comment 5 Marek Fukala 2007-10-22 15:35:54 UTC
I really do not know how can I fix that without setting special file encoding property. In the FEQ impl. I need to read
the stream to find the meta tag. However the string created from the inputstream is incorrect since I do not know the
encoding and use the default. As a result of that I do not find the meta tag and do not return the encoding, the FEQ
infrastructure then uses the project one which causes the file being incorrectly loaded. BTW, how other editors handle
this??? Isn't it a generic problem? Has anyone already solved this?
Comment 6 Tomas Zezula 2007-10-22 16:54:01 UTC
I don't know how it's in the HTML, but it should be probably the same as in XML, the UTF-16 starts with UTF-16 mark
otherwise the head has to be in UTF-8 or ISO Latin 1, I am not sure which one, look into XML/Core EncodingUtil.
Comment 7 Marek Fukala 2007-10-22 16:58:45 UTC
Tomasi, we do the same as in XML - looking for BOM, but it seems UTF-16LE doesn't have it.
Comment 8 Vitezslav Stejskal 2007-10-29 16:30:31 UTC
IMO the first 128 characters are the same in UTF-8 and ISO Latin 1. So, I would say if there is no UTF-16 mark just fall
back on UTF-8 for reading the header.
Comment 9 Marek Fukala 2007-10-31 09:02:53 UTC
fixed. Martine, please verify ASAP.

Checking in HtmlDataObject.java;
/cvs/html/src/org/netbeans/modules/html/HtmlDataObject.java,v  <--  HtmlDataObject.java
new revision: 1.32; previous revision: 1.31
done
Comment 10 Martin Schovanek 2007-10-31 13:13:22 UTC
Still can reproduce, reopen.
Comment 11 Marek Fukala 2007-10-31 14:39:10 UTC
fixed. The problem of the previous fix was that it supposed that the UTF-16LE encoded stream has BOM. I extended the
logic so the code tries to read the file and find the meta tag using DEFAULT or found from BOM, UTF-16LE, UTF-16BE.

Checking in HtmlDataObject.java;
/cvs/html/src/org/netbeans/modules/html/HtmlDataObject.java,v  <--  HtmlDataObject.java
new revision: 1.33; previous revision: 1.32
done
Comment 12 Ken Frank 2007-11-04 20:03:07 UTC
Martin,

can you see if its now ok ?

ken.frank@sun.com