This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 26943 - Does not parse UTF-8 XML with UTF-8 marker byte
Summary: Does not parse UTF-8 XML with UTF-8 marker byte
Status: RESOLVED DUPLICATE of bug 83321
Alias: None
Product: xml
Classification: Unclassified
Component: Code (show other bugs)
Version: 3.x
Hardware: PC Windows ME/2000
: P3 blocker (vote)
Assignee: _ lkramolis
URL: http://nagoya.apache.org/bugzilla/bug...
Keywords:
Depends on:
Blocks:
 
Reported: 2002-09-02 08:11 UTC by tboerkel
Modified: 2008-02-15 18:37 UTC (History)
0 users

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
An example unparsable UTF-8 XML. (55 bytes, application/octet-stream)
2002-09-02 08:14 UTC, tboerkel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tboerkel 2002-09-02 08:11:24 UTC
If I want to edit an XML document in UTF-8 encoding with 
UTF-8 marker byte at the beginning of the file (generated 
with "Save As" UTF-8 format in Windows 2000 Notepad), then 
NetBeans cannot parse the document. If I remove the 
leading UTF-8 marker byte, then it works.
Using JDK 1.4.1 RC. Attaching an example XML.
Comment 1 tboerkel 2002-09-02 08:14:31 UTC
Created attachment 7273 [details]
An example unparsable UTF-8 XML.
Comment 2 Martin Schovanek 2002-09-03 10:11:36 UTC
It is bug in JDK you can vote for it.
http://developer.java.sun.com/developer/bugParade/bugs/4508058.html
Comment 3 tboerkel 2002-09-03 10:19:14 UTC
OK, I voted for it, but I don't think Sun will change this 
in the near future. But this problem renders the XML 
features of NetBeans useless (at least for us). A 
workaround should be implemented in NetBeans. BTW, Xerces 
can load such files without problems.
Comment 4 _ pkuzel 2002-09-03 10:28:56 UTC
Uh, XML modules use Xerces (2.0.0 beta 4).
Have you tried alternative JDK such as IBMs? Does it work with it?

Text editing works, but I assume that the BOM is destroyed on save (is
it OK?).
Comment 5 tboerkel 2002-09-03 12:06:22 UTC
We are loading this XML without problems into our 
application with Xerces 2.0.1.
Text editing works, that's right and the marker is not 
destroyed on save (if I don't press DEL at the beginning 
of the file).
But we need the tree editor and the schema validation to 
work.
Comment 6 tboerkel 2002-09-03 12:07:30 UTC
Forget to answer the JDK question:
We are only using Sun's JDK.
Comment 7 Martin Schovanek 2002-09-03 13:15:00 UTC
Looks it is fixed in latest Xerces releases, module's Xerces should be
updated.

Comment 8 Martin Schovanek 2002-09-03 13:16:53 UTC
Later.
Comment 9 tboerkel 2002-09-03 13:53:13 UTC
OK, but do not use 2.0.2 or 2.1.0, use 2.0.1. The newer 
ones have severe problems with spaces in directory paths.

BTW: Can I switch NB 3.4 to Xerces 2.0.1 myself?
Comment 10 _ pkuzel 2002-09-03 14:03:10 UTC
NetBeans 3.4 uses Xerces 2.0.1, except for parsing XML file into
internal model (Xerces 2.0.0 beta 4 used).

You can port it we use XNI based builder. Unfortunately XNI have
changed (and next changes are announced), therefore you need to map
old XNI calls to new XNI calls. Class org.netbeans.tax.io.XNIBuilder.
Comment 11 _ pkuzel 2002-09-03 14:10:01 UTC
Thomas have you reported space in path bug at Apache's. No such is
reported or it is already fixed (see URL field).
Comment 12 tboerkel 2002-09-03 14:43:32 UTC
I just filed the bug (discovered yesterday), but your 
query does not find it. Use:
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12257
Comment 13 tboerkel 2002-09-03 14:50:25 UTC
About porting to Xerces 2.0.1:
Sorry, I am not familiar with XNI and don't have the time 
at the moment.
Comment 14 _ pkuzel 2002-09-03 14:55:07 UTC
More details in
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10633.
Comment 15 _ pkuzel 2002-09-03 15:00:06 UTC
URL update, Xerces contains 78 open bugs.
Comment 16 Mikhail Matveev 2008-02-15 18:13:45 UTC
5,5 years later the bug still exists... Tested with JDK 1.6.
Comment 17 Samaresh Panda 2008-02-15 18:37:22 UTC

*** This issue has been marked as a duplicate of 83321 ***