This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 127612

Summary: I18N - Check that all document characters can be loaded by selected encoding
Product: platform Reporter: Vitezslav Stejskal <vstejskal>
Component: TextAssignee: issues@editor <issues>
Status: NEW ---    
Severity: blocker CC: kfrank
Priority: P3 Keywords: I18N
Version: 6.x   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Exception Reporter:

Description Vitezslav Stejskal 2008-02-18 10:50:40 UTC
It would be nice if the Netbeans editor could detect that some characters in a file that a user is trying to open could
not be translated using the character encoding set for the file (ie. usually the file's owning project's encoding) and
warn the user that some characters might not have been loaded correctly.

Please see issue #126992 for details on what may happen when users are not careful about their files and the IDE does
not warn them. In short, a spanish team using Nb5.x had their files encoded in iso-8859-1 and stored in CVS. Their code
was in plain english, but javadoc in spanish with some special characters. One of them started using Nb6, which
automatically uses utf-8 as a file encoding. Since the first 127 characters of utf-8 and iso-8859-1 are the same and the
java code in their files used just those characters all seemed to work fine. The problem was that their javadocs got
screwed without noticing. Then this Nb6 team member saved his files and commited them to CVS damaging javadocs for all
the others too.

Having a warning from the editor when he first opened an iso-8859-1 encoded file might have prevented this situation.
Comment 1 Ken Frank 2008-02-18 16:26:43 UTC
is this rfe about just java files or other kinds of files ?
for example, various text files, such as plain text, ruby and others, don't have encoding
tags (or defaults if no tag)

I think wanting this kind of feature related to auto detection of encoding has been requested for a long
time, especially for files that dont have encoding tags, but it did not happen or the
algorithm to do it was not applicable. There used to be a encoding prop for java files but
with new feq and project encoding property, it is not used anymore.

OK, I see, this is not about doing more with the encoding detected but giving the warning
that is seen in these cases for xml, html, jsp files in similar situations ?
(rather than assuming the detected encoding for the file vs the project encoding, which applies
to all files in the project except for those with encoding tags which can override that) ?

ken.frank@sun.com
Comment 2 Vitezslav Stejskal 2008-02-18 17:12:09 UTC
> is this rfe about just java files or other kinds of files ?

All text files I would say.


> I think wanting this kind of feature related to auto detection of encoding has been requested for a long
> time, especially for files that dont have encoding tags, but it did not happen or the
> algorithm to do it was not applicable. There used to be a encoding prop for java files but
> with new feq and project encoding property, it is not used anymore.

I'm sorry the subject of this issue was misleading, so I changed it. I'm not asking for autodetection of files'
encoding, which I believe is generally not possible. All I'm asking for is that when the IDE loads files using 'some'
encoding (eg. the encoding can come from project properties or the file itself or maybe somewhere else, ... whatever the
current system is) it should check that all characters from the file can be loaded using that encoding. This check
obviously can't make sure that the encoding used for loading is the same as the one that was used for saving the file,
but it should catch the most disastrous cases.
Comment 3 Antonin Nebuzelsky 2008-04-17 15:13:44 UTC
Reassigning to new module owner mslama.