I found that there is the encoding capability with JSP source file. When I
change encoding for example from ISO-8859-1 to UTF-8, it does not convert from
but set to.
For Java source file. Yes, Java compiler only accept ASCII. It is very hard to
input unicode escape character.
If it has encoding capability as JSP file and has menu for convert encoding file
(can use native2ascii), it will be easier and comfortable for use.
I think there should be some general mechanism for handling the file
encodings/conversions. IMHO currently only the default encoding set to
JVM is used in the IDE.
When reading the file the editor kit only gets the input stream for
reading so it chooses the default encoding. The editor kit must be
given a Reader (with the proper byte-to-char-converter) instead.
Reassigning to core but openide should be involved too I guess.
passing to Peter.
In addition, I can put my comment and generate
document in my own language.
Target milestone was changed from '3.4' to TBD.
I have a feeling this is a duplicate of something.
By the way Isomchai - try my little experimental module,
It at least makes it easier to insert (but not read) escapes - mostly
for alphabetic/syllabic languages, as it is too clumsy to be useful
*** Issue 25191 has been marked as a duplicate of this issue. ***
Such an API has been proposed and discussed in various forms on
several occasions on the list throughout the past couple of years.
Suggestions that I remember have included:
- an EncodingCookie which supplies the encoding of a file
- cause EditorCookie to automatically decode/encode the file according
to a locale property associated with it
Definitely needs a complete proposal and discussion; the issue is
pretty complicated when you consider:
- How much for API vs. hidden implementation?
- usage of platform default encoding vs. a standard encoding like UTF-8
- Unix vs. Win vs. Mac line endings - should the same mechanism solve
- external processes like javac may need to know file encoding, so
encoding cannot be completely hidden in implementation
- UI to present the choice? prop ed needed (issue #20259); per-file
selection? per-file-type? per-filesystem (issue #25189)? global default?
- input methods: is the OS's keyboard support and JRE's input method
framework sufficient for users to enter international text in the
editor, or do we need any more support?
- escape vs. raw: for XML, HTML, .properties, and .java, there are
standardized Unicode escape syntaxes. Should the Editor window display
the raw characters, the escapes, or should you be able to choose on
the fly (a question for editor.netbeans.org probably)? Should the file
saved to disk contain the raw characters (encoded suitably), the
escapes (encoding irrelevant), or should this be a choice (i.e.
"escaped" is a special kind of "encoding")?
25191 is a duplicate of this issue; so am marking this one as defect
after consulation with nb QA and comments from nb
strategy that some i18n rfes could actually be viewed as defects.
Let me know if more details are needed.
Also, 20259 will be marked also as defect as above.
Finally, would 27240 be a duplicate of this also ? If so, I can mark
it as such.
To the previous note from Jesse:
IMO the editor should display the content that was obtained from the
java.io.Reader without changes i.e. if there is a "raw" unicode char
that char should be displayed and if there was '\\' 'u' ... then that
text should be displayed.
IMHO the additional tweaking with the characters such as expanding to
escapes etc. should be treated as a pluggable filters. In general
there could be several cascaded filters.
We should discuss whether the input methods are enough for inputting
of the characters. I have no valuable opinion of that because I don't
use the input methods.
*** Issue 27240 has been marked as a duplicate of this issue. ***
Issue #27240 also suggest per-file-type encoding defaults in some
uniform way. But I think we need per-file encodings anyway.
Jesse Click raises several interesting questions, which I'd
like to address. For Example:
>> - an EncodingCookie which supplies the encoding of a file
>> - cause EditorCookie to automatically decode/encode the
>> file according to a locale property associated with it
I'm not quite sure what these two mean, but there exists a
current mechanism for specifying the encoding for .java
files. The encoding gets saved in the directory's .nbattrs
file. This works well.
>> - How much for API vs. hidden implementation?
The current mechanism to specify the encoding of .java
files works well, and I feel it should be applied to all
files. It shouldn't be hidden, because the user needs some
control to specify which files use which encodings.
>> - usage of platform default encoding vs. a standard
>> encoding like UTF-8
The platform default should be the default encoding, but
the user needs to be able to override it for specific files
or file types.
>> - Unix vs. Win vs. Mac line endings - should the same
>> mechanism solve this problem?
This is an interesting idea, but I suspect it would cause
more problems than it would solve. Line endings aren't an
encoding issue. This should be seen as a separate issue,
probably an editor issue. (Personally, I feel users should
be allowed to specify a default line-ending, which should
be used when saving files, but any standard line-ending
should end a line when reading files.)
>> - external processes like javac may need to know file
>> encoding, so encoding cannot be completely hidden in
If the file is always loaded using the specified encoding,
the external processes shouldn't have any problems.
>> - UI to present the choice? prop ed needed (issue
>> #20259); per-file selection? per-file-type? per-
>> filesystem (issue #25189)? global default?
A property editor would be a good idea. It's a separate
issue, though, and should be considered separately. I'd
also like a to specify the encoding by file type, but this
shoudn't be seen as a substitute for specifying by specific
>> - input methods: is the OS's keyboard support and JRE's
>> input method framework sufficient for users to enter
>> international text in the editor, or do we need any more
Input methods are a separate issue. (In my experience, they
are perfectly adequate, and we shouldn't have to worry
>> - escape vs. raw: for XML, HTML, .properties, and .java,
>> there are standardized Unicode escape syntaxes. Should
>> the Editor window display the raw characters, the
>> escapes, or should you be able to choose on the fly (a
>> question for editor.netbeans.org probably)? Should the
>> file saved to disk contain the raw characters (encoded
>> suitably), the escapes (encoding irrelevant), or should
>> this be a choice (i.e. "escaped" is a special kind
>> of "encoding")?
Again, this isn't an encoding issue, but it raises an
interesting question: What happens if an editor enters
characters that aren't supported by the file's encoding?
However, currently, the java.io package already has a
policy to handle unsupported data. (For ISO 8859-1,
unsupported characters are converted to question marks.)
Users may want the editor to highlight the unsupported data
somehow. But this is an editor issue, not an encoding
issue. (Properties files use escaped characters because
java requires them to be in a the ISO 8859-1 encoding, so
they can be cross-platform. Again, this is an editor issue,
not an encoding issue, although there is certainly some
overlap.) I like Miloslav Metelka's suggestion.
However we decide this, we should keep in mind that, for
multi-platform/multi-Locale projects, there's a lot of
transferring files from one user to another, so there's no
telling what the encoding should be for any file. So the
user needs to be given the maximum possible control.
Personally, I'd be happy to see all files get a text tab in
their properties view, just like .java files do. This
wouldn't let me specify encodings for specific file types,
but gives me the flexibility I need to solve this problem.
And it could be done quickly--the code already exists.
Here's my (wacky) workaround. Currently, I need all .sql
and .utx files encoded with UTF-8. So, in Tools:Options, I
Java Source Objects
and I set the "File Extensions" property to
java, sql, utx
Then, for my sql and utx files, I set the compiler to (do
Ken, I don't understand why you have marked this as defect? It is pure
I also don't understand how an enhancement could be viewed as defect?
After talking with QA
Changing to back to feature.
And also sinnc it is not a must-have feature decreasing back the
If the feature is important, it has to should be pushed thru plans in
accordance to other features. The resources are limited and not all
features could be must-have ones.
reassigne to David K., new owner of editor
*** Issue 32028 has been marked as a duplicate of this issue. ***
1. To tell the truth, it seems very strange for me,
the issue is marked as RFE rarher DEFECT. When it is
impossible to do some every day work (like editing
text file, for example), the module (text module in
my case) has P1 bug.
2. As the issue has rather long period of life, I think
simple palliative step may be done:
- introduce global system property how to interpret byte
- introduce such property for text editor only (java,
editor has such one, XML and HTML editors are clever
enough to invoke encoding from appropriate language
I think, such little step demands one hour of efforts
of NB guru. On the other hand, significant part of users
problems will be resolved with such step (I see, it is
not a decision for _all_ users problem).
I'm afraid to incur NB developers anger :-), so I leave
the issue priority and type as is.
I agree that as a short term solution this should be fixed in plain
text editor similarly as in Java editor. I would suggest to file an
issue against text module asking for this.
Frankly speaking I'm not planning to properly fix this issue soon.
First, it is not trivial, second, I do not have resources for that.
Somebody will have to contribute this. :-)
That "short term solution" you describe sounds fine to me.
I suspect that's all people are really looking for. I'm not
sure why a new bug should be filed against text module.
Can't this bug report just be reassigned?
When I opened issue 27240 (now closed as dup of this), all
I was concerned with is that the editor read the file in
the proper encoding, and convert to Unicode. Once I start
editing, I already have everything I need. If I need IMEs,
I have them. Just make NetBeans read and write the files
with the proper encoding. Thanks.
If this bug report has a larger scope than 27240, please
reopen 27240 and assign it to the text module.
To Miguel: please don't change the version field. The bug was first
logged against FFJ 3.0 and since it's still open, it's understood that
it applies to all subsequent versions of NB, FFJ and S1S.
Version: 3.5 -> FFJ 3.0.
Yes, I think this issue is asking for proper solution on file
granularity, etc. That's why I would want to keep it open. I reopened
issue 32028 which was closed as duplicate of this one. Your one has
larger scope, it asks for setting this property for all files.
See issue 42638 which proposes simple File Encoding API.
Cf. issue #6050 ("Faster alternative to EditorCookie") which
recommends a Reader and Writer interface to a file rather than only
To NB dev team - has any of the things discussed in this issue
been implemented already ?
any in progress ?
any that should have a seprate rfe filed ?
To Ken: no; no; and probably no. This stuff should be solved in a
reasonably complete proposal to overhaul file encoding in the IDE. No
one has worked seriously on such a proposal yet.
*** Issue 51672 has been marked as a duplicate of this issue. ***
*** Issue 55751 has been marked as a duplicate of this issue. ***
*** Issue 55739 has been marked as a duplicate of this issue. ***
*** Issue 56597 has been marked as a duplicate of this issue. ***
Any chance this issue gets solved? New CVS Diff is facing problems due to lack
of encoding support. If you have a file with latin characters and change one
line, all lines containing latin characters are marked as different.
In CVS we have file caches. Files in cache do not have original extension to
avoid confusion of tools that recursively process directory content by extensions.
- the API could take InputStreamProvider and String (original file name) to
address it. may be also original MIME
- wait for JRE 6.0 that allows to set file hidden flag (and rewrite all tools
to check it...
- CVS cache should use workdir file encoding (but here is invalid assumtion
that encoding can not change over time)
To misterm - your comments about cvs and latin chars - can you
elaborate a little and tell which locale you are in when
running ide; in the file, are there characters in encoding or charset
other than the one that is default for the locale you are in;
are the issues also about filenames that have characters of
extended ascii or multibyte ?
>------- Additional comments from kfrank Wed Oct 26 17:34:05 +0000 2005 -------
> To misterm - your comments about cvs and latin chars - can you
> elaborate a little and tell which locale you are in when
> running ide;
pt-BR in one machine and en-US in the other one, using Windows default encoding
(cp1252, i guess)
> in the file, are there characters in encoding or charset
> other than the one that is default for the locale you are in;
No, just regular characters for my locale such as ç, ã, á etc.
> are the issues also about filenames that have characters of
> extended ascii or multibyte ?
NB CVS support used to have problems with it, but I haven't tested it lately.
As Jesse mentions, it would help to have overall proposal
and solution; how could that happen ? I've seen over time
this kind of question about need for encoding capability
arises. Thats why changing this to p2.
Just restoring original version field.
Reassigning to new module owner mslama.
This issue had *6 votes* before move to platform component
*** Bug 168265 has been marked as a duplicate of this bug. ***
*** Bug 55738 has been marked as a duplicate of this bug. ***
*** Bug 177714 has been marked as a duplicate of this bug. ***
Also see issue #114123 and http://wiki.netbeans.org/TextEncodingFOW.