JDK 1.6.0 b81 supports saving of properties files in various encodings (i.e. not
just ISO-8859-1). Allow users to choose their preferred encoding of properties
files (at least in projects based on JDK 1.6.0 and newer).
For more information, see JDK issue #6204853
As I mentioned in my comments to the JDK bug, the new Reader constructor is a
bit half-baked; there is no way for the IDE to tell that a given .properties
file is in a different encoding.
This could be done in two phases:
1) Support for ISO-8859-1 and for the system's default encoding.
If the .properties file cannot be read using ISO-8859-1, try loading it
using the system's default encoding. Hold the information about
which encoding was used for loading the file and use the same encoding
when saving the modified content.
2) Add support for other encodings once issue #42638
("Provide support for File Encoding") is resolved.
If issue #42638 is resolved in NB 6.0 M7 (it is currently planned so), I would
skip the first phase.
Issue #42638 is planned for 6.0 M7, so I plan this to M8 - some time is needed
for adaptation of the API introduced by fix of #42638.
"If the .properties file cannot be read using ISO-8859-1, try loading it using
the system's default encoding." - this cannot work, I think; all byte sequences
are valid in ISO-8859-1, IIRC.
I would actually detect any characters greater than 0x7e, as there should not be
any in files written by Properties.store(OutputStream, String). I am not sure it
is a good idea, though.
*** Issue 93636 has been marked as a duplicate of this issue. ***
Now that issue #32392 ("Edit Text Rather than Escape Sequences") is fixed, I do
not plan to support other encodings. Any text typed in the editor is encoded to
a pure ASCII file, with non-ASCII characters encoded in form of \uxxxx sequences.
See also issue #97861 ("Update properties data object to use FileEncodingQuery").
just want to be clear for testing - that the user can enter text with characters
of the locale they are currently in OR with characters of the currently set
project encoding property (once that is implemented for all project types)
and that in editor it will be changed to show the escaped ascii ?
- is this for editing the property file as the file itself or also for the view
where one gets to input keys and values ?
- does it mean that as they type into property file some multibyte characters,
for example, that they are automatically converted into the escaped ascii sequence ?
All properties are saved with encoding ISO-8859-1 (ISO Latin 1) - there is no
change in this. The change is that, when saving the file, characters that are
not part of the ISO-8859-1 character table are not silently replaced with a
question mark (as it used to work) but they are silently replaced with
corresponding \uxxxx sequences as specified in method
java.util.Properties.store(...) - see
The user can enter any characters they want (including multibyte), characters
having Unicode value less than 20h or greater than 1eh will be saved as \uxxxx
sequences, where 'xxxx' is a Unicode value of the corresponding character
expressed with hexadecimal digits. When the file is opened (loaded) in NetBeans,
these sequences will be decoded and corresponding characters will be displayed
in the editor instead of the sequences. The user is still allowed to enter
\uxxxx sequences in the editor - these sequences will not be modified during
saving but they will be decoded when the file is later loaded.
The above mechanism is independent of the locale settings of the IDE and of the
project's or file's settings.
The view where one gets to input keys and values is unchanged - it has always
allowed to enter any characters and translated them to \uxxxx sequences as
necessary. There is one remaining issue connected with it - when the user edits
the .properties file using the table view and he/she has also the editor view
for the same file opened, non-ASCII characters entered in the table view are
promoted to the editor view as \uxxxx escape sequences. This is no longer
necessary and I just filed issue #102699 for it.
Correction: instead of
"less than 20h or greater than 1eh"
there should be
"less than 20h or greater than 7eh"
Reoped so that status can be changed.
*** Issue 99231 has been marked as a duplicate of this issue. ***
It should be possible to semi-detect encoding of a .properties file by checking its content. If it contains any
character of value 0xff or larger, than the file:
- either has not been modified from NetBeans yet
- or it is a file encoded using UTF-8
In such a case, the file could by checked whether it could be decoded using the UTF-8 encoding. If it could, ask the
user (something like "Seems to be a UTF-8 encoded file. Right?") and let him/her decide whether it should be loaded
using ISO-8859-1 or UTF-8. If it could not, then use the ISO-8859-1 encoding.
In the question dialogue about encoding (ISO-8859-1 vs. UTF-8), the user could specify to always use the selected
encoding for all .properties files in the project.
The mechanism described in my previous comment could be also generalized so the algorithm would be:
1) Scan the file - search for non-ASCII characters.
2) If there are no non-ASCII characters found, use the default encoding/decoding used for .properties files
(ISO-8859-1 with \uxxxx sequences translated to corresponding characters).
3) If there are some non-ASCII characters found, try to detect encoding, UTF-8 in the first place.
If some encoding is detected, ask the user for confirmation. If no encoding is detected,
let the user to specify which encoding to use.
*** Issue 114462 has been marked as a duplicate of this issue. ***
could you comment here on problems seen in
125875 and how this enhancement is same or related and impact
on your team; perhaps it can be raised in priority.
on cc list here
are developers who could reply to the comments.
the Grails team needs this RFE as well. Grails message-bundles are exclusively UTF-8 encoded:
"The files must be saved in UTF-8 encoding if you wish to use non-ascii characters, which is contrary to standard Java
properties files which use the native Java VM encoding."
At the moment we can not use NB6.1 to deal with grails message-bundles. Since we are working on a full-featured
Groov/Grails integration into NB6.1 we need to be able to work with theses files.
*** Issue 125875 has been marked as a duplicate of this issue. ***
Based on all the discussion above, I think this is a much higher priority than P4 - bumping to P2. Can this be
considered for the next release?
Yes, it will be considered. When planning the next release, I will take priority of enhancements and feature requests
PHP user also user .properties, when log4php is being used. It's a minor case but it's helpful if we have an option to
save .properties as system default encoding to save non-ascii characters used in comment line.
*** Bug 155934 has been marked as a duplicate of this bug. ***
*** Bug 198631 has been marked as a duplicate of this bug. ***
*** Bug 210088 has been marked as a duplicate of this bug. ***
This bug is biting me as well. I'm writing a Firefox extension, .properties files should be UTF-8 according to Mozilla. File was created externally by another developer as UTF-8, when I opened it, it seems netbeans tried to open it as ISO-8859-1 and all Chinese characters broke.
If netbeans treats this file as any other project file when creating/opening/saving that should solve the problem since all project files are UTF-8.
I have the same problem. Working with properties files in UTF-8 works fine as long as nobody opens them with NetBeans IDE in which case each UTF-8 character is replaced with two strange characters :(
Two of my last projects have utf8 encoded .properties files which are used for localization.
Please, at least add an option to somehow suppress the current behaviour. We need to work with utf8 .properties file every day.
Same problem here.
Project types such as Grails or PHP which are known to mandate UTF-8 *.properties should simply provide a FileEncodingQueryImplementation saying so.
For other cases it would be trivial to write a plugin which lets you specify which *.properties should be treated as UTF-8: none (default config), all, based on file path regexp, etc. The downside is the need for manual configuration, especially if you also use standard *.properties files at times.
I am not sure it is possible to reliably detect UTF-8-encoded files, as such files would be loadable in ISO-8859-1 as well and plenty of *.properties in the field include raw European accent characters. There may be some libraries out there which can detect characteristic patterns of UTF-8 misinterpreted as ISO-8859-1, such as improbable punctuation or character sequences (¡å, Ä«). A plugin using juniversalchardet  to sniff file contents might be very handy, for example. It may suffice to look for a high percentage of bytes in the 0x80–0x9F range, which are almost never used in ISO-8859-1 documents but frequent in UTF-8. One limitation of all such approaches is that it can only work for an existing *.properties file which has a significant amount of non-ASCII text in it, so an IDE user writing a new file would probably see it treated as UTF-8.
The Java team declined to mandate that UTF-8 *.properties start with a BOM, which would have solved the problem cleanly (at least for JVM-based projects; not PHP). NetBeans (or a NB plugin) could adopt Emacs’ convention, that files may start with a header comment specifying the encoding:
# -*- coding: UTF-8 -*-
Actually, there already exists a global project encoding setting. Additionally, the general recommendation usually is to use UTF-8 and UTF-8 only. Using anything other than UTF-8 is bad behavior as can be discovered every day anew when inexperienced programmers forget to define file encodings in build.xml files etc and simple builds fail because one uses an UTF-8 console environment. We should really enforce best practices. Maybe Java should be changed to use UTF-8 by default. Always. (there are many other crappy defaults out there, like jdbc timezone handling etc., which should be killed once and for all)
From what I know .properties files are not used in PHP in general so I guess there are some PHP frameworks which use these files, right? If it is so please state name of these frameworks for me to have a better understanding of the situation.
In what projects other than PHP do you have problem with .properties files encoding?
Google Web Toolkit also require utf-8 encoding of properties files:
... You must also ensure that all relevant source and .properties files are set to be in the UTF-8 charset in your IDE. ...
*** Bug 228196 has been marked as a duplicate of this bug. ***
Would not suffice to just obey project properties encoding?
If my project is set to UTF-8, then assume properties are UTF-8. For me, this is all I need.
In case of need to detect the encoding, just check Notepad++ algorithm, it works almost perfectly IMHO.
Please implement it in the next release (maybe 7.4 or 7.5). So I can use Netbeans again.
It seems so easy for you guys. But if you can't do right now I'm encouraging someone to make a patch and send it for analysis.
Why its since 2006 without any "fix" guys?
Just give us some news about it (some main developers perhaps), its not WONTFIX so, what its that you can't do it?
Anyway thanks for you work and for your attention.
(In reply to comment #36)
> Please implement it in the next release (maybe 7.4 or 7.5). So I can use
> Netbeans again.
> It seems so easy for you guys. But if you can't do right now I'm encouraging
> someone to make a patch and send it for analysis.
> Why its since 2006 without any "fix" guys?
> Just give us some news about it (some main developers perhaps), its not WONTFIX
> so, what its that you can't do it?
> Anyway thanks for you work and for your attention.
As I've mentioned before I need to know in what project types you are using properties files with different encoding (not ISO 8859-1).
PHP was mentioned, but PHP support developers would like to know particular use-cases of using .properties files in PHP (in what framework, etc.). In case of GWT it has to be fixed on side of the 3rd party plugin.
So let me know what particular problem do you have.
Please clarify what you mean when stating that "In case
of GWT it has to be fixed on side of the 3rd party plugin".
Are you not basically saying that this will never be supported for java projects then?
(In reply to comment #38)
> Please clarify what you mean when stating that "In case
> of GWT it has to be fixed on side of the 3rd party plugin".
> Are you not basically saying that this will never be supported for java
> projects then?
Well as far as I know standard Java API is designed to use ISO 8859-1 encoding for the properties file, but maybe I'm just missing something - what is the problem with .properties files in java projects?
Properties files were originally ISO 8859-1, but since Java 1.5 they can also be read and written using a Reader in any encoding. It is a *long* time since IDO 8859-1 was the required encoding. It is a surprise that are the reference IDE that NetBeans does not support this.
We use UTF-8 property files in our Java project to hold i18n translations. This doesn't seem unreasonable. JClearly I dare not open these files in NetBeans and it does make it hard to try and convince others to change away from Eclipse.
You are not missing anything - the java spec do indeed say so for the
However, some frameworks/libs (such as GWT) seems to use the load(Reader)
method also of the Properties class, in which it is not specified what encoding
is used by the underlying inputstream.
I don't want to get into what seems to be a religious argument here, so i will
just describe my problem:
I use GWT for my web front end (this is all coded in java, so it is a java
The localized files i keep for GWT to read must be UTF-8.
Therefore i must set netbeans to open these files as plain text in order for it
to not convert the files to ISO-8859-1.
So my issue is that i can not use the properties file editor.
IMHO we can't automatically assume that project encoding should be used for every properties file - standard Java project sources can be encoded in UTF-8 but resource bundles still has to be ISO-8859-1.
What if there would be a check box in project/properties "use project encoding for .properties files" and according to this check box the encoding of the properties files would be default (ISO-8859-1) or project specific (e.g. UTF-8).
Would that solve your problems with it?
That'll do the trick for me.
Its fine that it is something that has to be actively selected for the individual project, as i agree that the standard should still be iso-8859-1 for properties files in java projects.
(In reply to comment #42)
> IMHO we can't automatically assume that project encoding should be used for
> every properties file - standard Java project sources can be encoded in UTF-8
> but resource bundles still has to be ISO-8859-1.
> What if there would be a check box in project/properties "use project encoding
> for .properties files" and according to this check box the encoding of the
> properties files would be default (ISO-8859-1) or project specific (e.g.
> Would that solve your problems with it?
For me, it is a perfect workaround.
The perfect solution would make NetBeans detect each file encoding prior to opening (such algorithm would be very complex), but this seems to be overwhelming.
(In reply to comment #44)
> The perfect solution would make NetBeans detect each file encoding prior to
> opening (such algorithm would be very complex)
See my comment #29 if you did not already.
(In reply to comment #45)
> (In reply to comment #44)
> > The perfect solution would make NetBeans detect each file encoding prior to
> > opening (such algorithm would be very complex)
> See my comment #29 if you did not already.
Yes, I was just enforcing my vision that the proposed workaround would work for me, but my humble opinion is that a complete solution would work better in all NB (not only .property files).
By today, my preferred editor with multi encoding support is Notepad++, which does a terrific job identifying file encoding as well converting from one encoding to another.
Would be nice to have such features in NetBeans and not needing external editors to do that.
Notepad++ is open source, and its algorithm would (or not) be easy to adapt to Java (I really don't know how easy would it be).
I'll discuss it with Java and PHP guys, is there any other project which has this problem?
After discussion with Tomas Zezula (Java Project support). We have agreed on a little bit different approach in this case.
User can check "use project encoding" property on a .properties file itself, not on its project. I know that it won't be very effective in case of project with multiple properties files, but on the other hand you can use this feature in any project.
I've added the property to the property sheet of a properties file and I'll integrate it today, so please take a look on that.
Integrated into 'main-golden', will be available in build *201304272301* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
User: Jan Peska <JPESKA@netbeans.org>
Log: Issue #75906 - I18N - Add support for other encodings (other than ISO-8859-1)
Support usage of project encoding for .properties files
Confirmed: with the fix, properties editor is working as expected.
I'm very happy that this issue is getting attention and has a solution now. I work in projects that have quite a lot of properties files to store internationalized content, so unfortunately checking off each file individually will be a somewhat tedious process. Is there any way this can be done at a project or folder level?
Again, thanks for looking at this!
(In reply to comment #52)
> I'm very happy that this issue is getting attention and has a solution now. I
> work in projects that have quite a lot of properties files to store
> internationalized content, so unfortunately checking off each file individually
> will be a somewhat tedious process. Is there any way this can be done at a
> project or folder level?
> Again, thanks for looking at this!
No, unfortunately you can't specify it at project (or folder) level, it is a property on a properties file itself. I can evaluate possibility to select multiple properties files and then set the property for all of them at once if that would help you...
I also really appreciate that this is getting attention.
In my current projects we have 80+ property files per language, that all have to be in utf-8, so i would appreciate the possibility to set this on multiple files at once.
> No, unfortunately you can't specify it at project (or folder) level, it is a
> property on a properties file itself.
So why could not we have both? If encoding is specified for an individual file, use this encoding. If it's not specified, use whatever is specified at the project level. If it's not specified at the project level, use the default ISO 8859_1. Would that work?
I would like to keep it as simple as possible. I've checked it and it works just fine if you select multiple files at once and set the property.
Thumbs up from Denmark then :) 10 points.
Just a reminder that anyone with a need for a more general fix (e.g. sniffing encodings, looking for Emacs-style -*- mode headers, loading folder or project properties, etc.) can implement other strategies in plugins, which could be quite small (one class) using very limited bits of the NetBeans API.
Was this fix included in 7.3.1 ?
Having to select the files will be a pain in some legacy projects where the .properties files have been placed where needed and not grouped in some way and I do happen to work on such a project with Netbeans and Eclipse. This should really be at a project level, especially for imported Eclipse's projects that use different encoding. This cause Netbeans to create files in the improper encoding in the project.
Where is this flag at the end???
I do not find it in release 7.2.1 nor 7.3.1.
So please could you give me some clarifications about that dumb issue.
This fix is a part of 7.4 - you can try it in 7.4 beta (https://netbeans.org/community/releases/74/)
(In reply to omeurice from comment #61)
> Where is this flag at the end???
> I do not find it in release 7.2.1 nor 7.3.1.
> I did not create a Java project, I just opened a maven project for a
> IDE converts them automatically at save time. That's definitely not what I
> So please could you give me some clarifications about that dumb issue.