This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
[ BUILD # : 200512152030 ] [ JDK VERSION : 1.5.0_06 ] Hello I'm working in Netbeans with option -J-Dfile.encoding=UTF-8 because I have some txt files in UTF-8. When I create file named ZażółćGęśląJaźń.txt and add it to CVS, the file created at server is: zaĹĽĂłĹ,Ä++GÄ^(TM)Ĺ>lÄ...jaĹşĹ".txt When I remove -J-Dfile.encoding=UTF-8 from config, CVS -> Show changes shows, that there is new Remote file. It may cause data loss --> P1
"-J-Dfile.encoding" switch is used for file content not for file name. Command line cvs treats this kind of file the same way as javacvs library. I'm afraid we can't do nothing about this.
-J-Dfile.encoding is suposed to work like you described. But if I dont run Nb with this swich file is added to repository properly.
Lets suppose that I have two files: Zażółć.txt and Gęślą.txt When I add the first one with switch it id added as: zaĹĽĂłĹ,Ä+.txt When I add second one without the -J-Dfile.encoding=UTF-8 it is added as: Gęślą.txt Is this normal?
CVS as such supports only ASCII filenames! All non-ASCII file names can work if server and client uses the same encoding. Please check encoding at your server side and align or stick with ASCII names. BTW -J-Dfile.encoding is used (dafault) for : new InputStreamReader(in) and new OutputStreamWriter(out) constructors. CVS library uses these so user can align its local environment with server setup. Well, extra property e.g. cvs.filename.encoding could be better. INVALID because it's likely user's setup problem. Is it?
> INVALID because it's likely user's setup problem. Is it? No it isn't user's setup problem because behavior of CVS depends of this setting. See my comments --> Fri Dec 16 18:01:41 +0000 2005
> Lets suppose that I have two files: > > Zażółć.txt > > and > > Gęślą.txt These two words contain different characters. I think that second word will be added correctly for both cases (with switch on/off) for this file. Am I right?
It is only Example :) I've tested it with: ZażółćGęśląJaźń.txt (with -J-Dfile.encoding=UTF-8) ZażółćGęśląJaźń_1.txt (without -J-Dfile.encoding=UTF-8) Results was: zaĹĽĂłĹ,Ä++GÄ^(TM)Ĺ>lÄ...jaĹşĹ".txt ZażółćGęśląJaźń_1.txt
I played with it a bit and think this is not an issue but one has to properly understand what is going on. First off, CVS server does not understand or care about different encodings. For filenames, this means that it takes the name as series of raw bytes as they come and sends the same bytes back to clients. This works well for plain 7-bit ascii characters. Now back to your case. You started Netbeans with -J-Dfile.encoding=UTF-8, so you are telling java to use UTF-8 as the default system (platform) encoding for this session. Then you created a file whose name contains special characters and _those chars have different byte representations depending on encoding in use_. And CVS has to pick one when communicating with server. It is natural that it picks the default system encoding, this time UTF-8. In this encoding, special characters are encoded with 2 bytes, hence longer filenames. CVS server takes this and stores it as you sent it. Later when you do update, checkout or any other CVS operation, everything works perfectly, because server sends you filenames in UTF-8 and you expect them to be in this encoding. However, once you remove the -J-D switch, your platform encoding becomes whatever_it_is and things will break because server does not care and you now expect all filenames coming from server to be in whatever_it_is. To conclude, I would suggest you either name your files using safe ascii only OR use the same encoding everytime.
Uuups. Thanks for clearing things :) -J-Dfile.encoding=UTF-8 is NOT ONLY for content of files right? If am I right how can I tell NB that some of my text files is UTF-8 encoded?
From CVS spec: Conventions regarding transmission of file names In most contexts, `/' is used to separate directory and file names in filenames, and any use of other conventions (for example, that the user might type on the command line) is converted to that form. The only exceptions might be a few cases in which the server provides a magic cookie which the client then repeats verbatim, but as the server has not yet been ported beyond unix, the two rules provide the same answer (and what to do if future server ports are operating on a repository like e:/foo or CVS_ROOT:[FOO.BAR] has not been carefully thought out). Characters outside the invariant ISO 646 character set should be avoided in filenames. This restriction may need to be relaxed to allow for characters such as `[' and `]' (see above about non-unix servers); this has not been carefully considered (and currently implementations probably use whatever character sets that the operating systems they are running on allow, and/or that users specify). Of course the most portable practice is to restrict oneself further, to the POSIX portable filename character set as specified in POSIX.1.