This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 119431 - Chinese characters display "??" on project folder
Summary: Chinese characters display "??" on project folder
Status: VERIFIED FIXED
Alias: None
Product: cnd
Classification: Unclassified
Component: Project (show other bugs)
Version: 6.x
Hardware: Sun All
: P3 blocker (vote)
Assignee: Thomas Preisler
URL:
Keywords: I18N
Depends on:
Blocks:
 
Reported: 2007-10-19 04:13 UTC by Will Zhang
Modified: 2010-02-24 10:49 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
screenshot (15.09 KB, application/octet-stream)
2007-10-19 04:13 UTC, Will Zhang
Details
screenshot in fixed build (6.10 KB, image/png)
2010-02-23 23:20 UTC, Keiichi Oono
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Will Zhang 2007-10-19 04:13:11 UTC
Platform: Solaris10u4(sparc/x86/x64)
Locale: zh
1. launch nb60beta1
2. create cnd project
result: Folder name's Chinese characters display "??", they are "Header Files" and "Resource Files"
"Source Files" and "Important Files" are fine
This issue doesn't exist on zh_CN.UTF8 locale and other platfrom
This issue doesn't exist on ja locale.
screenshot is attached
Comment 1 Will Zhang 2007-10-19 04:13:57 UTC
Created attachment 51260 [details]
screenshot
Comment 2 Thomas Preisler 2007-10-19 19:24:56 UTC
Are they localization issues? Header Files and Resource Files are not handled any different than Source Files. 
Comment 3 Will Zhang 2007-10-20 03:45:20 UTC
      As we know, this issue only exists on Solaris zh locale. It doesn't exist on Solaris zh_CN.UTF8 or 
windows/linux...
      If it's l10n issue, all platform should have this issue.
      So i did the investigation for "Header Files",
#1. the source is:
makeproject/src/org/netbeans/modules/cnd/makeproject/api/configurations/Bundle.properties
en: HeaderFilesTxt=Header Files
zh: HeaderFilesTxt=头文件
ascii: HeaderFilesTxt=\u5934\u6587\u4ef6
头 => \u5934
文件 => \u6587\u4ef6
UI display:?文件
      result: "头"(ascii code is \u5934)  displayed "?"
#2. launch the IDE on Solaris zh locale, create a C++ header file, please see screen shot in the attachment.
      result: "头" display well. I checked the source, it's also \u5934

      Maybe the display method in the code are different between displaying folder name and create file menu, so it 
seemed coding issue.
Comment 4 Thomas Preisler 2007-10-20 17:00:24 UTC
Ken Frank:
more experiment

i set zh locale solaris, then dumpcs and found that char at cdb7
(I dont know if it relates to the \u5934 Will mentions

so i created java project with name having 1 character to left of that
char and one to the right, and it shows ok

but using that char gives same ? in explorer  and exception -
java.lang.IllegalArgumentException: URI has a query component
    at java.io.File.<init>(File.java:372)
    at org.netbeans.modules.java.source.usages.RepositoryUpdater$CompileWorker$1.run(RepositoryUpdater.java:1372)
    at org.netbeans.modules.java.source.usages.RepositoryUpdater$CompileWorker$1.run(RepositoryUpdater.java:1122)
    at org.netbeans.modules.java.source.usages.ClassIndexManager.writeLock(ClassIndexManager.java:100)
    at org.netbeans.modules.java.source.usages.RepositoryUpdater$CompileWorker.run(RepositoryUpdater.java:1119)
    at org.netbeans.modules.java.source.usages.RepositoryUpdater$CompileWorker.run(RepositoryUpdater.java:1092)
    at org.netbeans.api.java.source.JavaSource$CompilationJob.run(JavaSource.java:1439)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
[catch] at java.lang.Thread.run(Thread.java:619)


Ken Frank wrote:
> not sure its a cnd thing; i create java project and use that string  Will uses as part of name  of
> java class
>
> (first i am in ja locale; euc-jp proj encoding)
>
> at first the mbyte shows ok, then it changes and the first letter is shown as ?
>
> i change project encoding to gb18030 or gbk, then create a new class again
> and same situation, and same using utf-8 project encoding
>
> thus some interpretattion is happening between when it first shows in explorer ok
> and one second later when it shows the ?
>
> its true i have started nb from ja locale but i think project encoding value can emulate
> what is being seen by Will.
>
> cnd does not have project encoding property so it uses (or should use) the encoding
> of the locale the user is in, just as was done for pre nb6 projects.
>
> Will, is that character with ? perhaps not valid in some zh encodings ?
> or in some range not part of some zh encodings ?
>
> no, that can;'t be it; i see the char ok in editor, in members view, in refactor rename
> so seems like something about explorer, and why it looks ok at first and then the ?
>
>
> Thanks - Ken
>
>
> Will Zhang wrote:
>> Hi  Thomas,
>>
>>       I also felt strange.
>>       As we know, this issue only exists on Solaris zh locale. It doesn't exist on Solaris zh_CN.UTF8 or windows/linux...
>>       If it's l10n issue, all platform should have this issue.
>>       So i did the investigation for "Header Files",
>> #1. the source is:
>> makeproject/src/org/netbeans/modules/cnd/makeproject/api/configurations/Bundle.properties
>> en: HeaderFilesTxt=Header Files
>> zh: HeaderFilesTxt=头文件
>> ascii: HeaderFilesTxt=\u5934\u6587\u4ef6
>> 头 => \u5934
>> 文件 => \u6587\u4ef6
>> UI display:?文件
>>       result: "头"(ascii code is \u5934)  displayed "?"
>> #2. launch the IDE on Solaris zh locale, create a C++ header file, please see screen shot in the attachment.
>>       result: "头" display well. I checked the source, it's also \u5934
>>
>>       Maybe the display method in the code are different between displaying folder name and create file menu, so it
seemed coding issue.
>>       Since i didn't meet this situation before, maybe i'm wrong.
>>       Hope it will be helpful.
Comment 5 Thomas Preisler 2007-10-22 18:22:51 UTC
Ken:

I'm glad it was ok for you on solaris sparc; perhaps I saw the problem since i was running
nb in ja locale, but then if so, I wonder why did I see the problem just with that character,
and its the same char you saw problem with. (I  think it is anyway; is the one I  used
from dumpcs \u5934 the same value
as the char you used that caused the
? to appear  ?

or is it that on solaris sparc you see problem just in cnd area and not with java project ?

and is there any exception in log or window ?

also, when you see the problem in explorer, does the character look ok for one brief
moment, and then become the ?

Thanks - Ken


Will Zhang wrote:
> Hi  Ken,
>
>        Thank you for evaluation.
>        I tried to reproduce the issue you mentioned on Solaris sparc gb18030 and gbk locales with nb6beta1 + l10n jars, but i can't see ?
>        screenshot was attached.
>        I create two java project with name "头文件" and "头文件1" then create two java class in project "头文件" with name "头文件" and "头文件1"
>        Chinses characters display well in explorer, there is no ?
>        Are the steps above wrong? or please let me know the detail how to reproduce it.
>
> thank you,
> Will
Comment 6 Thomas Preisler 2007-10-23 05:42:06 UTC
downgrading to original priority p3.
Comment 7 Thomas Preisler 2008-08-24 19:08:19 UTC
What is the status on this? Do you still see a problem?
Comment 8 Will Zhang 2008-08-25 02:29:40 UTC
This issue still can be reporduced in netbeans-trunk-nightly-200808241401-ml-cpp-solaris-sparc.sh
Comment 9 Thomas Preisler 2008-08-25 03:23:47 UTC
ok
Comment 10 Thomas Preisler 2009-07-22 20:18:10 UTC
See also CR 6863000.
Comment 11 Keiichi Oono 2010-01-24 23:20:30 UTC
I've checked generated project files, and have realized incorrect encoding is used in <project_directory>/nbproject/configuration.xml

zhcn@gimli[195] $ head -3 configurations.xml 
<?xml version="1.0" encoding="EUC-JP"?>
<configurationDescriptor version="62">
  <logicalFolder name="root" displayName="root" projectFiles="true">

"EUC-JP" is a Japanese encoding. "UTF-8" (UNICODE) or "GB2312" (Chinese encoding) should be used.

I would suggest to use "UTF-8" encoding for all the environment. I guess the following source fragment is causing this issue:

cnd/src/org/netbeans/modules/cnd/api/xml/XMLDocWriter.java
-----
     82     protected String encoding() {
     83         String lang = System.getenv("LANG");    // NOI18N
     84         String encoding = "UTF-8";              // NOI18N
     85         if (lang != null) {
     86             if (lang.equals("zh") ||            // NOI18N
     87                 lang.equals("zh.GBK") ||        // NOI18N
     88                 lang.equals("zh_CN.EUC") ||     // NOI18N
     89                 lang.equals("zh_CN.GB18030") || // NOI18N
     90                 lang.equals("zh_CN") ||         // NOI18N
     91                 lang.equals("zh_CN.GBK")) {     // NOI18N
     92 
     93                 encoding = "EUC-JP";            // NOI18N
     94 
     95             } else if (lang.equals("ja") ||     // NOI18N
     96                        lang.equals("ja_JP.eucJP")) { // NOI18N
     97 
     98                 encoding = "EUC-JP";            // NOI18N
     99             } else {
    100                 encoding = "UTF-8";             // NOI18N
    101             }
    102         }
    103         return encoding;
    104     }
-----

The "EUC-JP" in line #93 needs to be fixed. Strictly, the returned encoding name should be "GB2312" when 'lang' is one of "zh*" encodings.
However, I would suggest encoding() method always return "UTF-8" because "UTF-8" is preferred as XML encoding for all the language environment.
Comment 12 Thomas Preisler 2010-01-25 01:29:24 UTC
Thanks for the evaluation. Keiichi, can you please confirm the following change:

 82     protected String encoding() {
 83 //      String lang = System.getenv("LANG");    // NOI18N
 84         String encoding = "UTF-8";              // NOI18N
 85 //      if (lang != null) {
 86 //          if (lang.equals("zh") ||            // NOI18N
 87 //              lang.equals("zh.GBK") ||        // NOI18N
 88 //              lang.equals("zh_CN.EUC") ||     // NOI18N
 89 //              lang.equals("zh_CN.GB18030") || // NOI18N
 90 //              lang.equals("zh_CN") ||         // NOI18N
 91 //              lang.equals("zh_CN.GBK")) {     // NOI18N
 92 //
 93 //              encoding = "EUC-JP";            // NOI18N
 94 //
 95 //          } else if (lang.equals("ja") ||     // NOI18N
 96 //                     lang.equals("ja_JP.eucJP")) { // NOI18N
 97 //
 98 //              encoding = "EUC-JP";            // NOI18N
 99 //          } else {
100 //              encoding = "UTF-8";             // NOI18N
101 //          }
102 //      }
103         return encoding;
104     }
Comment 13 Thomas Preisler 2010-01-25 01:51:46 UTC
Fixed.
Comment 14 Keiichi Oono 2010-01-25 01:59:06 UTC
Thank you Thomas for your prompt action. The fixing seems OK. I will verify in the development build.
Comment 15 Quality Engineering 2010-01-27 13:06:25 UTC
Integrated into 'main-golden', will be available in build *201001271614* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main/rev/41d1c384b499
User: Thomas Preisler <thp@netbeans.org>
Log: #119431 - Chinese characters display "??" on project folder
Comment 16 Keiichi Oono 2010-02-23 23:20:59 UTC
Created attachment 94454 [details]
screenshot in fixed build

Verified in build 201002230200. Thank you for your fixing.
I've also confirmed the saved XML encoding is now "UTF-8".
Sorry for my late verification.
Comment 17 Keiichi Oono 2010-02-23 23:21:32 UTC
Change status to VERIFIED
Comment 18 Thomas Preisler 2010-02-24 10:49:01 UTC
Yeah.... Thanks.