This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
It seems that current codes are assuming only iso-2022-jp, sjis and euc-jp as Javadoc encoding. Only these encodings will be accepted as Javadoc, which means when users create Javadoc in UTF-8 with Japanese, it will not be accepted. Jdk12SearchType_japan.java: if ("jisautodetect".equals(acceptedEncoding)) { //NOI18N return "iso-2022-jp".equals (encoding) || //NOI18N "sjis".equals (encoding) || //NOI18N "euc-jp".equals (encoding); //NOI18N // || "utf-".equals (encoding); XXX Probably not, UTF-8 can be anything ???? } JDK has changed the encoding from euc-jp to UTF-8 at JDK6. jdk-6-doc-ja.zip (can be downloaded from http://java.sun.com/javase/ja/6/download.html) So currently it's not working with this Javadoc zip. We need to think better way to accept such UTF-8 based Javadoc as Japanese javadoc.
In current implementation, accepts() is only checking the encoding - it accepts possible Japanese encodings, e.g. sjis, euc-jp and iso-2022-jp. utf-8 is now widely used and we should accept it and we need to provide additional way to determine the javadoc are containing Japanese keywords or not. If encoding are Japanese -> accept (return true) If encoding is utf-8 -> check contents, if japanese keywords are includeds -> accept (return true) Any idea about other conditions? Should we check NetBeans locale?
We have the same situation for Simplified Chinese. It seems that only GB2312, GB18030, GBK are accepted, but not UTF-8.
I think its reasonable to include ja or zh_CN utf8 encoding as a search parameter since at least for solaris and linux, there are ja and zh_CN utf8 locales. Usually assumption is that encoding being used is that of the locale user is in, but also user can change the encoding of a given java file by properties, so perhaps also the encoding of each searched file can be used but perhaps that is not possible since user enters the search term while in a certain locale and encoding. ken.frank@sun.com
Created attachment 40469 [details] proposed patch, check Japanese encoding first. If utf-8, we need to check contents and check localized strings in it. JISAutodetect encoding should not be used. Using allclasses-frame.html file would be reasonable. This will not break English build. Will
Can anyone review the patch?
I just do not see any reason to read "JDK12_ALLCLASSES_JA" from Bundle.properties file. Otherwise it looks OK. Feel free to integrate it. Thanks for the patch!
fixed in /cvs/javadoc/src/org/netbeans/modules/javadoc/search/Jdk12SearchType_japan.java,v <-- Jdk12SearchType_japan.java new revision: 1.11; previous revision: 1.10
/cvs/javadoc/src/org/netbeans/modules/javadoc/search/Jdk12SearchType_japan.java,v <-- Jdk12SearchType_japan.java new revision: 1.12; previous revision: 1.11
*** Issue 108492 has been marked as a duplicate of this issue. ***
what is the user scenarios here ? I'd like to verify. please specify both about locale user is in but also about the project encoding properties that might be used for example lets use ja locale, and project in default utf-9 or euc-jp project encoding for solaris. (and please confirm - is this about viewing ja javadoc or user generating javadoc of their own project ?) also, there are some issues now with javadoc not appearing ok in firefox if non ascii is used in places like project name, path, class name, pkg name, etc. (I know not related to this but FYI since about javadoc) see 118174 for j2se project, issues on web and j2ee for same will be filed.
To verify this issue quickly, you can use Japanese jdk5 javadoc and jdk6 javadoc, then try search. We should get the same results with English. - jdk5 japanese javadoc : EUC encoding - jdk6 japanese javadoc : UTF-8 encoding In both cases, it should work and actually I could get the same results. (but I found small issue that icons on search result are not correct. I'll open new bug for this.) > also, there are some issues now with javadoc not appearing ok in firefox if non ascii > is used in places like project name, path, class name, pkg name, etc. > (I know not related to this but FYI since about javadoc) Yes, I think the reason is described in bug 118174, the default behavior of javadoc will generate javadoc in native encoding and will not add charset metatag in javadoc. So sometimes it does not work until we change the charset encoding of browser. I'll try the new build to see how the fix of 118174 is working.
why do bundle files need specific translated words for Japanese related to javadoc viewing ? I don't know if that's completely related to this issue but saw here mention of bundle files then saw in bundle files some separate Japanese words. ken.frank@sun.com
Jan already fixed that issue. Thank you Jan. We don't need any specific Japanese words in bundle file. (old implementation was using some words of Japanese in Bundle files)
is it safe to remove them from current bundles or is it better to wait until after nb6 ? ken.frank@sun.com
Yes, it's safe. We can remove them now and actually these have been already removed in the fix of bug 118488. It means displaying and searching Japanese javadoc are working without Japanese Bundle.properties.