This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 118174

Summary: I18N - javadoc not found on solaris, maybe other platforms, in certain cases wrt multibyte
Product: java Reporter: Ken Frank <kfrank>
Component: ProjectAssignee: Tomas Zezula <tzezula>
Status: RESOLVED FIXED    
Severity: blocker CC: jf4jbug, jpokorsky, masaki, mmirilovic, tmysik
Priority: P3 Keywords: I18N, RELNOTE
Version: 6.x   
Hardware: Sun   
OS: All   
Issue Type: DEFECT Exception Reporter:
Attachments: case 1 - utf8 pronect encoding
euc-jp project encoding

Description Ken Frank 2007-10-09 03:02:04 UTC
this is seen on solaris; am guessing it would be for linux or windows if using firefox; this initial
report for solaris.

assumption is that user can use multibyte or non ascii in project path, name, java file name, pacakge name, methods, etc.

there is now a concept of project encoding property which will be referred to below.

in both cases running in solaris ja locale.

attached gifs will show more details.

1. utf8 proj encoding prop

multibyte in proj name, path, file name, class name, package name

solaris  and firefox as browser

genrate javadoc and browser appears with the left and right sides

left side not shows mbyte ok, right side states it cant find
the file needed, see gif

in this message,  the mbyte that shows ok is the proj path and project
name but not pkg name or file  name

also, browser encoding is set to western, yet no browser was open
before and am running ide in ja locale.

click on left side item,  same problem as above

change browser encoding to utf-8:
left side ok, not right side

click on left - now right side shows ok

2. euc-jp proj encoding prop

same setup as above

left side not ok, right side cant find file mesage
browser again set to western encoding

this time, the right side shows the mbyte for the pkg and file name
but not the project path and project name.

chg browser to euc encoding - nothing changes
ok but not proj path and proj name

change browser to utf8 encoding - does not help





.
Comment 1 Ken Frank 2007-10-09 03:03:25 UTC
Created attachment 50458 [details]
case 1 - utf8 pronect encoding
Comment 2 Ken Frank 2007-10-09 03:04:16 UTC
Created attachment 50459 [details]
euc-jp project encoding
Comment 3 Ken Frank 2007-10-09 03:35:18 UTC
on windows, using firefox, case 1 is ok, case 2 is not, same kind of problems as described below

IE is ok for both cases.

ken.frank@sun.com
Comment 4 Jan Pokorsky 2007-10-09 10:40:38 UTC
IMO it is caused by misconfigured javadoc target. It should set -charset, -encoding (-docencoding) options properly.

Reassigning to java/j2seproject.
Comment 5 Jan Pokorsky 2007-10-09 10:53:11 UTC
Ken, try to add proper options to the project customizer Build/Documenting/Additional options e.g. -encoding UTF-8
-charset UTF-8 -docencoding UTF-8. It works to me then.
Comment 6 Ken Frank 2007-10-09 15:27:01 UTC
Jan, 1 question and 1 comment

- comment - user should not need to add encoding option to get this to work ok IMO; since there is now
project/file encoding things, it seems that thats all user should need to do; en users don't have to do it so users
in other locales should not also.
And in past I don't think that was a requirement for javadoc.

- question - is separate issue needed for all project types/components in which javadoc can be generated or
is this one enough ?

ken.frank@sun.com
Comment 7 Ken Frank 2007-10-09 15:51:41 UTC
also, issues were filed on samples to be regenerated to be able to use the new feq things;
some of these have been done; some not yet; would those done need to be regeerated again
if this is fixed ?

ken.frank@sun.com
Comment 8 Jan Pokorsky 2007-10-09 15:55:22 UTC
Ken, I have asked you to add options to verify it works even for you. The fix should patch build-imp.xml the same way. I
did not close the bug saying a workaround exists. I agree it should work automatically.

As for the 'all project types' question it has to be answered by j2seproject guys.
Comment 9 Tomas Zezula 2007-10-09 16:13:53 UTC
Checking in org/netbeans/modules/java/j2seproject/resources/build-impl.xsl;
/cvs/java/j2seproject/src/org/netbeans/modules/java/j2seproject/resources/build-impl.xsl,v  <--  build-impl.xsl
new revision: 1.104; previous revision: 1.103
done
Comment 10 Tomas Zezula 2007-10-09 16:44:26 UTC
Fix only for j2seproject, fixes also j2seproject samples.
Comment 11 Masaki Katakai 2007-10-11 05:10:57 UTC
I think it's good thing that we specify the default encoding of javadoc and charset tag in generated javadoc by default.

but I have one question - can user overwrite the such encoding options when users specify own options into "Additional
Javadoc Options"? If someone wants to generate javadoc in different encoding, the user will customize the option. Will
this work?

Anyway, I'll try when new build ready.
Comment 12 Ken Frank 2007-10-15 21:19:58 UTC
reopening

its ok in ja locale on solaris, using utf-8 default project encoding.

However, having a new project with euc-jp encoding, it still does not work as per
original issue, it can't find the path to the project since project name and also
nb project dir has mbyte in its name.

Even using the suggested javadoc options mentioned below using EUC-JP as the value does not
help.

ken.frank@sun.com
Comment 13 Tomas Zezula 2007-10-16 08:43:19 UTC
You don't need to add these options, it's part of the fix.
It seems as a file system encoding vs content encoding problem, probably nothing we can fix.
Comment 14 Tomas Zezula 2007-10-16 10:48:32 UTC
The problem is that the part of the URL is UTF-8 and part is EUC-JP, the javadoc tool stores the URL in the encoding it
generates the web page, but the right way is to generate the text in requested encoding (in your case EUC-JP) and urls
in UTF-8. The only workaround of this problem is that we force that the javadoc will be always generated in UTF-8.
Is it acceptable?
Comment 15 Ken Frank 2007-10-16 16:04:51 UTC
I think the problem with using utf-8 is that the opposite problem can happen, there will still
be 2 encodings active and thus instead of path/project dir not found, it can be that the
pkg or class file (that is named with non ascii) will not be found (or visa versa).

ken.frank@sun.com
Comment 16 Tomas Zezula 2007-10-16 16:20:18 UTC
OK, in this case it's a WONTFIX, since it's not fixable in the IDE.
Comment 17 Ken Frank 2007-10-16 16:23:26 UTC
I added relnote keyword so perhaps this can be mentioned, and possible workaround
might be to have browser go directly to the needed pages, though even that might
not work since still multiple encodings in this case.

ken.frank@sun.com
Comment 18 Tomas Zezula 2007-10-16 16:56:43 UTC
But I still think that UTF-8 encoded javadoc is much better since probably all OS are having UTF-8 encoded file system,
the problem will disappear for 99% os users.
Comment 19 Ken Frank 2007-10-16 17:08:35 UTC
It still might be more common for users to use non utf-8, since windows, as least for ja and zh,
does not have locales/reg settings that have utf-8 as their default encoding.
and even those running on unix, that does have utf8 locales, might need to use legacy encodings
(like are found in solaris ja locale).

but if its about using utf-8 as the project encoding in nb, and the fix is for that case,
then I agree that most users will use the default encoding of utf-8.


ken.frank@sun.com
Comment 20 Tomas Zezula 2007-10-16 18:16:21 UTC
It would be nice if someone who has Japanese windows can try it. On Solaris I cannot say since the file system is UTF-8.
Comment 21 Masaki Katakai 2007-10-17 14:17:29 UTC
I tried the latest daily build on Japanese Windows. In both cases windows-31j and UTF-8
project encoding, it's working fine.
(If I use multibytes in project name and project path, the problem appears. But let's ignore such cases.)

I agree with Tomas, let's use UTF-8 encoding for javadoc encoding, I believe it can help more cases e.g.

- Create NB project in EUC-JP encoding on Windows Japanese
- Create NB project in windows-31j encoding on Solaris EUC locale

These are not common case but with current implementation, it's not working when class and package name include multibyte.
If we could generate javadoc always in UTF-8, I think it will work.
Comment 22 Tomas Zezula 2007-10-17 14:28:38 UTC
OK, I will change it to generate UTF-8 javadoc.
Comment 23 Tomas Zezula 2007-10-17 15:52:24 UTC
Changed to generate javadoc's HTML pages in UTF-8.

Checking in org/netbeans/modules/java/j2seproject/resources/build-impl.xsl;
/cvs/java/j2seproject/src/org/netbeans/modules/java/j2seproject/resources/build-impl.xsl,v  <--  build-impl.xsl
new revision: 1.106; previous revision: 1.105
done
Comment 24 Ken Frank 2007-10-17 16:54:05 UTC
as to comment
(If I use multibytes in project name and project path, the problem appears. But let's ignore such cases.)

-  but they are valid cases also.

I am ok with the fix as to utf-8, but needed to clarify about that point.

ken.frank@sun.com
Comment 25 Ken Frank 2008-03-06 20:33:44 UTC
its a long issue - Tomas, could you summarize the fix; and what cannot be
expected to work now as to if user has non ascii in some project names or paths while using
either utf-8 or some other project encodings ?

Thanks - Ken

ken.frank@sun.com
Comment 26 Tomas Zezula 2008-03-07 07:37:49 UTC
As far as I remember the fix works in the following way:
1) The javadoc reads the source in the project's encoding - set in project customizer
2) The javadoc generates HTML pages always in the UTF-8 encoding - needed by browsers which expect URL encoded in UTF-8 but javadoc
generates the URL in the same encoding as the HTML which caused that links didn't work.

When the project is set up correctly - all source roots are in the project encoding, the javadoc should be generated correctly in the UTF-8.
Comment 27 Ken Frank 2008-07-28 21:49:23 UTC
Tomas, I want to finally verify this one, but first a summary question:

1. from Tomas comments 
As far as I remember the fix works in the following way:
1) The javadoc reads the source in the project's encoding - set in project customizer
2) The javadoc generates HTML pages always in the UTF-8 encoding - needed by browsers which expect URL encoded in UTF-8
but javadoc
generates the URL in the same encoding as the HTML which caused that links didn't work.

When the project is set up correctly - all source roots are in the project encoding, the javadoc should be generated
correctly in the UTF-8.


2. from Ken's questions:

but I'd seen that if user used another encoding for project, it would not work:

The problem is that the part of the URL is UTF-8 and part is EUC-JP, the javadoc tool stores the URL in the encoding it
generates the web page, but the right way is to generate the text in requested encoding (in your case EUC-JP) and urls
in UTF-8. The only workaround of this problem is that we force that the javadoc will be always generated in UTF-8.
Is it acceptable?

------- Additional comments from kfrank Tue Oct 16 15:04:51 +0000 2007 -------

I think the problem with using utf-8 is that the opposite problem can happen, there will still
be 2 encodings active and thus instead of path/project dir not found, it can be that the
pkg or class file (that is named with non ascii) will not be found (or visa versa).

3. and Tomas replie:

------- Additional comments from tzezula Tue Oct 16 15:20:18 +0000 2007 -------

OK, in this case it's a WONTFIX, since it's not fixable in the IDE.

===> thus the fix is only if project using utf-8 project encoding ?
(and we are saying wontfix for other cases of it)

We can still verify it, just want to be clear.

ken.frank@sun.com


Comment 28 Tomas Zezula 2008-08-13 19:04:21 UTC
Hi Ken,
the final state is:
Sources are read in the project encoding and javadoc is always generated in UTF-8. This shouldn't be problem since it's a html, so it's contains an encoding.
The problem 3. (wontfix) can happen when filesystem is not able to store UTF-8 file names or resolve them, but it's up to the user to fix such kind of 
problems.