This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 121959

Summary: I18N - jsp or servlet with multibyte name does not run ok
Product: javaee Reporter: Ken Frank <kfrank>
Component: CodeAssignee: Tomas Mysik <tmysik>
Status: VERIFIED FIXED    
Severity: blocker CC: dkonecny, kaa, phejl, pjiricka, rroska, tmysik
Priority: P2 Keywords: I18N, RELNOTE
Version: 6.x   
Hardware: Sun   
OS: All   
Issue Type: DEFECT Exception Reporter:
Attachments: using underscores in browser
mbyte name was converted by wizard using underscores
using underscores in browser
image

Description Ken Frank 2007-11-15 06:20:57 UTC
this can be separated into 2 issues if needed, since discuss about jsp and
servlet, but since jsp is compiled to servlet, thought it ok to discuss both here.

in ja locale, on solaris, using default utf-8 project encoding or euc-jp
encoding.

1. create web project - in this case project name has multibyte but
probably not important for this situation.

2. create a servlet and have its name have multibyte; by default
the uri does also.

2a. uncomment the commented out part of servlet code.

2b. compile servlet - no errors seen

3. run the servlet - and leave the uri as is in the uri popup window - the browser shows 
HTTP Status 404 -

type Status report

message

description The requested resource () is not available.

a. on windows using ie browser and with default utf-8 project encoding,
besides above msg the multibyte that represents servlet uri does not show ok.

b. in gf, the servlets are shown as deployed.

c. servlet code has utf-8 encoding value in it; don't know if that value
should be seeded with encoding of the project as is done for jsp files ?
but even with that section uncommented, it does not run ok.

4. change the uri of servlet to not have mbyte, but leave servlet
name as having mbyte - it runs ok and shows in browser ok.


5. I don't know if this relates to not replacing the mbyte with underscores
for the uri, as is done with context root of web app projects.

6. same situation for jsp, if jsp file name has mbyte in it.
but in jsp case, there are some errors in the compilation of the created
servlet, in this case for euc-jp project encoding on solaris

Compiling 1 source file to /home/proj/と粮Jot粮WebApplication4粤ろ/build/generated/classes
/home/proj/と粮Jot粮WebApplication4粤ろ/build/generated/src/org/apache/jsp/と粮jspeuc_jsp.java:7: 警告:この文字は、エン
コーディング EUC-JP にマップできません。
public final class ??荻jspeuc_jsp extends org.apache.jasper.runtime.HttpJspBase
/home/proj/と粮Jot粮WebApplication4粤ろ/build/generated/src/org/apache/jsp/と粮jspeuc_jsp.java:7: 警告:この文字は、エン
コーディング EUC-JP にマップできません。
public final class ??荻jspeuc_jsp extends org.apache.jasper.runtime.HttpJspBase
/home/proj/と粮Jot粮WebApplication4粤ろ/build/generated/src/org/apache/jsp/と粮jspeuc_jsp.java:7: \65533 は不正な文字です。
public final class ??荻jspeuc_jsp extends org.apache.jasper.runtime.HttpJspBase
/home/proj/と粮Jot粮WebApplication4粤ろ/build/generated/src/org/apache/jsp/と粮jspeuc_jsp.java:7: \65533 は不正な文字です。
public final class ??荻jspeuc_jsp extends org.apache.jasper.runtime.HttpJspBase

--> where the EUC=JP part is probably msg about non supported encoding,
and the  ??荻jspeuc_jsp - the question marks is showing incorrect multibyte
so this part could be encoding situation.

for windows, similar compile msgs as above but not problems with EUC-JP encoding mentioned.

in utf-8 project encoding there is not this error but still the same browser
error msg as above for servlet.


7. not related directly to this - but the popup that asks for the uri
for the servlet says it can be changed later using tools->set new uri 
but I don't see that choice on tools menu of servlet, web project or ide.
Comment 1 Ken Frank 2007-11-15 06:28:59 UTC
if the jsp with mbyte name is built and run as part of running the project,
then it does show ok in browser - that is, run the project
and get in browser a dir listing of the jsp files in it
then click the jsp file and it shows ok.

perhaps there is difference in what happens and how things processed
when project built and run vs compile/run a jsp or servlet by itself.

ken.frank@sun.com
Comment 2 Petr Hejl 2007-11-16 09:42:21 UTC
I tested this on Linux and every thing works quite fine:

1) created web project on glassfish
2) created servlet with name あだだ.java
3) compile on servlet
4) run on servlet
5) leaved the text in popup
6) browser (firefox) opened the servlet url

With compile followed by run on jsp it works too.

I will try windows.
Comment 3 Petr Hejl 2007-11-16 10:19:06 UTC
I can reproduce on windows. It seems to me that urls are escaped in bad way.
Netbeans build log:
http://localhost:8080/WebApplication18/アダダ
Firefox displayed url:
http://localhost:8080/WebApplication18/%C3%A3%E2%80%9A%C2%A2%C3%A3%C6%92%E2%82%AC%C3%A3%C6%92%E2%82%AC
Right url:
http://localhost:8080/WebApplication18/%E3%82%A2%E3%83%80%E3%83%80
Comment 4 Petr Jiricka 2007-11-16 13:17:43 UTC
Not a stopper for 6.0, setting target milestone to Dev.
Comment 5 Petr Hejl 2007-11-16 14:04:52 UTC
Specification is unclear on url encoding with multibyte characters: http://www.rfc-editor.org/rfc/rfc2396.txt.
Comment 6 Ken Frank 2007-11-16 15:55:54 UTC
I've seen same situation using IE on windows.

as to the uri/url spec and mbyte, I know in other parts of nb its
handled, ie if path/project dir or file name has path with mbyte,
that the nb code and/or browser (not sure) does whats needed so that
the page can be found, although firefox can have some problems with this.

thats one reason I wondered if the mbyte needed to be replaced with underscores
for the individual servlet or jsp names for their part of the uri,
just as web apps does for the project context root itself.

BTW, could this issue be related to 121933 in the sense of uri/urls handling ?

ken.frank@sun.com
Comment 7 Ken Frank 2007-11-16 17:24:25 UTC
should this be release noted ? added that keywd in case, please remove
if it should not.

since web apps do handle url/uri ok, and it is legal, as far as I know,
for url/uri to have non ascii, then perhaps relnote can clarify for this
case ?

ken.frank@sun.com
Comment 8 Petr Hejl 2007-11-17 21:29:55 UTC
I can't agree with desc7. When I create web app WebApplication19ア with context-root WebApplication19ア, then browsing
just does not work correctly too (both ie7 and firefox). Application is deployed and acessible by hand, but when the
browser is invoked by nb it shows the wrong url.

I've tested workaround with passing already escaped url (with utf8 charset - recommended by html) - this works.
Unfortunately no specification is clear on this - there is no mandatory mechanism afaik. In fact it also depends on the
server (for example tomcat's URIEncoding parameter).

Maybe the described workaround could solve most of the problems, but I'm afraid that in such complicated environment
(server, netbeans passing argument to external process, os locale, browser settings) we can't guarantee 100% success.

If anyone can provide some normative documentation which I missed, please let me know.
Comment 9 Ken Frank 2007-11-18 01:07:57 UTC
agree, don't think every case could be covered,
also some limitations of firerfox, at least on solaris, as to parsing
of non ascii might apply (this might be just when project encoding is 
not utf8 but for example euc-jp if user in solaris ja locale.

But perhaps the most common scenarios could be fixed ie
- works with either project encoding of utf-8 or that of the default
encoding of the locale the are running nb in - I think one big
reason for the feq implementation was to allow use utf-8 encoding,
but that for windows at least there is no locale for which that is
default encoding -- but we also need to support those who still need
to use default encoding of the locale they are in.

we can cover case for where the browser desired language is the first
item in the browser preferences since user can be running nb in one locale
but browser prefs can be another.

as to the workaround wrt url formatting, is that something that could
work for the use of underscores in the web app context root itself ?

or is the use of underscores a way to solve this issue with separate 
servlets or jsp ?

ken.frank@sun.com
Comment 10 Petr Hejl 2007-11-19 11:08:49 UTC
Well, to make it clear - I don't think this issue is related to project encoding, feq or anything similar. Basically, it
is the issue with passing java string (with unicode chars) to the external process (browser) and issue with the way how
such string is handled by the browser (if passed correctly). Another issue is whether the server will understand the uri
encoding sent by the browser (tomcat - for example - with default setting will not understand any non ISO-8859-1
characters in url, although the application will deploy).

The underscores could avoid (not to solve) the problem partially - user can always configure the context-root or servlet
path with multibyte characters. 
Comment 11 Ken Frank 2007-11-30 02:26:38 UTC
can team see if it can be fixed for some upcoming patch or 6.0.1 ?
since we allow in nb use of non ascii in project names and paths,
fixing it here would allow consistency with that.

ken.frank@sun.com
Comment 12 Ken Frank 2007-12-03 16:05:55 UTC
added release60_fixes_candidate2 to status whiteboard
since that is process from sustaining to see if this
can be fixed for patch 2.

Can team see if its a fix that can be done; the common case
as discussed below that is.

I can point team to info on the alias and internal sustaining
process about it.

ken.frank@sun.com
Comment 13 Petr Hejl 2007-12-11 11:09:22 UTC
It is not fixable for all cases. We can agree on replacing multibytes with underscores for example or passing URL UTF-8
encoded.
Comment 14 Ken Frank 2007-12-11 16:19:41 UTC
I think its reasonable that the common cases can be fixed without needing to
fix all the cases.  If this can be done in trunk then we can test the fix
so that we have met requirements for it being able to be in patch 2 in January.

ken.frank@sun.com
Comment 15 Ken Frank 2007-12-13 17:52:42 UTC
I found out that for it to be fixed in patch2, it needs to fixed in trunk
first and then verified - thus please let us know when fix will be in
trunk and we will verify immediately.

ken.frank@sun.com
Comment 16 Petr Hejl 2008-01-07 11:05:38 UTC
Simple solution is to escape default url mapping with underscores.

Posting the simple url check methods (will discuss the placement of these to j2ee core utilities):

    private static final Pattern VALID_URL_PATTERN = 
            Pattern.compile("[-_.!~*'();/?:@&=+$,a-zA-Z0-9]+"); // NOI18N

    private static boolean isRFC2396Url(String url) {
        return VALID_URL_PATTERN.matcher(url).matches();
    }

    private static String getRFC2396Url(String url) {
        if (isRFC2396Url(url)) {
            return url;
        }
        StringBuilder sb = new StringBuilder(url);
        for (int i = 0; i < sb.length(); i++) {
            if (!isRFC2396Url(sb.substring(i, i + 1))) {
                sb.replace(i, i + 1, "_"); // NOI18N
            }
        }
        return sb.toString();
    }
Comment 17 Ken Frank 2008-01-10 19:56:37 UTC
am updating the status whiteboard with correct string for patch 3;
originaly it was for patch 1.

patch 3 happens in next few week; would it be able to be fixed for it ?
(the case discussed in suggested fix part)

ken.frank@sun.com
Comment 18 Ken Frank 2008-01-11 16:53:36 UTC
correcting the status whiteboard item back to _candidate1, sorry for the misunderstanding
about this.

ken.frank@sun.com
Comment 19 Tomas Mysik 2008-01-15 10:19:16 UTC
Seems to be easy to fix.
Comment 20 Tomas Mysik 2008-01-15 15:03:15 UTC
Finally not so easy but should be fixed - please verify. Thanks.

Checking in src/org/netbeans/modules/web/wizards/Bundle.properties;
/cvs/web/core/src/org/netbeans/modules/web/wizards/Bundle.properties,v  <--  Bundle.properties
new revision: 1.14; previous revision: 1.13
done
Checking in src/org/netbeans/modules/web/wizards/DeployDataPanel.java;
/cvs/web/core/src/org/netbeans/modules/web/wizards/DeployDataPanel.java,v  <--  DeployDataPanel.java
new revision: 1.4; previous revision: 1.3
done
Checking in src/org/netbeans/modules/web/wizards/ServletData.java;
/cvs/web/core/src/org/netbeans/modules/web/wizards/ServletData.java,v  <--  ServletData.java
new revision: 1.8; previous revision: 1.7
done
Comment 21 Ken Frank 2008-01-17 00:17:27 UTC
is the fix in trunk or only in patch branch ?
Andrey please use applicable build for verification.

ken.frank@sun.com
Comment 22 Tomas Mysik 2008-01-17 12:33:24 UTC
Yes, it's in trunk.
Comment 23 kaa 2008-01-17 13:44:31 UTC
verified in trunk:
Product Version: NetBeans IDE 6.0 (Build 200801170000)
Comment 24 kaa 2008-01-17 15:32:38 UTC
checked with the following project encodings:
S10: utf-8/euc-jp
wXP: utf-8/win-31j
Comment 25 kaa 2008-01-17 16:10:31 UTC
reopening - I used incorrect setup for verification.
Servlets with mbyte in names couldn't be accessed. There was mentioned HTTP Status 404.
Compilation and deployment look ok on XP and Solaris 10. Checked with Mozilla and IE on windows.
I was able to use JSP pages with mbyte in their names only on Solaris 10 (proj encodings: utf-8 and x-euc)
Comment 26 Tomas Mysik 2008-01-18 09:28:48 UTC
Works for me ('dangerous' characters are replaced in web.xml, URL mapping for such servlet works for me) - are you 
sure you have tried the correct URL? HTTP status 404 means 'not found'. Reopen please if the problem still exists and 
provide exact steps how to reproduce. Thanks.

Product Version: NetBeans IDE Dev (Build 080118)
Java: 1.5.0_13; Java HotSpot(TM) Client VM 1.5.0_13-b05
System: Linux version 2.6.23-gentoo-r5 running on i386; UTF-8; cs_CZ (nb)
Comment 27 kaa 2008-01-21 15:46:31 UTC
I used 0121 build from here:
http://bits.netbeans.org/dev/nightly/latest/

Steps:
1. Created WebApp using mbyte only in its name
2. Added WS using mbyte only in its name and package
3. Changed context path to run using servlet name. (that was typed in the step 1)
4. Run the App

The only I had in my Mozilla was:

HTTP Status 404 -

type Status report
message
description The requested resource () is not available.

The Servlet looks ok with its name converted in underscores.
Does it expected behavior?
Comment 28 kaa 2008-01-21 15:47:56 UTC
Created attachment 55330 [details]
using underscores in browser
Comment 29 kaa 2008-01-21 15:49:00 UTC
Created attachment 55331 [details]
mbyte name was converted by wizard using underscores
Comment 30 kaa 2008-01-21 15:49:44 UTC
Created attachment 55332 [details]
using underscores in browser
Comment 31 Tomas Mysik 2008-01-22 17:26:29 UTC
Reopening and adding Radim to CC to verify. Thanks Radime.
Comment 32 Radim Roska 2008-01-25 15:28:41 UTC
i've tested this on solaris, linux and win xp...just for sure :)

I've created servlets with czech characters (čřšřěšč) in name...if you select next in servlet wizard, you will see
correctly modified url pattern. Servlet can be run through IDE and firefox display it well...

Only problem is if you finish new servlet wizard in 2nd step = "Name and location"...then url pattern is as it used to
be...with mbyte chars. That should be fixed.

But beside this issue i think tomas's fix is ok.

KAA: please try to reproduce it once again....(id=55332) and (id=55331) are little strange...if 55332 displays servlet
created in 55331...i would expect text "servlet <servlet_name> at /<project_name>", where <servlet_name> should be 3x
that strange jpn char..
Comment 33 Petr Blaha 2008-02-01 12:31:45 UTC
The bug wasn't verified before cut-off date for 6.0.1patch1 and will be included in 6.0.1patch2.
Comment 34 kaa 2008-02-06 17:01:16 UTC
Provide please build location with the fix.
Comment 35 Tomas Mysik 2008-02-06 17:09:23 UTC
It's not fixed completely yet - see radim's (rroska) comment.
Comment 36 Tomas Mysik 2008-02-13 16:41:34 UTC
Fixed. Please verify, thanks.

changeset:   67256:5c37900c595f
tag:         tip
user:        Tomas Mysik <tmysik@netbeans.org>
date:        Wed Feb 13 17:38:21 2008 +0100
files:       web.core/src/org/netbeans/modules/web/wizards/ServletData.java
description:
#121959: I18N - jsp or servlet with multibyte name does not run ok
Comment 37 kaa 2008-02-26 15:39:34 UTC
I tried to reproduce with:
Product Version: NetBeans IDE Dev (Build 200802191203)
Java: 1.6.0_03; Java HotSpot(TM) Client VM 1.6.0_03-b05
System: Windows XP version 5.1 running on x86; MS932; ja_JP (nb)

Result1:
The Servlet looks ok with its name was converted in underscores. Servlet can be run through IDE ok.

Result2:
1. Name the Servlet and its package using 1 mbyte char: 粤
2. Changed relative URL in the project properties (RUN category) using this name /粤
3. Run the app

404 err was shown in browser.
The requested resource () is not available.

Does it expected behavior? If not then please reopen.
   
Comment 38 kaa 2008-02-26 15:41:14 UTC
Created attachment 57280 [details]
image
Comment 39 kaa 2008-02-26 15:45:20 UTC
With eng chars in names of the Servlet and its package everything looks ok.
Comment 40 Radim Roska 2008-02-26 16:19:51 UTC
VERIFIED

This behavior is ok. There must not be mbyte chars in url. So during creating servlet, web project, etc...its url is
rewritten to ___ "shape"...If you change it back..its your problem...

Comment 41 jinb 2008-03-05 15:09:28 UTC
Backported in release601_fixes branch

Checking in src/org/netbeans/modules/web/wizards/Bundle.properties;
/cvs/web/core/src/org/netbeans/modules/web/wizards/Bundle.properties,v  <--  Bundle.properties
new revision: 1.13.6.1; previous revision: 1.13 
done
Checking in src/org/netbeans/modules/web/wizards/DeployDataPanel.java;
/cvs/web/core/src/org/netbeans/modules/web/wizards/DeployDataPanel.java,v  <--  DeployDataPanel.java
new revision: 1.3.8.1; previous revision: 1.4 
done
Checking in src/org/netbeans/modules/web/wizards/ServletData.java;
/cvs/web/core/src/org/netbeans/modules/web/wizards/ServletData.java,v  <--  ServletData.java
new revision: 1.7.8.1; previous revision: 1.7 
done