34399 – I18N - default encoding is always "EUC-JP" on Ja Solaris

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 34399 - I18N - default encoding is always "EUC-JP" on Ja Solaris

Summary: I18N - default encoding is always "EUC-JP" on Ja Solaris

Status:	VERIFIED FIXED

Alias:	None

Product:	javaee
Classification:	Unclassified
Component:	Code (show other bugs)
Version:	-S1S-
Hardware:	Sun Solaris

Importance:	P3 blocker (vote)
Assignee:	Petr Pisl

URL:
Keywords:	I18N

Duplicates (1):	35332 (view as bug list)
Depends on:
Blocks:	40686
	Show dependency tree

Reported:	2003-06-16 03:42 UTC by ohsumi
Modified:	2005-03-31 16:56 UTC (History)
CC List:	5 users (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
sample code of java.nio.charset.Charset (1002 bytes, text/plain) 2003-06-23 12:08 UTC, ohsumi	Details
Patch for the whole issue against the release35 branch (7.10 KB, patch) 2003-06-24 13:12 UTC, Petr Jiricka	Details \| Diff
New refined diff of the changes (in trunk) (7.61 KB, patch) 2003-06-30 08:45 UTC, Petr Jiricka	Details \| Diff
Additional changes fixes Keiichi's review (8.51 KB, patch) 2003-10-22 07:27 UTC, capSS	Details \| Diff
collapsed multibyte chars are placed between title tag in JSP (57.26 KB, image/jpeg) 2004-01-20 11:59 UTC, mtsuruta	Details
logs on browser when executing jsp (1.71 KB, text/plain) 2004-03-01 11:00 UTC, mtsuruta	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description ohsumi 2003-06-16 03:42:54 UTC

Tested build: Nevada ML RC1 build (030607_02)

When a jsp file is created in New Wizard on
Japanese Solaris,
default encoding is always "EUC-JP". There are
three japanese locales,
ja (ja_JP.eucJP), ja_JP.PCK, ja_JP.UTF-8, so
default encoding
should be system's locale (locale when IDE was
run).

I think the following codes are related to this
problem.

web/core/src/org/netbeans/modules/web/core/jsploader/JspDataObject.java

    public static String getDefaultEncoding() {
        String language =
System.getProperty("user.language");
        if ("ja".equals(language)) { // NOI18N
            // we are Japanese
            if
(org.openide.util.Utilities.isUnix())
                return "EUC-JP"; // NOI18N
            else
                return "Shift_JIS"; // NOI18N
        }
        else
            // we are English
            return "ISO-8859-1"; // NOI18N
            // per JSP 1.2 specification, the
default encoding is always ISO-8859-1,
            // regardless of the setting of the
file.encoding property
            //return
System.getProperty("file.encoding", "ISO-8859-1");
    }

Also, when IDE is run in chinese locale, default
encoding is "ISO-8859-1".
To support chinese locale, I think default
encoding should be system's locale,
so I would like to request that this problem will
be fixed in NB3.5.1
and Nevada (NB3.5) patch release.

In Nevada ML (en/ja), we will document this in ja
release notes.

Comment 1 Petr Jiricka 2003-06-16 09:34:18 UTC

This sounds like a reasonable suggestion. I talked about 
the default encoding before with Shioda, and we agreed on 
the behavior that's currently in place. I am happy to 
change this if you think the behavior should be different.

However, there is yet another issue: this encoding must 
match the <%@page contentType="text/html;charset=..."%> 
directive in the page. BTW, isn't this currently a bug ? 
Should not file 
web/core/src/org/netbeans/modules/web/core/resources/templa
tes/JSP.template be also localized, with the changed 
encoding ? Should I add this file to l10n.list ?

Comment 2 ohsumi 2003-06-16 10:39:59 UTC

System locale do to equal the encoding of "charset=" in
a jsp file, so I think it is difficult to identify
the encoding of charset from system locale.
Also, I think user can add the appropriate encoding to 
<%@page contentType="text/html"%> in the template file
and it is safety. I think the change of the default encoding
should be enough.

Comment 3 Petr Jiricka 2003-06-16 11:03:30 UTC

Well, the truth is that if the value of the Encoding 
property on a JSP file (in the "Text" tab) differs from 
the value specified in the @page directive, then the user 
will run into serious problems when trying to run the 
page. The users have come across this before and it 
resulted in hard-to-debug bugs and we were getting lots of 
complaints. So I believe the IDE must make sure that these 
two values are in sync after the page is created. Many 
users are not savvy enough to set the "charset=" value 
themselves.

Comment 4 Ken Frank 2003-06-16 16:40:40 UTC

Can the existing RFEs about encoding and charset
be looked at in context of this issue ?
These RFEs have been around awhile and was wondering
if the whole issue of encoding and charset for jsp
and webapps could be looked at again, perhaps as part
of fixing this issue ?

The RFEs 18651, 18652, 7427, 20161.

ken.frank@sun.com

Comment 5 ohsumi 2003-06-17 11:41:55 UTC

I think it would be better that the encoding of "charset=" in the
<%@page contentType="text/html;charset=..."%> 
directive is set dynamically by system locale (we do not
have template files for each locale).
Is it possible that the encoding of "charset" is identified
from system locale?

Comment 6 Petr Jiricka 2003-06-17 12:34:54 UTC

I agree this would be the ideal solution. However, it is 
not easy to do in the NB 3.5 code base. Would it be ok to 
do it in NB 4.0, and think of a simpler solution for NB 
3.5 ? Can you think of a simpler solution that would also 
work (esp. for the Chinese version) ?

Comment 7 ohsumi 2003-06-18 11:26:04 UTC

We have two ideas and would like 1.

1. default encoding - system locale
   charset	    - keep the present (not set "charset=")

2. default encoding - UTF-8
   charset	    - UTF-8

Comment 8 Petr Jiricka 2003-06-18 12:18:55 UTC

Have you verified that solution #1 will work ? So if I 
have a file encoded in some exotic encoding (and the page 
contains multibyte characters), and put set the encoding 
property to this valie, but I don't specify the charset= 
part in the page, will it be possible to successfully 
compile and deploy the page ? In the past we had problems 
in this scenario.

Comment 9 ohsumi 2003-06-19 08:48:01 UTC

In the case of #1, we will document the notice 
(urge users to specify appropriate encoding to "charset=")
in ja ane zh release notes. The current IDE also does not
set "charset=", so I think the description to release notes
is enough.

Comment 10 Petr Jiricka 2003-06-19 13:08:49 UTC

Sorry, that seems like a half baked solution to me. I 
don't understand how it can work.

I am thinking now it would be better to do a proper fix, 
i.e. set the "charset=" clause dynamically based on the 
JVM encoding. I implemented this change in the NetBeans 
trunk - should appear in tomorrow's build.

If you think this is a showstopper, then we can test this 
fix properly and put it into the ML release. I think this 
solution is better than having the file property and the 
page directive out of synch.

Checking in JspDataObject.java;
/cvs/web/core/src/org/netbeans/modules/web/core/jsploader/J
spDataObject.java,v  <--  JspDataObject.java
new revision: 1.26; previous revision: 1.25
done

Comment 11 Keiichi Oono 2003-06-20 09:02:27 UTC

Add I18N keyword.

Comment 12 ohsumi 2003-06-23 12:06:55 UTC

Below is the result that we tested new jsp.jar. 

locale		charset		default encoding
----------------------------------------------------
C		n/a		ISO-8859-1
ja		eucJP		eucJP
ja_JP.PCK	PCK		PCK
ja_JP.UTF-8	UTF-8		UTF-8
Shift JIS	MS932	  	MS932		(ja windows)
ja_JP.eucJP     EUC-JP-LINUX	ja_JP.eucJP     (ja Linux)
zh_CN.EUC	gb2312		gb2312
zh_CN.GBK	GBK		GBK
zh_CN.UTF-8	UTF-8		UTF-8
euro locales	ISO-8859-15	ISO-8859-15

Each default encoding works correctly, but charset for
ja and euro locales are not correct.

For example:
   charset  eucJP -> EUC-JP
   charset  PCK -> Shift_JIS
   charset  MS932 -> Windows_31J
   charset  EUC-JP-LINUX -> EUC-JP
   cahrset  ISO8859-15 -> ISO-8859-15

To get correct charset name, could you please use
java.nio.charset.Charset? For detail, please see
the attached file. However, java.nio.charset.Charset
does not seem to produce Windows_31J for MS932,
so please change MS932 to Windows_31J forcibly.

Comment 13 ohsumi 2003-06-23 12:08:27 UTC

Created attachment 10767 [details]
sample code of java.nio.charset.Charset

Comment 14 Petr Jiricka 2003-06-23 13:16:23 UTC

Yuko, thanks for your advice. (I don't know much about the Charset
class.) I implemented your suggestion and sent you the new jar file by
e-mail.

I don't understand the part about Windows_31J though. When tested with
your patch, it seems that Java does not condider "Windows_31J" to be a
valid encoding. Rather, "windows-31j" is the canonical name, and
that's also what I get if I feed "MS932" into your test. Note the
difference in case, and also in the hyphen instead of the underscore.
So is "Windows_31J" really correct ? Shouldn't it be "Windows-31J" ?

BTW, here is the code I used to produce the default encoding, can you
please review it ?

public static String getDefaultEncoding() {
    String language = Locale.getDefault().getLanguage();
    if (language.startsWith("en")) {
        // we are English
        return "ISO-8859-1"; // NOI18N
        // per JSP 1.2 specification, the default encoding 
        // is always ISO-8859-1, regardless of the setting
        // of the file.encoding property
        // return System.getProperty("file.encoding", 
        // "ISO-8859-1");
    }
    return canonizeEncoding(System.getProperty(
        "file.encoding", "ISO-8859-1"));
}
    
private static final String CORRECT_WINDOWS_31J = "Windows-31J";
    
private static String canonizeEncoding(
String encodingAlias) {
    if (Charset.isSupported(encodingAlias)) {
        Charset cs = Charset.forName(encodingAlias);
        String name = cs.name();
        if (name.equalsIgnoreCase(CORRECT_WINDOWS_31J)) {
            return CORRECT_WINDOWS_31J;
        }
        return name;
    } else {
        return encodingAlias;
    }
}

Comment 15 Keiichi Oono 2003-06-23 14:45:27 UTC

I guess "Windows_31J" is typo. I think it's not needed to convert
"windows-31j" to "Windows-31J". And I've checked attached program,
too.

jdk1.4.1_02
   - when I feed MS932, Charset.isSupported() return false.

jdk1.4.2 (beta)
   - when I feed MS932, Charset.isSupported() return true,
     and Charset.name() method return "windows-31j"

I think it's a bug of jdk1.4.1, because windows-31j should be returned
as charset name of MS932. And also, I can 't find "EUC-JP-LINUX" as
charset name in IANA, but it's returned in Japanese locale in RH7.2.
IANA website is here:
http://www.iana.org/assignments/character-sets

As a workaround for above two things, would you review if the
following can be implemented?
   - If "file.encoding" is "MS932", charset is set to
     "windows-31j"
   - If "file.encoding" is "EUC-JP-LINUX", charset is set to
     "EUC-JP".

Yuko, please correct me if anythings are incorrect.
Thank you.
Keiichi

By the way, I don't know why Japanese charset is complex like this, as
for other east asian locales, the current implementation seems to work
fine in jdk1.4.1_02.

Comment 16 Petr Jiricka 2003-06-24 13:12:34 UTC

Created attachment 10784 [details]
Patch for the whole issue against the release35 branch

Comment 17 Petr Jiricka 2003-06-30 08:44:11 UTC

Fixed in the NetBeans trunk. Will attach the new refined patch, as the
previous one didn't quite work.

Comment 18 Petr Jiricka 2003-06-30 08:45:21 UTC

Created attachment 10833 [details]
New refined diff of the changes (in trunk)

Comment 19 Keiichi Oono 2003-08-29 13:21:36 UTC

I'm sorry for my late verification. I've verified in the latest
Q-build with:
   j2sdk 1.4.1_02
   j2sdk 1.4.2
   j2sdk 1.4.2_01
The behavior of Charset class has been changed at 1.4.2 release. Would
you add the following name conversion as same as existing?

   x-EUC-CN -> GB2312
   eucJP-open -> EUC-JP
   x-euc-jp-linux -> EUC-JP

x-EUC-CN
Charset.name() returns x-EUC-CN, but it's not valid charset name. I've
just filed this as java bug (4914869). The charset name should be
"GB2312".

eucJP-open
As for eucJP-open, it has been added at 1.4.2 as file encoding name.
System.getProperty("file.encoding") return "eucJP-open" as encoding
name, but it's not supported by Charset class, and "eucJP-open" is not
valid charset name.
The charset name should be "EUC-JP".

x-euc-jp-linux
It's not registered charset name returned by JDK. When this value is
returned, charset name in JSP should be "EUC-JP".

I'm sorry for these additional conversion. I didn't think the return
value from JDK is changed between versions.
Would you add them in current fixing?

Comment 20 capSS 2003-10-22 07:27:37 UTC

Created attachment 11927 [details]
Additional changes fixes Keiichi's review

Comment 21 Antonin Nebuzelsky 2003-11-04 14:57:58 UTC

Fixed also in Nevada Patch 1 and in Arrow.

Comment 22 mtsuruta 2003-12-05 07:39:40 UTC

I have verified these fixes using Nevada Patch1 and j2sdk 1.4.1_05,
not arrow.
Checked charset of @page directive and encoding property in the Text
tab are set for jsp file properly as following.


          OS       locale.lang   encoding type
                                 (text and @page)
====================================================
Japan     Sol8     ja            EUC-JP      -- OK
                   ja_JP.PCK     Shift_JIS   -- OK
                   ja_JP.UTF-8   UTF-8       -- OK
          Win2k        -         Windows-31j -- OK
          Linux    (default)     EUC-JP      -- OK
          
----------------------------------------------------
China     Solaris  zh            GB2312      -- OK
                   zh_CN.GB18030 GB2312      -- OK
                   zh_CN.UTF-8   UTF-8       -- OK
                   zh_CN.GBK     GB2312      -- OK         
          Win2k        -         GB2312      -- OK
          
----------------------------------------------------
Taiwan    Sol8     zh_TW         Big5        -- OK
----------------------------------------------------
France    Sol8     fr_FR.ISO8859-1 ISO8859-1 -- OK *
----------------------------------------------------
German    Sol8     de_DE.ISO8859-1 ISO8859-1 -- OK *
----------------------------------------------------
* no "charset=" in @page directive as the default

Comment 23 Petr Jiricka 2003-12-05 10:30:22 UTC

Excellent. Since we've done some changes in the encoding handling area
in the NetBeans 3.6 code base, I suggest this is also retested on the
current NetBeans trunk builds.

Comment 24 Ana.von Klopp 2003-12-05 23:34:15 UTC

Guys, I'm reopening this because I'm not convinced we are doing the 
right thing. Maybe we are, and it's just difficult to follow the 
behaviour from the issue entries. So please bear with me... If 
everything is OK, just update the issue with the details and we can 
close it again. 

Here is the problem I have: the page encoding in the JSP is used for 
two things. Firstly, it's used to read in the JSP file when the 
container compiles it. Secondly, it is used for the HTTP response, in 
case the response encoding has not been set explicitly.

Because of this second use of the page encoding, I don't think that 
the approach that was chosen here (based on the last comments from 
Tsuruta-san)  where we set the encoding to be what the system default 
is is the correct one. We chose it on the basis of what the server 
where the development is done supports, but it is potentially used to 
create the HTTP response, and the response will be read by many 
different types of hosts (especially when it's windows character sets)
.

The JSP loader's charset and the page encoding have to match and *can* 
be set to something that's suitable for the development host, but the 
response charset needs to be set to UTF-8 (which works on all the 
browsers since a few versions back), or if you're supporting PDAs, to 
something that's determined dynamically because of the client. 
Further, there is no reason to set the page encoding to anything but 
UTF-8 if you're only going to work on the JSPs in the IDE. The only 
reason to set it to something else is if you're going to use another 
tool to edit the JSPs and it doesn't support UTF-8. So in fact, Yuko's 
suggestion (2) was the best one. 

I hope I've explained this clearly. FWIW, I think we need a spec for 
all the i18n features in webapps, so that we can review it all in one 
go.

Comment 25 Petr Jiricka 2004-01-15 19:29:04 UTC

Reassigning to our new i18n guy.

Comment 26 Petr Pisl 2004-01-19 15:01:57 UTC

Fixed in 3.6. You can reopen this bug in bugtraq.


The solution is describet in #7427 and there is set the default
encoding UTF-8 for jsp pages..

Comment 27 Ken Frank 2004-01-19 15:40:43 UTC

Couple of questions

- last comment says its ok to open back into bugtraq ? Should
bugs in this area be opened in BT ?

- could someone summarize the solution  since there are many comments
to this bug, and it will help us for testing to know specific spec
of solution.
(not says solution in 7427 was used but that issue also is complex
and has many comments so summary of both issues will be helpful
for us being able to verify them)

- Ana's comments below had concern about soliving for http response
also - does solution to this issue solve that concern ?

ken.frank@sun.com
1/17/2003

Comment 28 Petr Jiricka 2004-01-19 17:39:33 UTC

> - last comment says its ok to open back into bugtraq ? Should
> bugs in this area be opened in BT ?

No, Petr meant to say that if the desire is to continue tracking this
issue for the purpose of the Arrow release, then a bugtraq bug should
be filed (as all Arrow bugs are tracked in bugtraq). All bugs against
open source trunk / NB 3.6 should continue to be tracked in issuezilla.

I'll let Petr speak to the other questions.

Comment 29 mtsuruta 2004-01-20 11:43:10 UTC

I have a couple of questions.
- For some reason, if user needs to create JSPs with non UTF-8
encoding, is there any approach for it?
- Seems JSPs which were created on other IDE in non UTF-8 encoding
display multibyte chars garbaged. Is there any way to read JSP which
is not saved in UTF-8 and no page encoding and charset setting as
UTF-8 on IDE?

Comment 30 mtsuruta 2004-01-20 11:59:22 UTC

Created attachment 12971 [details]
collapsed multibyte chars are placed between title tag in JSP

Comment 31 Petr Pisl 2004-01-21 17:50:28 UTC

The property Encoding for the jsp files was removed. The editor
(BaseJspEditorSupport) asks to jsp parser for encoding during loading
and saving files. If the
encoding is supported, then the file is loaded or saved in the
appropriate encoding, in opposite case the user is informed that the
file will be loaded in UTF-8 and during saving the IDE asks to user,
whether he wants to save in UTF-8 or not to save. When user creates
the new jps page, then the page has pageEncoding="UTF-8".

The jsp parser is the same parser as is used in the tomcat 5. So if
the tomcat recognizes the encoding, then the IDE too. The parser
obtains the encoding from web.xml <page-encoding> attribute for
<jsp-property-group> element or pageEncoding attribute for the page
tag or from the contentType attribute.
So the information about encoding the page has itself or deployment
descriptor.

In the case when you take the page from other ide and the jsp parser
is not able to recognize the encoding for this page then the parser
returns default which is ISO-8859-1. 
You can do simple test. Put the page on the standalone server in a web
module, which doesn't have defined encoding <page-encoding> in the
deployment descriptor. The page will not be displayed correctly.

Comment 32 mtsuruta 2004-01-22 10:25:54 UTC

Added to <page-encoding> attribute to web.xml, but could not loaded
multibyte-chars on IDE without page directive in JSPs. Could you
please check this test method is proper or not?
1. Added following element in web.xml and save it.
2. Saved JSPs with no tags on nb36, and reopend.
       <jsp-property-group>
            <url-pattern>*.jsp</url-pattern>
            <page-encoding>UTF-8</page-encoding>
        </jsp-property-group>

Seems all jsps with no page directive save and load in ISO-8859-1 even
if I create jsp on nb3.6 and not other ide.
If we consider about include directive, user needs possibility to
specify the encoding type on IDE without page directive.
At least JSP needs to be saved in UTF-8 as the default if Yuko-san's
2nd proposal is followed in footsteps. 

Test:Created JSPs on another IDE and displayed on nb3.6
Saved as  | Page Tag| Result
----------------------------
Shift_JIS | Added   | OK
          | NotAdded| *2
----------------------------
UTF-8     | Added   | OK
          | NotAdded| *1
----------------------------
EUC-JP    | Added   | OK
          | NotAdded| *3

nb36:bld200401151900

*1 - Loaded but multibyte chars are garbaged
*2, *3 - JSP does not load on Editor.
workaround: Right-click on Editor and select "Clone Document".

Comment 33 Petr Pisl 2004-01-23 13:02:23 UTC

You are right. The problem is that the parse doesn't recognize the
encoding in the web.xml.

Comment 34 Petr Pisl 2004-02-03 12:35:20 UTC

*** Issue 35332 has been marked as a duplicate of this issue. ***

Comment 35 Petr Pisl 2004-02-11 15:12:31 UTC

Hi,

I found out where the problem is with the parser. The parser used a
cache for data. The information from web.xml file are stored in this
cache as well, but parser doesn't know about changes in the web.xml so
the parser doesn't update the information in the cashe. I fixed the
problem and committed in the trunk.

So when the web.xml file contains something like
   <jsp-config>
        <jsp-property-group>
            <url-pattern>*.jsp</url-pattern>
            <page-encoding>ISO-8859-2</page-encoding>
        </jsp-property-group>
    </jsp-config>

and the jsp file doesn't contain setting of pageEncoding neither
charset, then the encoding is used from the web.xml. Of course, the
jsp file has to satisfy the url pattern.

There is still minor issue, when the wrong cache data are used. It's
in this case:
1) start edit a jsp page where are not set pageEncoding neither
charset, but the web.xml file includes setting of encoding in
<jsp-config> element for this page.
2) change the encoding in the web.xml
3) save the change in the web.xml file.
4) save changes in the jsp file, which you had started to edit before
the saving web.xml file.

As result old encoding is used for saving (the old one), but if you
save the page again, the right encoding is used. So I think, that this
is not so important problem as was the original issue and I set the
priority to P3.

Comment 36 mtsuruta 2004-03-01 10:29:51 UTC

I have tested jsps created on nb36 by following steps.

1. Mount new dir.
2. Create web module.
3. Add <jsp-config>... to web.xml, and save it.
4. Create a JSP which has include directive for jsp and html.
    <jsp:include page="test.jsp" flush="true" />
    <%@ include file="incHTML.html" %>
5. Create included jsp and html which have ja message.
6. Create JSP segment and JSP document by template wizard.
6. Load created jsps and html.
7. Execute the jsp created at step4.

OS       locale.lang   encoding    result
                       (@page)     execution=e load=l
====================================================
Sol8     ja            EUC-JP      l -- OK *1
                                   e -- OK *2
         ja_JP.PCK     Shift_JIS   l -- OK *1
                                   e -- OK *2
         ja_JP.UTF-8   UTF-8       l -- OK *1
                                   e -- OK *2
Win2k        -         Windows-31j l -- OK *1
Linux    (default)     EUC-JP      e -- OK *1
                                   l -- OK *2
                                   
*1 ... JSP, JSP document, and html displayed multibyte chars properly
on ide, But not JSP Segment.
       
*2 ... JSP Executed properly but values settled by included html is
garbaged.

Comment 37 mtsuruta 2004-03-01 10:53:48 UTC

Crated JSP by various page encoding type on another ide.
- Settled web.xml.
- Specified @page directive in jsp, which has include directive for
html and jsp segment.
- Included jsp and html has ja message.


OS       locale.lang   jsp encoding    result
                       charset      execution=e load=l
====================================================
Sol8     ja            EUC-JP      l -- X *3
                                   e -- X *4
         ja_JP.PCK     Shift_JIS   l -- X *3
                                   e -- X *4
         ja_JP.UTF-8   UTF-8       l -- OK
                                   e -- X *2
Win2k        -         Windows-31j l -- X *6
Linux    (default)     EUC-JP      e -- OK *5
                                   l -- OK *6
                                   
*2 - JSP Executed properly but values settled by included html is
garbaged.
*3 - JSP displayed ja message garbaged on ide, but not HTML.
*4 - JSP displayed ja message garbaged on browser as ?????.
*5 - JSP is not loaded properly on ide(ja is garbaged), but html.
*6 - JSP had http error 500. please see attached log for more detail.

Comment 38 mtsuruta 2004-03-01 11:00:37 UTC

Created attachment 13738 [details]
logs on browser when executing jsp

Comment 39 mtsuruta 2004-03-03 08:50:44 UTC

All jsp created on other ide by various page encoding is displayed
properly on ide.
I had verified with wrong version of web.xml on the previsouse
comment. I am sorry for making confusion here.

here is correct result of loading jsps on ide:
             locale.lang   jsp file encoding    result of loading
                               &charset         
         ====================================================
         Sol8     ja            EUC-JP    OK
                                Shift_JIS OK
                                UTF-8     OK
                                PCK       OK

         Win2k        -         EUC-JP     OK
                                Shift_JIS  OK
                                UTF-8      OK
                                PCK        OK

         Linux    (default)     EUC-JP     OK                         
                                Shift_JIS  OK
                                UTF-8      OK
                                PCK        OK

- <url-pattern> of web.xml is "*.jsp."

Comment 40 Petr Jiricka 2004-03-03 18:09:41 UTC

Ok, so is this issue fixed, or is there any pending item? Can we mark
as fixed?

Comment 41 Petr Pisl 2004-03-04 09:51:06 UTC

I don't think so. There are still two issues:

1.Bed recognizing encoding with jsp parser. If user have defined both
pageEncoding and charset in a jsp page, then parser returns the
encoding from charset, not from pageEncoding.

2. Wrong cache data are used. It's in this case:

1) start edit a jsp page where are not set pageEncoding neither
charset, but the web.xml file includes setting of encoding in
<jsp-config> element for this page.
2) change the encoding in the web.xml
3) save the change in the web.xml file.
4) save changes in the jsp file, which you had started to edit before
the saving web.xml file.

As result old encoding is used for saving (the old one), but if you
save the page again, the right encoding is used.

We can close this bug (if every body agree) and fill two new issue,
which I describe above.

Comment 42 Keiichi Oono 2004-03-05 08:10:31 UTC

Mika (mtsuruta) and I agree to close this bug.
Please allow me to ask a question. The list two issue seems not
problem in my environment. Probably I'm not understanding enough.

above case 1,
Is it acceptable to set two different encoding in charset and
pageEncoding? I can't list any cases which the user needs to set two
different encoding in charset and pageEncoding.

above case 2,
I've tested by following above wrong case #2, but encoding works fine.
  step 1 -> 2 -> 3 -> 4  : JSP is saved correctly
  step 1 -> 2 -> 4 : JSP is saved by previous encoding.
                     I think it's because web.xml is not saved.
In my environment, cache works fine for this implementation.

Comment 43 Petr Pisl 2004-03-05 12:38:32 UTC

case 1)
See at my last comment in issue #40780

case 2)
I have some debug messages in saving and opening method. If you do
this quickly, then the bug happen. The cache is automatically
refreshed after 2 seconds.

And at last I fill new issue with is connected with encoding - issue
#40791.

So I close this bug as fixed and we can follow the new issues.

Comment 44 Ken Frank 2005-03-31 16:56:07 UTC

verified