Bug 40780 - I18N - pageEncoding property is always UTF-8
I18N - pageEncoding property is always UTF-8
Status: RESOLVED INVALID
Product: javaee
Classification: Unclassified
Component: Code
3.x
Sun Solaris
: P2 (vote)
: 3.x
Assigned To: issues@javaee
Petr Pisl
: I18N
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-03-05 07:54 UTC by Keiichi Oono
Modified: 2004-03-05 16:23 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Keiichi Oono 2004-03-05 07:54:19 UTC
NetBeans Q-Build 200402241900

to reproduce:
  - create web module
  - edit web.xml
    add <page-encoding> element to specify
    encoding (e.g. EUC-JP), save and close
  - create JSP

JSP is generated as follows:
------
<@page contentType="text/html; charset=EUC-JP"%>
<@page pageEncoding="UTF-8"%>
<html>...
------

I think charset, pageEncoding, and <page-encoding>
needs to be same. pageEncoding needs to be set
from <page-encoding> as same as charset.
Would you give me your any thoughts?
Comment 1 Keiichi Oono 2004-03-05 07:54:46 UTC
add I18N keyword
Comment 2 Petr Pisl 2004-03-05 12:28:35 UTC
There are some parts from JSP 2.0 specification:

---------------------------------------------------
JSP 3.3.4:
It is a translation-time error to name different encodings in the
pageEncoding attribute of the page directive of a JSP page and in a
JSP configuration element matching the page. It is also a
translation-time error to name different encodings in the prolog /
text declaration of the document in XML syntax and in a JSP
configuration element matching the document. It is legal to name the
same encoding through multiple mechanisms.

JSP.4.1
The page character encoding is the character encoding in which the JSP
page or tag file itself is encoded. The character encoding is
determined for each file separately, even if one file includes another
using the include directive
...
For JSP pages in standard syntax, the page character encoding is
determined from the following sources:
-A JSP configuration element page-encoding value whose URL pattern
matches the page.
-The pageEncoding attribute of the page directive of the page. It is a
translation- time error to name different encodings in the
pageEncoding attribute of the page directive of a JSP page and in a
JSP configuration element whose URL pattern matches the page.
- The charset value of the contentType attribute of the page
directive. This is used to determine the page character encoding if
neither a JSP configuration element page-encoding nor the pageEncoding
attribute are provided.
- If none of the above is provided, ISO-8859-1 is used as the default
character encoding.

JSP.4.2
The initial response character encoding is set to the CHARSET value of
the contentType attribute of the page directive. If the page doesn t
provide this attribute or the attribute doesn t have a CHARSET value,
the initial response character encoding is determined as follows:
- For documents in XML syntax, it is UTF-8.
- For JSP pages in standard syntax, it is the character encoding
specified by the pageEncoding attribute of the page directive or by a
JSP configuration element page-encoding whose URL pattern matches the
page. Only the character encoding specified for the requested page is
used; the encodings of files included via the include directive are
not taken into consideration. If there s no such specification, no
initial response character encoding is passed to ServletResponse.
setContentType() - the ServletResponse object s default, ISO-8859-1,
is used.
---------------------------------------------------

So as you can read, the pegeEncoding and the charset have different
purposes and can be different.
- The page-endcoding and pageEncoding are for the encoding of the file
itself. If they are not defined for a file, then the value of charset
or default encoding (ISO-8859-1) is used.
- The value of charset is used for response encoding. So this can be
different from encoding of file. If there are not defined value of
charset, then page-encoding or pageEncoding or default encoding is
used for response encoding.

This bug I'm closing as invalid.
There is other bug in tomcat's parser. See the issue #40791
Comment 3 Keiichi Oono 2004-03-05 16:23:39 UTC
Thank you very much for your detail clarification. I understand this
should be closed as invalid. But I'm still confusing current NB's
mechanism to handle the following three setting:
   charset, pageEncoding, and <page-encoding>
Please allow me to add comments in issue #40791.


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo