18866 – I18N - Output Window displays Japanese characters badly using internal execution

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 18866 - I18N - Output Window displays Japanese characters badly using internal execution

Summary: I18N - Output Window displays Japanese characters badly using internal execution

Status:	VERIFIED FIXED

Alias:	None

Product:	platform
Classification:	Unclassified
Component:	Execution (show other bugs)
Version:	3.x
Hardware:	Other Windows ME/2000

Importance:	P3 blocker (vote)
Assignee:	David Strupl

URL:
Keywords:	I18N

Depends on:
Blocks:

Reported:	2001-12-20 13:24 UTC by Honza Firich
Modified:	2008-12-22 20:45 UTC (History)
CC List:	5 users (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
source file (840 bytes, text/plain) 2001-12-20 13:26 UTC, Honza Firich	Details
correct output - external execution (3.13 KB, image/gif) 2001-12-20 13:27 UTC, Honza Firich	Details
wrong output - internal execution (3.16 KB, image/gif) 2001-12-20 13:28 UTC, Honza Firich	Details
Print something in Turkish about taking a bus (475 bytes, text/x-java) 2001-12-20 13:59 UTC, Jesse Glick	Details
Czech test case + gifs of Output Window (33.33 KB, application/octet-stream) 2002-04-23 11:59 UTC, Jiri Skrivanek	Details
There is the fix, but it is done on the sierrafixes branch. (1.26 KB, patch) 2002-11-27 13:40 UTC, Petr Pisl	Details \| Diff
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Honza Firich 2001-12-20 13:24:27 UTC

Hi,

I am working with Japanese Forte and I found a problem with
support of Japanese characters in the Output Window.


Hello.java
----------
When I ran this file External execution, everything was OK.

But when I ran it INTERNAL execution, only

        System.out.println("&#26085;&#26412;&#35486;");
        System.out.println("\u65e5\u672c");

was correct.

I tried this on Windows and on Solaris as well.


Environment:
------------
Windows: 2000 Pro - English, Locale - Japanese (as default)
Solaris: Solaris 8, locale - ja

Java: 	Solaris: jdk 1.3.1_02-b02
	Windows: jdk 1.3.1-b24

Forte for Java - Orion, Japanese version
Netbeans 3.3 - release

Comment 1 Honza Firich 2001-12-20 13:26:30 UTC

Created attachment 3904 [details]
source file

Comment 2 Honza Firich 2001-12-20 13:27:48 UTC

Created attachment 3905 [details]
correct output - external execution

Comment 3 Honza Firich 2001-12-20 13:28:27 UTC

Created attachment 3906 [details]
wrong output - internal execution

Comment 4 Jesse Glick 2001-12-20 13:58:03 UTC

AFAIK nothing is wrong with the Output Window. As I understand it, it
accepts whatever input it is given and displays it. It is up to Java
to determine the encodings used in I/O.

In fact in my tests it works as expected: internal execution preserves
strange chars, and external does not (passes through the OS which
munges everything to 8 bits). I use Linux, no special language
support, platform default encoding ISO8859-1.

Run the attached test class using internal and external execution. It
uses non-ISO8859-1 characters (Turkish accents). Internal works: you
can see dotless 'i', 's'-cedilla, palatal 'g', and so on. In external
mode, only ASCII chars and 128-255 chars like 'o'/'u' umlaut are
preserved, others show up as '?'. Tested in [r33 dec 20].

I suspect your calls setting the encoding of the output stream and so
on are just not correct, though I would not know offhand how to fix
them. Remember that the Output Window does not know what your program
did: it just receives bytes from System.out. Note that it is harder to
evaluate bugs using Japanese locale because Japanese fonts are not
available by default in the JDK, unlike miscellaneous European,
Indian, Hebrew, etc. fonts.

Comment 5 Jesse Glick 2001-12-20 13:59:13 UTC

Created attachment 3907 [details]
Print something in Turkish about taking a bus

Comment 6 Honza Firich 2001-12-20 14:56:24 UTC

Hi Jesse,

I tried Test18866 on Solaris - tr_TR.ISO8859-9 and both external and 
internal execution work fine.

In Hello.java:
System.out.println("&#26085;&#26412;&#35486;");
System.out.println("\u65e5\u672c");
work fine for external execution and internal execution as well.

Problem is with 
PrintWriter p = new PrintWriter(o);
p.println("Hello &#26085;&#26412;");
p.flush();

new PrintWriter(new OutputStreamWriter(System.out, "MS932"), 
true).println("Hello &#26085;&#26412;&#35486;");
new PrintWriter(new OutputStreamWriter(System.out, "MS932"), 
true).println("\u65e5\u672c");

and it makes no difference whether I set up codepage or not
OutputStreamWriter o = new OutputStreamWriter(System.out, "MS932");
or 
OutputStreamWriter o = new OutputStreamWriter(System.out, "EUC_JP");
- internal execution is always bad.

I am not sure if this is a bug of the Output Window, but I think it 
is a bug. Please could you help me where I should log this bug?

Thanks,
Honza

Comment 7 Jesse Glick 2001-12-20 16:00:40 UTC

This the correct category for the bug report.

I assume by "&#NNNNN;" you mean you really type that non-ASCII
character into sources. Fine, but to keep things simple let us
consider only \uXXXX escapes, as then the compiler encoding does not
affect the results.

Ales should be able to diagnose this more. Can you confirm then that
your problem is specific to Japanese, and does not occur e.g. with
Turkish? I assume your platform default encoding is set to something
Japanese-related. Your success with the external version of the
Turkish test can probably be attributed to using ISO8859-9 encoding
for it; I did not change my platform encoding. I know other various
encodings work fine in internal execution, e.g. Devanagari (Hindi
etc.) works with no special setup.

Perhaps internal execution only understands the characters if you
directly use print() methods of PrintStream? It may directly pass
those through without byte translation. In that case you would want to
use an OutputStreamWriter with the platform default encoding in the
case of external execution, and direct calls to PrintStream in the
case of internal execution.

Comment 8 Jan Chalupa 2002-01-11 14:04:30 UTC

Target milestone -> 3.4

Comment 9 Jan Chalupa 2002-01-11 14:08:24 UTC

Target milestone -> 3.4

Comment 10 Jan Chalupa 2002-01-11 14:09:19 UTC

Target milestone -> 3.4

Comment 11 Jan Chalupa 2002-01-11 14:11:54 UTC

Target milestone -> 3.4

Comment 12 akemr 2002-02-08 14:22:10 UTC

*** Issue 20331 has been marked as a duplicate of this issue. ***

Comment 13 akemr 2002-02-08 14:26:58 UTC

It looks like problem is in execution, not OW, because
running external execution is fine

So I'm reassigning to Svata - please reassign back if I'm wrong

Comment 14 Ken Frank 2002-02-27 16:07:12 UTC

Is there some limited encoding detection
that could be done by output window, based
on the locale user was running in,
in relation to information ow gets from
outside processes like app servers, databases,
etc ?

For example, if user in a ja locale or OS,
could it try some detection for the common
encodings used in that locale ?

We have seen various issues in other modules
of ow not showing multibyte correctly, since,
as is mentioned in this report, it just passes
thru what it is sent. And it would be helpful
both for localized release of FFJ and for
netbeans or FFJ customers running English
version but in other locale, if there would
be more of a chance that OW might show
multibyte from outside process correctly,
thus the idea of limited encoding detection.


Finally, I'd like to suggest this idea for
exception window and any other parts of
netbeans that get and show data from
outside process -- do they use shared 
methods or does each one get the info
and display it separately. If separately,
which modules would be involved.

ken.frank@sun.com
02/27/2002

Comment 15 David Strupl 2002-03-06 16:52:12 UTC

Ken,
the code is shared so the detection place can be common. But I don't
know about any method to find out the encoding of the incoming stream
(does not mean such a method does not exists, I just don't know it).

Jesse's questions were not answered. And I have to admit that I don't
understand the problem. If using plain System.out works for both
internal and external execution can the problem be at our side?
I am marking this issue as invalid for now and please reopen after
adding more info.

Also I would like to ask you to use only \uXXXX escapes to eliminate
the compiler encoding settings which is irrelevant for this issue.

It would be great if we could demonstrate the problem using some
European (e.g. Czech) encoding for easier testing.

Thanks for your help.

Comment 16 Jiri Skrivanek 2002-04-23 11:57:17 UTC

David,
to sumarize, the problem is that native (Czech, Japanese, ...) text
written to Output Window is garbaged when it is done by
OutputStreamWriter or PrintWriter created from System.out. It happens
only for Internal Execution. Try attached test case which prints Czech
characters. It answers Jesse's question, thas it is not only problem
of Japanese.

I would blame ThreadExecutor for wrong interpretation of incoming
characters. Respectively additional core or openide classes which map
input/output of a process into Output Window.

Running by External executor it is OK. If you see at
org.netbeans.openide.execution.ProcessExecutor class which actually
runs external process, you see following mapping

                (copyMakers[0] = new
CopyMaker(fromEngine.getInputOutput().getIn(),
new OutputStreamWriter(process.getOutputStream()), true,
className)).start();
                (copyMakers[1] = new CopyMaker(new
InputStreamReader(process.getInputStream()),
fromEngine.getInputOutput().getOut(), false, className)).start();
                (copyMakers[2] = new CopyMaker(new
InputStreamReader(process.getErrorStream()),
fromEngine.getInputOutput().getErr(), false, className)).start();
                
where CopyMaker is thread which only writes characters got from
external process.

Internal executor (ThreadExecutor) is not so easy. There is called
this:

       ret =
TopManager.getDefault().getExecutionEngine().execute(info.getClassName(),
run, inout);
            run.setInputOutput(ret.getInputOutput());
            
            
org.netbeans.core.execution.EngineExecution provide some mapping for
input and output streams, but it is hard to investigate the
path/journey of incomming characters. In the same package there are
additional classes which have something to do with streams, e.g.
DefaultSysProcess, OutputStreamWriter, TaskIO, WriterPrintStream.
IMHO, reported problems could be hidden inside those classes.

Comment 17 Jiri Skrivanek 2002-04-23 11:59:05 UTC

Created attachment 5520 [details]
Czech test case + gifs of Output Window

Comment 18 Ken Frank 2002-04-23 18:02:19 UTC

Please see also 21930, a general rfe on output window and
other parts that can display information from other external
processes.
Because we were seeing this issue in output window from
operations of various modules, and because other windows can
show information from external processes, it was suggested to
me to file general rfe.

ken.frank@sun.com
04/23/2002

Comment 19 Marek Grummich 2002-07-22 08:33:19 UTC

Target milestone was changed from '3.4' to TBD.

Comment 20 Marek Grummich 2002-07-22 08:44:18 UTC

Target milestone was changed from '3.4' to TBD.

Comment 21 Marek Grummich 2002-07-22 08:44:48 UTC

Target milestone was changed from '3.4' to TBD.

Comment 22 Marek Grummich 2002-07-22 08:48:56 UTC

Target milestone was changed from '3.4' to TBD.

Comment 23 Marek Grummich 2002-07-22 08:55:39 UTC

Target milestone was changed from '3.4' to TBD.

Comment 24 Marek Grummich 2002-07-22 08:58:52 UTC

Target milestone was changed from '3.4' to TBD.

Comment 25 Marek Grummich 2002-07-22 09:07:12 UTC

Set terget milestone to TBD

Comment 26 Marek Grummich 2002-07-22 09:13:53 UTC

Set terget milestone to TBD

Comment 27 Petr Pisl 2002-11-27 13:36:38 UTC

I hope I found out the problem. Problem is in the
ExucutionEngine.SysOut. There is method write(byte[] buff, int off,
int len) where is conversion byte [] to char []. Problem is in this
conversion.

Comment 28 Petr Pisl 2002-11-27 13:40:55 UTC

Created attachment 8067 [details]
There is the fix, but it is done on the sierrafixes branch.

Comment 29 David Strupl 2003-01-22 08:18:25 UTC

Fixed in trunk (by applying Petr's patch).
core/execution/src/org/netbeans/core/execution/ExecutionEngine.java 1.4

Thanks Petr.

Comment 30 pzajac 2003-01-29 13:32:38 UTC

Jirka Skrivanek:
Can you verify it on Japanese platform?

Comment 31 Jiri Skrivanek 2003-01-29 14:00:50 UTC

Honza Firich will verify it. He has Japanese environment ready to test.

Comment 32 Honza Firich 2003-02-05 11:51:20 UTC

Verified this bug on build:

NetBeans dev build
------------------
Number:   200302050100
Date:     February 5 2003
Branding:
Branch:   trunk

Checked it using JDK 1.4 and worked fine on Windows 
and Solaris.

Comment 33 Petr Pisl 2003-02-25 12:37:50 UTC

Integrated in SierraFixies

Comment 34 gautham mudra 2003-08-15 18:09:20 UTC

verified the fix on sierra update1 with patch 113638-4
Build:Sierra Update1 020923
JDK:1.4.0_03
Locale:ja_JP