This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 103193 - FileEncodingQuery.ProxyCharset.decode() returns an empty buffer for input of size 4 kB or less
Summary: FileEncodingQuery.ProxyCharset.decode() returns an empty buffer for input of ...
Status: VERIFIED WONTFIX
Alias: None
Product: projects
Classification: Unclassified
Component: Generic Infrastructure (show other bugs)
Version: 6.x
Hardware: All All
: P3 blocker (vote)
Assignee: Tomas Zezula
URL:
Keywords: JDK_SPECIFIC
Depends on:
Blocks:
 
Reported: 2007-05-03 05:38 UTC by Marian Petras
Modified: 2012-04-25 10:00 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marian Petras 2007-05-03 05:38:31 UTC
When method decode(ByteBuffer) is called on a Charset obtained from the
FileEncodingQuery and the passed ByteBuffer has 4096 bytes or less, an empty
CharBuffer is returned. This is JDK specific - it happens on JDK 1.5.x but not
with newer JDKs.
Comment 1 Marian Petras 2007-05-03 05:40:22 UTC
This bug is the cause of P1 bug #103067 ("Find/Replace in projects removed
content of all not saved classes") - that's why I set the priority to P1.
Comment 2 Marian Petras 2007-05-03 05:56:27 UTC
For the immediate cause, look at the source code of method

    java.nio.CharsetDecoder.decode(ByteBuffer),

at the for(;;) loop:

In JDK 1.5.0_11, the critical part is:

	for (;;) {
	    CoderResult cr;
	    if (in.hasRemaining())
		cr = decode(in, out, true);
	    else
		cr = flush(out);
	    if (cr.isUnderflow())
		break;
	    ...
	}

It means that as soon as you return CoderResult.UNDERFLOW from method
decodeLoop(...), the subsequent check for cr.isUnderflow() is met, the cycle is
interrupted (break;) and the flush(...) method is never called.

In JDK 1.6.0_01, the critical part is different:

	for (;;) {
	    CoderResult cr = in.hasRemaining() ?
		decode(in, out, true) : CoderResult.UNDERFLOW;
	    if (cr.isUnderflow())
		cr = flush(out);

	    if (cr.isUnderflow())
		break;
	    ...
	}

If you return CoderResult.UNDERFLOW from method decodeLoop(...), the subsequent
condition cr.isUnderflow() is met and the flush(...) method is called. Only then
the cycle is interrupted (break;).
Comment 3 Marian Petras 2007-05-03 06:00:46 UTC
A workaround for the unexpected behaviour in JDK 1.5.x is possible. In method
decodeLoop(...), before returning CoderResult.CR_UNDERFLOW, decode all buffered
bytes and send the resulting chars to the output buffer.
Comment 4 Marian Petras 2007-05-03 06:12:24 UTC
I did not find an exactly matching JDK bug but it seems that bug
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6221056 is close to it.
Comment 5 Tomas Zezula 2007-05-03 19:59:40 UTC
The client of CharSet[Encoder|Decoder] has to flush the encoder|decoder at the
end of encdoing|decoding  by calling flush. You can easily verify it by running
the FileEncodingQueryTest from project/queries on JDK 1.5, it tests also the
block < 4KB. For even more details see sun.nio.cs.[StreamDecoder|StreamEncoder].
Comment 6 Marian Petras 2007-05-04 13:49:38 UTC
What I use in my code is method

     Charset.decode(ByteBuffer)

which is a shortcut for

     charset.newDecoder()
            .onMalformedInput(CodingErrorAction.REPLACE)
            .onUnmappableCharacter(CodingErrorAction.REPLACE)
            .decode(bb);

(see
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/Charset.html#decode%28java.nio.ByteBuffer%29)

Javadoc documentation for method

     CharsetDecoder.decode(ByteBuffer)

states that

     "This method implements an entire decoding operation;
     that is, it resets this decoder, then it decodes the bytes
     in the given byte buffer, and finally it flushes this decoder."

(see
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/CharsetDecoder.html#decode%28java.nio.ByteBuffer%29)

I have not studied source code of sun.nio.cs.StreamDecoder but I have studied
this bug report against it:

     http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4744247
     ("StreamDecoder.CharsetSD.read does not invoke CharsetDecoder.flush")
Comment 7 Tomas Zezula 2007-05-04 17:17:58 UTC
The issue against StreamDecoder does not matter. It seems rather as a bug of 1.5
StreamEncoder, but there is an simple workaround, use 3 parameter version of
encode (decode).

Here is an algorithm:
Encoder enc;
while (haveSomethingToEncode) {
enc.encode (in,out,false);
}
enc.encode (emptyIn, out, true);
enc.flush (out);

I am not even sure if this problem should be worked around in our implementation
of Charset.
Comment 8 Tomas Zezula 2007-05-04 18:03:15 UTC
The workaround mentioned by Marian does not work. The CharsetEncoder needs to
maintain an internal state since it does not know to which CharseEncoder it
should delegate it is decided either by calling flush on it or by over crossing
size 4KB. There is no way how to work around the JDK 1.5 issue on the FEQ side
since it cannot find out if other data will come or not. Anyway I don't
understand why do you report it to NetBeans not to the JDK. The client of
CharsetDecoder can use the 3 params version of decode as I explained above to
workaround this problem.
Comment 9 Marian Petras 2007-05-06 11:31:13 UTC
The workaround I described was the one I use in my custom decoder in the
Properties module. I did not know the details of your implementation so I did
not know it could not be used.

I know it is caused by a bug in the JDK but I thought that it would be better if
you made the workaround on the FEQ side than if your clients had to do their
workarounds. Now that I understand that workaround on your side is not possible,
I will use a workaround on my side, i.e. I will use the three-argument decode
method as you have suggested.
Comment 10 Tomas Zezula 2007-05-07 06:59:51 UTC
Thanks.
Comment 11 Quality Engineering 2012-04-25 10:00:48 UTC
Integrated into 'main-golden', will be available in build *201204250400* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main-golden/rev/358f7f0a41c5
User: Jesse Glick <jglick@netbeans.org>
Log: decodeByteBuffer should no longer be needed for #103067/#103193 fixes since JDK 6.