This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 103193

Summary:	FileEncodingQuery.ProxyCharset.decode() returns an empty buffer for input of size 4 kB or less
Product:	projects	Reporter:	Marian Petras <mpetras>
Component:	Generic Infrastructure	Assignee:	Tomas Zezula <tzezula>
Status:	VERIFIED WONTFIX
Severity:	blocker	CC:	mmirilovic
Priority:	P3	Keywords:	JDK_SPECIFIC
Version:	6.x
Hardware:	All
OS:	All
Issue Type:	DEFECT	Exception Reporter:

Description Marian Petras 2007-05-03 05:38:31 UTC

When method decode(ByteBuffer) is called on a Charset obtained from the
FileEncodingQuery and the passed ByteBuffer has 4096 bytes or less, an empty
CharBuffer is returned. This is JDK specific - it happens on JDK 1.5.x but not
with newer JDKs.

Comment 1 Marian Petras 2007-05-03 05:40:22 UTC

This bug is the cause of P1 bug #103067 ("Find/Replace in projects removed
content of all not saved classes") - that's why I set the priority to P1.

Comment 2 Marian Petras 2007-05-03 05:56:27 UTC

For the immediate cause, look at the source code of method

    java.nio.CharsetDecoder.decode(ByteBuffer),

at the for(;;) loop:

In JDK 1.5.0_11, the critical part is:

	for (;;) {
	    CoderResult cr;
	    if (in.hasRemaining())
		cr = decode(in, out, true);
	    else
		cr = flush(out);
	    if (cr.isUnderflow())
		break;
	    ...
	}

It means that as soon as you return CoderResult.UNDERFLOW from method
decodeLoop(...), the subsequent check for cr.isUnderflow() is met, the cycle is
interrupted (break;) and the flush(...) method is never called.

In JDK 1.6.0_01, the critical part is different:

	for (;;) {
	    CoderResult cr = in.hasRemaining() ?
		decode(in, out, true) : CoderResult.UNDERFLOW;
	    if (cr.isUnderflow())
		cr = flush(out);

	    if (cr.isUnderflow())
		break;
	    ...
	}

If you return CoderResult.UNDERFLOW from method decodeLoop(...), the subsequent
condition cr.isUnderflow() is met and the flush(...) method is called. Only then
the cycle is interrupted (break;).

Comment 3 Marian Petras 2007-05-03 06:00:46 UTC

A workaround for the unexpected behaviour in JDK 1.5.x is possible. In method
decodeLoop(...), before returning CoderResult.CR_UNDERFLOW, decode all buffered
bytes and send the resulting chars to the output buffer.

Comment 4 Marian Petras 2007-05-03 06:12:24 UTC

I did not find an exactly matching JDK bug but it seems that bug
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6221056 is close to it.

Comment 5 Tomas Zezula 2007-05-03 19:59:40 UTC

The client of CharSet[Encoder|Decoder] has to flush the encoder|decoder at the
end of encdoing|decoding  by calling flush. You can easily verify it by running
the FileEncodingQueryTest from project/queries on JDK 1.5, it tests also the
block < 4KB. For even more details see sun.nio.cs.[StreamDecoder|StreamEncoder].

Comment 6 Marian Petras 2007-05-04 13:49:38 UTC

What I use in my code is method

     Charset.decode(ByteBuffer)

which is a shortcut for

     charset.newDecoder()
            .onMalformedInput(CodingErrorAction.REPLACE)
            .onUnmappableCharacter(CodingErrorAction.REPLACE)
            .decode(bb);

(see
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/Charset.html#decode%28java.nio.ByteBuffer%29)

Javadoc documentation for method

     CharsetDecoder.decode(ByteBuffer)

states that

     "This method implements an entire decoding operation;
     that is, it resets this decoder, then it decodes the bytes
     in the given byte buffer, and finally it flushes this decoder."

(see
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/CharsetDecoder.html#decode%28java.nio.ByteBuffer%29)

I have not studied source code of sun.nio.cs.StreamDecoder but I have studied
this bug report against it:

     http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4744247
     ("StreamDecoder.CharsetSD.read does not invoke CharsetDecoder.flush")

Comment 7 Tomas Zezula 2007-05-04 17:17:58 UTC

The issue against StreamDecoder does not matter. It seems rather as a bug of 1.5
StreamEncoder, but there is an simple workaround, use 3 parameter version of
encode (decode).

Here is an algorithm:
Encoder enc;
while (haveSomethingToEncode) {
enc.encode (in,out,false);
}
enc.encode (emptyIn, out, true);
enc.flush (out);

I am not even sure if this problem should be worked around in our implementation
of Charset.

Comment 8 Tomas Zezula 2007-05-04 18:03:15 UTC

The workaround mentioned by Marian does not work. The CharsetEncoder needs to
maintain an internal state since it does not know to which CharseEncoder it
should delegate it is decided either by calling flush on it or by over crossing
size 4KB. There is no way how to work around the JDK 1.5 issue on the FEQ side
since it cannot find out if other data will come or not. Anyway I don't
understand why do you report it to NetBeans not to the JDK. The client of
CharsetDecoder can use the 3 params version of decode as I explained above to
workaround this problem.

Comment 9 Marian Petras 2007-05-06 11:31:13 UTC

The workaround I described was the one I use in my custom decoder in the
Properties module. I did not know the details of your implementation so I did
not know it could not be used.

I know it is caused by a bug in the JDK but I thought that it would be better if
you made the workaround on the FEQ side than if your clients had to do their
workarounds. Now that I understand that workaround on your side is not possible,
I will use a workaround on my side, i.e. I will use the three-argument decode
method as you have suggested.

Comment 10 Tomas Zezula 2007-05-07 06:59:51 UTC

Thanks.

Comment 11 Quality Engineering 2012-04-25 10:00:48 UTC

Integrated into 'main-golden', will be available in build *201204250400* on http://bits.netbeans.org/dev/nightly/ (upload may still be in progress)
Changeset: http://hg.netbeans.org/main-golden/rev/358f7f0a41c5
User: Jesse Glick <jglick@netbeans.org>
Log: decodeByteBuffer should no longer be needed for #103067/#103193 fixes since JDK 6.