129985 – I18N : erb editor complains "unterminated string meets end of file"

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 129985 - I18N : erb editor complains "unterminated string meets end of file"

Summary: I18N : erb editor complains "unterminated string meets end of file"

Status:	VERIFIED FIXED

Alias:	None

Product:	ruby
Classification:	Unclassified
Component:	RHTML (show other bugs)
Version:	6.x
Hardware:	All All

Importance:	P2 blocker (vote)
Assignee:	Torbjorn Norbye

URL:
Keywords:	I18N

Depends on:
Blocks:

Reported:	2008-03-13 06:42 UTC by Masaki Katakai
Modified:	2008-04-09 18:28 UTC (History)
CC List:	1 user (show)

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
sample nb project (2.29 KB, application/octet-stream) 2008-03-13 06:46 UTC, Masaki Katakai	Details
screenshot (21.14 KB, image/png) 2008-03-13 06:48 UTC, Masaki Katakai	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Masaki Katakai 2008-03-13 06:42:58 UTC

Environment:
  Product Version: NetBeans IDE Dev (Build 200803111205)
  Java: 1.6.0_03; Java HotSpot(TM) Client VM 1.6.0_03-b05
  System: SunOS version 5.11 running on x86; UTF-8; ja_JP (nb)

I got the following error report from Japanese community member.
He is creating a .erb file and put "<p><japanesecharacters></p>"
but editor shows error underline and complains as

  "unterminated string meets end of file"

I tried it on the latest trunk and I can also see it. But it didn't
happen on 6.1 Beta.

Comment 1 Masaki Katakai 2008-03-13 06:46:39 UTC

Created attachment 58282 [details]
sample nb project

Comment 2 Masaki Katakai 2008-03-13 06:48:54 UTC

Created attachment 58283 [details]
screenshot

Comment 3 Torbjorn Norbye 2008-03-13 23:45:11 UTC

This was caused by the recent migration to JRuby 1.1.  In 1.1, JRuby processes the input byte by byte rather than char
by char - and the lexer source takes an InputStream rather than a Reader. The input stream was not translating the
characters right, which meant that one of these unicode characters when split into bytes looked like a string terminator
- so the string was terminated, and when the real terminator showed up it opened up another string which obviously isn't
going to be terminated - that's where the parser error came from.

To fix this, I first added code to convert the input stream to a byte array via the String.getBytes("UTF8") encoding
approach. However, this caused some really bad bugs. All the AST node offsets are apparently the byte buffer offset, not
the corresponding character offsets, which broke a lot of code.

So instead, since we don't handle unicode identifiers anyway (so these strings should only appear in literal strings
whose contents we don't care about), I translate the characters to bytes, one to one, and truncate values greater than
255. All the unit tests pass, along with this new scenario.

This should be fixed by changeset a75e53d6a38f available in trunk build #1053.

Comment 4 Ken Frank 2008-04-09 18:28:09 UTC

looks ok, at least on solaris, please reopen if viewed as not fixed by
community member.

ken.frank@sun.com