This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 130145

Summary:	Use proper lexer
Product:	javafx	Reporter:	David Strupl <dstrupl>
Component:	Editor	Assignee:	Rastislav Komara <moonko>
Status:	VERIFIED FIXED
Severity:	blocker	CC:	av-nb, dstrupl, pnejedly, vvg
Priority:	P1
Version:	6.x
Hardware:	All
OS:	All
Issue Type:	TASK	Exception Reporter:
Bug Depends on:
Bug Blocks:	130138

Description David Strupl 2008-03-14 13:36:36 UTC

We should use the lexer that comes with javafxc to be as close to javafxc as possible in the editor.

Comment 1 Rastislav Komara 2008-03-14 13:45:56 UTC

I started work on new Lexer for javaFX based on javafcx compiler grammar

Comment 2 Andrey Yamkovoy 2008-03-14 14:01:34 UTC

Lexer based on the Javafxc grammar already implemented.
Please describe what do you mean ...

Comment 3 David Strupl 2008-03-14 14:24:54 UTC

I mean lexer used by the javafxc (v3Lexer) generated from the ANTLR file (v3.g).

We should avoid using the hand written lexers if possible.

Is there any compelling reason to use hand written lexers especially if the javafxc team keeps changing the language?
When they change the language they (hopefully!) change the grammar. And after they change the grammar the lexer in the
compiler is being built during the build of the compiler (in the build/gensrc folder). We can either use the lexer from
the compiler or use the grammar directly to construct our (correct) lexer.

Rasta will check whether we will be able to use the v3Lexer from the compiler or whether we will use the grammar to
construct the lexer during the build.

I suggest to delete the hand written lexers (of course after we have a suitable replacement).

Comment 4 Alexei Mokeev 2008-03-14 14:42:23 UTC

That was our initial idea as well, but we ended using hand written one. Victor, what were the specific problems of using
automatically generated lexer ?

Comment 5 Petr Nejedly 2008-03-15 21:45:46 UTC

BTW: Current (handwritten) lexer seems to finally handle the formatting string literal well. But only in case it gets
the source right on the first attempt, otherwise it breaks. See the examples and explanation:. 
Frame {
   title: "Hello World {num}"
}
gets the tokens right. But if you add a space (or do any other edit) between '}' and '"', the token list breaks to:
...
QUOTE_LBRACE_STRING_LITERAL('"Hello World {')
IDENTIFIER('num')
RBRACE
WHITESPACE(' ')
STRING_LITERAL('"\n}\n')

While this might be simple to fix in the lexer, there's harder case to look for, and that is, what if the right quote
was missing initially:
Frame {
   title: "Hello World {num}
}

If you add the quote from given state, the both the handwritten lexer and the potential incremental-from-generated lexer
will have hard times to update the tokens correctly because of combination of three things:
*) '}' was already recognized as RBRACE
*) the modification happened (maybe far) after the RBRACE token
*) the (conditional) lexical rule is in the form of (simplified) '}.*"', that is, you both need to be in a state with
opened string-format AND have the corresponding tail quote.

Keep this case in mind when testing the generated lexer. Maybe it would help to treat unmatched '}' inside opened
string-format as a start of RBRACE_QUOTE_STRING_LITERAL even if it doesn't have finishing '"'. Once you type the '"', it
will get rerecognized. and the following tokens will pop up right.

Well, maybe the problem is completely different, but I just wanted to give you heads-up on this stateful-lexer issue.

Comment 6 Alexei Mokeev 2008-03-24 13:48:41 UTC

David, Petr,
Are there any news regarding applicability of generated Lexer ?
-A.

Comment 7 David Strupl 2008-03-24 14:20:31 UTC

Hello, today is Easter Monday here - everybody is at home. The question is for Rasta ... I think he will send an update
tomorrow. I will send some status report including the other parts to the mailing list tomorrow. Best regards, David

Comment 8 Rastislav Komara 2008-03-25 11:21:42 UTC

Hello,
 we use lexer generated from v3 grammar from javafx compiler. There is no evidence for lack of performance, lexer is
restartable and able to transfer important state information between restarts. There are still 3 problems:
  1) Javadoc, I'm trying reuse javadoc lexer (JavadocLexer class).
  2) The plugin gramar contains several enhancements in code (not rulers!) and there is need to merge these differences
in build process automatically and not by hand. There is no possibility to patch original grammar in this case.
  3) We are still missing massive performance test using really long (1000+lines) files.

Comment 9 David Strupl 2008-03-27 15:47:29 UTC

I guess you can close this issue - the rest will be bugs assigned to you ;-)

Comment 10 Rastislav Komara 2008-03-27 16:07:22 UTC

Issue solved.

Comment 11 Lark Fitzgerald 2008-04-07 18:18:20 UTC

Verified using:
Product Version: NetBeans IDE Dev (Build 200804010004)
Java: 1.6.0_03; Java HotSpot(TM) Client VM 1.6.0_03-b05
System: Windows Vista version 6.0 running on x86; Cp1252; en_US (nb)

Plugin: CB#312 (2008-04-07_14-26-18.zip)