Integrating syntax coloring, code completion, and other editor features into the IDE used to be a lot of work. Not anymore! This article describes how a 19th century explorer called Heinrich Schliemann is inspiring the IDE to become fluent in many languages.

Traditionally, when creating editor support for a new programming language in the IDE, a vast variety of NetBeans APIs must be implemented. By “editor support”, we typically mean syntax coloring, code completion, and the source navigation features provided by the IDE’s Navigator. Other examples include code indentation and brace matching. Out of the box, the NetBeans IDE provides this kind of support for several languages and technologies, such as Java (of course), JSP, and HTML.

There are many NetBeans APIs that one needs to implement to provide editor support for a programming language. This is unfortunate for two reasons. Firstly, the domain knowledge that a language programmer typically brings to the table is the language itself, not the versatile knowledge of the NetBeans APIs required to provide the necessary features. Secondly, the underlying infrastructure for editor support is the same for all languages. For example, the only difference between the Navigator for Java and the Navigator for HTML is the actual code, not the container. For these reasons, the language programmer should only need to provide the content of the language in the form of tokens that are communicated in regular expressions. Nothing more than that should be needed.

Given the tokens and an indication of where they should be used, the NetBeans Platform should be able to figure out how to hook the tokens to the support features. Not only would this approach simplify the process of integrating a new language into the IDE, but it would leverage the current knowledge of the language programmer – rather than requiring a steep learning curve of acquiring new knowledge before coding can even begin.

Enter Schliemann

This, in sum, is what the new Schliemann project (languages. is all about. And why is it called Schliemann? Heinrich Schliemann was a 19th century explorer who had a gift for languages. He traveled the world while keeping a diary in the language of the country he happened to be in. In the spirit of Schliemann, the 6.0 release of the NetBeans Platform envisages the IDE as being Schliemannesque, able to pick up languages very quickly and then being able to communicate in them fluently.

The project is especially pitched towards scripting languages, because the Schliemann project does not provide compilation support, which is not required by scripting languages­ – and because scripting languages, in particular, are increasingly in vogue today. In this article, we will explore the main facets of the Schliemann project and touch on some contrasts with the traditional NetBeans API approach to providing the editor features it supports.

Everything in a single file!

A central contrast between the traditional API approach and the Schliemann approach is that the latter lets you specify all editor features declaratively in one single file. This file has the .NBS file extension, which stands for NetBeans Scripting. To get a quick flavor of some typical content of an NBS file, let’s examine a code snippet – see Listing 1.

Listing 1. NBS file snippet.

# NBS Template

# definition of tokens
TOKEN:keyword:( “while” | “if” | “else”)
TOKEN:operator:( “{“ | “}” | “(“ | “)” )
TOKEN:identifier:( [“a”-”z”] [“a”-”z” “0”-”9”]* )
TOKEN:whitespace:( [“ “ “\t” “\n” “\r”]+ )

# parser should ignore whitespaces

# definition of grammar
S = (Statement)*;
Statement = WhileStatement | IfStatement | ExpressionStatement;
WhileStatement = “while” “(“ ConditionalExpression “)” Block;
IfStatement = “if” “(“ ConditionalExpression “)” Block;
Block = “{“ (Statement)* “}”;
ConditionalExpression = <identifier>;
ExpressionStatement = <identifier>;

# code folding

# navigator support

# brace completion

# indentation support
INDENT “{:}”
INDENT “(:)”
INDENT “\\s*(((if|while)\\s*\\(|else\\s*|else\\s+if\\s*\\(|for\\s*\\(.*\\))[^{;]*)”b

This template is what you are given when you use the new Generic Languages Framework wizard, which is part of NetBeans IDE 6.0. It gives you a single NBS file with sample content, which begins with the definition of four tokens. These tokens are named “keyword”, “operator”, “identifier” and “whitespace”. Within brackets, in the same line as the name of the tokens, a regular expression is used to define them.

Right away, one can see the power of this new approach to language support provision: a regular expression language, rather than Java, is used to define tokens. As a result, programmers outside the Java ecosystem can integrate their programming languages into the NetBeans IDE. Not needing to know Java, at least for the simpler integrations of languages, is a central benefit of the Schliemann project.

Once tokens are defined, one can already begin assigning features. For example, this single statement would fill the Navigator with the values provided by the “keyword” token:  


Readers who are familiar with the NetBeans Navigator API can only be amazed at this drastic simplification! However, normally you would like more robust support for a language and to provide a grammar in addition to tokens. The grammar that the Schliemann approach requires is also highly simplified. It is comparable to JavaCC or AntLR. Ideally, one would wish that the grammar provided by JavaCC and AntLR could be directly integrated into NetBeans IDE. Unfortunately, however, these grammars are not tailored to usage within an IDE. For this reason, a conversion process needs to take place, from AntLR or JavaCC (or from a similar approach) to the Schliemann NBS format.

Early experiments have shown that both a manual and an automatic solution for this process is feasible. However, this aspect of the Schliemann project is definitely the area where most work needs to be done. A unified, simple approach to integrating grammars provided by AntLR, JavaCC, and the like, is needed in order for the Schliemann project to reach its full potential.

In the NBS code shown before, you can see, in addition to the tokens, that the grammar forms the basis of both the Navigator implementation and the code folding implementation. In the case of code folding, the Block grammar definition determines each code fold, while the Navigator is populated by values conforming to the WhileStatement definition.

Finally, notice that the code also shows how brace completion and indentation is defined, all within the same single file, and that one can fine-tune further by specifying that white space should be skipped by the parser.

Hence, when the NBS file in Listing 1 is associated with a MIME type, documents corresponding to the MIME type immediately have the following features:

  • Syntax coloring
  • Navigator
  • Code folding
  • Brace matching
  • Indentation

  In similar ways, a wide range of other language-support features can be created, including code completion, which is frequently very high up on the list of features that language programmers want to provide support for.

Getting started

Now that we have a general flavor of the Schliemann approach, let’s put it into practice and create an NBS file for Java Manifests. Manifests, as you know, are constructed from key/value pairs. In the IDE, there is no language support for Manifests, not even syntax coloring. Let’s provide that... and a lot more besides.

We begin as one always does when creating a plug-in for the IDE: by creating a new module project (see Figure 1). Next, in the New Project wizard, name the project “ManifestEditorFeatures” and specify “org.netbeans.modules.manifesteditorfeatures” as the Code Name Base. At the end of the wizard, after having clicked Finish, you’ll see that the IDE has created a basic source structure, as it does for every NetBeans module (see Figure 2).

Figure 1. Creating a new module project.

Figure 2. Result of the New Projects window: Plugin Source Structure.

Next, we can use the Generic Languages Framework wizard to generate the NBS template discussed in the previous section. This template is found in the NetBeans Module Development section in the New File wizard (see Figure 3). Once you’ve completed the wizard, you have a single new file, in which we will do all our coding for this module (see Figure 4).

Figure 3. Generic Languages Framework Template.

Figure 4. Result of the New File wizard: One additional file!

Now, let’s begin! Unlike in the previous section, the syntax we are dealing with here has the notion of state. By state we mean that if we know in which token we find ourselves, we can always know where we are in relation to all the other tokens. So, for example, if we are in the “key” part of a key/value statement in a Manifest, we know that when we reach the colon we are entering the “value” part of the statement. As a result, we can define our tokens in the context of their states. Below you see how this is done. Not much of this should be foreign to you if you are familiar with regular expressions:

TOKEN:key:( [^”#”] [^ “:” “\n” “\r”]* ):<VALUE>
TOKEN:whitespace:( [“\n” “\r”]+ ):<DEFAULT>
TOKEN:operator:( “:” ):<IN_VALUE>
TOKEN:whitespace:( [“\n” “\r”]+ ):<DEFAULT>
TOKEN:value:( [^ “\n” “\r”]* )

Notice that we start out by saying that we are not in a key if the first character is a hash (#). In that case we are, in fact, in a comment. It would also be good to provide a specific syntax color for comments, so let’s define a token for comments:

TOKEN:comment:( “#” [^ “\n” “\r”]* [“\n” “\r”]+ )  

Right now, without going any further, we can already assign colors. Again we do so declaratively:  

foreground_color: “blue”;
COLOR:operator: {
foreground_color: “black”;
COLOR:value: {
foreground_color: “magenta”;

Apart from the foreground color, there are many other attributes that we can set per token, such as the style and background color. Without going much further, though, we can already install our module and then we’ll have syntax coloring (see Figure 5)! It couldn’t be much simpler. Before we do so, however, we need to create a MIME type resolver, which is a small XML file that specifies the file extension of the files we want to deal with.

Figure 5. A Manifest file with syntax coloring.

If you use the New File Type wizard, you can let the IDE generate such a MIME type resolver for you. You then need to register both the resolver and the NBS file in the XML layer file and declare a dependency on the Generic Languages Framework API. Eventually, the Generic Languages Framework template will do all of this for you, one imagines; but at the time of writing this is not the case.

After installing the module, we can develop it further. To help you, NetBeans 6.0 will provide a number of developer tools, such as the new AST window (see Figure 6), which lets you analyze a file, based on the tokens you have assigned to its MIME type. Ultimately, for Manifests, you could create a very detailed Navigator (see Figure 7), among other useful features for the end user.

Figure 6. AST window.

Figure 7. Navigator.


Hopefully this broad introduction gives you a flavor of what NetBeans 6.0 will do for scripting languages. Quickly and without much fuss, language developers will be able to integrate their favorite scripting languages into the IDE, thus turning NetBeans more and more into their own, customized development environment. In short, just like Heinrich Schliemann, NetBeans IDE will be able to pick up new languages and expand its usefulness across more and more development communities.

  The official Schliemann project page on
  The Schliemann page in the NetBeans Wiki
  Blog by Jan Jancura, the lead NetBeans engineer for Schliemann
  The official Schliemann project page on


Geertjan Wielenga
() is a technical writer for NetBeans IDE and a co-author of the book “Rich Client Programming: Plugging into the NetBeans Platform”. He is passionate about NetBeans and blogs about it daily at

Not logged in. Log in, Register