Source code file content

Revision: 2

import
» Project Revision History

» Checkout URL

web-content / trunk / dev / reviews / opinions_92584.html

Size: 9056 bytes, 1 line
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>
<head>
    <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=iso-8859-1">
    <title>Memory Model Accuracy</title>
    <META NAME="AUTHOR" CONTENT="Vladimir Voskresensky">
</head>
<body LANG="en-US" >

<h1>Memory Model Accuracy: Architecture Review Opinion</h1>

<dl>
<dt><b>Issue:</b> <a href="http://www.netbeans.org/issues/show_bug.cgi?id=92584">92584</a></dt>
<dt><b>Submitter:</b> <a href="mailto:vkvahin@netbeans.org">Vladimir Kvashin</a></dt>
<dt><b>History:</b><a href="http://www.netbeans.org/source/browse/cnd/www/dev/reviews/opinions_92584.html">in CVS</a>

<dt><b>Date:</b> Jan 16, 2006</dt>
<dt><b>Reviewers:</b> </dt>
</dl>

<hr/>

<dl>
    <dt><b>Contents</b></dt>
    <dd>
	<ul>
	    <li><a href="#definitions">Definitions</a></li>
	    <li><a href="#architecture">Brief code model architecture overview</a></li>
	    <li><a href="#issues">Issues</a></li>
	    <li><a href="#statistics">Some statistics</a></li>
	    <li><a href="#solutions">Solutions</a></li>
	    
	    <li><a href="#advisory">Advisory Information</a></li>
	    <li><a href="#appendices">Appendices</a>
		<ul>
		    <li><a href="#TCRs">Appendix A: TCRs</a></li>
		    <li><a href="#TCAs">Appendix B: TCAs</a></li>
		    <li><a href="#references">Appendix C: Reference Material</a></li>
		</ul>
	    </li>
	</ul>
    </dd>
</dl>

<h2><a name="definitions">Definitions</a></h2>

<dl>
    
    <dt>AST</dt>
    <dd>
	<p>
	    AST stands for Abstract Syntax Tree - 
	    a tree that represents the entire compilation unit code
	    (see <a href="http://en.wikipedia.org/wiki/Abstract_syntax_tree">Wikipedia AST definition</a>
	    for more details).
	    In CND code model, AST is produced by parser
	    and then processed by a special component 
	    (called Renderer) that builds implementation of code model API.
	    This allows to separate parser from other components.
	</p>
    </dd>
    
    <dt>Symbol table</dt>
    <dd>
	<p>
	    A symbol table is a mechanism where each identifier in a program's source code 
	    is associated with information such as its type, scope level and sometimes its location. 
	    This mechanism should allow to find information about identifier effectively.
	    (See <a href="http://en.wikipedia.org/wiki/Symbol_table"> Wikipedia symbol table definition</a> 
	    or "Dragon Book", chapter 1.3)
	</p>
    </dd>
    
    <dt>Dynamic vs static symbol table</dt>
    <dd>
	<p>
	    Associating identifier with the element (variable, function, type, etc.)
	    that is represented by this identifier, might happen
	    either on parsing phase or after the AST has been already built.
	    In the former case it is created and used by parser and it is dynamic
	    (i.e. each time parser leaves a scope, the scope's content
	    "disappears" from symbol table). 
	    In the latter case it is static - anyone can ask about any identifier
	    at any time.
	</p>
    </dd>

    <dt>Resolver</dt>
    <dd>
	<p>
	    A resolver is a historical term that means 
	    a static symbol table.
	</p>
    </dd>

    <dt>Code model architecture</dt>
    <dd>
	<p>
	    Below is a code model architecture chart
	</p>
	<img src="92584/code-model-detailed.png" alt="Detailed code model diagram"/>
    </dd>
    
</dl>


<h2><a name="issues">Issues</a></h2>

For node code model isn't accurate enough. Ideally, it should be 100% accurate 
in assumption that code does not contain compiler errors. We mean accuracy of data 
represented via code model API. Code model clients (completion, classview, etc.) 
might have their own issues that lower accuracy from end-user point of view; 
we aren't going to discuss these issues here - they are the matter of a separate 
disuccions, IZs, reviews, etc.

Particluar issues are:
<dl>
    <dt><p><b>Lack of symbol table at parsing time</b></dt>
    <dd>There are several constructs that can not be correctly recognized at parsing phase:
	<br><code> A  (a); <font color="gray"> // function call or variable declaration ? </font></code>
	<br><code> A&lt;B&gt;  c; <font color="gray"> // expression or variable declaration? </font></code>
	<br><code> (B)(c) ; <font color="gray"> // function call or cast expression? </font></code>
	<br><code> A b(t); <font color="gray"> // function declaration or variable declaration? </font></code>
    </dd>
    
    <dt><p><b>Poor resolver (static symbol table)</b></dt>
    <dd>
	For now, we have a static symbol table (AKA Resolver). 
	<br>The algorythm is poor and in some cases incorrect at all.
    </dd>

    <!--
    <dt><p><b>Inefficient resolver (static symbol table)</b></dt>
    <dd>
	Resolver searches model recursively to find out what does the given name refer to.
	On subsequent calls it isn't able to reuse any information gathered on previous calls.
	The makes algorythm inefficient.
    </dd>
    -->
    
    <dt><p><b>Using canonical parameter types representation</b></dt>
    <dd>
	The following declarations are treated as different:
	<br><code> void foo(string) </code>
	<br><code> void foo(std::string) </code>
	<br>
	Fixing this needs reliable and at the same time efficient resolver.
    </dd>

    <dt><p><b>Lack of C/C++ distinctoin</b></dt>
    <dd>
	The following code means different in C and C++.
	<br><code> void foo(); </code>
	<br><code> void foo(int p) { // ... } ; </code>
	<br>
	In C++ it declares two functions while in C 
	it is the same function (there are no overloads in C!)
   </dd>

    <dt><p><b>Parser errors</b></dt>
    <dd>
	We still have some parser errors (situations in which parser is not able to process 
    </dd>
    
   
</dl>

<h2><a name="statistics">Some statistics</a></h2>

This statistics isn't complete: 
for many projects current (old) whitebox tests run out of memory;
and new whitebox tests aren't yet ready.
Although it is incomplete, the table below gives a lot of interesting information.

<pre>
 project              Dwarf    Model   Delta Accuracy  Parser err  Unresolved
 -------              -----    -----   ----- --------  ----------  ----------
 clucene.2            4.855    4.317     538   88,9 %       73         344
 litesql.1            1.612    1.250     362   77,5 %        7         104
 mico.1 (partial)     6.822    5.409   1.413   79,3 %       24       3.285
 mysql.3 (partial)   43.291   39.267   4.024   90,7 %      189         274
 python.2            14.782   13.905     877   94,1 %       13           2
 Total               71.362   64.148   7.214   89,9 %      306       4.009
</pre>

<h2><a name="solutions">Solutions</a></h2>

<h3><a name="solutions-dynamic-symtav">Dynamic Symbol Table</a></h3>

Dynamic symbol table interface looks as follows.

<pre>

    //--------------------
    // Reading
    //--------------------
    
    /** Represents different kinds of identifiers */
    enum Kind {
	Type,
	Function,
	Variable
    }

    /** Determines the given identifier kind */
    boolean getKind(String name);

    //--------------------
    // Modification
    //--------------------
    
    /** Is called when entering a scope */
    void push();
    
    /** Is called when leaving a scope */
    void pop();
    
    /**  Adds an element to the current frame */
    void add(String name, Kind kind);

    /** Adds all symbols from the given namespace.
     *  Is called when entering namespace definition. */
    void addFromNamespace(String namespaceName);
    
    /** Adds symbols from the given class 
     *  Is called when entering a class that extends given class */
    void addFromClass(String className, boolean isPublic);
    
</pre>

<h3><a name="solutions-static-symtab">Static Symbol Table</a></h3>

Static symbol table interface looks as follows.

<pre>
public class ResolverFactory {
    public static Resolver createResolver(CsmFile file, int offset);
}

public interface Resolver {
    /**
     * Resolves identifier name.
     *
     * @param nameTokens tokenized name to resolve
     * (for example, for std::vector it is new String[] { "std", "vector" })
     */
    public CsmObject resolve(String[] nameTokens);
    
    /**
     * Resolves identifier name.
     *
     * @param qualifiedName name to resolve 
     */
    public CsmObject resolve(String qualifiedName);

}
</pre>


<h3><a name="solutions-pros-and-cons">Static vs Dynamic Pros and Cons</a></h3>



<h2><a name="advisory">Advisory Information</a></h2>

<p>[List any non-blocking issues and suggestions for improvement.]</p>
    
<h2><a name="appendices">Appendices</a></h2>

<h3><a name="TCRs">Appendix A: Technical Changes Required</a></h3>

<p>[File all TCRs in Issuezilla with P1 or P2 priority and make the issue representing this 
review depend on them.]</p>

<h3><a name="TCAs">Appendix B: Technical Changes Advised</a></h3>

<p>[File all TCAs in Issuezilla with P3 to P5 priority and make the issue representing this 
review depend on them.]</p>

<h3><a name="references">Appendix C: Reference Material</a></h3>

<p>[List additional materials relevant to reviewed case]</p>

</body>
</html>

Project Features

About this Project

CND was started in November 2009, is owned by DimaZh, and has 197 members.
By use of this website, you agree to the NetBeans Policies and Terms of Use (revision 20160708.bf2ac18). © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo
 
 
Close
loading
Please Confirm
Close