Source code file content

Revision: 2

import
» Project Revision History

» Checkout URL

web-content / trunk / dev / reviews / opinions_91546.html

Size: 19188 bytes, 1 line
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>
<head>
    <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=iso-8859-1">
    <title>Memory Model infrastructure</title>
    <META NAME="AUTHOR" CONTENT="Vladimir Voskresensky">
<style type="text/css">
<!--
body {color: #000000; background-color: #ffffff; font-family: Monospaced}
table {color: #000000; background-color: #e9e8e2; font-family: Monospaced}
.java-block-comment {color: #737373}
.java-layer-method {font-family: Monospaced; font-weight: bold}
.java-keywords {color: #000099; font-family: Monospaced; font-weight: bold}
-->
</style></head>
<body LANG="en-US" >
    <h1>Architecture Review Opinion</h1>
    
    <dl>
        <dt><b>Issue:</b> <a href="http://www.netbeans.org/issues/show_bug.cgi?id=91546">91546</a></dt>
        <dt><b>Submitter:</b> <a href="mailto:vv159170@netbeans.org">Vladimir Voskresensky</a></dt>
        <dt><b>History:</b><a href="http://www.netbeans.org/source/browse/cnd/www/dev/reviews/opinions_91546.html">in CVS</a>
        
        <dt><b>Date:</b> Dec 25, 2006</dt>
        <dt><b>Reviewers:</b> </dt>
    </dl>
    <hr/>
    <dl>
        <dt><b>Contents</b></dt>
        <dd>
            <ul>
                <li><a href="#summary">Summary</a></li>
                
                <li><a href="#decision">Decision</a></li>
                <li><a href="#opinion">Opinion</a></li>
                <li><a href="#mem_working_set">Memory Working Set</a></li>
                <li><a href="#minutes">Minutes</a></li>
                <li><a href="#issue_detailes">Issue details</a></li>
                <li><a href="#minority">Minority Opinion</a></li>
                <li><a href="#advisory">Advisory Information</a></li>
                <li><a href="#appendices">Appendices</a>
                    <ul>
                        <li><a href="#TCRs">Appendix A: TCRs</a></li>
                        <li><a href="#TCAs">Appendix B: TCAs</a></li>
                        <li><a href="#references">Appendix C: Reference Material</a></li>
                    </ul>
                </li>
            </ul>
        </dd>
    </dl>
    
    <hr/>
    
    <h2><a name="summary">Summary</a></h2>
    
    <p>
        There are a lot of complains about memory usage and several bugs are still open
    </p>
    
    <ul>
        <li>
            <a href="http://www.netbeans.org/issues/show_bug.cgi?id=87921">
                Out of Memory Error
            </a>
        </li>
        <li>
            <a href="http://www.netbeans.org/issues/show_bug.cgi?id=89648">
                Code model memory consumption shouldn't depend linearly on project size
            </a>
        </li>
        <li>
            <a href="http://www.netbeans.org/issues/show_bug.cgi?id=87302">
                APTStringManager uses more memory than necessary.
            </a>
        </li>
    </ul>    
    <p>
        The <q>memory problem</q> seems to be one of the most important 
        and as such it should be addressed one day. The way how to satisfy it is subject to this
        review. There used to be a general feeling that the solution should
        be based on repository engine. 
    </p>
    
    
    <h2><a name="decision">Decision</a></h2>
    
    <p>
        
    </p>
    <!--
<p>[Keep this short; details will be given in <a href="#opinion">Opinion</a>
section. Mark one of three outcomes:</p>
<ul>
<li><b>Accepted</b> (Go to implement or commit, based on phase of review)</li>
<li><b>Accepted with change requests</b> (Go to implement or commit with completed Technical Change
requests)</li>
<li><b>Rejected</b> (No, go back and do it again.)</li>
</ul>
<p>]</p>
-->
    
    <h2><a name="opinion">Opinion</a></h2>
    
    <p>The following significant issues were discussed at the inception review.</p>
    
    <h2><a name="mem_working_set">Memory Working Set</h2>
    <p>
        The underlying memory metric is that of "memory working set". This is the
        minimum amount of memory required to run the application while achieving
        adequate performance. 
        <img src="91546/memory-set.png" alt="memory-set"/>
    </p>
    
    <h2><a name="minutes">Minutes</h2>
    
    <p>
        Memory Model [all high priority]
        <ul>
            <li>
                (9)a Design and implement a runtime repository to maintain parsed data
            </li>
            <li>
                (9)b Implement the client for accessing and using parsed data 
            </li>
        </ul>
    </p>
    <h3>Memory</h3>
    </p>
    
    <p>
        The underlying memory metric is that of "memory working set". This is the
        minimum amount of memory required to run the application while achieving
        adequate performance. Detailed goals for what the required memory should be were
        not discussed. The team agreed that the selected solution should scale "nicely"
        with application size, so for example, even large applications like Firefox
        could run in standard system configurations. Additional quantification is needed
        here.
    </p>
    <p>
        The development team proposed adding a repository to the language model. Two
        options were discussed: 
        <ul>
            <li>
                (i) use of Lucene
            </li>
            <li>
                (ii) custom solution.
            </li>
        </ul>
    </p>
    
    <p>
        Development agreed to: 
        <ul>
            <li>
                (i) execute a due diligence effort to identify available
                open source solutions (beyond Lucene) for a repository implementation
            </li>
            <li>
                (ii)
                simulate the memory and performance profile for Lucene with large production
                applications
            </li>
            <li>
                (iii) size the effort for a tuned custom implementation.
            </li>
        </ul>
    </p>
    <p>
        The selected repository approach needs to serve all language model needs,
        including the symbol table (required for the accuracy work), the core parsing,
        and a future cross reference ("xref"). The decision for which approach to use is
        pending completion of exploring the listed options, and a top level design.
    </p>
    </p>
    
    <h2>API design</h2>
    <h3>Code Model</h3>
    <p>
        <img src="91546/code-model-detailed.png" alt="detailed-csm"/>
    </p>
    <h3>Memory model: Client proposal</h3>
    <p>TBD</p>
    <h3>Memory model: Repository proposal</h3>
    <p>
        
<pre>
<span class="java-block-comment">//////////////////////////////////////////////////////////////////////////////</span>
<span class="java-block-comment">// First two interfaces should be implemented by client to use the repository.</span>

<span class="java-block-comment">/**</span>
<span class="java-block-comment"> * Interface which classes should implement to be persistable</span>
<span class="java-block-comment"> */</span>
<span class="java-keywords">public</span> <span class="java-keywords">interface</span> Persistent 
{
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * Serialization </span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">void</span> <span class="java-layer-method">write</span>(OutputStream out); 
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * Deserialization </span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">void</span> <span class="java-layer-method">read</span>(InputStream in); 
}

<span class="java-keywords">public</span> <span class="java-keywords">interface</span> PersistentObjectFactory
{
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * create an object by handle, </span>
<span class="java-block-comment">     * handle is a sign for a factory to understand which kind (class) should</span>
<span class="java-block-comment">     * be used to create new object</span>
<span class="java-block-comment">     */</span>
    Persistent <span class="java-layer-method">createPersistent</span>(<span class="java-keywords">int</span> handle); 
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * retrieve handle for object class </span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">int</span> <span class="java-layer-method">getHandle</span>(Persistent obj);     
}

<span class="java-block-comment">//////////////////////////////////////////////////////////////</span>
<span class="java-block-comment">// This interface would be implemented by repository provider.</span>
<span class="java-keywords">public</span> <span class="java-keywords">interface</span> Repository {
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * initialize and provide Repository with objects factory</span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">void</span> <span class="java-layer-method">init</span>(PersistentObjectFactory factory, String repositoryId); 
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * store object, maybe Id should be on behalf of the object itself</span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">void</span> <span class="java-layer-method">put</span>(Identifier id, Persistent obj); 
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * retrieve object</span>
<span class="java-block-comment">     */</span>
    Persistent <span class="java-layer-method">get</span>(Identifier id); 
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * stop storing object</span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">void</span> <span class="java-layer-method">remove</span>(Identifier id); 
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * store all objects to permanent location </span>
<span class="java-block-comment">     * should be called, e.g., during IDE shutdown or project closing</span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">void</span> <span class="java-layer-method">flush</span>();     
}

<span class="java-block-comment">//////////////////////////////////////////////////////////////</span>
<span class="java-block-comment">// Accessor.</span>
<span class="java-keywords">public</span> <span class="java-keywords">class</span> RepositoryAccessor 
{
    <span class="java-keywords">private</span> <span class="java-layer-method">RepositoryAccessor</span>() {};
    <span class="java-keywords">private</span> <span class="java-keywords">static</span> Repository instance;
    <span class="java-block-comment">/**</span>
<span class="java-block-comment">     * Default way for clients to get instance</span>
<span class="java-block-comment">     */</span>
    <span class="java-keywords">public static</span> Repository <span class="java-layer-method">getRepository</span>(String repositoryId)
    {
        <span class="java-keywords">if</span> (instance == <span class="java-keywords">null</span>)
        {
            instance = (Repository)Lookup.<span class="java-layer-method">getDefault</span>().<span class="java-layer-method">lookup</span>(Repository.<span class="java-keywords">class</span>);
        }
        <span class="java-keywords">return</span> instance;
    }
}
        </pre>
    </p>
    
    <h2><a name="issue_detailes">Issue details</h2>
    <h3>Base Level</h3>
    <p>
        The most memory critical part is API Implementation component.
    </p>
    <h3>Tasks</h3>
    <p>
        <b>Action item:</b> Design for repository client and it's place in memory model
        <p>
            <img src="91546/repository.png" alt="repository"/>
        </p>
    </p>        
    <p>
        <b>Action item:</b> Analyze the current state of memory usage using 
        profiler and MySQL. Consider Library and Project elements as different
        <ul>
            <li>
                Amount of used memory by different elements
            </li>            
            <li>
                Number of objects for different elements
            </li>
        </ul>            
    </p>
    <p>
        <b>Action item:</b> Prototype Light Weight Elements approach.
    </p>
    <p>
        <b>Action item:</b> Introduce CsmID as replacement for hard references to objects (for API clients)
        <p> 
            <img src="91546/UID.png" alt="UID"/>
        </p>
        
    </p>
    <p>
        <b>Action item:</b> Prototype using SoftReferences as Java approach for memory management
    </p>
    <p>
        <b>Action item:</b> Rewrite model to use RID instead of hard references. Uses KeyBasedUID in most cases.
        <p>
            <img src="91546/keyUID.png" alt="keyUID"/>
        </p>
    </p>
    <p>
        <b>Action item:</b> Update API-clients use CsmID instead of hard references
        <p>
            <img src="91546/fileImpl.png" alt="fileImpl"/>            
        </p>
    </p>
    
    <h3>Some Implementation Notes to consider and not forget</h3>
    <p>
        <b>Action item:</b> FileBufferFile handles java.io.File objects (may be path is enough)
    </p>    
    <h4>Mem info for MySql on Opteron (19-01-2007)</h4>
    <table border="2">
        <thead>
            <tr>
                <th>Package</th>
                <th>Objects</th>
                <th>Shallow Size</th>
                <th>Retained Size</th>
            </tr>
        </thead>
        <tbody>
            <tr bgcolor="white">
                <td>all model</td>
                <td>2,677,267 (100%)</td>
                <td>75,214,328 (100%)</td>
                <td>185,727,576 (100%)</td>
            </tr>            
            <tr>
                <td>modelimpl</td>
                <td>1,417,137 (53%)</td>
                <td>44,408,944 (59%)</td>
                <td>177,009,664 (95%)</td>
            </tr>
            <tr>
                <td>apt</td>
                <td>1,260,093 (47%)</td>
                <td>30,804,776 (41%)</td>
                <td>82,807,280 (45%)</td>
            </tr>
            <tr>
                <td>repository</td>
                <td>0</td>
                <td>0</td>
                <td>0</td>
            </tr>
        </tbody>
    </table>

    <h4>Mem info for MySql on Opteron (22-01-2007) with "clean snapshot" prototype</h4>
    <table border="2">
        <thead>
            <tr>
                <th>Package</th>
                <th>Objects</th>
                <th>Shallow Size</th>
                <th>Retained Size</th>
            </tr>
        </thead>
        <tbody>
            <tr bgcolor="white">
                <td>all model</td>
                <td>1,418,969 (100%)</td>
                <td>44,453,680 (100%)</td>
                <td>112,002,680 (100%)</td>
            </tr>            
            <tr>
                <td>modelimpl</td>
                <td>1,415,948 (100%)</td>
                <td>44,380,408 (100%)</td>
                <td>103,736,544 (93%)</td>
            </tr>
            <tr>
                <td>apt</td>
                <td>2,984 (0%)</td>
                <td>72,664 (0%)</td>
                <td>9,110,824 (8%)</td>
            </tr>
            <tr>
                <td>repository</td>
                <td>0</td>
                <td>0</td>
                <td>0</td>
            </tr>
        </tbody>
    </table>
    <p>
        There are APT States handled by ProjectBase in modelimpl that affect memory
    </p>
    <p>
        <img src="91546/APTStateHandlers.png" alt="APTState"/>
    </p>
    <pre>
Some details about "clean snapshot" (most improvements are in APT size).
                   Nr Objects           Shallow Size          Retained Size
APT part of DDD    317,000->1,200       8Mb->28Kb             21Mb->3Mb
full DDD           663,000->347,000     18.5Mb->11Mb          48Mb->29.5Mb
APT part of MySQL  1,260,000->3,000     31Mb->72Kb            82Mb->9Mb
full MySQL         2,700,000->1,400,000 75Mb->44.5Mb          186Mb->112Mb        
    </pre>
    <h2><a name="minority">Minority Opinion</a></h2>
    
    <p>
        to be done
    </p>
    <pre>
    </pre>
    
    <p>
        
    </p>
    
    <h2><a name="advisory">Advisory Information</a></h2>
    
    <p>[List any non-blocking issues and suggestions for improvement.]</p>
    <p>
        Comment about repository from Nik:
    </p>
    <pre>
Sun Studio compilers have "-sb" option.
This option is used to generate source browser data. It is some
kind of repository, and probably it can be used as a candidate
for a "custom solution" repository. I don't know if it will fit,
but it seems it makes sense to check out. 
    </pre>
    <p>
        
    </p>
    <p>
        Comment about stand alone parsing from Nik:
    </p>
    <pre>
add the ability to parse projects outside of IDE.
This is the only way to solve the problem with the memory limit.
IDE cannot increase its own memory limit on fly, but it can start
a child process with increased maximum memory limit
    </pre>
    <p>
        Comment about NIO from Tim:
    </p>
    <pre>
One option that seems not to be mentioned is using NIO memory mapped files for indexing sources.  
It doesn't solve the problem of nondeterministic behavior due to swapping, 
but it is a way around the heap size limit, and generally works quite well, 
and since you control the cache, some access-level optimizations are possible.  
If you can either do fixed-record-length caches or cache file + metadata lookup table, 
you can probably get around the heap limit quite nicely.

I wrote some (pretty embryonic and untested) code to build such caches in contrib/misc/cache - 
it's based on code I wrote for the NetBeans output window, which had to be able to handle 400Mb of text 
and still be able to scroll and line-wrap without a hiccup, 
and I was surprised at just how well it ended up working.

Ideally some existing library will do that for you, 
but this is the sort of thing that often needs to be really optimized for the particular use-case.  
Some people would argue that mapping objects to data kept in records off the Java heap 
not "object oriented" enough, but it certainly is practical if you're dealing with huge data sets.   
    </pre>
    
    
    <h2><a name="appendices">Appendices</a></h2>
    
    <h3><a name="TCRs">Appendix A: Technical Changes Required</a></h3>
    
    <p>[File all TCRs in Issuezilla with P1 or P2 priority and make the issue representing this 
    review depend on them.]</p>
    
    <h3><a name="TCAs">Appendix B: Technical Changes Advised</a></h3>
    
    <p>[File all TCAs in Issuezilla with P3 to P5 priority and make the issue representing this 
    review depend on them.]</p>
    
    <h3><a name="references">Appendix C: Reference Material</a></h3>
    
    <p>[List additional materials relevant to reviewed case]</p>
    
</body>
</html>

Project Features

About this Project

CND was started in November 2009, is owned by DimaZh, and has 197 members.
By use of this website, you agree to the NetBeans Policies and Terms of Use (revision 20160708.bf2ac18). © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo
 
 
Close
loading
Please Confirm
Close