This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
Created attachment 162108 [details] nb-spellshecker-io.jpg At initial startup when NetBeans load project there is some stage when message "Building dictionaries" appears in status bar. In my environment it lasts for around 7-10 seconds and block consecutive background scan and delays workable state of NetBeans. Self profiler shows that 90% of that time goes into file IO organized via RandomAccessFile (see attached screenshot). There is API which is 350 times more efficient - I recommend switching to it. Details are below: randomAccessFile: 7535ms mappedByteBuffer: 21ms Both methods generate binary identical files with significantly different time. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - public class FileRandomIOTest { public static int N_REPEAT = 1000000; public static int N_FILESIZE = N_REPEAT; public static int test_pos[] = new int[N_REPEAT]; public static int test_data[] = new int[N_REPEAT]; static { for( int i = 0; i < N_REPEAT; i++ ) { test_pos[i] = (int)(Math.random() * (N_REPEAT - 8)); test_data[i] = (int)(Math.random() * Integer.MAX_VALUE); } } @Test public void randomAccessFile() throws Exception { long started = System.currentTimeMillis(); File tmpFile = File.createTempFile("randomAccessFile", ".tmp"); try( RandomAccessFile out = new RandomAccessFile(tmpFile, "rw") ) { out.setLength(N_FILESIZE); for( int i = 0; i < N_REPEAT; i++ ) { out.seek(test_pos[i]); out.writeInt(test_data[i]); } } long finished = System.currentTimeMillis(); System.out.println("randomAccessFile: " + (finished - started) + "ms"); } @Test public void mappedByteBuffer() throws Exception { long started = System.currentTimeMillis(); File tmpFile = File.createTempFile("mappedByteBuffer", ".tmp"); MappedByteBuffer out = new RandomAccessFile(tmpFile, "rw") .getChannel().map(FileChannel.MapMode.READ_WRITE, 0, N_FILESIZE); for( int i = 0; i < N_REPEAT; i++ ) { out.putInt(test_pos[i], test_data[i]); } out.force(); long finished = System.currentTimeMillis(); System.out.println("mappedByteBuffer: " + (finished - started) + "ms"); } } Current implementation of file IO in org.netbeans.modules.spellchecker.TrieDictionary. (Can be easily adopted to use MappedByteBuffer) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - private static class ByteArray { private final RandomAccessFile out; public ByteArray(File out) throws FileNotFoundException { this.out = new RandomAccessFile(out, "rw"); } public void put(int pos, char what) throws IOException { out.seek(pos); out.writeChar(what); } public void put(int pos, byte what) throws IOException { out.seek(pos); out.writeByte(what); } public void put(int pos, int what) throws IOException { out.seek(pos); out.writeInt(what); } public void close() throws IOException { out.close(); } }
Mentioned MappedByteBuffer (which is memory-mapped file) has side effects in Java - it is not possible to explicitly close or resize target file on disk. Since size of the generated file is rather small (only 4 MB on my machine) - it is sufficient to generate it in memory and then dump to disk. Below is the adopted implementation which works for me: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - private static class ByteArray { private final File outFile; private ByteBuffer buf = null; private int size = 0; public ByteArray(File out) throws FileNotFoundException, IOException { this.outFile = out; this.buf = ByteBuffer.allocate(65 * 1024 * 1024); } private int markUsed(int pos, int itemSize) { size = Math.max(size, pos + itemSize); return size; } public void put(int pos, char what) throws IOException { buf.putChar(pos, what); markUsed(pos, Character.SIZE / 8); } public void put(int pos, byte what) throws IOException { buf.put(pos, what); markUsed(pos, Byte.SIZE / 8); } public void put(int pos, int what) throws IOException { buf.putInt(pos, what); markUsed(pos, Integer.SIZE / 8); } public void close() throws IOException { buf.limit(size); try( FileOutputStream fos = new FileOutputStream(outFile) ) { try( FileChannel fc = fos.getChannel() ) { fc.write(buf); } } } }
I will look at this for next version, but it is not now top priority, since it is not regression
Please note the writing was originally done in memory, but was causing OOME is some cases: https://netbeans.org/bugzilla/show_bug.cgi?id=191287
I'm not very certain in my conclusions, but it seems to me that above-mentioned issue is wrong fix for not existing problem. "Wrong" - because degradation in performance after applied changes is far more significant than saved memory. "Not existing" - i suspect that main reason for OOM was just too small Xmx. Memory dump attached there is 55MB - roughly the allocated heap size for that application - this tells me about rather low Xmx (plus - dump does not contain anything pointing to TriDictionary). Post probably failure occurred in serialization of TriDictionary happened only because it was the last one doing something at that moment. It would happen to anything else that would try to allocate another, let's say, 500K of memory. Dictionary currently being used in netbeans is roughly 40 larger than file attached to mentioned issue and in serialized form it consumes 3.7MB. It is just nothing on modern computers. So ... the quickest and most cost-efficient solution "for today" would be reverting changes made in #191287. My proposal to use various flavors of bytebuffer actually require pre-allocation of more memory, than in original implementation - so less efficient. "Ideal" fix for that issue, in terms of optimal performance, is reworking of serialization algorithm to save via sequential buffered DataOutputStream. Not sure if it is possible and, mostly probably, level of the gained optimization will not correspond to effort spent on rework. I would not go that way unless OOM reoccurs with reasonable amount of memory.