This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
Summary: | MIMEResolution doesn't scale | ||
---|---|---|---|
Product: | platform | Reporter: | Petr Nejedly <pnejedly> |
Component: | Filesystems | Assignee: | Petr Nejedly <pnejedly> |
Status: | RESOLVED FIXED | ||
Severity: | blocker | CC: | issues, jglick, jskrivanek, jtulach, pjiricka, rmatous |
Priority: | P2 | Keywords: | PERFORMANCE |
Version: | 6.x | ||
Hardware: | All | ||
OS: | All | ||
Issue Type: | ENHANCEMENT | Exception Reporter: | |
Bug Depends on: | |||
Bug Blocks: | 114195 |
Description
Petr Nejedly
2007-10-01 14:04:50 UTC
There are currently two registration points for mime resolvers, meta-inf and services lookup. All in meta-inf is currently before everything in services lookup. Most of the entries in the services lookup are xml files that are processed by a private implementation into a resolver instances (private impl as well). If we change the xml processing to generate other instances that MIMEResolvers (e.g. MIMEResolverData) and register a new, singleton, MIMEResolver as a last slot in the meta-inf, we can make this resolver gather all the registered instances of MIMEResolverData (private class) and let it do post-processing of the instances (turning them into an internal sequence of MIMEResolvers), it would keep the semantics but allow much finer control of the resolution process. The only pitfall is a potential presence of explicitely coded MIMEResolver instance in the services folder, in the middle of the declaration xmls. (There _is_ one currenly in the big NetBeans). To solve this compatibly, we could find the original position of such an offender and split the optimized resolver list into two halves, having the offender between them. Of course, in the real product and once having tests for this case, we would move such a resolver somewhere else (end of meta-inf), as log as would the real (human-from-source-code-inferred) ordering be equivalent. I'll try to post an initial sketch of the framework later. I have also implemented an ugly proof-of-concept implementation (which didn't even obey the outlined rules) and it loked like it works reliably so far (no extensive testing, though). Note that we don't really need to implement optimal, bullet-proof classification framework, it can happily stay as a heuristic that works for us. Is calling a single method on a collection of ~40 resolvers, most of which will return null, really so expensive? Yes, if *) you do it 120 times in a row (e.g. 120 files in a folder). *) many resolver instances contains more sub-instances (more <file> tags in the declaration). *) each instance call typically call FO.getExt, a nontrivial method to begin with. *) the order of resolvers is suboptimal. *) each magic resolver copies its load of bytes again through InputStream API. And we're talking about scalability here, the iteration never scales much. The individual issues are fixable, but the scallability limit will remain. And I'm not sucking my thumb to get the arguments, I did the profiling, several times. If we solve all the subissues of 114195, we can speed up folder expansion (which regressed recently) by more than 60% Because keeping mutual order of real MIMEResolver instances and xml-based instances of some private helper class seems complicated (probably would require copying part of the FolderLookup implementation), we can do it in following compatible way: Let the "private implementation" actually implement the MIMEResolver interface (abstract class in reality), being mostly no-op (return null, should be fast enough). Let the convertor infrastructure look up MIMEResolver, not the private interface. It can then filter the instances to private and unknown, and optimize the private. For example, if there is a sequence: METAINF-Unknown1 METAINF-Unknown2 Services-xml1 Services-xml2 ... Services-xml15 Services-Unknown3 Services-xml16 Services-xml17 ... then Services-xml1 instance would cover the functionality of xml1-xml15, xml-15 would cover xml16-... and the other xml instances would do nothing(return null). Idea: rather than hack this case, when we will likely want to do something similar for e.g. DataLoader in the future: /** Metaservice which can coalesce adjacent services into an optimized form. */ interface LookupCoalescer<T> { Class<T> type(); T/*|null*/ coalesce(T former, T latter); } class Lookups { /** Coalesces an ordered list of services using all coalescers known in default lookup. */ public static <T> Lookup.Result coalesce(Lookup.Result<T> r, Class<T> type); } Usage: package core; class XMLMimeResolver extends MIMEResolver {...} public/* for M-I/services */ class XMLMRCoalescer implements LookupCoalescer<MIMEResolver> { public class<MIMEResolver> type() {return MIMEResolver.class;} public MIMEResolver coalesce(MIMEResolver m1, MIMEResolver m2) { if (m1 iof XMLMIMEResolver && m2 iof XMLMIMEResolver) { return new XMLMIMEResolver(m1.data.append(m2.data)); } else return null; } } package fs; static Lookup.Result<MIMEResolver> resolvers = Lookups.coalesce(Lookup.getDefault().lookupResult(MIMEResolver.class), MIMEResolver.class); public static String getMIMEType(FileObject f) { for (MIMEResolver m : resolvers.allInstances()/* or cached MIMEResolver[] + change listener */) { String t = m.getMIMEType(f); if (t != null) return t; } return null; } I have had something similar already coded as a proof of concept (not generic, tight to MIMEResolvers), but abadoned that approach, as it was API change. And it won't work with DataLoaders, they are not in Lookup currently. I'd like to try implementing the mime resolver coalescing without a visible change first, to prove it is sound and helpfull. We can switch over to your idea later, no doors closed by my hack. DataLoader's are not in Lookup currently but they probably could and should be. Changing to enhancement after filing issue 117781. Slightly changing the order of MIME resolvers can save us from the worst problems in common cases. While this change is desirable, it might not end up in 6.0 and thanks to the other fixes under umbrella issue 114195 is not strictly necessary to keep reasonable performance. *** Issue 64461 has been marked as a duplicate of this issue. *** Jesse rewritten the mime resolvers during 6.5 and move their implementation to openide.filesystems. Moving the impl of resolvers to openide.filesystems had nothing (intentional) to do with performance, so I'm not sure why this is being closed now. Perhaps we have done what we needed to do in 6.5, but that would be orthogonal to the move. I like Petr's idea of having a singleton resolver which reads all the declarative resolvers and placing this in M-I/s lookup in a specific location. This would permit individual procedural resolvers to decide for themselves whether to run before or after the block of declarative resolvers, whereas in the current state this order is hardcoded (declarative followed by procedural). It would be enough to get the position attribute from MetaInfServicesLookup.Item. Then procedural resolvers can be merged with declarative resolvers. BTW, I will try to enhance declarative resolvers to be capable of detecting files which only procedural resolvers can. Then we can replace all existing procedural resolvers in IDE. |