This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 124779

Summary: automatic detection of Javadoc root and source root
Product: java Reporter: fommil <fommil>
Component: ProjectAssignee: David Konecny <dkonecny>
Status: RESOLVED FIXED    
Severity: blocker CC: apireviews, dkonecny, jbecicka, jrojcek, mkleint, pjiricka, tzezula, wadechandler
Priority: P2 Keywords: API, API_REVIEW_FAST
Version: 6.x   
Hardware: All   
OS: All   
URL: http://www.nabble.com/extract-Javadoc-from-source-to14653323.html
Issue Type: ENHANCEMENT Exception Reporter:
Bug Depends on: 154894    
Bug Blocks:    
Attachments: patch implementing the feature

Description fommil 2008-01-06 21:45:50 UTC
When a jar/zip archive for source or Javadoc is attached to a library, NetBeans will automatically assume that the zipfile will start at the top level of the 
archive. e.g. by adding an entry to config/org-netbeans-api-project-libraries/Libraries/xstream.xml 

jar:file:/path/to/xstream-distribution-1.2.2-src.zip!/ 

But this is not always the case... in fact it is very rarely the case with most open source library sources and javadocs. The current workaround is to manually 
edit this file to point to

jar:file:/path/to/xstream-distribution-1.2.2-src.zip!/xstream-1.2.2/xstream/src/java

The solution could be fully automated by looking for relevant entries within the archive and setting the root appropriately, or partially-automated by 
prompting the user with an archive browser. Eclipse does the former for source archives and the latter for javadoc archives.
Comment 1 fommil 2008-01-06 21:50:18 UTC
it occurred to me that I wrote some code that does this for javadocs... in a different scenario. Check out the source code for the file JavadocParanamer in the 
project Paranamer

http://svn.codehaus.org/paranamer/trunk/paranamer/src/java/com/thoughtworks/paranamer/JavadocParanamer.java
Comment 2 fommil 2008-01-18 09:03:47 UTC
if somebody in the dev team can create the dialog box, I would be happy to write the javadoc/source archive scanner (and validator). We could agree on a 
short API for me to write to.
Comment 3 fommil 2008-01-28 02:00:04 UTC
I am willing to write a class that would enable the auto-detection of the correct base within a source or Javadoc archive, also accounting for archives with 
multiple bases (e.g. the SUN JDK source or documentation archives). I would be greatly obliged if somebody could contact me so that we can agree on an 
API.

At the moment I'm thinking that the following could work...

  /**
   * @param file
   * @throws NullPointerException if file is null
   * @throws IllegalArgumentException if the file does not exist or
   * if the file is not a valid archive.
   */
  public ArchiveBaseDetector(File file);

  /**
   * @returns the list of valid paths within the archive
   * that look like the base of Java source code. Never
   * null but may be empty. Uses the '/' character as
   * the path separator.
   * @throws IOException
   */
  public List<String> getJavaSourceBases() throws IOException;

  /**
   * @returns the list of valid paths within the archive
   * that look like the base of Javadocs. Never
   * null but may be empty. Uses the '/' character as
   * the path separator.
   * @throws IOException
   */
  public List<String> getJavadocBases() throws IOException;

I'd appreciate any help or additional requirements.

Implementation-wise, the Javadoc searcher should be relatively trivial. The base of any Javadoc directory should have a package-list file. I can simply search 
for these.

The source archive detector is slightly more complicated as it involves finding .java files and then parsing them to find the package name in order to discover 
the base directory. Other files in that branch can then be ignored.

As a test case, I will ensure that the code works on the SUN JDK source code and documentation archives... as those archives have multiple bases. I will also 
choose a popular open source library to test the single-base scenario.

This solution would require user feedback only when more than one base is found... and even then, some bases could be ignored. e.g. if the returned paths 
are "/libraryname/src/java" and "/libraryname/src/tests", then the latter should probably be ignored and the user not prompted. This is a much better 
solution than the original proposal of providing an archive browser, which is what Eclipse does.
Comment 4 Petr Jiricka 2008-01-28 13:13:54 UTC
Hi Sam, I'd like to ask about concrete libraries which have sources/javadoc in a non-root location. Are these your own
libraries, or 3rd party/open source ones? Can you please specify some examples of such libraries, so we know what to use
as test cases? Thanks.
Comment 5 fommil 2008-01-28 13:49:47 UTC
pjiricka: they are all third party (open source) libraries. We currently have about 30 different libraries (some contain more than one jar) and out of that list, only 
4 or 5 have sources or javadocs at the base of the archive. Hell, even the SUN JDK sources/javadocs aren't at the root!

Some typical examples (just google for them):-

- xstream
- commons-lang
- hsqldb
- hibernate (all packages)
- pretty much every J2ME Javadoc archive
Comment 6 Petr Jiricka 2008-01-28 14:04:47 UTC
These are very usual libraries that many projects routinely use, so I would say many people will hit this issue. I think
it is important to address this enhancement for 6.1.
Comment 7 fommil 2008-01-28 14:08:28 UTC
pjiricka: I'll get an implementation of the detector classes to this issue fairly soon (probably today). I'll also make a start on a few test cases.
Comment 8 Tomas Zezula 2008-01-28 14:16:21 UTC
It's not a problem to implement this, the code is already there, but it's a problem of designing an UI. Adding Jano to cc.
Comment 9 fommil 2008-01-28 14:30:23 UTC
oh right... ok... so code already exists to do both javadoc and source detection?

Well most sources/javadocs have only one base, so there is no need for a UI if there is only one base that is found. A simple (no-UI needed) alternative would 
be to add all the bases that are found as separate entries!

An archive browser is not required, so the UI could just be to present the user with a dialog after attaching sources/javadocs that presents the discovered 
roots and asks for a choice. Is that what you were thinking, or is there a lot of bureaucracy involved in getting a UI piece like this approved?
Comment 10 Tomas Zezula 2008-01-28 16:45:18 UTC
Yes, there is such a code for javadoc and sources somewhere in the j2seplatform module.
The no UI version is fine for me, the check of roots in the archive will be done to level 3 and all
source (javadoc) roots will be added automatically when you select the zip file.
Is it OK?
Comment 11 fommil 2008-01-28 17:08:50 UTC
sounds great!
Comment 12 jrojcek 2008-01-29 09:25:16 UTC
I support the solution with no special dialogs. If we can discover the roots of sources and javadoc, the best thing is to add all of them as separate items as 
suggested by fommil. If the user doesn't want some of the discovered items, she/he can remove them. I think that would well support the majority of cases.
Comment 13 David Konecny 2008-03-06 01:36:16 UTC
Tomas, this was not implemented yet, right?
Comment 14 fommil 2008-03-18 09:43:50 UTC
Is this part of 6.1 ?
Comment 15 fommil 2008-07-03 23:50:54 UTC
could somebody please think about adding this to a milestone? (lest it be forgotten) It results in the Javadoc window (and parameter name hints) being broken 
for most open source libraries as they do not always place the source/javadocs at the base of the archive. I'd say that's a fairly major bug.
Comment 16 David Konecny 2008-07-04 00:30:10 UTC
I do agree this is very common scenario and it is a shame we still have not done anything with such a prominent
deficiency. Unless you tell me not to Tomas I will go ahead and fix this for 6.5 M2.
Comment 17 Tomas Zezula 2008-07-04 13:51:42 UTC
David, it would be nice, if you have a time for this. I am now working on the new language integration framework, and I have no time for this. The code for 
finding source (javadoc) roots is already part of several modules. The code for finding javadoc root is in JavadocForBinaryQueryPlatformImpl and code for 
finding source root is in DefaultClassPathProvider. So you can reuse it.
Comment 18 David Konecny 2008-07-06 21:05:01 UTC
Re. time - it is not like I do not have anything else to do :-) but if code already exists and it is matter of hour or
two of tweaks and testing and its done then I can find time to do this. If there are some complications and it looks
more difficult I will update this issue and pass it back to you.
Comment 19 David Konecny 2008-07-17 04:28:13 UTC
Created attachment 64790 [details]
patch implementing the feature
Comment 20 David Konecny 2008-07-17 04:49:13 UTC
Please review following three new public API methods:
* JavadocAndSourceRootDetection.findJavadocRoot - traverses up to 5 folders deep and is looking for "package-list" file
* JavadocAndSourceRootDetection.findSourcesRoot - searches for a first Java file and uses below method findPackageRoot on it
* JavadocAndSourceRootDetection.findPackageRoot - private implementation taken from
java.j2seplatform/DefaultClassPathProvider.java and made public

First two methods if they find a match they cache it via NbPreferences. The methods are called whenever user selected a
JAR/folder in #1) set Javadoc/sources on a library and #2) set Javadoc/sources on a project's JAR/folder. At the moment
UI warning is not shown consistently when Javadoc/sources root detection reports none-found but that's how it's been for
a while now. Important is that calling these methods will pre-cache results and consequently will make performance of
JavadocForBinaryQ and SourceForBinaryQ constant.

What remains to be done:
* update apichanges document

Please review. Thanks, -David.
Comment 21 Jesse Glick 2008-07-17 23:41:10 UTC
[JG01] Why is SourceReader public?


[JG02] Test probably does not need a "live" project in the data dir. In fact could probably dispense with data dir
altogether and just create example Java sources in various disk locations - would be more compact to test a lot of
different scenarios.


[JG03] findJavaPackage (and part of findPackageRoot) look to do about the same thing as the much simpler regex-based
code I have in contrib/autoproject.java/src/org/netbeans/modules/autoproject/java/ClassPathProviderImpl.java (which I
would want to have call this new API when it is ready, of course).
Comment 22 David Konecny 2008-07-18 02:34:49 UTC
Thanks for comment Jesse.

JG01 - is it? was not intentional.

JG02 - no need for a live project - just something I could test on; I will look at it again

JG03 - your regexp is better but I do not like loading of whole file into memory; I'm little bit hesitant to do any
changes in implementation - I just wanted to reuse whatever has already being used and fix this problem.
Comment 23 Jesse Glick 2008-07-18 17:41:56 UTC
Regarding loading the entire file into memory - generally loading a single file of any reasonable size is going to be
nearly the same speed as opening it and just reading part of it, since the operating system is optimized to stream the
whole file (again unless you are talking about some 3GB DVD!). Anyway if you are searching for a package root it is
likely you will only wind up inspecting a single *.java file.

The regex is attractive because it is more precise and much simpler. (Also should handle package-info.java correctly
even with annotations on the package declaration.)

But whichever you prefer should be suitable, so long as tests pass. If you go with the big impl, I will still make
autoproject.java call the standard impl, though I would add the regex version in a comment to the standard impl for a
rainy day.
Comment 24 Tomas Zezula 2008-07-18 18:13:17 UTC
tz: JavadocAndSourceRootDetection is called from the queries on roots got from persistent state. Shouldn't be the root stored into the library (platform) in 
already fixed way. If someone uses Library.getContent("javadoc") he will need to call JavadocAndSourceRootDetection to obtain the same result as the JFBQ 
returns.
Comment 25 Jesse Glick 2008-07-18 18:34:18 UTC
To tz: absolutely. I would expect that the various wizards and customizer dialogs for platforms and libraries would save
the actual S/J root in the platform/lib settings file, so that the various queries need only return the stated URL
directly. If it doesn't already work this I would consider it a bug that could be fixed independently (with care for
compatibility of old settings files).
Comment 26 David Konecny 2008-07-22 22:36:26 UTC
JG02 - tests rewritten

JG01, JG03 - I'm using Jesse's regular expression version as it is *way* simpler

tz - I changed the code so that any UI - that is #1) Libraries Customizer and #2) JAR's sources and javadoc customizer -
to perform detection and to store directly javadoc/sources roots path. That eliminates caching from previous impl.

Fixed as 9387b72807ce and 9030d60cfc5e
Comment 27 David Konecny 2008-07-22 22:36:55 UTC
.
Comment 28 Jesse Glick 2008-07-22 22:52:45 UTC
I think you will find TestFileUtils (from either openide.util or openide.filesystems, acc. to whether you prefer File or
FileObject) to be useful in creating dummy files and ZIPs, as in JavadocAndSourceRootDetectionTest.


Regarding JG03 - it looks like you used an older version of my code which does not contain some miscellaneous fixes:

http://hg.netbeans.org/main/contrib/rev/16486c8846de
Comment 29 Jesse Glick 2008-07-22 23:17:16 UTC
[JG04] findPackageRoot fails to mention that it can return null.


[JG05] getPackageRoot checks for FileObject.getPath() ending with a suffix containing File.separator, which will never
be true on Windows.
Comment 30 Jesse Glick 2008-07-22 23:32:17 UTC
[JG06] "findSourcesRoot" is not great English. "findSourceRoot" would be better.
Comment 31 David Konecny 2008-07-22 23:33:48 UTC
Thanks Jesse for catching these, I will amend it ASAP. Yes, I have not fetch newer contrib for a while so I missed your
change.

[JG05] - are you saying that FileObject's path always uses '/' and not OS specific file separator? I see, completely
forgot about that.

Btw. I had to rewrite inferRootFromPackage() to use FileObjects instead of Files to support JAR entries as well.
Comment 32 Jesse Glick 2008-07-23 00:14:22 UTC
JG05 - right.

Regarding File vs. FileObject - right, we need the ability to search inside JARs. I hope the performance impact will not
be significant, especially for findSourcesRoot which does a tree traversal. Probably should be OK - should normally find
a Java source file quickly enough.

Making use of the new API in autoproject.java:

http://hg.netbeans.org/main/contrib/rev/9735ba80a14f
Comment 33 David Konecny 2008-07-23 02:30:34 UTC
Jesse's comments implemented in c7eba1a17a4a. autoproj on contrib updated in fa6a95fc783f.

re. performance - at the moment scan for Javadoc/sources root is done in UI after eg. closing a JFileChooser so it
should be OK even if it takes little bit of time. But hopefully it will be always quick not to notice.