This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 207137

Summary: Cache downloaded NBMs
Product: apisupport Reporter: Jesse Glick <jglick>
Component: HarnessAssignee: Jesse Glick <jglick>
Status: RESOLVED WONTFIX    
Severity: normal CC: jtulach
Priority: P3    
Version: 7.1   
Hardware: All   
OS: All   
Issue Type: ENHANCEMENT Exception Reporter:
Bug Depends on: 196428    
Bug Blocks: 197038    
Attachments: Proposed patch

Description Jesse Glick 2012-01-10 18:00:09 UTC
nbproject/platform.xml currently just downloads all NBMs to the selected platform location if it is empty. For purposes of CI, where you want to clean unversioned files from the workspace before each build, this is unpleasant because you do not want to define e.g. nbplatform.active.dir=${suite.dir}/nb - the long slow download would occur on each build. nbplatform.active.dir=${suite.dir}/.hg/nb solves this but creates its own problem: when you switch platform versions, you need to wipe out the workspace. It also does not suffice to let multiple jobs built against the same NB version share download bandwidth.

These issues are especially irritating since dlc.sun.com.edgesuite.net often hangs during downloads for a few hours at a time.

We would prefer to keep some sort of cache, the way Maven does. Unfortunately it is not guaranteed that a OpenIDE-Module-Specification-Version uniquely identifies an NBM build. On the other hand, this in combination with OpenIDE-Module-Implementation-Version (or -Build-) is probably enough.
Comment 1 Jaroslav Tulach 2012-01-10 18:51:17 UTC
All Linux distros I know have a development update center where you can upload new version of your module only if you increase the spec version. In my opinion we should do the same.

Let's modify nbms-and-javadoc to only update new nbms if spec version is boosted. Then the spec version uniquely identifies the module.

Of course caching into ./netbeans/hashCode(url-of-au-catalog) is good idea.
Comment 2 Jesse Glick 2012-01-10 21:34:39 UTC
(In reply to comment #1)
> a development update center where you can upload
> new version of your module only if you increase the spec version

I have thought of this as well [1]. Doing it may not be simple, however.

> Let's modify nbms-and-javadoc

Actually the more common use case is updates.netbeans.org.

> caching into ./netbeans/hashCode(url-of-au-catalog) is good idea

I was actually planning to cache based on spec version + impl/build version, which is insensitive to minor variants in catalog URL, and works even when the UC does not require new spec versions to publish.

The impl version will typically be e.g. "201107282000" (7.0 UC) or "201112071828" (7.1 UC) or "nbms-and-javadoc-8507-on-20120109" (self-explanatory). Unfortunately it seems that UC generation discards OpenIDE-Module-Build-Version (in fact autoupdate-catalog-2_6.dtd does not even allow it), meaning that the subset of modules using something like OpenIDE-Module-Implementation-Version="1" have no useful identifying build number. However #/module_updates@timestamp is available when parsing and could be used as a fallback value.

[1] http://wiki.netbeans.org/AggregatingUC#Ill-defined_.22versions.22_of_modules_released_from_CI
Comment 3 Jesse Glick 2012-01-10 23:27:30 UTC
Created attachment 114777 [details]
Proposed patch
Comment 4 Jesse Glick 2012-01-10 23:28:37 UTC
Please review.
Comment 5 Jaroslav Tulach 2012-01-12 10:06:54 UTC
I don't think [1] or the patch is heading the right direction. Rather than modifying AutoUpdateTask, the scripts should be enhanced.

I cache is defined (and I think it should have a default), the platform.xml script should mirror the center to the cache as
http://wiki.netbeans.org/wiki/images/9/9e/Mirror.xml
does and then just populate the working directory from the mirror.

The same applies to nbms-and-javadoc or updates update center. The build can produce a temporary update center (not published outside) and then existing update center is upgraded via the Mirror.xml script.

I can provide patches to demonstrate the approach works.
Comment 6 Jesse Glick 2012-01-12 15:05:54 UTC
(In reply to comment #5)
> the platform.xml script should mirror the center to the cache [...]
> and then just populate the working directory from the mirror.

You cannot mirror all UCs to the same cache location, because you may have unrelated jobs building against 6.9, 7.1, nbms-and-javadoc, etc. You could use some kind of Ant magic (<checksum>?) to produce a hash of the UC URL and cache into a corresponding subdirectory, though this is brittle since it would produce different caches for e.g. [2] and [3] and [4] and [5] even though they all have identical content at a given time.

> The build can
> produce a temporary update center (not published outside) and then existing
> update center is upgraded via the Mirror.xml script.

You have suggested this in the past and I already described in [1] why it is not very satisfactory - requires rather complex CI job setup involving lastStableBuild, and is not flexible enough to handle alternate policies like (b1) or (b2). Anyway this is somewhat off topic for this issue, which is about how to address a lack of caching in suite builds using existing update centers.


[2] http://updates.netbeans.org/netbeans/updates/7.0.1/uc/final/distribution/catalog.xml.gz
[3] http://dlc.sun.com.edgesuite.net/netbeans/updates/7.0.1/uc/final/distribution/catalog.xml.gz
[4] http://updates.netbeans.org/netbeans/updates/7.0.1/uc/final/distribution/catalog.xml
[5] http://dlc.sun.com.edgesuite.net/netbeans/updates/7.0.1/uc/final/distribution/catalog.xml
Comment 7 Jaroslav Tulach 2012-01-12 17:56:19 UTC
> You cannot mirror all UCs to the same cache location,

I don't want to mirror UCs into same location! There is an autoupdate.cache in your patch - so just create the local UC there using Mirror.xml. Should its value be automatically predefined (it is not now, but I suggested it), then let's make sure it encodes something unique (cluster versions, release version, etc.).

> Anyway this is somewhat off topic for this issue, which is about
> how to address a lack of caching in suite builds using existing update centers.

If the fact that specification version does not exactly identify a module is not related to this issue (which I question), then you don't need arguments like:

> Unfortunately
> it is not guaranteed that a OpenIDE-Module-Specification-Version uniquely
> identifies an NBM build.

> lastStableBuild, and is not flexible enough to handle alternate 
> policies like (b1) or (b2).

Is there any example where B-policies are used? Neither in Mandriva, Fedora or Debian uses them right? Nobody, doing packaging, where UC is the primary source of bits wants automatic propagation of changes. People need to publish the changes to the system by increasing the version. 

Re. complex build setup: It is not complex at all. It is one build script (the Mirror.xml)! When using A-policy, you need to access to previous state of the NBM center. That how it goes: if you need to track history, you need to know the history.

Mirror.xml like solution can unify the local and AU production "caching" as well as make our AU state (and local cache state) consistent. I still prefer such unification over ad-hoc hacks on local caching side.
Comment 8 Jan Lahoda 2012-01-12 19:28:06 UTC
(In reply to comment #7)
> > Unfortunately
> > it is not guaranteed that a OpenIDE-Module-Specification-Version uniquely
> > identifies an NBM build.
> 
> > lastStableBuild, and is not flexible enough to handle alternate 
> > policies like (b1) or (b2).
> 
> Is there any example where B-policies are used? Neither in Mandriva, Fedora or
> Debian uses them right? Nobody, doing packaging, where UC is the primary source
> of bits wants automatic propagation of changes. People need to publish the
> changes to the system by increasing the version. 

I think that for development AUCs (corresponding/analogous to development/daily builds) B-policies are fine (and even preferable for me). Currently, one needs to manually upgrade the spec. version, which is both error prone and increases the number of changesets. In the worst case, doing a commit to update a spec. version after each commit - I do not like mixing the spec. version upgrade "just to update the AUC" with other changes - changes to versions/dependencies/etc. that are required for correctness are something different, IMO, and belong to the commit that requires them.

For "release" AUCs, "manual" management of versions may be preferable.
Comment 9 Jesse Glick 2012-01-18 17:57:05 UTC
(In reply to comment #7)
> Should its value be automatically predefined [...], then let's make sure
> it encodes something unique (cluster versions, release version, etc.).

The question is exactly what. Comment #6 discussed some shortcomings with hashing the UC URL. I am not sure how cluster versions would be used as a key, since you do not even know this information until the download begins, and not all clusters have versions at all; "release version" is hard to pin down concretely.

> complex build setup: It is not complex at all.
> It is one build script (the Mirror.xml)

I was referring to complexity on the side of a CI job publishing an update center an A-policy. Something in the style of Mirror.xml does not suffice for this purpose because ${user.dir}/mirror/ is not managed by the CI server and will not work well when jobs migrate between nodes etc.; nor can the workspace be used since that gets cleaned on each build. For keeping build-to-build state you need server-specific support like Hudson's Copy Artifact Plugin.

There does not seem to be a clear consensus as to how to fix the lack of caching from a suite build, so closing this for now.
Comment 10 Jaroslav Tulach 2012-01-19 16:27:12 UTC
(In reply to comment #9)
> (In reply to comment #7)
> > complex build setup: It is not complex at all.
> > It is one build script (the Mirror.xml)
> 
> I was referring to complexity on the side of a CI job publishing an update
> center an A-policy. Something in the style of Mirror.xml does not suffice for
> this purpose because ${user.dir}/mirror/ is not managed by the CI server and
> will not work well when jobs migrate between nodes etc.; nor can the workspace
> be used since that gets cleaned on each build. For keeping build-to-build state
> you need server-specific support like Hudson's Copy Artifact Plugin.

I don't think it is that complex and Mirror.xml can be modified to be ready for Hudson environment. Just update from the lastStableBuild URL first:

<autoupdate catalog="$JOB_URL/lastStableBuild/actifacts/updates.xml">
  <include modules=".*"/>
</autoupdate>
Comment 11 Jesse Glick 2012-01-19 21:10:42 UTC
(In reply to comment #10)
> <autoupdate catalog="$JOB_URL/lastStableBuild/actifacts/updates.xml">

This is similar to what we do today (though for a slightly different reason - validating diachronic consistency). It is not robust, and has in fact failed at the whims of network admins; it assumes that a slave node can make an HTTP connection to the public URL of the master, which does not always work since it bypasses the remoting channel which is the only guaranteed point of communication between them. That is why Copy Artifact Plugin is better.