This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
This bug was part of #9163, which I've broken up into several seperate bugs in order to track better. -------------------------------------------------------------- Must specifically *not* index certain things, like - cvs mail archives (actually this seems to be alread done ?); - collabnet/2 test projects (I get to http://www.netbeans.org/project/collabnet/index.html via a search); - nbmoduleadmin mail archives (?? are these public ? Should these be browsable?) - www-request_cert and www-request_passwd - you can't actually see the msgs if you browse to them, which is good, but they shouldn't show up in the search results at all; - changelogs ??
This is an enhancement request to our search functionality (which we are reifying right now). Assigning to support so that they can put this in PCN for tracking for search enhancements for SC.
Requested status from stack@collab.net on pcn #3291.
Heard from ms: <keric> ok. so www and mail are going to be the two options available to nb users when we move. <ms> yes, but seperately <keric> ok. Thanks. So there are going to be two independent search facilities that don't overlap -- one for mail archives (eyebrowse) and one for HTML. Keri
** Note these last 2 entries are pertinent to 9721 rather than this issue ** Sounds good - will "eyebrowse" search allow you to select *which* mailing list archives you search, as described in 9721 ? Eg I want to search nbdev and dev@openide, but not nbusers ?
*** This issue has been marked as a duplicate of 9721 ***
Trying to clean up and clarify : As noted in #9721, this bug is *not* a duplicate of 9721. Even if this issue is resolved, I would like some info as to exactly *what* the various searches indes. For example, in 9721 Keri notes that there will be a seperate HTML and mail-archive search. What exactly is indexed for the HTML search ? Can we exclude certain directories like - /source/browse/ - testwww/ - there were a couple of Collab test projects at one stage - ... ? Is this configurable by us ? If I wanted to stop a certain directory/project/xxx from being indexed, how would I do it ?
Collabnet internal issue SC52.
Setting internal Target Milestone of 1.1.
This has been moved back to 2.0 upgrade. (May change back to 1.1- if so, we will reopen.)
Reopening and marking P5.
Accepting issue.
The internal issue (PCN3291) has been targeted for SC1.3.
Also wondering why P5 here ...
This enhancement is targeted for the Danube release of SourceCast. During the upgrade to that release we can confirm this issue or reopen if necessary.
Verified in 2.6.
Sounds like this is in place, but I don't know how to use it. Quoting my comment of May 29 11:15:40 +0000 2001 : > Is this configurable by us ? If I wanted to stop a certain > directory/project/xxx from being indexed, how would I do it ?
started...
- cvs mail archives (actually this seems to be already done ?): Yes, the corresponding host admin option has been disabled already. - collabnet/2 test projects: Note: There is only one test project exist now. To avoid any project from being indexed, adding "/servlets/ProjectHome" in the disallow entry should fix. User-agent: CEE Disallow: /servlets/ProjectHome To avoid the whole project to be indexed User-agent: CEE Disallow: / - nbmoduleadmin mail archives User-agent: CEE Disallow: /servlets/ReadMsg?list=nbmoduleadmin Disallow: /servlets/SummarizeList?listName=nbmoduleadmin - www-request_cert and www-request_passwd: Again the same setting as above in the robots should help here. - To avoid any directories like /source etc...to be indexed, User-agent: CEE Disallow: /source/ Note: The changes in robots.txt will be applicable only for the new data, for the already indexed data, a full indexrebuild is requried.
Jack, this was waiting for your review, however as per your email closing this. *************** 2) SC should *not* index some lists : http://www.netbeans.org/issues/show_bug.cgi?id=9724 This is still open though I think should be closed/fixed - it looks like we can now do this with edits to robots.txt, cool.
Reopening. The fix isn't in place yet, until it is the issue should stay open so it doesn't get lost.
Added the following to www/www/robots.txt # Don't want SC indexer to return results from rollup cvs and issues lists # See http://www.netbeans.org/issues/show_bug.cgi?id=9724 User-agent: CEE Disallow: /servlets/ReadMsg?list=nbcvs Disallow: /servlets/SummarizeList?listName=nbcvs Disallow: /servlets/ReadMsg?list=nbbugs Disallow: /servlets/SummarizeList?listName=nbbugs In fact I don't find any hits from those lsits anyway so not sure this is really needed.
In fact if the archives are deleted (issue 36647) they wont show up in search results anyway. But using robots.txt this way might be useful for other lists we don't want to index.
We recently moved out from Collabnet's infrastructure