This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
It seems that searching for mail archiving does not work at all. There is no result returned with any keyword. Here is an example, http://translatedfiles.netbeans.org/servlets/SearchList?list=dev&searchText=katakai&defaultField=author&Search=Search
It is as if there are no mail archives to be searched. Here's another random example: http://www.netbeans.org/servlets/SearchList?list=broken_builds&searchText=build&defaultField=subject&Search=Search nothing found in the broken_builds with "build" in Subject -- ha! Here's one of the emails that has "build" in its Subject: http://www.netbeans.org/servlets/ReadMsg?list=broken_builds&msgNo=4451
P2 defect with no response from support in almost 1 week. Pls respond ASAP. Mailing list search is a critical piece of site functionality.
Hi, Let me look at this asap and will update you my findings. Thanks, Kavitha Support Operations
Hi, This seems to be a site wide, taking this issue as high priority. Will update you soon on any progress made for this. Thanks, Kavitha Support Operations
changin the status.
Jack, This is again to do with robots exclusion option. Will update you again with more info after i get the consolidated info. -Priya
Hi Jack, To fix the few of the earlier requests i.e.refer issue 22183, we have enabled "Enable robots exclusion" which means on netbeans the indexer will not search /servlets (because it is disallowed in their robots.txt). We enable it for the robots to be recognized. In netbeans,the host level attribute "Enable robots exclusion" wasn't enabled but they were trying to exclude some of the URLs from indexing. After some point we enabled that, which fixed the actual issues but the other projects were running problems. Hence we would require to have the project list which is really need to be excluded from indexing. Thanks, Kavitha Support Operations
changing the status, waiting for the response.
As I know, testwww project should be excluded for sure, there are probably others but I know 100% about only this one... Jack, which others? jan
I guess the question is : "on which nb.org projects can we allow servlets/ to be indexed by robots". Is that correct ? ie we have mailing lists on www, so we need to update our robots.txt file to say "robots can index www.netbeans.org/servlets/". Yes ? Indexing of servlets/ was disallowed to stop robots bringing nb.org down. If we re-allow indexing on some project sites, aren't we asking for trouble ? Are there no other options ? Eg have the SourceCast mailing list indexer ignore the robots.txt file ? Pls advise.
Jack, I will discuss about this again with the engineer and will update you ASAP. Thanks, Priya
I think the confusion here is between the internal CEE indexer(internal robot) and external spider/crawlers(external robots) Internal robot is an CEE indexer which helps us to do the site wide search. Prior to Danube release our internal robots doesn't respect robots.txt and now we introduced an option in danube "Enable robots exclusion" is to force the CEE indexer to respect the robots.txt. I guess the question is : "on which nb.org projects can we allow servlets/ to be indexed by robots". Is that correct ? >>>> No. The question is " On nb.org what are all the projects can we disallow **internal** robot/indexer from indexing?" ie we have mailing lists on www, so we need to update our robots.txt file to say "robots can index www.netbeans.org/servlets/". Yes ? >>>> No. Again this is not about external robot. Its about internal CEE indexer. Indexing of servlets/ was disallowed to stop robots bringing nb.org down. >>>>Yes. Its true still. We are not gonna change anything here. If we re-allow indexing on some project sites, aren't we asking for trouble ? >>>> I think you refer the external robots here. External robots, the functrionality remains the same. The question is about allowing the intenal robots ie., CEE internal indexer to index the project artifacts. I would like to paste a snip here may help you: <snip> A robots.txt file is used to tell external spiders and indexers (such as Google's Googlebot, or anyone else's) what web pages to not add to their index. Since there are some portions of a typical CEE installation we won't want indexed by external indexers, there will be a default robots.txt for the domain and for each project (achieved through rewrite rules), which customers may override if they wish. Customers can also use such an override to tell CEE's own indexer to not index certain portions of a project or a domain. The default robots.txt at the domain level currently excludes all the URL patterns matching '/source/', '/search/', '/issues/' and '/servlets/'. But for the CEE internal indexer, we won't want to exclude these by default. This can be achieved by having a separate record for the CEE internal indexer. The following robots.txt file should be used as the default. User-agent: CEE Disallow: User-agent: * Disallow: /source/ Disallow: /search/ Disallow: /issues/ Disallow: /servlets/ The above robots.txt file can be read as "No robot should index any page in the site matching the URL patterns '/source/', '/search/', '/issues/' and '/servlets/' except the robot CEE for which there are no restrictions". </snip>
*** Issue 86075 has been marked as a duplicate of this issue. ***
>>>> No. The question is " On nb.org what are all the projects can we disallow **internal** robot/indexer from indexing?" For mailing lists (and probably other servlets/ content), the internal indexer should index every project. Pls enable ASAP, thanks. HTML content is different, eg we dont' want testwww/ indexed by anything. But so far we have no issues with the way html content is indexed, so nothing should change here.
Updated the engineers on the same and will make sure all these set in the config.
*** Issue 86512 has been marked as a duplicate of this issue. ***
ok. Atlast the robots.txt is edited as per the requirement and ran the full indexer for all the plugins. Verified all the above complaints about the search in mailing list and domain search and now it should work as expected. Please verify and let us know the feedback. Thanks for the patience maintained on this issue. -Priya
I have verified all the above concerns users had on 'search' and now it works as expected, so closing. Please feel free to reopen if you happen to find anything not works as expected. -Priya
Verified the search works now.
We recently moved out from Collabnet's infrastructure