discuss AT lists.opennicproject.org
Subject: Discuss mailing list
List archive
- From: Jeff Taylor <shdwdrgn AT sourpuss.net>
- To: discuss AT lists.opennicproject.org
- Subject: [opennic-discuss] Search engine notes
- Date: Fri, 01 Apr 2011 16:47:04 -0600
- List-archive: <http://lists.darkdna.net/pipermail/discuss>
- List-id: <discuss.lists.opennicproject.org>
I have built a new script on grep.geek today for spidering the domains. It no longer relies on searching specific TLDs each day, but now runs from a randomized list of domains under all the TLDs. This means that all pages will currently get spidered at least twice a week, but there are no longer any restrictions on how much the search engine can grow. This also provides an easy method for adding specific web pages from the ICANN realm as well, so I have added a few basics like the wikipedia page on OpenNIC.
Eventually I plan to create a web page where anyone can specifically add or block a page from the search engine. In the meantime, if you know of any websites that should be added, let me know the URL, and if I should grab JUST the one page, or if the entire website is relevant. I'm not putting any restrictions on what would be considered relevant, but I would suggest sites that are pertinent to opennic - such as DNS info or networking - or any sites that are related to a website you host on OpenNic. Please format lists with one URL per line.
Also note that robots.txt is supposed to be honored, but if you know of a case where this is not true, please let me know.
- [opennic-discuss] Search engine notes, Jeff Taylor, 04/01/2011
Archive powered by MHonArc 2.6.19.