Skip to Content.
Sympa Menu

discuss - Re: [opennic-discuss] search engine opennicproject?

discuss AT lists.opennicproject.org

Subject: Discuss mailing list

List archive

Re: [opennic-discuss] search engine opennicproject?


Chronological Thread 
  • From: Morten Oesterlund Joergensen <mortenoesterlundjoergensen AT mortenoesterlundjoergensen.dk>
  • To: discuss AT lists.opennicproject.org
  • Subject: Re: [opennic-discuss] search engine opennicproject?
  • Date: Sun, 08 May 2011 18:37:39 +0000
  • List-archive: <http://lists.darkdna.net/pipermail/discuss>
  • List-id: <discuss.lists.opennicproject.org>

On Sat, 2011-05-07 at 16:18 +0200, JP Blankert (thuis & PC based) wrote:
> What search engine do you all use to find eachother's .glue, .geek etc.
> domain names? 'Normal searchengines' as google do not index outside IANA
> rootzone (is my finding....)
>
> Best regards,
>
> Philippe Blankert
> _______________________________________________
> discuss mailing list
> discuss AT lists.opennicproject.org
> http://lists.darkdna.net/mailman/listinfo/discuss

Maybe someone should set up an instance of YaCy
(http://en.wikipedia.org/wiki/YaCy) beginning the crawling at some of
the websites using the top-level domains of OpenNIC?
I have had YaCy running for years up until about half a year ago. It
requires several GiBs of RAM; else the Java virtual machine runs out of
memory and that often results in internal corruption or something, which
requires a reinstall of YaCy itself. It should of course be possible to
find an easier fix, for instance like clearing its internal database.
That was the reason why I stopped running it. Maybe I should spend some
time installing it again.
One also need to tweak the I/O usage, as it really slows down the
system, if not configured correctly.
Even though not strictly necessary, I recommend doing a bit of
maintenance about every two weeks or similar. That is to restart the
crawling from the website it started at, otherwise it may never reach
that site again and to clear the old data from the index. I believe that
the search engine always returns the results from the most recent
crawling and old data is also as default overwritten if the crawler
actually stumbles upon an already visited website, but there is really
no reason to store old and possibly outdated data. Unfortunately there
isn't an automated way to delete old data, so one has to clear the
entire database at once like that.
It seems that one can test the search engine here:
http://yacy.net/en/Searchportal.html





Archive powered by MHonArc 2.6.19.

Top of Page