discuss AT lists.opennicproject.org
Subject: Discuss mailing list
List archive
- From: mike <mike AT pikeaero.com>
- To: discuss AT lists.opennicproject.org
- Subject: Re: [opennic-discuss] Grep.geek offline for maintenance
- Date: Thu, 04 Oct 2012 03:48:43 -0500
- Envelope-to: discuss AT lists.opennicproject.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Since grep.geek is such an important part of OpenNIC and is subject to
scalability issues, would it be appropriate to have a discussion
around some sort of distributed kind of architecture, such that others
could pool resources toward grep.geek and have some redundancy and
more storage capacity at the same time?
In other words, would a discussion around laying the foundation for a
scalable grep.geek make any sense at this time?
- --Mike
On 10/04/2012 04:10 AM, Jeff Taylor wrote:
> The database is back online again, sorry for the delay in notifying
> the list. Part of the delay was that copying the old data took far
> longer than I expected, and this is fact due to the database for
> grep.geek spiraling out of control. The database is currently
> sitting at 118GB and contains information for over 24million
> pages.
[clip]
>
> Regardless of the intentions, my poor little servers are not up to
> the task of indexing Google. Therefore I am implementing some new
> code to reject the indexing of any pages that appear to be
> redirects to another site. I will index the home page of your
> domain, but that is all. This should be enough to get your website
> listed in grep.geek and have it appear in general searches, but
> will not chew up large portions of my storage drives to retain
> multiple copies of the same websites, not to mention the bandwidth
> required to actually crawl these sites multiple times. Once the
> redundant data has been removed from the database, grep.geek should
> also respond much faster to queries.
- --
Regards,
Mike Sharkey,CEO
Pike Aerospace Research Corp. (Pike Aero)
420 Cross Street
Sudbury, Ontario
Canada P3E-3W1
P:1+(705)586-2255
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
iQEcBAEBAgAGBQJQbU1mAAoJEA7EcEr0emgfhkwIAJqJono5HINVLv0KjIGrgzZ1
14mFHLTAg0uECAHECtU3iM3sKz3+qC7yCUbyYpoCGWeMvoJMOIHT/M/h5oLiGe2y
DRdC7/tZJd8Ap/nTEQW/qXME+Rdc/ueJCGw4MUro24rxLZqzhWeV4Rer/7MUKDEh
+f9XEwMkyd6PwfsCMON+VIX5rFYfh6YLZt4VKXVhM4Hm2WX/NFSlTR+3Te6zYdoy
tXgwtuXGS8opA79JtDoIxWVZJwWTumFwp/kBuzz5d4WfRY/pLy24c2hqPvRw6eU8
x/C7EP2a4ODHEn46D1/lrtEp1cGcdHM23ipkc8rKcHsqSI2x9Chdc20Q0cgWbS4=
=y5cC
-----END PGP SIGNATURE-----
- [opennic-discuss] Grep.geek offline for maintenance, Jeff Taylor, 10/03/2012
- Re: [opennic-discuss] Grep.geek offline for maintenance, Jeff Taylor, 10/04/2012
- Re: [opennic-discuss] Grep.geek offline for maintenance, mike, 10/04/2012
- Re: [opennic-discuss] Grep.geek offline for maintenance, Jeff Taylor, 10/04/2012
- Re: [opennic-discuss] Grep.geek offline for maintenance, mike, 10/04/2012
- Re: [opennic-discuss] Grep.geek offline for maintenance, Jeff Taylor, 10/04/2012
Archive powered by MHonArc 2.6.19.