Re: Cheap DIY web crawling?

From: Kyle Banerjee <banerjek_at_nyob>
Date: Wed, 30 Sep 2009 09:53:09 -0700
To: NGC4LIB_at_LISTSERV.ND.EDU
If you are actually running the crawls as opposed to consulting a
database of crawls that have already been done, sounds like a great
way to aggravate the heck out of a lot of sysadmins.

2 bit crawlers that don't play nice look a lot like DOS attacks. Even
when the sysadmin knows what's going on, lockout is what will result
if you start hosing services.

kyle

On Wed, Sep 30, 2009 at 6:22 AM, Cab Vinton <bibliwho_at_gmail.com> wrote:
> 1. Crawl up to 2 billion web pages/day just by filling out a web form.
>
> 2. Design and run your own crawls in minutes -- API for full customization.
>
> 3. $2 per million pages crawled and $0.03 per CPU-hr used.
>
> Wow, sounds like something libraries might be interested in, if the
> service works as advertised:
>
> http://80legs.com/
>
> [found via Mashable blog: http://mashable.com/2009/09/30/80legs/]
>
> Cheers,
>
> Cab Vinton, Director
> Sanbornton Public Library
> Sanbornton, NH
>



-- 
----------------------------------------------------------
Kyle Banerjee
Digital Services Program Manager
Orbis Cascade Alliance
banerjek_at_uoregon.edu / 503.999.9787
Received on Wed Sep 30 2009 - 12:58:58 EDT