Re: screen scraping

From: Genny Engel <gengel_at_nyob>
Date: Mon, 3 Oct 2011 19:21:09 +0000
To: CODE4LIB_at_LISTSERV.ND.EDU
Another reason to check with the webmaster, all legalities aside, is that their top ten list might actually be being built on an RSS feed, but for whatever reason they don't offer it directly as a feed (or they do, but it wasn't obvious to you where that feed was to be found).  They might prefer you grab the feed rather than scrape the screen.  I don't actually have any feed-based pages on our site that aren't also available as feeds -- but some people might.  Also, for usage statistics reasons, I'd rather have bots hitting the feeds instead of the pages.

Genny Engel
Sonoma County Library
gengel_at_sonoma.lib.ca.us
707 545-0831 x581
www.sonomalibrary.org


-----Original Message-----
From: Code for Libraries [mailto:CODE4LIB_at_LISTSERV.ND.EDU] On Behalf Of Nate Hill
Sent: Sunday, October 02, 2011 7:23 PM
To: CODE4LIB_at_LISTSERV.ND.EDU
Subject: [CODE4LIB] screen scraping

A question: what are the 'rules' around screen scraping?
If one site doesn't offer an RSS feed and you want to grab (for example)
their weekly top ten list with a script and then redisplay it on another
site, is that bad form?  Or even illegal?
Thanks-
Nate


-- 
Nate Hill
nathanielhill_at_gmail.com
http://www.natehill.net
Received on Mon Oct 03 2011 - 15:22:40 EDT