Re: screen scraping

From: Simon Spero <ses_at_nyob>
Date: Mon, 3 Oct 2011 12:20:08 -0400
To: CODE4LIB_at_LISTSERV.ND.EDU
On Oct 3, 2011 9:19 AM, "Ed Summers" <ehs_at_pobox.com> wrote:

> On Sun, Oct 2, 2011 at 10:32 PM, Ken Irwin <kirwin_at_wittenberg.edu> wrote:
> > 1. respect robots.txt

Disclaimer: I am not a lawyer.

Remember that robots.txt applies only to recursive web crawlers, and not to
screen-scraping per se. In cases where it does apply, it has limited legal
effect, but ignoring it is not cricket.

Important considerations are: is access to the site governed by a license
that prohibits the activity; is the content being scraped subject to
copyright, and if so, is the screen scraping covered by one of the
exceptions to exclusive rights of the copyright holder; is the
screen-scraping activity disruptive and damaging to the site being used
(trespass to chattels, etc.)?

>A bit of reflection on the Golden Rule probably is probably more important
than pondering the legality of what you are doing.

Ed invoking philosophy? With citation? (wikipedia still counts) :-p

The usual objection to the golden rule apply here- just because one has no
objection to having a screen scraper used on your own site doesn't
automatically imply that others might not wish to have their sites scraped.

Simon
Received on Mon Oct 03 2011 - 12:20:59 EDT