I don’t know how well this applies to your specific use of screen-scraping, but for libraries’ broader use of crawlers to build archives, the Section 108 Study Group Recommendations are a good source of guidance (though not law). They propose specific copyright exceptions for libraries in regard to collecting and archiving “publicly accessible online content”. Their recommendations are clear & sensible… they run from page 80-87 of the report.
http://www.section108.gov/docs/Sec108StudyGroupReport.pdf
Tracy Seneca
California Digital Library
________________________________________
From: Code for Libraries [CODE4LIB_at_LISTSERV.ND.EDU] on behalf of Nate Hill [nathanielhill_at_GMAIL.COM]
Sent: Sunday, October 02, 2011 7:23 PM
To: CODE4LIB_at_LISTSERV.ND.EDU
Subject: [CODE4LIB] screen scraping
A question: what are the 'rules' around screen scraping?
If one site doesn't offer an RSS feed and you want to grab (for example)
their weekly top ten list with a script and then redisplay it on another
site, is that bad form? Or even illegal?
Thanks-
Nate
--
Nate Hill
nathanielhill_at_gmail.com
http://www.natehill.net
Received on Sun Oct 02 2011 - 22:55:01 EDT