Re: Anyone web scraping to benefit their library?

From: Brett <brett.l.williams_at_nyob>
Date: Tue, 28 Nov 2017 14:08:11 -0500
To: CODE4LIB_at_LISTS.CLIR.ORG
I leveraged the IMPORTXML() and xpath features in Google Sheets to pull
information from a large university website to help create a set of weeding
lists for a branch campus. They needed  extra details about what was in
off-site storage and what was held at the central campus library.

This was very much like Jason's FIFO API, the central reporting group had
sent me a spreadsheet with horrible data that I would have had to sort out
almost completely manually, but the call numbers were pristine. I used the
call numbers as a key to query the catalog with limits for each campus I
needed to check, and then it dumped all of the necessary content (holdings,
dates, etc) into the spreadsheet.

I've also used Feed43 as a way to modify certain RSS feeds and scrape
websites  to only display the content I want.

Brett Williams


On Tue, Nov 28, 2017 at 1:24 PM, Brad Coffield <bcoffield.library_at_gmail.com>
wrote:

> I think there's likely a lot of possibilities out there and was hoping to
> hear examples of web scraping for libraries. Your example might just
> inspire me or another reader to do something similar. At the very least,
> the ideas will be interesting!
>
> Brad
>
>
> --
> Brad Coffield, MLIS
> Assistant Information and Web Services Librarian
> Saint Francis University
> 814-472-3315
> bcoffield_at_francis.edu
>
Received on Tue Nov 28 2017 - 14:09:41 EST