Re: SV: Integrating Google Book Search content into OPACs

From: Jonathan Rochkind <rochkind_at_nyob> Date: Fri, 16 May 2008 11:07:04 -0400 To: NGC4LIB_at_LISTSERV.ND.EDU

Some have reported success sending all Google Books API querries through
an apache mod_proxy reverse proxy.  This was to deal with other
problems, not this one.

But the reason I bring it up is it suggests it _may_ be possible to do
server-side requests without running into Google's security. Since the
mod_proxy reverse proxy approach DOES result in requests being sent from
server, and some have reported being able to do it without running into
security. If you wanted to try this (and I plan to eventually), I'd try
apeing a mod_proxy reverse proxy request as much as possible. That will
definitely include setting an x-forwarded-for header. Probably also a
referent: header.  And maybe a bit of examination of a mod_proxy reverse
proxy request to see what else it sends (user-agent? Etc.).   I plan to
try this at some point, when I get the chance.

Jonathan

Ashley Sanders wrote:
> Henrik Lindström wrote:
>
>> We also discovered the problem, described by Janes, that the script
>> stops
> > the loading of the page. We tried an AJAX workaround, but that got
> > stuck in Google's security precautions, resulting in a 401-response.
> The
> > solution we choose was to make sure that the Google-script is the
> > last thing to be added to the page when loading.
>
> We've had pretty much the same experiences -- eventually settling on
> an AJAX approach. The way I worked it was that all requests to Google
> went from our machine rather than the users machine. This worked
> for a while, but we have now been blocked by Google. All accesses now
> result in the following message from Google:
>
>      We're sorry...
>      ... but your query looks similar to automated requests from a
>      computer virus or spyware application. To protect our users, we
>      can't process your request right now.
>
> The reason I went for the above approach was that some of our
> users had let it be known that they didn't want information about
> which records they were viewing to be sent to Google.
>
> If you provide access to Google Book Search the way Google suggest then
> Google get to learn the ISBN (and other identifiers you've put in the
> url) of all the books a user is viewing -- all linked by their IP
> address. Which could be valuable marketing information for Google
> and potentially a breach of your users privacy.
>
> So Google Book Search isn't currently working on Copac until we
> decide where to go from here...
>
> Ashley.
> --
> Ashley Sanders               a.sanders_at_manchester.ac.uk
> Copac http://copac.ac.uk A Mimas service funded by JISC
>

--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu