Re: Google can't be trusted with our books

From: Jonathan Rochkind <rochkind_at_nyob> Date: Wed, 27 Apr 2011 11:59:34 -0400 To: NGC4LIB_at_LISTSERV.ND.EDU

If you authenticate as a HathiTrust partner, you can download PDFs of 
all public domain books from HathiTrust.

But, yes, this requires you to be an affiliate of a HathiTrust partner 
institution.

My understanding is that this restriction is required by 
Umich/HathiTrust's contract with Google, it's not something HathiTrust 
wants to do, but Google requires it -- I guess Google wants to insist 
the public get these PDFs, even for public domain books, only from Google.

In fact, this restriction only applies to Google scans. HathiTrust went 
the extra mile to make it only apply to those Google Scans which their 
contract says it has to apply to. If you can find a public domain book 
on HathiTrust that did NOT come from a Google scan, you in fact can 
download it as PDF, without authenticating as a HathiTrust member. There 
are a few such items in HathiTrust, although not many (yet?), most of 
their content comes from Google scans. I don't have an example on hand 
to verify, but I have verified this in the past.

And yes, this story should not increase one's happiness with Google, heh.

Jonathan

On 4/27/2011 11:47 AM, James Weinheimer wrote:
> Going back to the Guardian article, the author discusses something 
> different. He talks about how Google was intending to simply *delete* 
> all of Google Video, a huge resource that people have relied on for a 
> few years now. Google decided not to delete it only when people 
> started complaining. Their reasoning was that they wanted to 
> concentrate on "search" instead of hosting content--a rather strange 
> reason that I suspect may betray more intentions, since Google has 
> been buying up all kinds of hosting resources for several years.
>
> The author compared this with Google Books, saying that based on such 
> reasoning, a private profit-making corporation could not be trusted 
> with such an incredible resource. As he says,
> "As a private sector company, the core aim of Google is to make money. 
> The Google Videos situation shows that in order to lower expenditure 
> and adjust its priorities, Google was willing to delete content 
> entrusted to it by users. Libraries have trusted Google with millions 
> of documents: many of the books scanned by Google are not digitised or 
> OCR-processed anywhere else and, with budgets for university libraries 
> shrinking year after year, may not be digitised again any time in the 
> near future. Google acted admirably by listening to users and working 
> to save the videos but entrusting such vast cultural archives to a 
> body that has no explicit responsibilities to protection, archiving 
> and public cultural welfare is inherently dangerous: as the situation 
> made clear, private sector bodies have the ability to destroy archives 
> at a whim."
>
> Naturally, the long-term purpose of the Google Books project was for 
> Google to make money, not from any altruistic motives. They do *not* 
> do it all for us. ;-) If it looks as if they will not make money, it 
> will wind up being a drain and what will happen to the project? That 
> is why I mentioned the absolute need from Google's viewpoint (and 
> unfortunately, our viewpoint necessarily) to monetize the book project 
> somehow. While I haven't read anything, it wouldn't surprise me if 
> Google is pinching pennies now along with everybody else in this down 
> economy. If it were bad enough, Google would probably be willing to 
> jettison some of their holdings (is that the idea of closing down 
> Google Video?), including selling the book scans, but if those scans 
> are illegal, they cannot be sold. As my father would have said, it's 
> like spending money on a dead horse. It's a real dilemma for Google 
> but the main losers would be us, the public.
>
> Sooner or later, the books in our libraries will become available 
> electronically because as people become more and more used to 
> accessing materials electronically, the more distasteful they will 
> find the labor, the wait, and the general hassle of getting a physical 
> book to be able to hold in their hands for only a couple of weeks. The 
> non-electronic materials will slowly begin to go ignored, just as 
> happened before, when the texts in manuscripts that were not printed 
> were ignored and forgotten. It would be a genuine tragedy for our 
> entire civilization if the scans are not made available.
>
> By the way, while I very much appreciate HathiTrust, I cannot download 
> the public domain books and place them on my ebook reader. I must read 
> them one page at a time on my computer, which I will not do. I 
> discussed this in a blog entry earlier. 
> http://catalogingmatters.blogspot.com/2010/03/observations-of-bookman-on-his-initial.html 
> Therefore, I look for a downloadable version on either Google Books, 
> or I prefer the scans at the Internet Archive. There are lots of other 
> sites out there with some excellent book scans, though.
>