Re: Linking to mass digitized books from library catalogs: one month later

From: Jan Szczepanski <jan.szczepanski_at_nyob> Date: Mon, 22 Oct 2007 15:14:54 +0200 To: NGC4LIB_at_listserv.nd.edu

Hi Maurice,

Library stands for quality, reliability, neutrality, objectivity and
much more. Googles
stands for other things but not these.

Google doesn't care for quality because they can't. They just give
everything to
everybody but we serve a special group of people, scientists, humanists,
academicians,
students and not people in real life, outside the university.

I don't feel overwhelmed. I can see enormous possiblities, money will
not stop me nor
space. Now I have the possiblity to collect everything I couldn't
earlier and that will
take many years, and I would need the help of other libraries making the
cataloging
thing (as usual the bottle-neck)

Compare with the best American libraries and now also with Max-Planck
Institute
in Europe that has said no to "Big Deals", commercial packages
incorporating
hundreds or thousands of scientific journals when they found out that
80% was not
used. Other libraries think having everything is better even if never used.

We are not in the "everything goes" business, we are not commercial, we
are just
one of the most valuable pillars in human culture.

Jan

Maurice York wrote:
> Hi Jan,
> I'm curious about this trash-or-treasure line of thinking as a
> reasoned basis for the manual effort of selection of digitized texts.
> You are quite right that libraries specialize in selection and have
> been doing it for thousands of years (more in generalities than
> realities, since I don't believe any library with a currently
> functioning collection has been around for more than a few hundred).
> But it seems to me that this is the very reason Google saw libraries
> as such an attractive proposition for digitization--they have been
> building high-quality collections of print materials and (presumably)
> sorting much of the dross according to sustained plans over long
> periods of time. When you say that the vast majority of texts in
> Google are "bad quality, bad relevance", that seems more a dig at
> American libraries and how we collect than at Google, since Google's
> collection is no more and no less than what librarians have created.
> Let me expand that a bit....it's something of a criticism of the
> libraries of Spain, Germany, the Netherlands, Japan, England, and
> France as well, all of whom are digitizing books with Google.
>
> I do respect the amount of effort you are putting into selecting ebook
> titles for your catalog--your faculty and students are lucky to have
> such mindfulness and dedication. I think very few people would argue
> for dumping every book in GBS into their own catalog--that's what
> Google and WorldCat are for. But if we are harvesting links to
> digitized content of items that we already own (which unless I
> misunderstand is the approach Tim is putting forward), then we are
> simply extending the utility of the collections we have already
> built--not throwing white noise into them.
>
> One last point I will comment on, which I think gets to the heart of
> one of the issues we need to grapple with in looking at how our
> catalogs behave in the broader context of the digitized environment.
> That is the working theorem that academics only want to see what is
> "relevant to them" rather than "everything that's available". If that
> were true, libraries would be the most beloved place on the planet to
> start looking for information, since we have traditionally tried to
> make a friendly, peaceful enivronment stocked with just "what's
> relevant to you". But we're not--only 1% of people start their
> searches at a library catalog, and among both faculty and students
> Google  blows libraries, PubMed, ScienceDirect, you name it, out of
> the water as the first place to go for information. By and large, they
> come to the library catalog after they've found what they want
> somewhere else.
>
> My point is that we should use and promote all the tools at our
> disposal for what they are good for. One of the great utilities of GBS
> and similar tools in my own research is that I can discover relevant
> content and leads in places I never would have imagined looking, and
> which are not reflected in the cataloging and organization of the
> library collections I use. Conversely, my favorite library collections
> give me structured entry points for discovery that GBS currently can't
> deliver. My ideal research environment would be  a happy marriage of
> the two, and I believe in an increasingly multi-disciplinary academic
> environment where the best research crosses unanticipated boundaries
> and pulls together unexpected avenues of thought, that is what our
> libraries should strive for.
>
> -Maurice
>
> --
> ************************************
> Maurice York
> Associate Head, Information Technology
> NCSU Libraries
> North Carolina State University
> Raleigh, NC 27695
>
> maurice_york_at_ncsu.edu
> Phone: 919-515-3518
>
> On 10/18/07, Jan Szczepanski <jan.szczepanski_at_ub.gu.se> wrote:
>
>> Thanks Steve for showing interest
>>
>> Steve Toub wrote:
>>
>>> Very interesting, Jan. Thanks for sharing.
>>>
>>>
>>>> It takes less than five minutes to create an e-record by reusing an
>>>> p-record and add the
>>>> fiels necessary to transform the record to an e-record.
>>>>
>>> Are you making the edits manually or have you automated this process in
>>> some way?
>>>
>> I do it manually but would love automation but that seems to be a dream
>> that will take 5-10 years before it hits ground.
>>
>>>
>>>> I have collected by myself up to today more than 17.000 e-books.
>>>> I can do about 10.000 per year
>>>>
>>> Wow! Is your employer supportive of this or are you doing this on your
>>> own time?
>>>
>> This is part of a project. My hopeless dream is by showing the way others
>> would follow. Only a couple of small special libraries have been inspired,
>> and started cataloguing OA working papers in the political field.
>>
>> Libraries all over the world pays for ebrary or/and Netlibrary books, in
>> spite of the fact that most of the titles are uninteresting, the
>> selection is
>> to 95% belove all descent quality criteria. The can fool a student but how
>> can the fool academic libraries? That's strange.
>>
>> In theory any library could import my 17.000 titles for free but why don't
>> they do that? I can understand that nobody outside Sweden is interested
>> in the Swedish e-books but why not the rest?
>>
>> We are still too much in the pulp business and we have handed over to
>> much power to commercial companies selling "Big Deals".
>>
>>
>>>> So what is the point to mecanically harvest GBS
>>>> URLs if most of it
>>>> is not of any value?
>>>>
>>> Hmmm. One man's trash is another man's treasure. I think I'd have a hard
>>> time convincing a faculty member at my institution that a volume we had
>>> in print wasn't worth being digitized.
>>>
>> That may be right, but we have a specific men and women, academics and
>> they are not interested in haveing "everything" digitlized. You can use
>> Bradford's law 20/80. Only twenty percent of the Google books is of
>> interest and because I'm working in a Swedish context, maybe just 5-10%
>> will be of interest in Sweden.
>>
>>> I've heard that the selection process takes more effort/time than the
>>> technical processing--folks like Google may be scanning everything on
>>> the shelf since it's too much effort to do the selection. How much time
>>> to do spend on "selection" to separate the trash from the treasure?
>>>
>> Compared with a librarian what is Google? We have been around now
>> for thousands of years, long before Google and even universities and
>> selection is our speciality. We have never acquired everything.
>>
>> Google is just a clever machine making a lot of money on a commercial
>> market.
>>
>> How much time I spend "selecting"? Less than 5%, rest is boring and
>> mechanical cataloguing.
>>
>> Yesterday I made these twenty four books in the afternoon
>>
>> Fritt tillgänglig från Center for Contemporary Arab Studies
>> http://ccas.georgetown.edu/research-papers.cfm
>> Summa: 12 fria e-böcker 17.10.07
>>
>> Fritt tillgänglig via Swisspeace
>> http://www.swisspeace.ch/typo3/en/publications/working-papers/index.html
>> Summa: 12 fria e-böcker 17.10.07
>>
>> and selected and catalogued about 25 from:
>>
>> Fritt tillgänglig via Religion Online
>> http://www.religion-online.org/listbooks.asp
>>
>> The titles on this list is really
>> going from trash to treasure. I will select less than 50% of these about
>> two hundred titles when I continue later today with the project.
>>
>>
>> Jan
>>
>>
>>
>>>        --SET
>>>
>> --
>>
>> Jan Szczepanski
>> Förste bibliotekarie
>> Goteborgs universitetsbibliotek
>> Box 222
>> SE 405 30 Goteborg, SWEDEN
>> Tel: +46 31 773 1164 Fax: +46 31 163797
>> E-mail: Jan.Szczepanski_at_ub.gu.se
>>
>>

--

Jan Szczepanski
Förste bibliotekarie
Goteborgs universitetsbibliotek
Box 222
SE 405 30 Goteborg, SWEDEN
Tel: +46 31 773 1164 Fax: +46 31 163797
E-mail: Jan.Szczepanski_at_ub.gu.se