Re: Spiderable OPACs

From: Jonathan Rochkind <rochkind_at_nyob> Date: Mon, 23 Apr 2007 18:26:26 -0400 To: NGC4LIB_at_listserv.nd.edu

Well, if bib was exposed as it's own web page---then it _would_ be
linkable. That's kind of what Tim is suggesting I think, if it were
linkable, then  people will link to it, then useful page ranks will be
calculated.

I'm not sure this works. For the reasons Ross mentions, just the shear
number of records we have. Not only does each of our library have
hundreds of thousands or millions of records, but many of our records
all really represent the _same_ work. So if someone is meaning to link
to the work, they have a lot to choose from, and the "google juice" gets
distributed so much it's useless (or concentrated in the big ones,
leaving everyone else out to dry).

But okay, WorldCat wants to be 'the big one'--they want everyone to link
to them as the authoritative, so the google juice points everyone to
them, and then they will happily redirect to your local library---IF
your local library pays them.

Except, aside from the monopolistic concerns there,  currently Google
refuses to index every bib on worldcat.org because there are just too
many! "open worldcat" is a subset of bibs that Google will index.

I think that we have some unique problems here--the volume of our
records being in fact just ONE of them, but a big one (searching for
Pizza in Portland is like searching for a Library in Portland, not like
searching for a copy of Dhalgren in Portland, unless like Ross suggests
a pizza place has a half million kinds of pizza)--such that just "throw
it at Google" is certainly not going to be a good solution. Now,
certainly, for a variety of reasons, all of our _local_ bibs should
ideally have persistent URLs suitable for 'spidering'.  They don't now,
that needs to be fixed.  But I still remain skeptical that just putting
all of our local bibs on Google can lead to much useful discovery. It
might lead to something else interesting, it would be worth an
experiment, but an experiment is what it would be.

Jonathan

Karen Coyle wrote:
> The difficulty that I see with adding the contents of the library
> catalogs is the page rank. It's kind of the same problem Google is
> having with coming up with a ranking for its books database. Since the
> data in many library catalogs isn't linkable, there's no data to use to
> calculate the rank, just as there are not enough current links to Google
> books that would inform ranking. We could use library holdings as a
> ranking characteristic -- basically, you query WorldCat to find out how
> many libraries own the book. That's pretty crude, requires a good
> FRBR-ization of the titles, and is going to give us a very, very long
> tail. (a large number of WorldCat records have only one holding library
> -- based on data about the Google 5 libraries:
> http://dlib.org/dlib/september05/lavoie/09lavoie.html)
>
> Ranking is absolutely key. I gave up using Google desktop search because
> what I was looking for never showed up in the first 1-2 screens.
>
> kc
>
> Tim Spalding wrote:
>> Does anyone know of examples of a fully-spiderable OPAC?
>>
>> It's my contention that libraries would do well in Google and even
>> Google Local if they were spiderable. I've seen the Lamson Library
>> catalog do very well—tops in Google, even without mentioning Plymouth
>> State, but it gets a LOT of push from its association with WpOPAC.
>>
>> But I need some examples. Anyone?
>>
>> Tim
>>
>>
>
> --
> -----------------------------------
> Karen Coyle / Digital Library Consultant
> kcoyle@kcoyle.net http://www.kcoyle.net
> ph.: 510-540-7596
> fx.: 510-848-3913
> mo.: 510-435-8234
> ------------------------------------
>

--
Jonathan Rochkind
Sr. Programmer/Analyst
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu