Some more recent discussion of a similar idea can be found in the recent
Code4Lib Journal:
Googlizing a Digital Library
Jody DeRidder
http://journal.code4lib.org/articles/43
It's still not obvious to me if an XML surrogate gets you anything,
unless you just can't make your actual HTML crawlable. I think making
the actual HTML crawlable is highly preferable.
Jonathan
Steven Harris wrote:
> Someone at our library asked recently about getting Google to crawl our catalog. The primary motivation was to reveal some unique items in Special Collections to a wider audience. I find this description of an experiment several years ago:
>
> http://www.theshiftedlibrarian.com/2003/02/03.html#a3569
>
> Basically, it requires the creation of an XML surrogate of the catalog. What's the status of this idea? Possible? Desirable? Hopelessly labor-intensive? Stupid? Superceded by other approaches? The materials are already in OCLC, so I don't know what a Google crawl of our data would add. Just a chin-scratching morning here today.
>
>
>
> Steven R. Harris
> Collection Development Librarian
> Utah State University
> (435) 797-3861
> http://collections2point0.wordpress.com/
>
>
--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu
Received on Wed Apr 09 2008 - 12:04:42 EDT