Re: opac live search

From: Jonathan Rochkind <rochkind_at_nyob> Date: Fri, 6 Mar 2009 11:46:30 -0500 To: NGC4LIB_at_LISTSERV.ND.EDU

I think you're right, Jim.

One caveat. In general I'd be cautious about the idea that including 
proper semantic markup on your web page can aid web searching. Didn't we 
try that with "meta" tags, and find that web page authors are neither 
trustworthy nor reliable?  I don't think that's changed.

So I don't think it's in the free web in general that this kind of 
technology will show it's promise: It's instead in the kind of 
_controlled_ data that we specialize in, allowing controlled data to be 
easily remixed and re-used in novel ways by third parties, opening up 
our data (and isn't that what libraries are about) and enabling 
synergistic innovation. 

Note that though that there are a variety of ways of exersizing 
'control'.  The way we've traditionally dealt with our metadata is one 
of them.  But I would consider the wikipedia model another example of a 
controlled data space.  It's community-exersized control, in a community 
that anyone can join (rather than 'authorized expert control'), but it's 
still a space of more control than than the free web at large, where 
malfeasance and incompetence are _generally_ filtered out, and I think 
sufficient control to make this kind of linked data markup useful.  
Wikipedia is already experimenting with various kinds of linked data in 
several ways, and I'd expect that to increase, and start being very 
valuable. Or freebase is probably a better example of wikipedia-like 
control exersized on structured metadata.  I expect that the methods of 
control we use in our domain are increasingly going to move in that 
direction, if they are stay capable (as the OCLC experiment in 
_slightly_ opening up editing ability in Worldcat is one example of).

Jonathan

Weinheimer Jim wrote:
> Bernie Sloan wrote:
>   
>> Jim Weinheimer said:
>>
>> "I think it may still become a basic focal point for the Semantic
>> Web."
>>
>> Feel free to smack me if this is a dumb/naive question, but how would this be a
>> basic focal point for the Semantic Web?
>>
>> I'm not doubting the statement...I'm just trying to understand it.
>>
>> Bernie Sloan
>>     
>
> This always scares me. I remember the saying of Mark Twain, "It is better to keep your mouth shut and let everybody suspect you are a fool, than open it and remove all doubt." So, I'm opening my mouth. I just hope my understanding isn't too far off the mark.
>
> I have my own understanding of the Semantic Web. What Eric wrote, I agree with, but I would just like to add a few points. The way I see it, there are essentially two areas:
>
> Let's say you are looking for a new job on the web (just like I’m sure many of us are doing right now), and have you seen those new sites lately? You have your beautiful resume all done, but they want you to reproduce it in their system so that they can run through it quickly in their database to save their own time and energy. So, you have to sit there, copying and pasting, or retyping the same things in, over and over and over, for each job. I understand their needs, but isn’t there an easier way?
>
> The Semantic Web offers a method that, if you code everything correctly in your resume, it would automatically feed into their system. Being librarians, we understand that this is similar to bringing all the world’s library catalog MARC fields together, so that 100/700 in MARC21 equals the equivalent fields for personal authors in Russian MARC, the equivalent in other MARCs and so on. If we could get to that level, it would make it much easier to share data.
>
> As librarians, it is natural to ask, "It's nice to link fields, but what about the forms of the names, e.g. Samuel Clemens or Mark Twain?" This is the other part I mentioned, or the "conceptual part" that Eric wrote of. In essence, that any concept will be expressed as a URI, not by a textual label. Therefore, the idea is that if URIs exist, everybody can link to the same concept, e.g, I have a web page, where I write:
> "I love my cat."
>
> When Google eats it, people can find my page when they search "cat."
>
> But with the correct coding I can do it "semantically": 
> I love my 
> <item rdf:about="http://dbpedia.org/resource/Cat">Cat</item>.
>
> This would allow people to link all "cat" values together, and these could be used for searching. The word "Cat" may appear as Gatto, Kot, or whatever the user wants.
>
> Click on that that link to dbpedia and see how it works. In a semantic web search engine, you could search for "cat" and it would actually search for the URI, in this case, http://dbpedia.org/resource/Cat, and bring things together in this way.
>
> Another possibility:
> <item rdf:about="http://dbpedia.org/resource/JimWeinheimer">I</item>
> love my 
> <item rdf:about="http://dbpedia.org/resource/Cat">Cat</item>.
>
> (no page there for me!)
>
> This has consequences for textual labels of the URI and I have suggested that there may no longer be real meaning in the term "preferred form" since theoretically, each user could select any label he or she wants.
>
> This is why I think the LC announcement is so important. It must be one of the great repositories of "concepts" in the world. dbpedia is too. Could these be linked? Of course.
>
> I think the power inherent in such a system is clear enough to envision and its possibilities seem endless. To make it complete, i.e. encompassing all pages on the Internet would be useless and unrealistic, but this is where the controlled area (Ross Atkinson’s "control zone") vs. the uncontrolled area would be. I see libraries doing their work in the controlled area, and it seems to me to be the logical extension of librarianship in the digital world.
>
> But...
> I may be completely wrong, so if I am, pardons in advance!
>
> Jim Weinheimer
>
>