Re: The "A" in RDA

From: Stephen Paling <paling_at_nyob> Date: Sat, 3 Aug 2013 22:27:19 -0500 To: NGC4LIB_at_LISTSERV.ND.EDU

OK, not I'm ~really~ baffled.

On 08/03/13, Karen Coyle 	 wrote:
> On 8/3/13 9:01 AM, Stephen Paling wrote:

> Pagerank is the ranking algorithm, not the search. The search is on words, primarily, although since the algorithm is proprietary, no one knows for sure. But there is no classification, no "this is broader than that", etc.

But "this is broader than that" is only one kind of classification. That's why I asked about the graphs you referred to. Graphs can represent different strengths of relationship, the direction of a relationship (digraphs), etc. Google creates graphs of sites. Sure, the initial search involves the words in the query. But don't you think they use tools like term weighting and IDF? That's why I asked what you meant by "keyword" searching. To me that term whiffs of old-fashioned Boolean searching over sparse surrogate records.

> >Just to be clear, are we talking about the same kind of graph? https://en.wikipedia.org/wiki/Graph_%28mathematics%29
> 
> No.

What kind of graphs are you referring to, then? What does 	"an unending graph of relationships" refer to?

> Now THAT'S a shift in topic, aimed to derail the discussion. These systems use library catalog data. Whether or not they've had user testing is another issue -- are the cataloging rules based on user testing? Are the catalog displays based on user testing? Is your catalog based on user testing? Why change the subject?

The disconnect between users' expectations and what the catalog actually delivers is a staple of discussion on this list. It's not changing the subject at all to ask whether these sites have addressed the problems in some way.

> Again, a change of subject. Anything you do on the internet that combines resources rides on the back of content providers. That's what the internet is all about - sharing. If you don't want someone using your data, don't put it online. Why shouldn't LIBRIS link to information resources, like VIAF or DBPedia, that have been made available SO THAT PEOPLE WILL LINK TO THEM? That's what some of us are trying to create with the semantic web. Where are you going with this? Do you think that scientists and governments and academic disciplines should not share their data online?

I'm trying to make sense of what you're saying. I don't mean to be disrespectful. I know that you've done a lot of work over a long period of time. But your arguments often seem at odds with each other. You criticize Google for running algorithms over text, but that's what OPACs do, too. You remark that Google's algorithms are proprietary, but how often do commercial OPAC or database vendors give you their algorithms? You argue that Google doesn't organize information, but then make an unclear reference to graphs, which is exactly what Google ~does~ use to organize information. You talk about wanting to get outside the library, but point to sites that focus on material that libraries own or already have access to. You refer to relationships, which are a key part of Google's ranking. Google is bad because it aggregates without giving anything back, but a library site does the same thing and it's OK?

Google provides infrastructure for people to find what they want. It's like the city streets that allow you to get to the library. Instead of criticizing Google, why not concentrate on building things that users actually want to find when they're out on the Web?

Steve