Re: Another nail in the coffin

From: Alexander Johannesen <alexander.johannesen_at_nyob> Date: Tue, 5 May 2009 15:02:16 +1000 To: NGC4LIB_at_LISTSERV.ND.EDU

On Tue, May 5, 2009 at 00:18, Jonathan Rochkind <rochkind_at_jhu.edu> wrote:
> What is your point, Alex?  How can this discussion help us serve our users better?

What's my point? Ouch. Let's first turn to a guy I happen to trust,
David Weinberger (of "Everything is Miscellaneous" fame), who you'll
notice was in the room of the demonstration (and asked the third
question or so), with a pro and con post ;
http://www.hyperorg.com/blogger/2009/05/04/how-important-is-wolframalpha/

Ok, that out of the way, I *was* expecting this sort of reply (and
possibly especially from you, Jonathan), and I do set myself up for
such, so let's make a few points then ;

1. Sanitizing data is a huge problem in the library world. The MARC
record is a mess and lacks any rigour to make it denormalized enough
for computational consumption, which seems to be what most techie
library folks are trying to do. The library world spends millions on
this problem alone (the larger ones, and most through ie. OCLC), and
it's at the forefront focus of the whole FRBR / RDA debacle.

Centralizing this task is one of course that's a bit scoffed at these
days, and as such Wolfram|Alpha fits smack bang into that, but for
highly curated data this is probably something you need. And you have
done this for some time. No one has curated their data more than
libraries have the last 30 years, but where the data should have been
made itemized, tokenized and normalized, you have done neither (except
some work by OCLC).

Point: You were on the right track for having the best meta data
collection that curation can muster, yet you let people and technology
race past you without any real or tangible reaction. Was this because
you didn't see it coming, because you didn't think it would happen, or
because you can't even see it still? And if the latter, I've got some
swap I'd like to sell you.

This is not about doom and gloom. This is about pointing out realities
that's happening right in front of our faces, on things that will
impact us deeply, and if nothing else, a speculation as to why it's
happening seemingly without the library world pointing this out.

2. Analyzing sanitized data is hard, but it's something people have
been pursuing for years. The AI crowd of course has hailed this as
their holy grail, and even where NLP has made huge strides, W|A only
has to deal with contexctualized first-order logic. How? Well, by the
use of ontologies, of course. (Hint: It's not done in RDF, but
certainly can share it as such)

Point: New exciting technology that has huge relevance to the library
world is happening through technologies the library world is sadly
lacking resources and expertize in, and that perhaps it would be a
good idea to invest in directions in which advances seems to be made.
This has implications for both the library business model and the
internal infrastructure and formats used.

3. There's things you can do to data, and Wolfram, being a
mathematical guy, applies equations, etc., from all over. “There are
finite numbers of methods that have been discovered in the history of
science.” There are 5-6 millions lines of Mathematica code at work,
all linked into the ontology of what the data is, how it can be
applied, typified and ready for parsing and application.

Point: There's 27 years of really hard mathematical work at play in
the background. This is not an indexer, this is not competing with
Google. It's a mathematical engine for parsing inputs and doing
calculations over them. Divide the magnitude of that tremendous task
with the number of enquiries to the reference desk that either Google
or Wolfram can handle. "How many people died in WWII?" and so on. The
dent in reference librarianship *will* be large.

4. Automated presentation. What do yo show people so they can
cognitively grasp it? “Algorithmic presentation technology … tries to
pick out what is important.” Mathematica has worked on “computational
aesthetics” for years. (Pinched from
http://www.hyperorg.com/blogger/2009/04/28/berkman-stephen-wolfram-wolframalphacom/
DW live-blogging during the presentation)

Point: Add the 27 years in point 3 with 1 part semantic contextually
aware widgets and 1 part enabling technology, and you get an advance
in user experience few can match, even Google. This highly
contextualized platform will make a lot of librarians without a job if
it works, and to be frank, I see no evidence to the contrary. Also,
the future goes in only one direction, and the system won't get worse
over time.

Wolfram|Alpha has a framework in place that solves every big problem
the library world has technologically struggled with. People shouldn't
dismiss this as "how is this relevant?" when they should be sitting
upright and taking notes, and *learn* from this. Ignoring these kind
of things is exactly what got library systems to where they are today;
below par and becoming more or less irellevant.

I didn't post this in a "doom and gloom" way; I posted it because this
will have implications, and the more you ignore the more larger those
implications will be. They're doing things highly relevant to a big
swathe of librarianship (if not the referencing, then the actual
answers in books that won't be written, and so forth). This is why I
was saying people will become data producers, not book producers,
because it's cheaper and probably is easier to profit from. But don't
worry, you'll have your hardcore fans still.

(Written in haste)

Regards,

Alex
-- 
---------------------------------------------------------------------------
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
------------------------------------------ http://shelter.nu/blog/ --------