Re: Resignation

From: Alexander Johannesen <alexander.johannesen_at_nyob> Date: Fri, 31 Aug 2007 10:04:37 +1000 To: NGC4LIB_at_listserv.nd.edu

On 8/31/07, Sperr, Edwin <sperr_at_nelinet.net> wrote:
> I guarantee you
> plenty of librarians who *do* care about the relevant developments in CS
> have a hard time stumbling upon them.

Cheap shot : Could that be because the library systems don't work all
that well? :)

I need to make a few things clear, though. The whole field of AI, and
especially within machine learning, is all about creating systems that
can mimic human interpretation. One thing to note here is that the
more specific the wanted result, the easier it is to do, so it's
easier to identify people in an article than it is to find contextual
relevance within a scientific field, for example. The reason we can't
point to a LCSH cataloger is because no one (in their right mind :)
has applied AI to that task (although we hear that someone has done
something about it). But then, why would subject headings be what
we're aiming for?

The point I'm trying to make is that AI is capable of doing the job,
and as soon as someone catches on to doing it in *our* domain,
catalogers will have a hard time. And that "someone" is probably
Google, and it's probably happening as we speak. (I know several
semWeb and AI folks who started at Google a couple of years ago, and I
doubt they're being web monkeys)

The question is simple, though : What is the library world doing in
terms of AI? And why aren't they?

> One concern I have is that both World Bank Docs and News stories seem to
> be limited to defined scopes: in one case technical reports that are
> extremely focused (and likely to telegraph their main points ad-nauseum)
> and in the other, punchy 10-15 paragraph news stories.

Try the UN corpus with reports that are longer than your average
novel, translated into multiple languages. But I digress from the
important point that unless the library does anything in this area,
the library will not be a part of the future of that area.

> I wonder how well such approaches would work in an environment where the
> length of the texts is variable and the texts themselves often
> meandering from point to point?  Is there another test corpus that
> models library requirements better?  Anybody banging at the Project
> Guttenberg docs yet?

Good. (There are those and numerous ways to traverse text) Yes. (The
UN corpus, Project gutenberg, WikiPedia, Archives.org, etc.) And yes.
(http://www.ltg.ed.ac.uk/np/publications/ltg/papers/Betts2007Utility.pdf
although I suspect many more)

For a lot more answers, visit the AI FAQ's ;
   http://www.faqs.org/faqs/ai-faq/

Alex
--
 ---------------------------------------------------------------------------
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
------------------------------------------ http://shelter.nu/blog/ --------