Re: After MARC...MODS?

From: Alexander Johannesen <alexander.johannesen_at_nyob>
Date: Tue, 27 Apr 2010 14:29:11 +1000
To: NGC4LIB_at_LISTSERV.ND.EDU
Jim,

> Pardon, I have never maintained that searching library catalogs is
> *better.* I have said that there are powers that are absolutely not
> in the Google-type searches, and that these powers are important.

Ok, let's go through this together ; we are talking about how the one
is better at getting to results over the other. Your initial challenge
was in "Afro-Americans in US agriculture through history" where the
pitfalls are what certain concepts were called at various points
through time, so if you search for Afro-Americans you won't get the
more contemporary blacks (or worse). And you give anecdotal evidence
for you being correct that Google does a crap job at this.

Here's my point again ; there are many different ways of achieving the
"best" results, either through your way or the Google way or some
other way. This is all search methodology, and *not* eschatology of
the tools we use. These are human problems of a more linguistic nature
more than they are concepts that are embedded in our various cultures
and our data models.

How to get this point across? Let's build on logic ; if I can get from
A to B (the path of starting from nothing, and getting to the piece of
knowledge I sought) through means X, why is means Y important? By this
I'm not saying means Y is not important nor helpful nor interesting.
I'm asking *why* it is so. You might have an opinion and some
experience that tell you that means Y *is* important, maybe there's
some cross-pollinations going on there, tacit journeys of knowledge,
stronger basis of truth, whatever, something! But just stating that an
alternative means is important should be backed up with value
statements that points to value.

How do we get to knowledge? How do we get to *good* knowledge? What
does "good" mean? It used to be that we *only* got good knowledge
("good" as in trusted, verified, backed-up, written down, stored away,
agreed through consensus, etc.) through means Y. These days there's
also means X, Z and L. Their intrinsic importance is rooted in whether
they bring the good knowledge or not. How they do it is actually
irrelevant in a value-system based on the knowledge itself.  (Do also
note here that the value system for knowledge has changed just as
dramatically as the means and medium of knowledge has)

I suspect this is the cause for our disagreement. There is the
potential of knowledge (the corpus), then the found knowledge (the
search), and then the scrutiny of the found knowledge (eh, scrutiny).
Both our models (yours and mine in this dispute) have *greatly*
different results at every single point, the method of doing each
point differ, but the functionality of the points are exactly the
same. You might say that the quest is the same, the paths, lakes and
mountains are the same, but they all have different cities and towns
with different names around, and the language used are different.

It's easy to scrutinize the corpus for which is better (Google has
more, has different things, in more languages, unstructured), a little
harder looking at the search results, and most hard at the actual
scrutiny of what was found (especially hard in libraries where you
don't actually have the knowledge yet, just a physical "link" to it).
But this is not controversy. The real problem lies in the change of
value systems of knowledge, the change in access to the channels of
knowledge, and the changing shape of  knowledge itself. Knowledge is
not attached to concepts, is not found in meta data, it cannot be
found through a search on Google nor a reference librarian with spare
time ; you find it in the model in which you perceive that knowledge.
What I suspect we all (all who create computer systems that deal with
knowledge)  struggle to do is to create a computer system model that
match the mental model as far as possible so that knowledge is easier
to spot and consume.

Let me get a few words in in answer to your next statements to clarify ;

> While you may continue to maintain that you can search concepts through
> these newer tools, or that it's unimportant to do so, many of my students
> certainly have problems with full-text and have wound up in deep holes.

Jim, my point here was that "concepts" are themselves just a concept,
a human construct to define something that is mostly undefinable yet
recognizable as knowledge. "African-American" is such a concept that
some times holds value to the searcher, other times not. There is no
given intrinsic value in that concept on the knowledge it tries to
portray, so to speak. In order for the human brain to encapsulate the
knowledge it needs the external model to match with the internal
model. Concepts, as widely used, tries to perform this function, but
it is as fallible as there are other concepts that overlap and as many
other concepts are attached to the knowledge (and remember that *no*
concept is also a concept!) .

You have as much chance of getting the right piece of knowledge as
there are pieces of knowledge around. There's simply too many
parameters in the search to say that it holds any particular value
over any other method of doing so, and our example here is our library
specialist vs. Google dichotomy ; your method, in order to be a
contender for what we want to practice and persevere with, must be
able to demonstrate value. What value?

Well, that was what I wanted you to point out. And this next session
points to this, and I feel I must say something about ;

> The Semantic Web project wouldn't need to be built. I demonstrated
> what you would miss by searching Google in the way you did "blacks
> agriculture united states", but those are all dismissed

... because it's a straw man. I can do the same to you; if you
searched for Mark Twain you will miss out all of that which was
written by Samuel Clemens or Josh or Thomas Jefferson Snodgrass. Yeah,
I know ; rookie mistake, so off you go and do the right searches, tick
the right boxes, concoct the right search. Therein lies my point ; in
Google you do the same. And there just like in the library, if no one
tells you or gives you the hints of these alternative paths, you will
get stuck in a hole.

So what you're saying is, and I asked you this specifically in my last
post, that the librarian helping you search *is* this advantage you're
talking about, and in fact it is not how you can search for concepts,
tick the right boxes or concoct just the right arcane search riddle in
order to get the OPAC gods to answer properly. The librarian *knows*
these tricks, and demonstrates their awesome powers, just like someone
else like me can know the Google tricks and demonstrate my powers. The
methodology is the same, the path is the same. However, the data is
not, nor is the result.

> and besides, how could somebody be aware of materials you never
> even see in the first place?

The library doesn't do any better here.

> You didn't see it; you were happy with the result you would get in Google,
>  even after I showed some of the concepts you could not--by definition
> --possibly see or even know about. How are you supposed to know about
>  these things that are in your "concept" but that don't appear in your
> search result?

Here I feel you're grasping for straws (not least by saying I'm so
happy with whatever I get from Google. Didn't you read my Google
methodology?). Why are you so persistent in saying that the search
result possibly can't (by its very definition, no less!) give these
hints? Of course they do! Every single returned item is a hint of
where to go to or search for next. Remember, it's a network? A network
of fully available text? It's millions upon millions of links all
there to make to search around until you find what you're after. I'm
sure we can take off on some imaginary search and explain the good and
the bad of the method, but if you get to good knowledge, why does the
method matter? What value do you offer that is worth persevering with?

> How do you do this in a library catalog? It's a lot of work. You
> have to find the authorized forms (i.e. "concepts") of everything,
> look at the cross-references for related "concepts," find catalog
> records that seem to match your search and search the "concepts"
> you find there. The tool is powerful if you know how to use it.

As an aside, do you know that this is the very essence of Topic Maps
technology, something I've peddled here and other places for over 10
years? Sometimes I wonder if librarians ever pay attention to anything
outside people ever say. *grumble*

> But I can't show this in 5 minutes--that's why there is such a thing
> as bibliographic instruction and information literacy courses.

Their complexity does not bode well for explaining their value, though.

> No, it's not uncomfortable ; it's nonsense.
>
> Of course it's nonsense. I'm glad you have such a deep expertise on
> the fundamentals of how a catalog works that it simply outshines all
> of my own experiences as well as that of many other catalogers
> on this list, so that our ideas can simply be dismissed as nonsense.

No need to get snarky; I stated my credentials just to make sure that
I wouldn't be dismissed as an outsider who don't know what they're
talking about, like what a lot of librarians would do when they're
challenged. I know *something* about what goes on in the back there,
and I have some knowledge to judge what you say about concept search,
specialist search tools and their like. I'm not some Google loving
wanker who dismiss the good ol' librarian ways because I don't think
they're worth anything.

I'm trying to make you explain that value, a) so that I can understand
it better, and b) for you to scrutinize it better, and c) for all of
us to make sure we know that what we claim is good *really* is good.

> Continue to insist that a search for "cats"

In a discussion like this even *you* will have to give context. A
concept doesn't exist until you need it. I could indeed be looking for
plain normal feline cats, and my result would have been perfect. So
what cats are you on about?

> or "blacks"

1. Blacks clothing, 2. Wikipedia "Black" the color, 3. Wikipedia
"Black people" ; either of these give hints that your search was
wrong, and 3. specific details of what you should be looking for.

> or "Dostoyevsky"

Did you mean: Dostoevsky

> pull out their related "concepts" from Google, and maintain how
> useful they are to people. And people will believe it until it is not just
> a matter of surfing and hunting around, and information becomes
> a serious matter for them. That's when we have to pick them up.

I suspect that there are many, many people you don't pick up at all
because, well, they are better at searching Google. Or a semantic
search engine. Or the search in specialist tools. Or DBpedia. Or
Wikipedia.

Why do I get the feeling you think the web and Google are dead-ends in
finding good knowledge, that after the search result comes back,
*that* is the premise for the argument for Google? It sounds a bit,
um, strange coming from someone like you who have demonstrated in the
past deep knowledge of all the paths that lead to good knowledge.
Surely you haven't forgotten? Surely this is a misunderstanding
between us, where rhetorical devices gets a little overused in trying
to score points for your home team?

> This is one of the big problems that catalogers have towards many
> systems people. While catalogers and librarians are supposed to
> respect everything the systems people say about the power of the
> new searching (which my students will insist doesn't work much of the
> time, protestations to the contrary--but their complaints can all be
> ignored), these same systems people insist that they know everything
> we know, only more and better.

If you mean me as part of that systems people group, I'm saddened that
your reading comprehension of my mails and the generic gist of what
I'm trying to depict has gone down the gurgler. No one in this systems
people group fight for the librarian knowledge nuggets more than me.
Maybe you're confusing my scrutiny of their value in the future
library and non-library systems for a generic devaluation of library
knowledge?

> I am happy in your faith in Google. It is truly unshakeable.

I don't care about Google, nor do I care about any brand or method. I
do, however, care about what brings results. I do believe in getting
better results, quite a lot. That is the belief that is unshakable;
that we must always strive for getting better, and that there always
will be better ways. I don't care whether that is called Jim's crazy
system or Google. Results. End of story.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ ----------------------------------------------
------------------ http://www.google.com/profiles/alexander.johannesen ---
Received on Tue Apr 27 2010 - 00:30:47 EDT