Re: Spell checking (was "Elitism - and Aristotle again!")

From: Jon Gorman <jonathan.gorman_at_nyob>
Date: Mon, 6 Aug 2007 12:45:42 -0500
To: NGC4LIB_at_listserv.nd.edu
On 8/5/07, Dan Scott <denials_at_gmail.com> wrote:

> I think we can work towards a UI experience for spell-checking in
> catalogues that is biased towards precision, but enables the user to
> quickly expand recall via spell-checking and thesaurus capabilities in
> a helpful (that is, offer no suggestions that lead to zero hits) and
> progressively disclosed (that is, leave the user in control of the
> search session) manner.


I must admit to not following the thread that spawned this, but I
think this is an interesting example.  I'm glad to see that the [sic]
wasn't in the actual MARC record.  AACRII only calls for it in case of
inaccuracy in the record. This is not a misspelling or inaccurate,
just an archaic/variant spelling.  In either case, a 246 should have
been added.

The interesting thing here is that this could have been caught at a
couple of levels.  The catalog client could have recognized the
variant/archaic  spelling and asked the cataloger before committing
the record if they wanted to add a 246.  If there had been a [sic], I
would think that would definitely issue a warning if no 246 was
present.

If it wasn't caught at that level, still, a smart indexer could have
noticed that it wasn't a common spelling and indicated a variant title
automatically.  (This may or may not be searched, or might display
differently, but would be created).

Then, at the point the user entered the search, variants could have
been searched and some sort of interface to allow those results to be
added combined (ie a sidebar going "sergeant is also spelled serjeant,
15 records found with that spelling, add to search?") or done
automatically.  This could be done simply by consulting a dictionary,
without any need to examine the index.

So it's not just indicative with a problem at one level but multiple
levels that this can happen.

Jon Gorman
Received on Mon Aug 06 2007 - 11:38:11 EDT