On Sat, 2007-09-01 at 09:03 +1200, Stephen Hearn wrote:
> I'm fine with the idea of testing computational linguistics' ability
> to distinguish David Johnsons, but I wouldn't wave the banner of
> AACR2/LCRI authority control's achievements too high on this one.
> When I checked just now, LC's undifferentiated personal name
> authority for "Johnson, David" had 12 differentiated identities
> resident under that one heading. Current AACR2/LCRI rules restrict
> the allowable qualifiers fairly narrowly. Even if sound statistics
> could sort out the works of "David Johnson" among these twelve
> identities, the rules would still have them all sharing one
> heading--as a matter of principle, I suppose. :)
>
> This points again to the drawbacks of relying on heading forms rather
> than more neutral identfiers as the locus of differentiation.
True! But presumably a catalogue record could already use a Library of
Congress Control Number or something to uniquely identify each
particular author?
I agree that using human-readble names for identifiers is problematic.
This is perhaps another "thing that LIS can learn from CS". In database
design, best practice is to use opaque tokens as identifiers. Putting
actual DATA about an entity into its identifier is a failure of
normalisation rules for database design!
I remember, in a previous life, while working for the national office of
a trade union, I was appalled to find that not only did our membership
database NOT identify each member with a unique number, but that this
ridiculous situation was explicitly required by the union rules.
Presumably, the members who had voted this rule into effect had
telephones and bank accounts ... but no way would they allow their own
union to identify them uniquely! :-D The rule is long gone now, thank
goodness!
>
--
Conal Tuohy
New Zealand Electronic Text Centre
www.nzetc.org
Received on Tue Sep 04 2007 - 01:24:32 EDT