Re: Regexp for rewriting LoC LCCN authorised personal names

From: Stuart A. Yeates <syeates_at_nyob>
Date: Tue, 5 May 2026 07:06:34 +1200
To: CODE4LIB_at_LISTS.CLIR.ORG
As it happens, I have already downloaded the records in bulk. What I need
is a regexp to parse the "quoted text"

cheers
stuart

--
...let us be heard from red core to black sky


On Tue, 5 May 2026 at 06:33, Trail, Nate <ntra_at_loc.gov> wrote:

> Stuart,
>
> You could download the entire Names file in "nt" serialization, then
> there's one line for each name you can filter on:
>
>         <http://id.loc.gov/authorities/names/nr2001046558> <
> http://www.loc.gov/mads/rdf/v1#authoritativeLabel> "Smith, Jim, 1940
> October 17-" .
>
> Then you can do what you want with the quoted text.
>
> Saves bandwidth for you and us.
>
> https://id.loc.gov/download/
>
> Good luck,
>
> Nate
>
>
> -----------------------------------------
> Nate Trail
> Network Development & MARC Standards Office
> LCSG/DPS/ABA/NDMSO
> Library of Congress
> Washington DC 20540
>
>
> -----Original Message-----
> From: Code for Libraries <CODE4LIB_at_LISTS.CLIR.ORG> On Behalf Of Kevin
> Hawkins
> Sent: Monday, May 04, 2026 2:08 PM
> To: CODE4LIB_at_LISTS.CLIR.ORG
> Subject: Re: [CODE4LIB] Regexp for rewriting LoC LCCN authorised personal
> names
>
> CAUTION: This email message has been received from an external source.
> Please use caution when opening attachments, or clicking on links.
>
> Hello Stuart,
>
> Do you mean that you want to convert LCNAF personal names from this sort
> of order:
>
> Mudge, Lewis Seymour, 1868-1945
>
> to something like this:
>
> Lewis Seymour Mudge, 1868-1945
>
> ?  But then also deal with authorized forms containing no commas, forms
> with more than two commas, and occasional use of parentheses.  So, as you
> know, it gets complicated.
>
> I wonder if a different approach might make more sense here:
>
> 1. Query the inverted LCNAF form at https://id.loc.gov/
>
> 2. Retrieve the URI, extracting the identifier (beginning with "n")
>
> 3. Query Wikidata using this identifier.
>
> 4. Retrieve Wikidata's form of the name, which is not inverted.
>
> --Kevin
>
> On 5/3/26 1:25 PM, Stuart A. Yeates wrote:
> > Does anyone know of somewhere that describes LCCN authorised personal
> > names as regexps? I want to be able to rewrite them at scale to 'normal'
> order.
> >
> > AI appears to be actively undermining the functionality of search
> engines.
> >
> > cheers
> > stuart
> > --
> > ...let us be heard from red core to black sky
>
Received on Mon May 04 2026 - 15:07:35 EDT