Re: Converting image of MARC to text MARC?

From: Gavett, Celia <celia.gavett_at_nyob> Date: Mon, 21 Jul 2025 19:24:36 +0000 To: CODE4LIB_at_LISTS.CLIR.ORG

Hi Erich, 

I do not yet have first-hand experience with this exact issue, but perhaps you could try out Glen A. Greenly's CatalogerGPT tool? From https://chatgpt.com/g/g-8ymg1Ftwo-catalogergpt): "CatalogerGPT creates MarcEdit format MARC records from book contents you provide as images, text, or PDF files."

Fingers crossed for your team! 

Best wishes, 
Celia

Celia Gavett
Emerging Technology & Digital Projects Librarian
NYU School of Law Library
40 Washington Square South, Room LB-73
New York, NY 10012
celia.gavett_at_nyu.edu 

-----Original Message-----
From: Code for Libraries <CODE4LIB_at_LISTS.CLIR.ORG> On Behalf Of Hammer, Erich F
Sent: Monday, July 21, 2025 3:14 PM
To: CODE4LIB_at_LISTS.CLIR.ORG
Subject: [CODE4LIB] Converting image of MARC to text MARC?

Without going into details, we inherited a sizeable collection of physical materials from another library, and were only able to capture the unique MARC records in image (PDF) form.  

Visually, they are quite readable and obviously MARC (to a human eye).  They are OCR'd, but as you can imagine, the text is in blocks that when collectively copied do not paste into any useable order that would allow us to process them.  Copy/pasting every little block of text into the right order would take as much time (likely more) than simply re-typing them all (although possibly with less error).  

Does anyone know of a way to automatically convert these into useable MARC?  It feels like something AI could do if trained, but I haven't a clue how to go about doing that.

Thanks,
Erich

--
Erich Hammer            Head of Library Systems
erich_at_albany.edu         University Libraries
518-442-3891              University @ Albany

"Belief gets in the way of learning."     -- Robert Heinlein