MARC XML from Internet Archives - 404 Errors - Google Books Scans?

From: Sara Amato <samato_at_nyob>
Date: Sun, 4 Sep 2022 21:07:45 -0700
To: CODE4LIB_at_LISTS.CLIR.ORG
Hello All -
I'm trying to download a few thousand marc xml files from Internet Archives
for out of copyright books. I'm following the url pattern described in
the Instructions
for contributing MARC records for Open Libraries
<https://docs.google.com/document/d/1EnmeLTWhJMRpS860UMFkqCNzb-FEfiT8xFG8oXnAybk/edit#heading=h.fbdtjmsnogsi>
  -- https://archive.org/download/**iaidentifier*/*iaidentifier**
_archive_marc.xml

which works for a large portion of the titles, e.g.
https://archive.org/download/
<https://archive.org/download/sacredgames00chan/sacredgames00chan_archive_marc.xml>
scotlandantholog0000unse
<https://archive.org/details/scotlandantholog0000unse>/
<https://archive.org/download/sacredgames00chan/sacredgames00chan_archive_marc.xml>
scotlandantholog0000unse
<https://archive.org/details/scotlandantholog0000unse>_archive_marc.xml
<https://archive.org/download/sacredgames00chan/sacredgames00chan_archive_marc.xml>

However, I am finding a significant proportion are giving me 404 errors. I
suspect they are all for google books since the IA id ends in 'goog', e.g.

https://archive.org/download/worksstephenoli00churgoog/worksstephenoli00churgoog_archive_marc.xml

and

https://archive.org/download/worldhereandthe00dickgoog/worldhereandthe00dickgoog_archive_marc.xml


Does anyone have any experience with getting this or have any ideas as to
how to get the marc records for titles such as these?
Received on Sun Sep 04 2022 - 23:51:47 EDT