Re: METS in JSON-LD?

From: Murray, Gregory <gregory.murray_at_nyob>
Date: Tue, 1 Aug 2023 16:08:45 +0000
To: CODE4LIB_at_LISTS.CLIR.ORG
Hi Manuela,

Personally I’d recommend storing and managing your METS documents as XML and converting to JSON only at the point where you need to process them as JSON. That assumes you have the option of storing as XML. If your database is built mainly or only for JSON, then you’ll have to do a one-time conversion from XML to JSON. In any case, as has already been said it’s strictly a matter of conversion, and there are lots of tools that can take XML and output generic JSON in a lossless way. Whether you would then need to have another process to go from JSON to JSON-LD I don’t know; that’s outside my knowledge.

Hope this helps,
Greg


Gregory Murray
Director of Digital Initiatives
Wright Library
Princeton Theological Seminary


From: Code for Libraries <CODE4LIB_at_LISTS.CLIR.ORG> on behalf of parker, anson D (adp6j) <000000d5611add13-dmarc-request_at_LISTS.CLIR.ORG>
Date: Tuesday, August 1, 2023 at 10:55 AM
To: CODE4LIB_at_LISTS.CLIR.ORG <CODE4LIB_at_LISTS.CLIR.ORG>
Subject: Re: [CODE4LIB] METS in JSON-LD?
[You don't often get email from 000000d5611add13-dmarc-request@lists.clir.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

might be worth spending a minute on some AI bots to play with this

at the end of the day it's an XML->JSON project and there are a bunch of tools that will streamline that

for instance here's a dumbed down python script with a streamlit interface i got out of claude.ai in a couple of queries


import streamlit as st
import xml.etree.ElementTree as ET
import json

st.title('METS to JSON-LD Converter')

uploaded_file = st.file_uploader('Choose a METS XML file', type=['xml'])

if uploaded_file is not None:
    # Load METS file
    tree = ET.parse(uploaded_file)
    root = tree.getroot()

    # JSON-LD context
    context = {
        '@vocab': 'https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fschema.org%2F&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TYxQc%2BBh0kJYYb2Wy7IvpJjJqiiabnlpdsSVhFViWvw%3D&reserved=0<http://schema.org/>',
        'dc': 'https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pbxUZAJT2mBujKQJOHiPAOrZZTKf7AqKW8KcRUtdmZ4%3D&reserved=0<http://purl.org/dc/elements/1.1/>'
    }

    jsonld = {'@context': context}
    jsonld['metadata'] = []

    # Parse METS and generate JSON-LD
    for dmdSec in root.iter('dmdSec'):
        # Extract metadata
        md = {
            'name': dmdSec.find('mdWrap/xmlData/mods/titleInfo/title').text,
            'author': [{'@type': 'Person', 'name': namePart.text} for namePart in dmdSec.find('mdWrap/xmlData/mods/name')],
            'datePublished': dmdSec.find('mdWrap/xmlData/mods/originInfo/dateIssued').text,
            'publisher': {'@type': 'Organization', 'name': dmdSec.find('mdWrap/xmlData/mods/originInfo/publisher').text},
            'genre': dmdSec.find('mdWrap/xmlData/mods/genre').text,
            'description': dmdSec.find('mdWrap/xmlData/mods/abstract').text
        }

        jsonld['metadata'].append(md)

    # Output JSON-LD file for download
    json_txt = json.dumps(jsonld, indent=4)
    st.download_button('Download JSON-LD', json_txt, 'metadata.jsonld')

________________________________________
From: Code for Libraries <CODE4LIB_at_LISTS.CLIR.ORG> on behalf of Manuela Pallotto Strickland <000001243159acc2-dmarc-request_at_LISTS.CLIR.ORG>
Sent: Tuesday, August 1, 2023 10:47 AM
To: CODE4LIB_at_LISTS.CLIR.ORG
Subject: [CODE4LIB] METS in JSON-LD?

Hello,
I am posting this question on a couple of lists, so sincere apologies to those who might see it twice (or thrice).
Does anyone know of any work that has been/is being/will be done on 'a' METS JSON-LD serialization?
Any relevant info or comment in this re will be very much appreciated.
Thank you!
Best wishes,
Manuela


__________________________________________________________________________________

Dr Manuela Pallotto Strickland | Metadata and Digital Preservation Coordinator | Archives & Research Collections | Libraries & Collections
King's College London | Strand | London WC2R 2LS | manuela.pallotto.strickland_at_kcl.ac.uk<mailto:manuela.pallotto.strickland_at_kcl.ac.uk>
Tel: Please call me using MS Teams or Skype for Business, or email to arrange a call

W: https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kcl.ac.uk%2Flibrary%2Fcollections%2Farchives&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vYbN%2FPFH3wje38r9bnL%2BTvH9GqtcH20YRMDpBJUmZnQ%3D&reserved=0<https://www.kcl.ac.uk/library/collections/archives>
T: twitter.com/KingsArchives<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2FKingsArchives&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HQpBnfJh3XSZdIx13CYjmVppCIkecbii8EOB0BeF7mg%3D&reserved=0> and twitter.com/kingslibraries<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2Fkingslibraries&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=503MDdt2QwD%2B5Z3OU8St%2Bjhw8mU6ER93J5W%2B2sWifKY%3D&reserved=0>
Blog: blogs.kcl.ac.uk/kingscollections<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblogs.kcl.ac.uk%2Fkingscollections&data=05%7C01%7Cgregory.murray%40PTSEM.EDU%7Cab618e3e600b4bc6890108db929f6228%7C6fb1672fa768436d88c81585060baf28%7C0%7C0%7C638264985520735577%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=yYKeQ5Q4ngQlxPD71xUs%2FFPzaGUMBCukFsaEz9S06Ao%3D&reserved=0>
Received on Tue Aug 01 2023 - 11:30:50 EDT