On Oct 21, 2025, at 5:00 PM, Edward M. Corrado <ecorrado_at_ECORRADO.US> wrote:
> Issue 61 of the Code4Lib Journal is now available at: https://journal.code4lib.org/ Articles include:
>
> - What it Means to be a Repository: Real, Trustworthy,
> or Mature? by Seth Shaw
>
> - Building and Deploying the *Digital Humanities Quarterly
> Recommender System by Haining Wang, Joel Lee, John A.
> Walsh, Julia Flanders, and Benjamin Charles Germain Lee
>
> - From Notes to Networks: Using Obsidian to Teach Metadata
> and Linked Data by Kara Long and Erin Yunes
>
> - Retrieval-Augmented Generation for Web Archives: A Comparative
> Study of WARC-GPT and a Custom Pipeline by Corey Davis
>
> - Extracting A Large Corpus from the Internet Archive, A Case
> Study by Eric C. Weig
>
> - Liberation of LMS-siloed Instructional Data by Hyung Wook
> Choi, Jonathan Wheeler, Weimao Ke, Lei Wang, Jane Greenberg,
> and Mat Kelly
>
> - Mitigating Aggressive Crawler Traffic in the Age of Generative
> AI: A Collaborative Approach from the University of North
> Carolina at Chapel Hill Libraries by Jason Casden, David Romani,
> Tim Shearer, and Jeff Campbell
For a good time, I applied various distant reading and machine learning computing techniques to the whole of our Code4Lib issue, and I documented what I learned at the following (temporary) URL: https://bit.ly/3JqwN0X
One of the bit different modeling techniques I applied was named-entity extraction. More specifically, I wrote a named-entity extraction tool which identifies and lists human values in documents. This work was a part of a research project and published as "Corpus-based analysis of human values in blockchain and constitutions" by Aditya Joshi, et al (10.1145/3761826). Based on my observations, our Code4Lib authors value: individuality, integrity, transparency, privacy, and trust.
Fun with distant reading.
--
Eric Lease Morgan
Librarian Emeritus, University of Notre Dame
Received on Wed Oct 22 2025 - 13:22:24 EDT