Hi Jean,
> So, I was wondering if any one had an example Python or Perl script for
> reading RDF/XML, Turtle, or N-triples file. A simple/partial example
> would be fine.
I worked on a Perl script for reading RDF during last year's OCLC Developer House event.
I used the Perl "RDF::Helper" module since it claimed to "Provide a consistent, high-level API for working with RDF with Perl" [1]. There was a bit of a learning curve and I was not able to find much in the way of RDF::Helper code examples on the interwebs.
For the OCLC Developer House project, we were extracting, parsing, and displaying a library's hours from institutional data in the OCLC WorldCat Registry [2]. I've attached a perl "proof-of-concept" script and a couple of screen shots showing output. The script file has an additional ".txt" file extension for safe travels thru email. The script requires non-core perl module(s), as well as specifying a path to a CA Root certs file (for HTTPS gets).
The other Developer House "Registry Hours" project team members worked on a PHP script to do essentially the same thing (although more elegantly and with more functionality). Their code is available on Github [3].
Good luck!
- Michael Doran
[1] http://search.cpan.org/dist/RDF-Helper/
[2] Examples of data for UTA Libraries:
https://worldcat.org/wcr/normal-hours/data/2928
https://worldcat.org/wcr/special-hours/data/2928
[3] https://github.com/oclc-developer-house/wclibhours
# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 mobile
# doran_at_uta.edu
# http://rocky.uta.edu/doran/
> -----Original Message-----
> From: Code for Libraries [mailto:CODE4LIB_at_LISTSERV.ND.EDU] On Behalf Of
> Jean Roth
> Sent: Tuesday, September 30, 2014 9:14 AM
> To: CODE4LIB_at_LISTSERV.ND.EDU
> Subject: [CODE4LIB] Python or Perl script for reading RDF/XML, Turtle, or
> N-triples Files
>
> Thank you so much for the reply.
>
> I have not investigated the LCNAF data set thoroughly. However, my
> default/ideal is to read in all variables from a dataset.
>
> So, I was wondering if any one had an example Python or Perl script for
> reading RDF/XML, Turtle, or N-triples file. A simple/partial example
> would be fine.
>
> Thanks,
>
> Jean
>
> On Mon, 29 Sep 2014, Kyle Banerjee wrote:
>
> KB> The best way to handle them depends on what you want to do. You need
> to
> KB> actually download the NAF files rather than countries or other small
> files
> KB> as different kinds of data will be organized differently. Just don't
> try to
> KB> read multigigabyte files in a text editor :)
> KB>
> KB> If you start with one of the giant XML files, the first thing you'll
> KB> probably want to do is extract just the elements that are interesting
> to
> KB> you. A short string parsing or SAX routine in your language of choice
> KB> should let you get the information in a format you like.
> KB>
> KB> If you download the linked data files and you're interested in actual
> KB> headings (as opposed to traversing relationships), grep and sed in
> KB> combination with the join utility are handy for extracting the
> elements you
> KB> want and flattening the relationships into something more convenient
> to
> KB> work with. But there are plenty of other tools that you could also
> use.
> KB>
> KB> If you don't already have a convenient environment to work on, I'm a
> fan
> KB> of virtualbox. You can drag and drop things into and out of your
> regular
> KB> desktop or even access it directly. That way you can view/manipulate
> files
> KB> with the linux utilities without having to deal with a bunch of
> clunky file
> KB> transfer operations involving another machine. Very handy for when
> you have
> KB> to deal with multigigabyte files.
> KB>
> KB> kyle
> KB>
> KB> On Mon, Sep 29, 2014 at 11:19 AM, Jean Roth <jroth_at_nber.org> wrote:
> KB>
> KB> > Thank you! It looks like the files are available as RDF/XML,
> Turtle, or
> KB> > N-triples files.
> KB> >
> KB> > Any examples or suggestions for reading any of these formats?
> KB> >
> KB> > The MARC Countries file is small, 31-79 kb. I assume a script that
> KB> > would read a small file like that would at least be a start for the
> LCNAF
> KB> >
> KB> >
> KB>
Received on Tue Sep 30 2014 - 12:43:12 EDT