Re: LIBER Quarterly Article on Europeana

From: Joe Hourcle <oneiros_at_nyob>
Date: Tue, 19 Jan 2010 12:44:09 -0500
To: NGC4LIB_at_LISTSERV.ND.EDU
On Tue, 19 Jan 2010, Thomale, J wrote:

>>> A butterfly in the wild is not data.
>>
>> Why not ?
>>
>> -- or at least a *datum*.
>
> The fact of the butterfly being in the wild is a datum. We could say, 
> number_of_butterflies_in_wild = 1. That's your piece of data. The 
> butterfly *itself* is not (at least, based on the commonly understood 
> definition of "data").

I'd have to agree with this.  I collected a bunch of posters at last 
month's AGU (American Geophysical Union) meeting, so when I get more free 
time, I can try to analyze the terms that people use to describe their 
collection, processing and just their data in general.

For the science community (at least, the segments that I deal with), the 
'data' are observations of the environment.  They might be something like 
a measurement taken by hand, or a recording from a sensor.  The specimen / 
sample / whatever is the object of interest, but is *not* the data.

The 'metadata' is then all of the other information about the data --
information about the act of collecting the data (when, where, by whom), 
possibly aggregation of the data (min/max/mean/mode/std.dev) although some 
might consider this a 'higher level dataset', other information about the 
handling, processing and storage of the data.


> And I would say the same thing in response to Simon Spero's assertion 
> that documents == data. Documents aren't data. Documents *contain* data, 
> but they are not themselves the data. A jar containing jellybeans is not 
> itself a jellybean.

Depending on your definition of 'data', documents can be data.  However, 
in my definition, documents are containers for data and possibly attached 
metadata.  (And there's metadata about the containers as well as the 
content in our catalogs)


> With that said, I do think we use the term "metadata" differently than 
> its literal interpretation ("data about data"). When a librarian talks 
> about metadata, it's pretty much understood that they mean "data about 
> documents," isn't it? Library metadata == most other professions' data.

And that's part of why we get into the issue of 'higher level datasets' 
(for those who aren't familiar with the CODMAC levels-- be very, very 
thankful.  Most people cited a NASA page when talking about them, but I 
haven't figured out where they moved it when redoing their websites:
 	http://web.archive.org/web/20080124142217/http://science.hq.nasa.gov/research/earth_science_formats.html
), but I've never seen the original list, only other lists explaining how 
they compare to the CODMAC levels.

Anyway, one person's metadata is another person's data.

-Joe
Received on Tue Jan 19 2010 - 12:39:33 EST