On Tue, 19 Jan 2010, Thomale, J wrote:
>>> A butterfly in the wild is not data.
>>
>> Why not ?
>>
>> -- or at least a *datum*.
>
> The fact of the butterfly being in the wild is a datum. We could say,
> number_of_butterflies_in_wild = 1. That's your piece of data. The
> butterfly *itself* is not (at least, based on the commonly understood
> definition of "data").
I'd have to agree with this. I collected a bunch of posters at last
month's AGU (American Geophysical Union) meeting, so when I get more free
time, I can try to analyze the terms that people use to describe their
collection, processing and just their data in general.
For the science community (at least, the segments that I deal with), the
'data' are observations of the environment. They might be something like
a measurement taken by hand, or a recording from a sensor. The specimen /
sample / whatever is the object of interest, but is *not* the data.
The 'metadata' is then all of the other information about the data --
information about the act of collecting the data (when, where, by whom),
possibly aggregation of the data (min/max/mean/mode/std.dev) although some
might consider this a 'higher level dataset', other information about the
handling, processing and storage of the data.
> And I would say the same thing in response to Simon Spero's assertion
> that documents == data. Documents aren't data. Documents *contain* data,
> but they are not themselves the data. A jar containing jellybeans is not
> itself a jellybean.
Depending on your definition of 'data', documents can be data. However,
in my definition, documents are containers for data and possibly attached
metadata. (And there's metadata about the containers as well as the
content in our catalogs)
> With that said, I do think we use the term "metadata" differently than
> its literal interpretation ("data about data"). When a librarian talks
> about metadata, it's pretty much understood that they mean "data about
> documents," isn't it? Library metadata == most other professions' data.
And that's part of why we get into the issue of 'higher level datasets'
(for those who aren't familiar with the CODMAC levels-- be very, very
thankful. Most people cited a NASA page when talking about them, but I
haven't figured out where they moved it when redoing their websites:
http://web.archive.org/web/20080124142217/http://science.hq.nasa.gov/research/earth_science_formats.html
), but I've never seen the original list, only other lists explaining how
they compare to the CODMAC levels.
Anyway, one person's metadata is another person's data.
-Joe
Received on Tue Jan 19 2010 - 12:39:33 EST