On 9/11/2012 3:26 PM, Karen Coyle wrote:
> I suggest you keep an eye on the W3C provenance work. It was recently
> explained to me that they see a move from triples to quads, where the
> source is no more burdensome than the subject, predicate or object, and
> there is no "keeping track." It comes with the data.
But data is not immutable. What happens when a human edits a literal
value, does it need multiple provenance values, must the original
attribution be maintained? does it depend how much they edit? You have
to run a 'diff', and if there are NO substrings in common, it loses it?
What if a machine edits possibly by merging together two data sets?
Even if you keep a complete history of all immutable snapshots
(obviously an increased technical cost in itself which you may or may
not have wanted to do otherwise), it's not at all clear when a given
piece of data has changed 'enough' that the original attribution is no
longer required (although if it's been entirely replaced with a value
from another source, it SEEMS like it would be. But what if that value
from another source is actually MOSTLY the same as the original value,
even if the new value came entirely from antoher source with it's OWN
licensing requirements. We're talking about data here, the title I get
from Amazon is quite likely to be similar to the title I get from
WorldCat, even though the 'provenance' of each does not include the
other. If for a particular book, I start with a title from WorldCat, and
replace it with a title from Amazon.... does the WorldCat attribution
license still apply because my data set has somehow been tainted?)
Also, it's great that the W3C is doing provenance work to make this
easier (great in my opinion because it's a genuinely useful function for
reasons other than license requirements).
But to license your data (assuming it's enforceable) such that it's only
convenient/non-burdensome to comply with the license using one
particular not-even-yet-finished-being-invented technology, and not
other technologies, is obviously a barrier/burden to use.
Received on Tue Sep 11 2012 - 15:40:00 EDT