Re: FRBR WEMI and identifiers

From: Alexander Johannesen <alexander.johannesen_at_nyob> Date: Thu, 12 Nov 2009 11:15:16 +1100 To: NGC4LIB_at_LISTSERV.ND.EDU

On Thu, Nov 12, 2009 at 09:24, Jon Phipps <jonphipps_at_gmail.com> wrote:
> As in all things web, your mileage may vary.

Well, I should disagree slightly with this; there's standards which do
tell us what these things are, it's just not always that people
implement them accordingly, which may be what you're referring to.

Let me throw in my two Bob's worth as well, because there seems to be
some confusion as to what that magical anchor # really mean ;

   http://id.loc.gov/authorities/sh2008115565
   http://id.loc.gov/authorities/sh2008115565#

these two are different URIs indicating different things, which at
first may seem rather confusing, but we need to think of the URI
business in three distinct steps ;

1. think of identities at the unparsed level (first-order) as being
strict with string literals (which is also case-sensitive), that's
really what they are, that's our first stop, although no one in their
right mind would (or should) be happy at that level. And yet, *many*
systems treat URIs as string literals. It's wrong, but widely spread
(and probably needs some consideration).

2. The *next* step is to parse the path, which will yield equality for
our two above URIs (rule: empty anchor is ignored).

3. Third step is the actual resolving of the URIs (and as such also
includes the very important but often overlooked business of content
negotiation) which may or may not yield more info about the resources.
But we need to be careful about this step ;

   http://id.loc.gov/authorities/sh2008115565
   http://id.loc.gov/authorities/sh2008115565#something

For a *browser* it means to move the viewers top as close to the
anchor as possible (so either the top of the document, or at the
#something anchor), and for this to happen the content-type (quirkily
resolved to some form of HTML) tells the server to still return the
whole page *regardless* of anchors; it must return the exact same
string of HTML characters. So make a note that content-type has a
significant role in what the meaning of the URIs resolved resource is.
If the content-type is something more like XML, the anchor - being at
the mercy of the server - is a different resource all-together. The
only reason HTML content-types returns the *same* as with an anchor is
because we have decided that special case for HTML and browsers.

And I think that has caused quite a lot of confusion, including the
whole "cool URIs" thing. :)

Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ ----------------------------------------------
------------------ http://www.google.com/profiles/alexander.johannesen ---