Re: FRBR WEMI and identifiers

From: Jon Phipps <jonphipps_at_nyob> Date: Thu, 12 Nov 2009 17:29:45 -0500 To: NGC4LIB_at_LISTSERV.ND.EDU

Hi Jonathan,

Any http server that receives a request for "http://something/index#fragment"
must return the document located at
"http://something/index<http://something/index#fragment>"
if it's operating correctly. So whether the user-agent sends the #fragment
as part of the request shouldn't matter, although if the user-agent expects
to be able to do something with the #fragment, like scroll to that location
in the document, it should send the fragment so it's returned as part of the
response.

If the server is making a decision about how to respond based on the
presence of a URL #fragment alone, it can't depend on the user agent
providing it, as you note, and this isn't a reliable practice.

We need to make sure that when we're talking about URIs for linked data,
which must adhere to _both_ the rules for URIs and the rules for URLs that
we're clear about which set of specs we're talking about. It's a lot more
apples & oranges than we sometimes think.

--Jon Phipps
Who sometimes wishes he'd stuck with the whole name too.

On Thu, Nov 12, 2009 at 10:24 AM, Jonathan Rochkind <rochkind_at_jhu.edu>wrote:

> Hmm, okay, I was wrong, the fragment is sent to the server. As I expect you
> knew before asking, heh. Weird.  Since the HTTP spec doesn't seem to allow
> for it.
>
> Interestingly, while curl as a user-agent sends the fragment, wget does
> NOT.   I suspect that many programming libraries used for HTTP get requests
> won't send it either, as I don't think I'm the only one under the impression
> that it's not supposed to be sent to the server.  Let's see if ruby open-uri
> does...     ruby open-uri does NOT,   open("
> http://something/index#fragment") in ruby results in simply "/index" in
> the apache logs, NOT "index#fragment".
> I'm not sure it's not _curl_ that's misbehaving, rather than those other
> things!
> This definitely makes things even MORE confusing to me.  I don't like those
> fragment identifiers.
>
> Jonathan
>
>
> Ross Singer wrote:
>
>> On Thu, Nov 12, 2009 at 10:13 AM, Jonathan Rochkind <rochkind_at_jhu.edu>
>> wrote:
>>
>>
>>
>>> Huh, I don't think this is true. I thought it was part of HTTP in
>>> general,
>>> regardless of content-type, that the user-agent will never send the
>>> "fragment identifier" (aka the "hash part") to the server. So the
>>> disposition of a request can't possibly be determiend by the server based
>>> on
>>> the fragment identifier, whether it's text/html or application/xml or
>>> anything else. It's not a special case for HTML, it's general for
>>> anything
>>> HTTP.
>>>
>>>
>>
>> Jonathan, if you have access to the commandline "curl", try opening a
>>
>> tail -f /path/to/your/apache/access/log
>>
>> in one terminal and then in another terminal
>>
>> curl "
>> http://some.url.that.the.above.log.will.see/index.html#somecrazyfragment"
>>
>> and tell me what url appears in your log.
>>
>> -Ross.
>>
>>
>>
>