Re: URN and citations

Alexsander Totic (atotic@ncsa.uiuc.edu)
Sat, 16 Apr 1994 16:53:31 -0500 (CDT)

From: atotic@ncsa.uiuc.edu (Alexsander Totic)
Message-Id: <9404162153.AA22310@void.ncsa.uiuc.edu>
Subject: Re: URN and citations
To: moore@cs.utk.edu (Keith Moore)
Date: Sat, 16 Apr 1994 16:53:31 -0500 (CDT)
In-Reply-To: <199404150149.VAA20425@wilma.cs.utk.edu> from "Keith Moore" at Apr 14, 94 09:49:09 pm

> The problem with this argument is that whether or not two resources
> are "the same" differs depending on your purpose.
>
> It would be reasonable to assign a URN to a resource that changed over time
> (like the weather map). While the meaning of the URN doesn't itself change,
> the object named by the URN does. We wouldn't normally call today's weather
> map and yesterday's weather map "the same".
>
> However, I might need to reference the weather map at 14 April 1994 at
> 3:00pm EDT. This might require a different URN, and a different notion of
> "sameness" would be used.

Being able to refer to a particular field within a URC would be very
useful. For example, the weather map URC might contain URLs for all
available weather maps. We could use something similar to the # directive
in URLs, that will contain the additional query arguments for the URC.

For example:

URN:/uiuc/cs/wxatmos:usmap -- would refer to the weather map, latest one by
default, and would be queried from the server
with something like:
SEARCH ID=USMAP;RETURN= URL
URN:/uiuc/cs/wxatmos:usmap#2300190494 -- would refer to the weather map
available for april 19, 94, 23:00, and would
be queried with something like:
SEARCH ID=USMAP;RETURN=URL;RETURN=2300190494

This would create some comparasance problems. You could not do a
strict of the cannonical form. But, these problems arise anyway, even with
http: urls, because for many purposes, comparasance involves both the
UR*, and what you want to do with it. Right now, when lynx caches documents
based upon URLs, it stores both the URL, and the method used to obtain the
document (http:.... + (POST|GET)). Similarly, for URCs, when comparing the
two you have to take into account what are you planning to do with them
(what do you expect back as the result of the query). It could be
a full URC, just a list of URLs, list of URLs + LIFN. So, on the client
side, for caching, you will have to compare both URCs, and expected
return values.

> Yet, there's another notion of "sameness" that will be important for nearly
> every object to be named -- if two instances of a "file" resource contain
> exactly the same sequence of bytes.

> One means of doing this is to have a distinguished "location independent
> file name" (LIFN) for each valid instance of an object, with a description
> of that object (containing e.g. the MD5 signature) available from the
> naming authority.

As someone mentions later in this thread, why not return MD5 together with
the URL, and then run the check when you download the file?

> Ideally, this LIFN would be always used as the actual handle from which a
> location of the resource were derived. In that way it would be possible to
> verify the authenticity and/or integrity of a resource. Furthermore, a
> LIFN->URL mapping obtained from the naming authority for the LIFN, would
> provide reasonable assurance that the URL pointed to the correct (and
> current) version of that resource.

This would create another level of indirection. Why is LIFN to URL mapping
and more inherently secure than URN to URL mapping? In both cases you
trust the server to provide you with a valid URL. And both servers would
have the same problems ensuring their URLs are valid.

> > It is not the business of this
> > architecture to make policy choices like that but rather allow
> > flexibility and heterogeneity in how these decisions are made. It is
> > for exactly this reason that version management, for example, is NOT
> > in the list of requirements.

I think that # directive would add level of flexibility to URN to URC
mappings that is very useful (for versioning, for example).

Aleks