Message-Id: <9410060041.AA09204@expresso.bunyip.com>
From: Peter Deutsch <peterd@bunyip.com>
Date: Wed, 5 Oct 1994 20:41:05 -0400
In-Reply-To: "Daniel W. Connolly"'s message as of Oct 5, 14:19
To: "Daniel W. Connolly" <connolly@hal.com>
Subject: Re: No "TOP" of the docuverse [Was: URC usage scenarios ]
[ Daniel W. Connolly wrote: ]
. . .
> Bingo. I don't know why folks go to such trouble to distinguish URLs
> and URNs.
Perhaps you don't understand what we want them for... :-)
> . . . The URL concept grew out of the WWW addressing architecture
> which was designed to include locationally transparent addresses[1].
> The fact that such addresses have not yet been deployed is not a
> design decision but a reflection of the fact that it takes time to
> deploy technology.
And the URN concept grew out of the need of services such
as ours (archie and its follow-ons) to identify multiple
instantiations of information independent of its location.
When I get lots of archie hits I cannot simply compare
their URLs for equality to see if they're the same
document (because a URL identifies a resource's location
but not its content), nor can I be sure that I found all
copies of a document (because someone is free to rename a
document and its URL would change).
What _I_ want from URNs is an identifier which will be the
same for any copy of a resource regardless of its location
that I can then use to distinguish among multiple
resources. You may not need this functionality, in which
case you don't need URNs, but for some of us there is a
fundamental difference here and we need both URLs and URNs.
Put another way, The requirements for URNs are intended to
allow us to perform an entirely different set of
operations on the named objects. In computer science
terms, you compare URNs and dereference URLs. Thus, I
submit that the difference between the two classes is real.
> The idea that URNs are somehow fundamentally different from URLs is
> odd, and the proposals of deploying a namespace disjoint with the WWW
> address syntax is just plain silly. The WWW address space acomodates
> multiple addressing schemes. When we come up with a service that
> provides the features that folks are looking for in URNs (high
> availability and authentication), we can start writing urn://... if we
> like, but I bet we will write whois://... and solo://... and
> lifn://... or md5://... for a while until one becomes the clear
> winner and the others die out.
I respectfully disagree with the above paragraph. The WWW
address space is just that, an address space, along with
accompanying protocol (and where appropriate, host)
information. A URL gives you the information you need to
access a copy of a resource. It does _not_ allow me to
perform the operation I need to perform, which is to compare
multiple instantiations of resources for equality of
content without examining the content itself. On the other
hand, a suitable URN _will_ allow me to perform that
operation. Ergo, URLs and URNs are not the same thing.
(BTW, I certainly don't require URNs to have high
availability nor authentication. I merely require that
they identify content, not location.)
I do agree that if you have something like
"whois://server/query-string" you in fact have a URL, not
a URN, but an MD5 checksum is not a location pointer and
cannot be used for dereferencing and access without
further work. You still need to be told which host to
connect to, which port to use, which protocol to use and
so on. Given an MD5 checksum, you will still need to find
the appropriate URL before you can go get a copy. This is
the _other_ fundamental operation we require for a URN
scheme, after comparison.
Perhaps the difference between URLs and URNs is being lost
on some people because the current proposals are focusing
not on the comparison requirement, but on the companion
requirement that URNs be easy to dereference. Remember,
resolution is _not_ the only requirement and for many
applications things like MD5 checksums will work fine
(assuming we can build mapping services, which we've
proved we can do with archie).
With that as background, let's consider a couple of
scenarios.
In the archie context, we plan to serve to our users both
a location pointer and a content identifier at the same
time. Thus, a search for the string fred might return:
URN:12345 URL:ftp://site.com/pub/fred/
URN:45666 URL:gopher://site.com/usr/fred/
URN:12345 URL:ftp://bozo.com/pub/fred/
URN:59555 URL:ftp://mysite.edu/pub/fred/
This allows me to see that the first and third entries are
the same item, so I don't need to examine both (and of
course this example illustrates how archie will allow us
to use multiple access protocols, not just ftp. For what
its worth, archie now supports multiple collections and
internal test versions support directing a single query to
multiple collections. Coming soon to an information
service near you... ;-)
Alternatively, I might want to do a search for a
particular URN, say number 12345, and get something like:
URN:12345 URL:ftp://mysite.com/pub/fred/
URN:12345 URL:ftp://yoursite.com/usr/zork/
URN:12345 URL:http://bozo.com/pub/fred/
URN:12345 URL:gopher://another.site.edu/pub/peterd/ramblings/
This tells me that there are four copies of this
particular document on the net, under several different
combinations of naming or protocol. I can now choose to
fetch the most appropriate copy, if I want it, using my
favorite client.
Of course, we want URNs to have a few other
characteristics, as well. We want multiple naming schemes
to allow us to grandfather existing info collections (eg.
ISBN numbers). We want them to be easy to transcribe, etc.
Still, the point is that at its heart, a URN is not
intended to allow access but location independent naming.
Just as ISBNs and library call numbers are different
things and serve different purposes, I submit that URNs
and URLs are different things and will serve different
purposes.
One final thought. I can imagine future systems which deal
_only_ with URNs, hiding access details from the user. In
fact, we're working on such a system ourselves. I don't
think this means that the distinction between URNs and
URLs will disappear, but simply that we will eventually be
able to hide the URLs from the user (in most cases).
Conceptually and practically there are still two different
classes of identifier being used and of course getting to
this ideal state will still require working with the
installed base of URLs. There is a difference here and
even if you don't need both, some of us most definitely
do...
- peterd
--
==============================================================================
...
"It's a -. Shall I tell him?" he asked, looking at Bill. Bill nodded, and
the Penguin leaned across to Bunyip Bluegum and said in a low voice,
"It's a Magic Puddin'."
...
"that's where the Magic comes in," explained Bill. "The more you eat the more
you gets. Cut-an'-come-again is his name, an' cut, an' come again, is his
nature. Me and Sam has been eating away at this Puddin' for years, and
there's not a mark on him."
"The Magic Pudding", by Norman Lindsay
Sounds like a pretty good analogy for the Internet to me
(and yes, that's where we got the name "Bunyip"...)
==============================================================================