Message-Id: <9310152318.AA14612@expresso.bunyip.com>
From: Peter Deutsch <peterd@bunyip.com>
Date: Fri, 15 Oct 1993 19:18:39 -0400
In-Reply-To: "Rob Raisch, The Internet Company"'s message as of Oct 15, 13:29
To: "Rob Raisch, The Internet Company" <raisch@internet.com>,
Michael Mealling <ccoprmm@oit.gatech.edu>
Subject: Re: The URN: wrapper and URLs...
Hi Rob,
[ You wrote: ]
> On Fri, 15 Oct 1993, Michael Mealling wrote:
>
> > Personally the biggest gain I think we stand to get from URL: is that
> > we can come up with new schemes without having to notify everyone's
> > software that this is a URL and not a URN. A good example is this:
>
> But, Michael, we need to inform everyone's software how to retrieve the
> URL from the URN, no?
Not necessarily. Although dereferencing a URN into a URL
is a nice trick, it was not the primary goal when we first
discussed the need for URNs. When I first got involved in
this debate, oh so many moons ago, it was because I wanted
something which I could use to test the property of
"sameness" (and we all seem to agree that here "sameness"
means "as defined by the issuing authority"). Note that
this is _not_ a requirement of a URL, which is a pointer
to resources and used for mediating access. Thus, by
definition URNs and URLs are different and we must at
least sometimes be able to distinguish between them.
Now, it's nice to also be able to dereference, and I
personally think we're fully capable of deploying a system
today capable of doing this, but it's not true that a URN
is a URL or that dereferencing must work for the URN to be
of any use.
Returning to the issue of whether we need a prefix to
disambiguate whether something is a URL, URN or whatever,
It might be argued that you will always be able to
distinguish between them by context, or that it's enough
to try a fetch and see if it works, but I'm not sure about
that. Certainly, given that these things are still highly
speculative, and we're still not sure what they'll be used
for, I'd personally vote to hedge our bets and say "stick
on a prefix". It's still early, it makes everything
orthogonal and it's just not that big a deal.
The question seems to be whether having the "URL:" prefix
in all cases makes things easier or simply breaks existing
code for no good reason. I'm not sure, but suspect that
having the prefix will be a help, although to be honest
there is still a certain element of "artistic sensibility"
about this right now. I'm still chewing on this but feel
that there is merit in a truely orthogonal design.
> Both the URL and the URN are fundamentally mechanisms to retrieve
> resources from the network. In the first case, the resource is data, and
> in the second, the resource is a list of URLs.
Actually, the way I see it URNs serve another role. There
original one was to be used as a test of "sameness". This
is a requirement of URNs which is _not_ a requirement of
URLs and thus I'd like a simple mechanism for determining
when it makes sense to apply this "test of sameness" and
when it doesn't. Now, since I am identifying something
with my URN, it is nice to also ask "is it on the net" and
"where" but these are subsidiary questions, which only
make sense once the item is recognized.
Taking a slightly different tack, If we recall all those
endless discussions about transcribing these onto napkins,
perhaps letting the user see the label has some use? If
someone is to read over the phone "it says 'help - errr,
some double-yous, something'" you might not recognize it
as "http://www.gatech.edu..." (with the user assuming http
was a typo). It's harder to miss what you have when they
say something like "err, it says 'You Are En ..." etc.
I recognize that this is a weak argument, especially given
that I never got my list of desired and required traits
for these things. It's probably not even valid to quote
"transcribability, readability, etc" since we never agreed
on the relative importance of each trait. Still, these are
issues to consider. Having the prefix seems to make them
more readible and some people thought that important at
least in the past.
> What's the difference between
>
> ftp:some opaque string which is interpreted by the software
>
> (In this case, I as programmer know that I need to
> use 'ftp' to get the property, so I send it to the
> FTP RETRIEVAL ENGINE for further processing.)
>
> and
>
> URL:some opaque string which is interpreted by the software
>
> (In this case, I as programmer know that I need to
> use 'URL' to get the property, so I send it to the
> URL RETRIEVAL ENGINE for further processing.)
Actually, the later approach lets you have a single URL
engine, which may actually live on another machine. You
don't need to have _any_ knowledge of retrieval protocols
in your application code at that abstraction layer,
everything is handled in the URL processing code.
The former approach assumes you have a minimal core of
protocols that have found there way into your application.
If you can't identify a particular protocol, you might
assume it is one you haven't heard of and send it to a
proxy server anyways to see if they can make anything of
it, you might assume it is an error and report "protocol
not valid" when in fact it is, or you might guess that
there is yet another acronym dreamed up by the URI working
group and send it to the bit bucket in disgust. Seems
cleaner to have all knowledge of retrieval protocols in
one place so I'd vote to structure things to encourage
such behaviour.
>
> URN:some opaque string which is interpreted by the software
> (In this case, I as programmer know that I need to
> use 'URN' to get the property, so I send it to the
> URN RETRIEVAL ENGINE for further processing.)
Assuming retrieval is the appropriate action at that
point. it will not always be so.
> ????
>
> Frankly, I don't see the difference. You are adding URL: to the
> beginning, which adds an unnecessary extra step to the process of
> retrieving the property.
But you are moving knowledge about the retrieval layer out
of the application into an abstract layer responsible for
URL processing. If that layer can't process things you can
assume there is a problem. Otherwise, if you try to
process it and fail you still have to send to someone else
for checking, anyways. Again, this may be simply an
artistic issue, but that actually seems like not such a
bad argument to me.
> And in fact, you are always adding, at least, one extra step to all
> processing since you need to strip the unnecessary descriptor. And
> in the usual case, consider:
>
> URN:mechanism://namespace/ref becomes
> mechanism://namespace/ref becomes
Actually, that would be "URN:naming-authority:publisher:identifier:"
(modulo a debate over punctuation, etc which I'll address
in my reply to Tim to follow).
This seems to me an important distinction. We're not just
arguing about two ways to encode retrieval info. This is
about two entities with two different functions.
- peterd
--