Message-Id: <9305252056.AA05203@expresso.bunyip.com>
From: Peter Deutsch <peterd@bunyip.com>
Date: Tue, 25 May 1993 16:56:23 -0400
In-Reply-To: Marc Andreessen's message as of May 25, 14:33
To: marca@ncsa.uiuc.edu (Marc Andreessen)
Subject: Re: URLs, URIs, and references
[ You wrote: ]
> I don't get it. How did URL's ever get to the point they're at now if
> there are still calls from some of the principles to reexamine basic
> issues and essentially start over from scratch?
Sorry if I scared anyone, I may just be having a bad day.
I guess my last posting was a sort of "crie du couer" and
as usual when you don't take the time to fully explain
things you can be misunderstood. Permit me to elaborate
somewhat on my rather hasty posting of last night.
I wish to make it very clear that I'm not have a crisis of
faith in the need and value of URLs and their brethren. I
actually think we've done a pretty good job over the past
year or so defining the needs of an information system
architecture and proposing specific elements of such an
architecture to address the perceived needs. I _do_ have
some concerns about our current working proposal for one
component which I felt I had to raise here, given the
activity on this list over the past couple of weeks and
what I've picked up from the various threads.
I think it fair to ask at this point exactly where we are
with URLs. I think there's been a lot of good and useful
theoretical discussion, which was needed to define the
needs and enunciate possibilities. I'd say that this phase
(which lasted for about a year) went well and I think the
current model that most people seem to agree upon is a
good one.
But it's time to move on to deployed systems. I know of
lots of people holding up their projects for these things
and we seem to be drifting.
When we took a stab at proposing a specific structure for
URLs, to make it easier we based it upon their use in a
single functioning system that has something like what we
need (WWW). Tim had a document on hand which has formed
and determined much of the debate on these things since we
started. In effect, by taking on an existing document from
an existing system we have implicitly accepted all the
assumptions the WWW people made about their own
URL-equivalents.
Unfortunately, the first few times people working on other
systems tried to use the current URL proposal, they seem
to have turned up some problems.
As one example, When I tried to do an additional encoding
for WHOIS++ I seem to have turned up what I think is a
conflict in the community's understanding of whether we
wanted "readibility" (as suggested in the draft) or
"transcribability" (as suggested by the stated desire to
be able to write them down on bar napkins). I think we can
agree now that these two concepts are obviously not the
same thing.
Also, when I complained that my mapping wouldn't look a
lot like my own particular protocol (problems with missing
characters, long lines, and so on, all of which imply that
there could conceivably be more work to get the mappings
right in the code) I was basically told "you are using too
many illegal characters" and that I should consider using
the hex encoding mechanism or map different characters
into what I have. Fair enough, if that's what's needed,
but it made me go "hummm".
This problem seems to indicate to me that "cross-platform
usability by humans" (driving decisions about character
sets, use of white space, and so on) was to be considered
a major requirement (which could even impact upon
efficient and functional machine handling) and not just a
desirable extra which could be forgone or compromised.
This seems to me to be a fairly major sacrifice and not
one I think we should be making without clearly agreeing
that this is what we want to do.
In this case, I concluded that the intention to make these
easily manipulated by _machines_ was being sacrificed and
I have come to the conclusion that this is wrong. I think
humans should be writing down URNs, and if writing down
URLs is a little hard, well maybe that will encourage
everyone to get URNs right, as well. Others may disagree,
but let's make it clear that the decision was a conscious
one.
Lots of questions about formatting and line wrapping have
also turned up and someone has just pointed out that their
URLs broke fairly fast when he set up a pilot system and
started mailing them around.
Again, we seem to have conflicting requirements here. In
this case, it seems to be that the unstated desire to to
keep them small and compact and looking like a single
selector string has had an impact on the desire to have
them operate harmoniously with the existing Internet
architecture. Is it fair to expect to email these? If so,
the current proposal definitely is flawed.
And so it goes....
At first, I couldn't quite decide if we were only having a
few problems with the current doc which could be fixed with
some tweaking, or if maybe we haven't actually started off
on the wrong foot, given the problems that are turning up
as people try to use them in other environments.
Note that I am _not_ claiming that the idea of a URL is
broken. And I'm not even quite willing to yet vote for
chucking the current approach of trying to make them look
like one long selector string. The point I was trying to
make was that we seem to be moving around in circles for
the past couple of iterations, without moving towards
closure on what a lot of people thought was going to be
the simple stuff. I certainly tried to do a couple of
things and it wasn't what I had imagined it to be. I think
we need more experimentation, and we need more work on the
ground rules.
Part of the problems we're experiencing may have resulted
from a dose of feeping creaturism, and part because not
enough people have actually gone out and done some coding,
while waiting for something to gel. Both should be fairly
easy to fix. We just have to have some people do the
equivalent of that Nike and "Just Do It".
Still, I feel some of what we are experiencing comes about
from the fact that we are actually working to a
fundamentally irreconcilable (sp?) set of requirements and
I think that this is giving us some problems. My current
interpretation is that the conflict in our design goals is
real and needs to be addressed before we can get this
right.
The whole point of my only half-joking suggestion that we
consider a MIME format is that those guys have spent a lot
of time on the "fit within the architecture"
considerations, foreign character set support and so on.
Certainly, their stuff is more readable than what we have,
and there's more experience with working with it at this
point. If we want that kind of stuff, there is no need to
reinvent the wheel.
If we don't need what they have, fine. Let's just find out.
Whatever we do, I think we should first go back and spend
one more turn of the crank on defining what we expect of
URLs, get an ordered list of priorities for their
characteristics and then let this ordered list be the
final sanity check for any and all proposals.
So, in the spirit of the season, here is a little snap end of
semester quiz. I've come up with a variety of statements
about URLs that I ask you to rank in order of importance
for your needs.
You have thirty minutes. All questions carry equal weight.
You may refer to you text, but no calculators allowed.
Begin now...
----------------------------------------------------------------------
--- Peter's Pop Quiz for the URL challenged. ---
1) With which of the following statements do you most
agree (pick one and only one):
- "URLs are for machines"
- "URLs are for people"
2) For each set of statements below, rank them in order of
preference. (with a clear understanding that if we give
you your first choice this may make it difficult or
impossible to give you your last choice):
Bonus Marks: rank them all as one complete list.
Encoding:
- "URLs are to be easily readible by humans"
- "URLs are to be easily transcribable without error (by humans)"
- "URLs are to be easily transmitted through existing
infrastucture, (eg. email)"
- "URLs must fit within the confines of a single bar
napkin, at no less that 7 point type."
Functionality:
- "URLs are to be easily manipulated by machine"
- "URLs are to be easily transcribable without error (by machines)"
- "URLs must be easily extensible to new systems"
- "URLs should provide strong typing information."
- URLs should provide hints to the system about
when they may or may not have gone out of date."
Creation and Ownership:
- "URLs should be derivable from first principles
by anyone who needs them."
- "URLs should be issued only by a specific server."
- "It should be possible, starting with a URL, to
identify the appropriate URN and associated meta-information."
----------------------------------------------------------------------
Okay, that's enough for now, although I'd appreciate it if
anyone cares to add to this list.
Again, I'm not attacking URLs. Despite appearences I'm not
even attacking the current proposal, despite my obvious
misgivings. What I'm doing is asking everyone to agree on
the ground rules so we can make decisions in a meaningful
way, and not as a bunch of ad-hoc hacks to what already
exists.
Enough. Now where'd I leave the 'ole asbestos suit?
- peterd
--
------------------------------------------------------------------------------
Peter Deutsch, (514) 875-8611 (phone)
Bunyip Information Systems Inc. (514) 875-8134 (fax)
<peterd@bunyip.com>
"Charging for information is not a crime, any more than charging for food is
a crime. On the other hand, I agree that letting people _starve_ is a crime."
------------------------------------------------------------------------------