Message-Id: <9310150127.AA13363@expresso.bunyip.com>
From: Peter Deutsch <peterd@bunyip.com>
Date: Thu, 14 Oct 1993 21:27:40 -0400
In-Reply-To: Kevin Altis's message as of Oct 8, 22:13
To: kevin@scic.intel.com (Kevin Altis), uri@bunyip.com
Subject: Re: The URN: wrapper and URLs...
---- WARNING ----
The following posting contains polemic material. I
consumed a whole cappuchino before starting this posting
and as those who know me will attest, that would be enough
to get me going at warp speed. Fortunately, it was only a
small one...
Also, I do hope no one reading this takes offense. I'm
definitely not aiming this at any one person or at the WWW
development community in particular. In fact, I've spent
most of the past two weeks working with, and demoing WWW
to the publishing community and was most impressed with
many of its capabilities. I'm just somewhat concerned at
the attitude of some in the WWW community who seem to think
that what they have defines the Internet's info delivery
state-of-the-art for the next few years. As a consequence
I've noticed a marked unwillingness to make changes in
their work to accomodate others. I think this is a mistake
I'll try to address why in this posting.
Now, on with the show....
[ Kevin Altis wrote: ]
> At 6:14 PM 10/8/93 -0700, Marc Andreessen wrote:
> >Because software has already been deployed on a very wide scale to
> >both serve and retrieve information using the current scheme, and we
> >and others made a huge commitment to maintain and support the current
> >methodology -- which works fine. Changing the fundamental schema of
> >URLs at this point would be a *huge* pain in the ass, and we'd have to
> >support backward compatibility for years.
> >
> >Besides, what's the point? URLs are file:, http:, wais:, etc. URNs
> >can be urn:. URCs can be urc:. That would do the job just fine...
>
> I agree with Marc that a huge installed base of clients, servers, and more
> importantly "content" already exists utilizing the existing URL structure,
> changing the schema at this point will destroy much of the success of WWW
> with little potential for gain in the long run.
Let's not kid ourselves here. Yes, WWW has had some
success with URL-likes thingies, and yes, Xmosaic is a
popular program on the Internet among certain groups of
well connected users. On the other hand, the current level
of deployment of any of our tools is a miniscule fragement
of a drop in the teacup compared to the ocean of tools,
info and users that's a'coming in the next few years. If
we let an installed base of a few tens of thousands (or
even a few hundreds of thousands, although I don't think
it's that large) get in the way of getting things right we
will be doing a disservice to the entire Internet
community.
I spent last week at the Frankfurt Book Fair. There were
ten exhibition halls full of publishers' stands. There
were literally thousands of companies, with thousands and
thousands and thousands of titles on display. The Fair had
something like 265,000 visitors in a week and just walking
across the exhibition complex took about 30 minutes. This
place was HUGE...
And guess what? Some of the visitors had heard of the
Internet, almost none of them were connected and absolute
none of them (except the few people at the one booth we
participated in) were yet serving real info onto the net
in quantity. _These_ are the people we're developing the
UR* specs for, not the few developers and pioneer users of
WWW, Xmosaic or archie. If we insist that the 15 million
users coming on line in the next 12 months use something
"because that's what was used in the Apollo Domain System"
or "because that what the original archie telnet client
used" we deserve to spend the rest of our days using OSI
protocols to do remote COBOL programming. Please, let's
keep things in perspective and not take quite such a
parochial view of what's going on here.
> Many URLs are protocol specific. Well, technically the first element is the
> name of the scheme and the rest of the URL after the colon may be different
> depending on the scheme. However, most of our URL forms today are
> protocols.
And I'd add developed primarily for specifying and
accessing files.
This dependency on protocols and file formats formed the
heart of my objections to the current URL spec, as I feel
the document is way too tied to the existing practice of
WWW, which is in turn way too file-oriented for what I see
coming in the next couple of years. I made my objections
to this clear over the past year or so, so there's no
particularly need to rehash them here, but I'm a little
disappointed that I'm now hearing from the WWW community
that now we've taken their standard as the basis of the
URL spec we can't make any changes to it because it would
break WWW and its clients. That strikes me as just
slightly like the case of the guy who murders his parents
and then pleads for leniency since he's an orphan...
The fact is, we're supposed to be building these things
for the next generation of tools. They're supposed to work
not just to make WWW a success, but to make the Internet
as a whole a better place to turn to for information. WWW
is not the end of the line in tools development and we'd be
kidding ourselves if we pretend that it is. We'd also be
setting ourselves up to be completely ignored by the
_next_ set of clever developers who would rightly bypass
what we've done if it doesn't do what they need.
> . . . [An unfourtunate side effect is that the most common URLs we see
> today (ftp, gopher, and http) are often confused to mean that if something
> is available at http://domain/file then the user can also get it with
> ftp://domain/file which is almost always not true. Watch some "normal"
> users to see if they make this mistake.] . . .
Of course, I don't think the average user should have to
type these things in anyways, or be trying to ftp a file just
because the tool couldn't find it.
Let's do a reality check here. There are about 2.5 million
files in anonFTP, an equivalent number in Gopher (with
_lots_ of duplicates) plus whatever is available in WWW.
That's peanuts, folks.
In the future, people will use URLs to specify MTV video
feeds, interactive weather servers, on-line phone-in chat
show feeds, volunteer chess position calculation services
(a project I've been planning to implement for years now),
plus _LOTS_ more things that are neither specifically
protocols nor at all file-like. They will simply be
resources which we want to specify in a uniform manner.
The current format will probably be bent to support these
things, but it really is too file/protocol oriented right
now and we'd better allow graceful escape (which is why
I'd prefer to see the leading "URL:" tag required in the
future). To do otherwise is to "program ourselves for
failure" as my dad likes to say.
Yes, WWW serves files nicely. Yes, it has some primitive
graphics and page description formating capability. Still,
apart from it's awareness of the client-server model it is
hardly state of the art in electronic publishing. Last
week I saw interactive CD-ROMs with hundreds of commercial
titles shrink-wrapped and ready to go, pocket book readers
the size of calculators with half-size CDS with full
motion video, and I saw full-function page description and
markup languages that support hypertext, full font
control, full text searching and lots more. Although none
of these other systems supports client-server networked
access (which is where we shine), they are all available
as products _today_.
We've got a great medium here, but we do _not_ have any
real products yet. Let's keep this all in perspective.
> . . . I don't think it is clear whether
> URNs, URCs, etc. will necessarily indicate protocols in their schema. Also,
> the slashes of URLs imply hierarchy. I don't necessarily want to impose
> hierarchies on URNs.
I think it clear that they will _not_ necessarily indicate
protocols. I also think that the presence of slashes in
some particular formats does _not_ indicate general
hierarchy.
As one example, here at Bunyip we have now merged WAIS,
WWW, Gopher and the existing archie server into a single
system (yeah, we toss in Prospero, telnet and email for
good measure). When a user comes in through the gopher
front end and specifies a menu item that is in fact in a
WAIS database item (for example, to select an entry in a
catalogue, an entry in the yellow pages, etc) then the
resulting gopher selector string that comes out as part of
the menu item is in fact the information that we need to
feed to the WAIS search engine to extract the info. No
hierarchy implied here.
The appearance of hierarchy to date is in fact an accident
of the widespread use of files and file systems in early
examples. That's an accident of history and should be
treated as such.
Note that the user need never know that the gopher
selector string we supply them is really a WAIS selector
(or notice when we replace the WAIS search engine with an
SQL search engine and change the format of the selectors
one day). That's why we're building this stuff, so they
don't need to know. That's why they're to be treated as
opaque strings.
> . . . So if anything, maybe the U of "Uniform" (or
> "Universal") should be changed to some other term when we talk about and
> define URNs rather than try and mangle what has already been done with
> URLs.
OH NO!! NOT AGAIN!! ACRONYM WAR ALERT!! THIS IS NOT A DRILL!! :-)
Seriously, I don't think we could stand another round of
acronym wars. The "uniform" means just what it implies.
These are suppposed to be a system-independent mechanism
for specifying access to resources (not just files) on the
Internet. They are supposed to be usable in multiple
systems, not just the particular WWW, Gopher or whatever
client you happen to be using this month. They should be
used to specify all sorts of resources, not just files.
And not just WWW items.
Let me assure you, there will be a system to replace WWW,
and I predict that it will take shape in the next year or
so. There has to be.
Why? Because there is too little security, there is too
little support for page description languages, there is
too little support for linking grapics, video and other
multimedia, there is to little support for interacting
with other, dynamically accessible services. Because of
all this, WWW will mutate and be replaced in the next year
or so by something better.
Now, it may be called WWW+, WWW-prime or whatever, or it
may even have a spiffy name like "Surfer" or "Voyager"
(yeah, I know there's already a system called "Voyager").
The point is, it will change. Go back two years and look
around, then fast forward back to today. Do you still feel
like arguing that WWW and it's URL format must per force
determine what the next two years will look like? Seems a
tad short sighted to me...
Enough ranting. As I said in the disclaimer at the
beginning, I'm not trying to take shots at the WWW
community, but I do feel that some among you have been a
tad short sighted on this issue. Try to keep in mind that
there are other needs and other communities involved here.
Be prepared to flex a bit. After all, the goal is for us
all to be able to work together. to do that will take some
compromise on everyone's part.
- peterd
--