Newsletter

Subscribe

Search this site

Partner Login



The first URL discussion ? PDF Print E-mail
Written by Mitra Ardron   
Monday, 21 October 1991 19:49

A record of one of the first discussions - 21st October 1991 - to come up with a common identifier (URL) for everything on the internet.

We were trying to come up with an identifier, that could be used across multiple information systems to refer to documents on others. Until that time, identifiers were internal to specific system, i.e. a WAIS search could only refer to documents that were accessable from WAIS servers, not from Lexus or Gopher, and so on.

 

 

This document, is the first record I can find of discussions of URLs. The four of us:

  • Dean Tribble then at Xanadu,
  • Brewster Kahle, inventor of WAIS, then at Thinking Machines, and now Founder & Digital Librarian of the Internet Archive,
  • Bob Schumaker then at AMiX, now at Fujitsu Cultural Technologies
  • Mitra Ardron, then at Pandora, and main external developer on Gopher

had met at the Hackers conference in Tahoe (note this is the old use of the word Hackers - meaning people who push technology to its limits, not people who break into systems!),

The WALS (Wide Area Location Server) server described here was basically a URN resolver, that used a URL as a URN. The clever part was that the WALS server could either locate, or retrieve the document, allowing it to also be a proxy server.

Nothing much came of the WALS server, it was ahead of its time, but matched much of the disscussion of how to implement URNs. However Brewster took these ideas to IETF in Boston, where they were described over beers to Tim Berners Lee who was looking for something like this for WWW. The rest, as they say, is history.

March 2009: I was emailing with Tim Berners-Lee recently, who corrected our memory of history... Thanks, Mitra to the pointer to the WALS document. I would point out that http://www.mitra.biz/uri/wals.html is wrong in saying:

"However Brewster took these ideas to IETF in Boston, where they were described over beers to Tim Berners Lee who was looking for something like this for WWW. The rest, as they say, is history. "

The UDI (Universal Document Identifier) design was in September or October 1990. The scheme was there in the original UDI design for WWW, in the fall of 1990. I coded up http: and ftp: and news: immediately, as I needed to be able to incorporate a bulk of stuff which I used day to day. The news: engine was just an NNTP client which converted to HTML, very small layer. Not a brilliant news reader but allowed you to make hypertext links to any article or group which I thought was cute at the time, as it was the same hypertext model but very different distribution protocol. Gopher was similar. Wais: had to go through a gateway, of course as there was more code than one could just throw in. The prefix and the colon were taken from the VMS disk name (etc), the // was taken from Apollo domain (which later influenced MS to produce their \\ in windows SMB). All to make things feel familiar to people who had been using workstations.

Yes, Brewster, I remember you having suggesting an email-address-like system. It must have been in Boston in 1992. I think had we talked in 1990 I would have ben receptive to it, but in 92 it seemed to me like something which would fragment the web, if people started to fork off different URI syntaxes. When you wrote it up, were you aware of URI, and felt they could be improved on, or not aware of them?

On the question of order, I thought what you proposed was something like This e-mail address is being protected from spambots. You need JavaScript enabled to view it in which the order was consistently increasing left to right. And course then to be consistent you need the scheme name on the right hand side, which you do. But the WALS document has examples like /pub/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it :ftp where the path is most significant first.

In a way, it would have been good to have http:com/acme/tutorials/mytutorial/chap1 (or chap1.mytutotial.tutorial.acme.com:http) and make the browser figure out where domain name ends and path begins, so as to be able to move that point with time. But anyway.

Ah, those days :)

"Oh yes, my friend we're older but no wiser, as in our hearts the dreams are still the same". :) I hope wisdom and dreams are not totally incompatible!

Tim

So it looks like the WALS and URI were developed seperately at around the same time - not aware of each other.

For the curious and rushed, skip to the BNF and examples to see just how close they were to the final version.

Bob' s version of this is at http://www.io.com/~cobblers/URLS.html

- Mitra < This e-mail address is being protected from spambots. You need JavaScript enabled to view it >

 


bob, tribble, brewster, 

Dean: Here's my interpretation of the last slide. I tried to capture things
that we talked about at various places on the slide. Don't shoot me
if I'm wrong :-) I'll be writing up something like a proposal RSN.

Mitra: - this looks good, it seems to cover what we are talking
about, some additional notes.

Mitra: Some pseudo-code which I hope makes some things clearer, note
I'm not suggesting things have to be done this way, only that this is
one way to do it, and may clarify intentions to some people.

external locate(Identifier) -> Identifier
# the returned identifier presumably has more location information.
{ locatelocal(Identifier) # Can we locate this in our rule-base
if sufficient(Identifier) # If we can locate it exactly
then return(Identifier) # Then we have the answer
else if Host=wheretoask(Identifier) # Otherwise who might know
then Host!locate(Identifier) # Ask that WALS server
else return Identifier # Cant tell you anything more
}

local locatelocal(Identifier) -> Identifier
# Mitra: This is a simple subset of locate which wont ask another server,
its needed to save fetch having to traverse a tree of WALS servers once
to find the document and then to fetch it
{ if can find Identifier in its rule-base
(see below for example rule-base)
then transform Identifier by rule and recurse
else return Identifier
}
local sufficient(Identifier) -> Boolean
# Mitra: I've added this, its needed below to know when to ask other hosts
{ Return true if if this identifier contains sufficient location
information to retrieve the document
}

local wheretoask(Identifier) -> WALS-server | NULL
# Mitra: Again a rule based function to figure out which WALS server to
ask about this Identifier. This could be part of locatelocal,
{ return the name of another WALS server to ask about this Identifier
typically returns the host portion of the location
if not possible to ask anyone else about this then return NULL
}


external fetch(Identifier) -> doc
# get the contents of the document corresponding to the identifier.
{ locatelocal(Identifier) ; Find as much as we can locally
if Host=wheretoask(Identifier) ; Figure out who to ask about it
then return Host!fetch(Identifier) ; ask that host for the document
else return NULL
}

store(doc, <Identifier>)=> <Identifier>

# Dean: Register the document at some new location. The operation here is
clearly bogus, but something like that needs to be available. (For
instance a Xanadu server could claim that it stores a particular
document, and supply the navigation information.

BNF
Mitra: I've re-written the BNF from Dean's description, I think my version
is the same, but just more precise if my memory of BNF is still
intact, Dean - please check and see that this is what you meant.
Explanation:
"," means the character comma
[xxx] means none or one occurances of xxx (the [] is for grouping
[xxx]* means none or more occurances of x
[xxx]+ means one or more occurances of xxx
[a-z] means any of the letters a thru z

<Identifier> ::= <GID> [ "," <Location> ]*
# Mitra: Note the , is repeated for multiple locations, also that it is not
# anticipated that multiple Locations will be supported by early
# implimentations

<Location> ::= [[ <LocalID> ":" ] <Server> @ ] <SiteAddress>

# Mitra, note that if there is no LocalId the : is ommited, ditto the @

Mitra: Please check I have the next group is correct, what we allow here will
ensure whether we have an Identifier we can parse from right to left
<SiteAddress> ::= <a standard internet site>
<a standard internet site> ::= [ <site part> "." ]+ <site part>
<site part> ::= <alphanumeric>+ # Is this sufficient?

<GID> ::= <SubGID> ":" <Naming convention>

Dean: renamed DB to Server so that we remember that the servers could do
anything to retrieve the document once they are supplied with the
localID. FTP is an example.
Mitra: Do we want to keep this as restrictive as below, or is Server
anything but ":"
<Server> ::= <alphanumberic>+
<alphanumeric> ::= [a-zA-Z0-9]

Mitra: The SubGID can be anything, it is bounded by the known,
parsable, pattern on the right of it.

<SubGID> ::= <any printing ascii character>+

Mitra: I'm not sure what LocalID can be, we need to be able to find
the "," on the left of it, so I'm not sure what we allow - maybe
anything but ","

Dean:

Examples of GIDs:
<hash><docID><backendID>:xanadu
docID:lexus
locdocID: This e-mail address is being protected from spambots. You need JavaScript enabled to view it :wais
/pub/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it :ftp

Mitra: I've taken out the {} - where they intended?

Dean:

An example of ftp:
locate(/pub/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it :ftp) returns either:

/pub/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it :ftp, /pub/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it :myFTP@localhost

to invoke a local server "myFTP" with the LocalID: /pub/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it , or

/pub/ This e-mail address is being protected from spambots. You need JavaScript enabled to view it :ftp, /pub/foo: This e-mail address is being protected from spambots. You need JavaScript enabled to view it

to call WALS at think.com and have it run an ftp server there.

Dean: Note that the mapping from <Server> names to the actual servers is
create from a configuration file when the local WALS starts up. In
the example above, myFTP is just the name provided that somehow
identifies the process that will perform FTP.

An example of lexus:
locate(doc39:lexus) returns doc39:lexus, doc39: This e-mail address is being protected from spambots. You need JavaScript enabled to view it

Dean: Among other things, the locate service can apply heuristics to the GID
to figure out who to call for more information. It could also call
some directory service to find out more information.

Dean: I came up with a new way to point out some of the properties of this
scheme. The Global IDs a uninterpreted bytes to WALS, whereas the
location information is uninterpreted bytes to servers using WALS. So
when storing a reference to an external document, Xanadu would store
the location information without trying to parse it, and would simply
hand that information back to WALS when it wanted the document.

Dean: Great stuff!
Mitra: Second that!

 

 

 

Comments