Re: Version 5 of URL draft

Roy T. Fielding (fielding@simplon.ICS.UCI.EDU)
Wed, 27 Jul 1994 01:55:47 -0700

To: Larry Masinter <masinter@parc.xerox.com>
Subject: Re: Version 5 of URL draft
In-Reply-To: Your message of "Tue, 26 Jul 1994 13:34:55 PDT."
<94Jul26.133457pdt.2760@golden.parc.xerox.com>
Date: Wed, 27 Jul 1994 01:55:47 -0700
From: "Roy T. Fielding" <fielding@simplon.ICS.UCI.EDU>
Message-Id: <9407270155.aa18240@paris.ics.uci.edu>

To save Larry some grief, I'll separate my response to this draft into
non-controversial edits (this meessage) and possibly controversial edits
(my next message).

> 2. Recommendations
>
> This document describes the syntax for "Uniform Resource Locators"
> (URLs): a compact representation of the location and access method
> for a resource available on the Internet. As there are many
> different method of accessing resources, there are several
> _schemes_ for describing the location of such resources.

That would be better as:
"Just as there are many methods of access to resources, ..."

> The generic syntax provides a framework for new schemes for names
> to be resolved using as yet undefined protocols.

Delete "schemes for names" with "URL schemes".

> The syntax is described in two parts. First, we give the syntax
> rules of a completely specified name; second, we give the rules
> under which parts of the name may be omitted in a well-defined
> context.

Replace "name" with "URL" (twice).

>...
> 2.1.2. Scheme
>
> After the initial "URL:" string, the next component of a URL is the
> name of the scheme used, followed by a colon. Scheme names are made
> of lower case letters "a"--"z", digits, and the character plus
> ("+"), period ("."), and hyphen ("-"). For resiliancy, programs
> interpreting URLs may wish to allow upper case letters as
> equivalent to lower case in scheme names (e.g., allow
> "URL:HTTP://host/" as well as "URL:http://host/").

Replace "resiliancy" with "resiliency".

>...
> 3.2.1. FTP Name and Password
>
> A user name and password may be supplied. If no user name or
> password is supplied and one is requested by the FTP server, the
> conventions for "anonymous" FTP are to be used, as followed:

Replace "as followed:" with "as follows:".

>...
> 3.11 PROSPERO
>...
> URL:prospero://<host>:<port>/<hsoname>;<field>=<value>
>
> as specified in section 3.1 The port defaults to 1525. No username
> or password is allowed.

That would be better as:
"where <host> and <port> are as described in Section 3.1. If :<port>
is omitted, the port defaults to 1525. No username or password is
allowed."

>...
> 4. REGISTRATION OF NEW SCHEMES
>...
>
> URL schemes must have demonstratable utility and operability. ...

Replace "demonstratable" (not a word) with "demonstrable".

>...
> 5. BNF for specific URL schemes
>
> This is a BNF-like description of the Uniform Resource Locator
> syntax, using the conventions of RFC822, except that "|" is used to
> designate alternatives, and brackets [] are used around optional or
> repeated elements. Briefly, literals are quoted with "", optional
> elements are enclosed in [brackets], and elements may be preceded
> with <n>* to designate n or more repetitions of the following
> element; n defaults to 0.
>
> url = "URL:" unlabelled
> unlabelled = httpaddress | ftpaddress | newsaddress |
> nntpaddress | telnetaddress | gopheraddress |
> waisaddress | mailtoaddress | fileaddress |
> prosperoaddress | otheraddress
> otheraddress = scheme ":" schemepart
> scheme = 1*[ lowalpha | digit | "+" | "-" | "." ]
> schemepart = *xchar
>
> login = [ user [ ":" password ] "@" ] hostport
> hostport = host [ ":" port ]
> host = hostname | hostnumber
> hostname = alpha *uchar
> hostnumber = digits "." digits "." digits "." digits
> port = digits
> user = *[ uchar | ";" | "?" | "&" | "=" ]
> password = *[ uchar | ";" | "?" | "&" | "=" ]
>
> ftpaddress = "ftp://" login [ "/" path [ ";type=" ftptype ]]
> path = segment *[ "/" segment ]
> segment = *[ uchar | "?" | ":" | "@" | "&" | "=" ]

Should be:
segment = *[ uchar | ":" | "@" | "&" | "=" ]
(or httpaddress cannot use "path").

> ftptype = "A" | "I" | "D" | "a" | "i" | "d"

Add:

fileaddress = "file://" host [ "/" path ]

> httpaddress = "http://" hostport [ "/" path [ "?" search ]]
> search = *[ uchar | ";" | ":" | "@" | "&" | "=" ]
>
> gopheraddress = "gopher://" hostport [ / [ gtype [ selector
> [ "%09" search [ "%09" gopher+_string ] ] ] ] ]
> gtype = xchar
> selector = *xchar
> gopher+_string = *xchar
>
> mailtoaddress = "mailto:" encoded822addr
> encoded822addr = *xchar

Should be:
encoded822addr = 1*xchar

> newsaddress = "news:" grouppart
> grouppart = "*" | group | article
> group = alpha *[ alpha | digit | "-" | "." ]
> article = 1*[ uchar | ";" | "/" | "?" | ":" | "&" | "=" ] "@" host
>
> nntpaddress = "nntp://" hostport "/" group [ "/" digits ]
>
> telnetaddress = "telnet://" login [ "/" ]
>
> waisaddress = waisdatabase | waisindex | waisdoc
> waisdatabase = "wais://" hostport "/" database
> waisindex = "wais://" hostport "/" database "?" search
> waisdoc = "wais://" hostport "/" database "/" wtype "/" wpath
> database = *uchar
> wtype = *uchar
> wpath = *uchar

? Is it valid to have an empty database, wtype, or wpath?

> prosperoaddress= "prospero://" hostport "/" path *[ fieldspec ]
> fieldspec = ";" fieldname "=" fieldvalue
> fieldname = *[ uchar | "?" | ":" | "@" | "&" ]
> fieldvalue = *[ uchar | "?" | ":" | "@" | "&" ]
>
> lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
> "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
> "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
> "y" | "z"
> hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
> "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
> "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
> alpha = lowalpha | hialpha
> digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
> "8" | "9"
> safe = "$" | "-" | "_" | "." | "+"
> extra = "!" | "*" | "'" | "(" | ")" | "," | "="
> national = "{" | "}" | "|" | "\" | "^" | "~" | "[" | "]"
> punctuation = "<" | ">" | """ | "#"
> reserved = ";" | "/" | "?" | ":" | "@" | "&" | "="
> hex = digit | "A" | "B" | "C" | "D" | "E" | "F"
> escape = "%" hex hex
>
> unreserved = alpha | digit | safe | extra | national
> uchar = unreserved | escape
> xchar = unreserved | reserved | escape
> digits = 1*digit

That's it for the non-controversial changes. More to follow.

....Roy Fielding ICS Grad Student, University of California, Irvine USA
(fielding@ics.uci.edu)
<A HREF="http://www.ics.uci.edu/dir/grad/Software/fielding">About Roy</A>