Re: <URL:...> considered harmful

Owen Rees (rtor@ansa.co.uk)
Tue, 13 Sep 1994 12:43:23 BST

Message-Id: <9409131143.AA15208@plato.ansa.co.uk>
To: uri@bunyip.com
Subject: Re: <URL:...> considered harmful
In-Reply-To: Message from connolly@hal.com of Tue, 13 Sep 1994 00:09:09 -0500.
<9409130509.AA06650@austin2.hal.com>
Date: Tue, 13 Sep 1994 12:43:23 BST
From: Owen Rees <rtor@ansa.co.uk>

"Daniel W. Connolly" <connolly@hal.com> writes:
> In message <16569.779430042@hound.cs.indiana.edu>, Marc VanHeyningen writes:
> >Um... OK, I give up... parentheses are better than angle brackets
> >because why?
>
> Because they're not as overloaded in the same contexts as <>'s. But
> the real point is to put whitespace around the URL because that works
> today and will probably continue to work.

Parentheses being less overloaded is a valid point, but whitespace is not
working because conventional wisdom is that it is wrong to leave whitespace
before punctuation in text. Let me give you an example taken from a message
sent to orb@omg.org on Fri, 20 May 1994:

=====
The release notes are available as an HTML document,

ftp://parcftp.parc.xerox.com/pub/ilu/1.6.4/announce.html.

The full source code, including documentation, is available
as a 3 MB compressed tar file as

ftp://parcftp.parc.xerox.com/pub/ilu/1.6.4/ilu-1.6.4.tar.Z

The 1.6.4 ILU manual is also available separately, either in
Postscript (231 KB) as

ftp://parcftp.parc.xerox.com/pub/ilu/1.6.4/ilu-manual-1.6.4.ps.Z

or via World Wide Web at

ftp://parcftp.parc.xerox.com/pub/ilu/1.6.4/manual-html/manual_toc.html.
=====

The absence of a standard for delimiting URLs and the ingrained habit of
punctuating text correctly lead to a conflict. It is not clear whether or not
the trailing dots are part of the URLs, or are punctuation in the text. I
received the message containing the above fragment from a colleague, together
with a comment to the effect that the URL was wrong and that his attempts to
guess the right one had been unsuccessful. (This was not a novice, it was a
very experienced person with more urgent things to do than resolve the
ambiguity.)

Note that the <URL:...> is recommended, and this is one of the terms used (in
capitals) to indicate the significance of requirements in Internet standards,
being less than "REQUIRED" but more than "OPTIONAL". This seems to me to be
the correct degree of significance for this wrapper, now that it is a wrapper
that contains the URL rather than the "URL:" being part of the URL.

I feel it is important to have a recommended delimiter. Of the possible
options, the <URL:...> form seems to me to be the least inconsistent with
existing practice - it cannot be a valid "msg-id" or "route-addr" as defined
in RFC-822 (":" is a "special" so MUST NOT occur unquoted). Using "<...>" to
"indicate the presence of a one machine-usable reference" is quite plausible.
Parentheses indicate comments and square brackets indicate domain literals in
RFC-822 so using either of these seems doubtful.

Parentheses are also a poor choice because traditional punctuation demands
".)" when ")" occurs at the end of a sentence, thus defeating the whole
purpose of the delimiter. (Using ")." is now considered acceptable, but I am
sure that there are traditionalists who will "correct" this "mistake" given a
chance.)

The references in draft-ietf-uri-url-07.txt illustrate the value of the
recommendation. The commas (after every "<URL:...>") are clearly not part of
the URL thanks *only* to the ">". The extra space in ref 7 introduces no
ambiguity, nor do the line breaks in several of the refs.

In summary my position is: a recommended (rather than required) delimiter is
needed, the current proposal works, nothing with fewer problems is on offer;
keep the text as it is.

Regards,
Owen Rees <rtor@ansa.co.uk>
Information about ANSA is at <URL:http://www.ansa.co.uk/>.