Message-Id: <9409122013.AA02543@ulua.hal.com>
To: uri@bunyip.com
Subject: <URL:...> considered harmful
Date: Mon, 12 Sep 1994 15:13:27 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
OK... we're at the "last call" stage, and this <URL:...> silliness is
still in there. I thought it would die of its own weight, but
alas... So I'll go to bat one last time:
Regarding the section:
>APPENDIX: Recommendations for URLs in Context
>
> URIs, including URLs, are intended to be transmitted though
> protocols which provide a context for their interpretation.
I find no basis in fact for the following:
> In some cases, it will be necessary to distinguish URLs from other
> possible data structures in a syntactic structure. In this case, is
> recommended that URLs be preceeded with a prefix consisting of the
> characters "URL:". For example, this prefix may be used to
> distinguish URLs from other kinds of URIs.
Please motivate this assertion with a concrete example or two or
strike it from the text.
Further, if we do need some way to reliably pick URLs out of plain
text, let's use _anything_ but <>'s and URL:
<>'s are already used for mail addresses (e.g. <connolly@hal.com>)
and message ID's (e.g. in <12343@hal.com>, Dan writes:...) and sgml
tags (e.g. for more info, see <a href="...">this</a> -- even in plain
text, folks write this these days).
URL: looks like a URL scheme, but it's not. There is lots of software
that searches for URLs by using a regular expression like:
[A-Za-z0-9\.-]+:[^ \t\n]+
I suppose they can use:
[A-Za-z0-9\.-]+://[^ \t\n]+
and get away with it, at least for URLs that use the //hostname syntax.
Each piece of software that basically looks for "scheme:..." will have
to have a special case to check to see if scheme: is URL:, and skip it
if so.
In practice, I find that the most reliable way to communicate a URL in
plain text is to put it on a line by itself, preferably with a little
whitespace on each side, e.g.:
ftp://cnri.reston.va.us/internet-drafts/draft-ietf-uri-url-07.txt
That it is a URL is self-evident, or given by context.
I'm willing to see something like:
URL: ftp://cnri.reston.va.us/internet-drafts/draft-ietf-uri-url-07.txt
or:
(URL: ftp://cnri.reston.va.us/internet-drafts/draft-ietf-uri-url-07.txt )
with space before and after the URL itself. That way, folks can
double-click on the URL and get the right thing, and all sorts of
other happy, practical things.
"What about long URLs?" you might ask. Well, they don't work in plain
text. They just don't. The receiver has to glue them together by hand.
It's a tedious, error-prone situation with no widely deployed
solution. Emperical arguments to the contrary are welcome.
<URL:...> is invention by committee. It serves no useful purpose. It
is harmful in at least the above ways. I move to replace it with some
mechanism that has proved effective, or to strike it from the document
until such time as some mechanism emerges that works.
Dan