Re: <'s for URLs

Tim Berners-Lee (timbl@www3.cern.ch)
Mon, 18 Oct 93 17:20:20 +0100

Date: Mon, 18 Oct 93 17:20:20 +0100
From: Tim Berners-Lee <timbl@www3.cern.ch>
Message-Id: <9310181620.AA02269@www3.cern.ch>
To: marca@ncsa.uiuc.edu (Marc Andreessen)
Subject: Re: <'s for URLs

>Date: Mon, 18 Oct 93 09:48:08 -0700
>From: marca@ncsa.uiuc.edu (Marc Andreessen)

>That theory [that wyswyg editors will exist] isn't working
> -- it has to be assumed people will be
>creating, handling, manipulating, converting, and tweaking data
(text,
>HTML, whatever) by hand. Experience is showing that every day.

That is true.

>This really is unnecessary confusion, easily avoidable by not
>overloading the same characters SGML (and, for that matter, email
>addresses) uses.

I don't feel that, given the limited number of characters,
this is _over_loading. Indeed, as it happens if a news or
mail reader finds a <news@article> or <mail@address>
inside a plain text document is stands a reasonable
chance of figuring out that it is in fact something which
can be converted to full URL form too. I don't think
that will cause a problem. In fact I like it.
From the point of view of
a "normal user" adept at spotting things in <>, I
don't see a problem either.

You obviously want to think about alternatives. I couldn't
think of any which don't have problems. They have to
be non-national-variant characters which are not in use
in normal text (as brackets [of other kinds] are) and they
have to match ideally. And they have to be excluded
from URLs.

I completely agree that people will be editing HTML by hand
for a long time. But we are talking about a rather special
case here. The input format is plain text, but sufficiently
advanced to have URLs in it. The output is SGML but
sufficiently low-tech to want URLs to be displayed as though
in plain text - like an SGML simulation of a plain
text environment. And we are talking about a person, not
a script, doing the work, with cut and paste. If we change
because of this small but important sector, we make life
more difficult for:

- Those who are using SGML and don't want the document to
contain URLs *in the text*;

- Those who use URLs in mail messages to refer to GIF files
and don't even know what SGML is;

- Those who use SGML with any tool which does the escaping
for them.

Why do you want to write

1 Read the spec, available on
&lt;<ftp://info.cern.ch/pub/www/doc/url7a.txt>.

rather than the awfully SGML

2 Read the spec, available on
<ucite>ftp://info.cern.ch/pub/www/doc/url7a.txt</ucite>.

or the even cooler

3 Read the <a
href="ftp://info.cern.ch/pub/www/doc/url7a.txt">spec</a>.

? Do you think [1] is the mainstream use
of URLs? In fact, real people (well, fairly real people) were
happily typing in [3] in otherwise plain text annotations
during the Mosaic annotation trial.

One more point. <soapbox>
The WWW designs should not in the interests
of headlong development blurr the edges between URLs, HTTP
and HTML. I am in favour of keeping the 3 specs as
independent as possible: history will thank us and it will
make integration of WWW ideas with the rest of the world
easier in both directions. So I am reluctant to put
SGML-specific constraints on URLs: the plain text wrapper
is for [those poor unfortunate] guys who are NOT writing
in HTML. It should be designed from their standpoint, and
posing as one, I think the <> is more natural, safer, and
appropriate than anything else. </soapbox>

>Marc

Tim