Re: are #anchor-id's allowed in Gopher or FTP URLs?

Daniel W. Connolly (connolly@hal.com)
Fri, 01 Jul 1994 14:52:59 -0500

Message-Id: <9407011953.AA09505@ulua.hal.com>
To: Larry Masinter <masinter@parc.xerox.com>
Subject: Re: are #anchor-id's allowed in Gopher or FTP URLs?
In-Reply-To: Your message of "Fri, 01 Jul 1994 11:12:33 PDT."
<94Jul1.111234pdt.2760@golden.parc.xerox.com>
Date: Fri, 01 Jul 1994 14:52:59 -0500
From: "Daniel W. Connolly" <connolly@hal.com>

In message <94Jul1.111234pdt.2760@golden.parc.xerox.com>, Larry Masinter writes
:
>This is actually current practice, isn't it? You can say
>
>ftp://host/dir1/dir2/name.html#FOOT1

This is consistent with the WWW URI spec and some code I've seen...
not sure whether, for example, it's completely supported by
Mosaic or widely used. (Some issues that appear to be orthogonal
are surprisingly not handled that way by Mosaic...)

>Can HTML documents occur in news articles?

Of course, with MIME. But the # construct can conceivably be
used to select parts of any sort of document: tar files, text
files, etc.

>Maybe the way to resolve this is to make "#" unsafe, take anchor IDs
>out of this document, and relegate anchors to the same place that
>partial URLs went: to the WWW document for which this should be a
>subset.

I highly recommend this. Resolving #fragment identifiers is orthogonal
to network resource retrieval. When a WWW client sees:

scheme://host/path#fragment

it strips off the fragment and resolves...

scheme://host/path

and then finds #fragment inside the resulting document in a manner
that may vary with the data format.

So in order to keep the specification of various features in the right
place without hampering compatibility, we should say that # is not
legal in a URL and leave it at that.

In the same vein, I'd like to reserve the scheme names "file:" and
"local-file:" The URL document should not really specify them, other
than to say that they should be reserved for use by applications for
access to the local file system.

This is a good trend... while the URL document need not specify all
the features of the WWW addressing architecture, it should be 100%
compatible; i.e. we should be able to look at the WWW addressing
architecture (relative addresses, fragment identifiers, etc.) as "an
application of the URL specification."

So now all we need is a BNF for the "common syntax." I had a lex/yacc
description of something close a while ago, along with a test suite to
check out pathological cases (see http://www.hal.com/%7Econnolly/url_test/).
Maybe I'll update it for comparison with the recent URL drafts.

Dan