Re: what good is a URL without type information?

Mark P. McCahill (mpm@boombox.micro.umn.edu)
Mon, 25 Oct 93 19:57:20 CDT

Date: Mon, 25 Oct 93 19:57:20 CDT
Message-Id: <9310260057.AA28871@boombox.micro.umn.edu>
From: "Mark P. McCahill" <mpm@boombox.micro.umn.edu>
To: jak@violet.berkeley.edu, jcurran@nic.near.net, masinter@parc.xerox.com,
Subject: Re: what good is a URL without type information?

In message <199310251839.LAA00777@violet.berkeley.edu> John A. Kunze writes:
> > Date: Mon, 25 Oct 1993 01:17:20 -0400
> > From: John Curran <jcurran@nic.near.net>
> >
> > Practically speaking, there's little gained and significant functionality
> > lost by not including "type" information in URLs.
>
> The problem is the entire UR* architecture implodes.
>
> This in no way minimizes the importance of "type" discrimination,
> which I think we all agree is a very high priority.
>

Part of the type issue revolves around distinguishing broad classes
of types from each other (for instance files and directories) because the
clients for some protocols must to know the difference to resolve the URL.

For instance, in ftp the commands for fetching a directory are different from
the commands for fetching a document. An FTP URL isn't really functional unless
the client knows whether its fetching a document or a directory. My reading of
the July 14th URL draft is that the URL path is supposed to be converted to
unix-style filenames. I think this means that if you want to have a URL that
points to an ftp directory (rather than a document), then the path ends in a
slash... and in this case the type information to distinguish between files and
directories is implicit in the URL (coded into the path). This only works if
you are dealing with Unix-style file/directory names... and the URL draft
says that everything has to be mapped into unix-style names. So the type
information the client needs is is in the FTP URL if you accept that all ftp
selectors can be mapped to unix pathnames.

The gopher URL defined in the ULY 14 draft work because the gopher client can
distinguish a document from a directory... the type information is (in this
case) explicitly coded into the URL path as a seperate item before the gopher
selector. Since gopher clients cannot inspect the selector string to determine
type (gopher selector strings are opaque) enough type information for a client
to function is coded into the URL path before the selector string.

So... for URLs to function at all, some basic type information must be either
coded explicitly (or implicitly) into the URL... otherwise clients for some of
the protocols can't resolve the URL.

It might be better if the directory/filename typing was explicit for ftp. If
this had been done, you could treat the ftp path as an opaque string and the
URL spec would not have to resort to hand waving about maping non-unix ftp
implimentations into Unix-style pathnames. This would let ftp clients fetch
either the directory or the document from any ftp server... not just ones where
the client understands how to map to/from unix paths.

Anyway, for clients to function on some of these protocols, some basic
type information has to be in the URL or the URL is non-functional without
some other UR*.

Mark P. McCahill

gopherspace engineer/University of Minnesota
mpm@boombox.micro.umn.edu
612 625 1300 (voic 612 625 6817 (fax)