Date: Thu, 3 Mar 1994 13:33:16 +0500
From: dupuy@smarts.com (Alexander Dupuy)
Message-Id: <9403031833.AA26278@brainy.smarts.com>
To: hallam@alws.cern.ch, moore@cs.utk.edu
Subject: Re: FTP URL mapping
Phil Hallam-Barker writes:
> It is worse than Tim points out! Not only is the mapping from the
> path to directory system different on each machine a sequence of
> CD commands will *NOT* work on certain systems, specifically those
> that do not have a single root to the file structure and those that
> do not require access to intermediate directories in a path to
> use it.
I have seen this restriction for CD commands on Unix systems as well. CMU, in
particular, supports AFS access via anonymous FTP. You could argue that since
this is really AFS, not FTP, why not use an AFS URL. But I have FTP, and I
don't have AFS (like most people, I suspect) so an AFS URL isn't really much
use, while an FTP URL is something I can deal with.
Anyhow, you sometimes get an FTP path like /afs/cs.cmu.edu/jdoe/herfile where
you can only access the file using the full path name, and CD is only allowed
for /afs/cs.cmu.edu/jdoe, not for /afs or /afs/cs.cmu.edu. So I would make
the attempt to retrieve the file in a single operation a required step, and
not just an optional performance enhancement which allows you to reuse FTP
connections.
Also worth noting is that the initial / is significant, since accessing
afs/cs.cmu.edu/jdoe will not work (with no initial /). This raises another
subtle question about FTP URLs (not just for AFS access via FTP). While most
anonymous FTP servers for Unix start with CWD = / (relative to the FTP root),
this is not always the case, and for non anonymous FTP access, CWD != / almost
always.
If I specify an FTP URL ftp://host/a/b, does this reference the file a/b or
/a/b on host? If the former, is it possible to add an extra initial / (i.e.
ftp://host//a/b) to force it to be retrieved as /a/b? In this case, the extra
/ must be sent as a / to the FTP server, and not interpreted as a generic
directory separator (the single operation retrieve attempt requirement would
address this, provided the extra / is not deleted as "redundant").
> Why does all this matter? After all can't we just write
> ftp:///u1:[hallam]fred.html?
>
> No we can't. Imagine we have an integrated programming system supporting
> code development on VMS, UNIX and WNT. We have our source file with
> links to the data dictionary or whatever :-
>
> #include <string.h>
> #include <http://mb1.sol.moon/some_code.byron>
> #attach <ftp://md2.sol.moon/security/moonbase/logicals.byron>
>
> Now imagine md2.sol.moon suffers some catastrophe or is upgraded
> to a VMS system. We do not want the code link to change.
While this might be a desirable feature, I'm not sure it's feasible. Using
this URL via "CD security/moonbase" for Unix, but "CD [security.moonbase]" for
VMS, and "CD security\moonbase" for Windows NT, and N other variants for other
operating systems is adding too much complexity. The multistep CD procedure
is designed to hide the peculiar directory semantics of systems not just from
the end-users, but from the URL client code as well. Either change the link
to ftp://md2.sol.moon/u1:[security.moonbase]/logicals.byron, or better yet,
use a URN in the first place.
Keith Moore writes:
> I'd like to see the ftp part of the URL spec defined as a precise,
> step-by-step algorithm for translating an ftp URL to a series of ftp commands.
I agree. There is enough complexity and variety in FTP servers out there,
that providing a precise algorithm is probably the only way to ensure
interoperability.
> [The FTP URL spec] should also specify how you know whether a URL is a file
> or a directory (in which case the "filename" is the argument to LIST/NLST,
> maybe even a wildcard pattern).
It seems to me that the ? separator could be used to indicate LIST/NLST
queries if desired, so that ftp://host/some/dir? would reference a directory,
and ftp://host/some/dir?*.txt would reference the "*.txt" files in that
directory.
@alex