Re: Unresolved URL issues

Keith Moore (moore@cs.utk.edu)
Wed, 16 Mar 1994 23:26:23 -0500

Message-Id: <199403170426.XAA01206@wilma.cs.utk.edu>
From: Keith Moore <moore@cs.utk.edu>
To: bajan@bunyip.com (Alan Emtage)
Subject: Re: Unresolved URL issues
In-Reply-To: Your message of "Mon, 07 Mar 1994 00:07:12 EST."
<9403070507.AA26431@mocha.bunyip.com>
Date: Wed, 16 Mar 1994 23:26:23 -0500

> a) the FTP URL.
>
> Majority(?): The syntax should be "URL:ftp://host/a/b/c/d". Meaning that
> repeated CWD commands "a", "b", "c" should be performed and a RETR done
> on "d". The "/" is a directory boundary and if embedded "/" are to be
> allowed they must be quoted via the same mechanism as whitespace (ie,
> %<number>).
>
> Pro: Will work in most cases
>
> Con: Will fail in (a minority of) cases

Unless someone can demonstrate a (real) case where zero or one CWD commands will
NOT work, I strongly prefer that the FTP URL *not* specify repeated CWD commands.

For all of the servers I know of, at most one CWD command is required to get to
any file accessible from the server. (Even those for which multiple CWDs will
fail!)

(If this is not the case for most of the ftp servers out there, we need to
consider updating the MIME message/external-body anon-ftp access-type!)

> Personal comments: I've had some experience with automated ftp retrieval
> through archie and the technique we use is that proposed above.

This is understandable given that archie has to scan the file tree using FTP
commands. But I don't think it's necessary for URLs. An archie-like file scanner
that generated URLs could always do a PWD command after each CWD to determine the
single CWD command required to access the files in that directory. (Might help
eliminate loops caused by symlinks, too, if that's ever a problem.)

> I propose that the path be given relative to the login (uid/password) in
> the URL, as opposed to an absolute path.

Agreed.

> The URL still contains enough
> information that it is not "relative" (or "partial") and the context may
> be fully resolved on the host in question. It does however prevent the
> conversion of the URL to another access method. Not a requirement in any
> case, I believe.

Conversion of a URL from one access method to another should be strongly
discouraged. The root of the anon FTP tree may be different (and often is) than
that for other servers.

> Also as Larry notes, there is no current provision for typing the object
> being referenced or the transfer mode that has to be used. Since both are
> required for access to the object and since the draft requirements allow
> such typing in cases were the information is necessary for access, I
> propose that we allow the terms "binary", "ascii" and "tenex" to be used
> as transmission specifiers (again, see RFC 959).

You might could leave out tenex. Binary should be the default, to maximize
backward compatibility with present-day URLs. It would be nice to have text URLs
where the text attribute would be ignored on present-day systems. Is there any
way to do this?

> IMHO, I agree with John Curran. With something like FTP we can't bother
> about every possible implementation under the sun... it's been around too
> long and in may cases is too unstandardized to try to get 100% of all
> implementations.

Agreed. Just try to make it work for the most common cases. (But we should be
able to do better than just UNIX.)

Keith