Message-Id: <9411181904.AA26430@mocha.bunyip.com>
To: uri@bunyip.com
Subject: AFS URLs
Date: Fri, 18 Nov 1994 14:03:49 -0500 (EST)
From: Mic Bowman <mic+@transarc.com>
------- Forwarded Message
To: Larry Masinter <masinter@parc.xerox.com>
Cc: mic+@transarc.com
Subject: AFS URLs
In-Reply-To: Your message of "Thu, 17 Nov 1994 23:12:55 -0800."
<94Nov17.231300pst.2760@golden.parc.xerox.com>
Date: Fri, 18 Nov 1994 08:57:26 -0500 (EST)
From: Mic Bowman <mic+@transarc.com>
> The main reason for not using 'file:' is that it's really already
> defined for other purposes, and the overloading that you're proposing
> for 'afs' doesn't work well for sites that might want to use local
> files but don't run afs. It also doesn't work well for non-Unix
> systems, does it?
Why is this overloading file? AFS is a file system. AFS users
don't really care if they are using AFS or NFS. It is the local
file system. It just happens that the local file system is much
bigger than the typical NFS. It seems to me that this is precisely
the thing for which "file" was designed.
The translation we proposed happens to work for AFS because the
name space on a remote host is the same as the name space on the
local host. What we proposed with our translation is a retrieval
optimization. (I don't want to shortchange URL translation, it is
much more useful, but for the purpose of this discussion it is just
an optimization.)
> It seems like if you mean 'use AFS if you can, else use FTP', then you
> should use a different URL scheme than 'file:'.
No. I mean, use the *locally available file system* if URL is defined in
the local name space or find a machine that exports the file system name
space. It just happens to be that the locally available file system for
an AFS client is VERY big. It is my responsibility to specify a gateway
that exports the same name space.
> Now, why do you need:
>
> afs://{gateway}/afs/{cell}/path
>
> when
> afs://{gateway}/{cell}/path
Fine. But why make this specific to afs? The URL for DFS will
look very similar except that the path will probably be /.../{cell}/path.
Likewise, the URL for files in other file system name spaces. This
is really just a file system, why not call it that?
> might work, or even
>
> afs://{cell}/path
>
> where you suggest that sites that don't run afs do keep a
> configuration that deals with:
>
> afs://{cell}/{path}
I disagree with this. It will be too difficult to propogate information
about AFS gateways. In addition, the AFS name space is sufficiently large
that one gateway is not sufficient. You would need to keep a table that
maps CELL to a GATEWAY. Maintaining the table would be very difficult.
Also remember that AFS is just one of many large, distributed file
systems. You'll need this kind of configuration for each.
> The revised specification for 'ftp:' also suggests that
> ftp://host/a/b/c/d
>
> treat access as
> ftp host
> cd a
> cd b
> cd c
> retr d
>
> but this apparently doesn't work with many AFS sites that actually
> want
> ftp host
> cd a/b/c
> retr d
I only know of one site where this is the case. If you look at the
configuration of the FTP directory on grand.central.org, you see that
the "afs" directory is actually /afs/grand.central.org/afs so that the
standard FTP definition is precisely the one we export (cd to each
component separately).
Even with the 'chroot' that most ftp servers use to cut off the root
of the tree, it is very easy to export the public portion of the name
space: make local directories and a mount point to the public volumes.
Don't get me wrong, I'm not particularly tied to ftp. It just happens
to be the easiest to configure and use. Would it be possible to put
more gateway information (like protocol) in the URL?
> Thus, a direct translation of "afs:" to "ftp:" might lead to further
> problems. This may be something that has worked with previous clients,
> but the specification has been ambiguous, and there have been problems
> in the inability to talk to some FTP servers.
As I said, whoever maintains the gateway (and each cell can maintain its
own gateway) is responsible for ensuring that the name space is the same
both inside and outside the gateway.
Change gears...
Let me ask a question. Why is it necessary to have separate URL schemes
for each commercial file system? Is there a push for an NFS scheme or
for one that would let me name files in a Netware file system? Granted,
AFS is a very big name space (so is Netware). It is very unlikely that
anyone will put enough smarts in a URL resolution library to handle native
AFS access. DFS is much worse. I don't know about Netware but I assume
that it would be nearly impossible given the protocol base.
We must assume that AFS clients require a gateway. The question then
is where to put the information about the gateway. Your proposal puts
the information in a separate configuration record. I claim that it is
not possible to maintain a configuration record that can efficiently
represent the entire AFS name space.
A separate question... what is the point of a URL? If the intention
is to provide a way to retrieve a file, then putting the gateway in
the URL should not be a problem. If the intention is to specify a
protocol/name space and a name in the name space, then gateway
information should not be in the URL.
For the purpose of defining an AFS specific URL, I have a couple requirements:
1) If the client is running AFS, the URL should be resolved through
the file system.
2) If the client is not running AFS, the URL should be sent to a
host that is an AFS client and is 'close' to the file server that
maintains the file.
3) A non-AFS client must be able to locate an appropriate AFS
gateway with minimal or no pre-configuration of the client.
I'm very interested in getting this issue resolved. I still think that
something like 'file' is most appropriate and I don't see a conflict with
the specification in the August version of the URL spec. If you can propose
something better than 'file' I will do everything I can to help design and
implement a resolver for it.
- --Mic
------- End of Forwarded Message