Re: URN functionality from URLs

Alexander Dupuy (dupuy@smarts.com)
Tue, 14 Dec 1993 13:12:12 +0500

Date: Tue, 14 Dec 1993 13:12:12 +0500
From: dupuy@smarts.com (Alexander Dupuy)
Message-Id: <9312141812.AA04704@brainy.smarts.com>
To: winograd@interval.com, timbl@www0.cern.ch
Subject: Re: URN functionality from URLs

> 2. "Within the context of a host". There is a problem here which
> I mentioned it on 3 Nov on the URL list. You have a nice
> hierarchy for the hostname, and a nice hierarchy for the
> filename, but the two are joined flatly. Just like current
> URNs. I proposed that the
> division between the hostname part and the filename part
> (in your terms) be indistinguishable from the other
> hierarchical divisions. This would make a cleaner name space
> and scale better, as hosts could coalesce and split
> to cover sub-domains. (I proposed using MX-like records in DNS
> to actually find the join quickly)

> One of the reasons I wanted to move the Information Exchange
> (IX?) records away from being hosts is that conventionally people
> delete hostnames (CN) whenever they like, whereas IX record
> names could start off with the convention that they don't
> delete, they just get moved to different addresses.

> > This can be easily handled by assigning pseudo-domain names
> > to any naming authority that does not want to use a real host name. These
> > "virtual host" names would be assigned by whatever mechanisms are used to
> > assign network host names by the IANA or its successors. For example we
> > might have "ISBN.vir" which would appear in URLs such as:
> >
>
> > <URL:STANF://ISBN.vir/0-201-11297-3>
> >
>
> > The .vir would key the client immediately to use one of the indirect access
> > modes described above since it obviously can't directly access the naming
> > authority host.
>
> Why not put their hostnames into DNS under the .vir
> domain and go straight there?

I've only been following the URI discussion for a little while, but it does
seem to me that having URNs simply be URLs that specify a directory service
which returns an access-based URL is a very nice solution. (But in this case,
why bother to surround it with <URL: >? The simpler STANF://some/stable/name
would seem to be enough.)

I also think that Tim's suggestion of using a new DNS record type (I like IX)
is very elegant, and eliminates the need for kludgy names like .vir (shades of
.uucp?). In order to make the host and file hierarchies seamless, you need to
reverse the components when doing the DNS lookup, which is a pain, but pretty
simple compared to the stuff being done in the tpc.int domain (subdomains of
tpc.int are phone numbers reversed digit for digit, with each digit a separate
component, e.g. 1-800-555-1212 becomes 2.1.2.1.5.5.5.0.0.8.1.tpc.int).

So a URN/STANF URL might look like this:

STANF://mil/ddn/nic/rfc/1234

which would be turned into a DNS IX query for 1234.rfc.nic.ddn.mil, which might match an IX wildcard record for *.rfc.nic.ddn.mil that points you at a set of servers that support retrieval of RFCs.

A nice feature of DNS that can be exploited is the concept of a DNS search
path. When resolving 1234.rfc.nic.ddn.mil, my DNS resolver will first try
1234.rfc.nic.ddn.mil.smarts.com. and then 1234.rfc.nic.ddn.mil. If I have a
local cache of RFCs that I would like to use, I can set up a wildcard IX
record for *.rfc.nic.ddn.mil.smarts.com, and all clients in the smarts.com
domain will use it (the wildcard IX would also reference the standard servers,
in case the local ones were not working).

A good question is what, exactly, will these IX records contain? Since MX records contain a hostname and priority (for search ordering) that might be a good place to start. However, once you have a hostname (and thus, presumably, an IP address) what do you do with it? Tim would probably suggest that you contact the HTTP server at that address, and send it a request for the STANF URL that you are trying to resolve. This seems pretty reasonable to me, but it might be nice to have a bit more flexibility, such that other protocols could be used to retrieve the information directly from the DNS response.

One way might be to have protocol-specific URLs in the DNS records, but this
would probably overwhelm the DNS servers with the thousands and thousands of
individual documents. A better approach might be to have two strings, one
specifying a URL prefix and another string specifying the initial substring of
the STANF URL to strip off. So the example of STANF://mil/ddn/nic/rfc/1234
URL might match a wildcard IX record for *.rfc.nic.ddn.mil which provides a
prefix string "FTP://nic.ddn.mil/" and a strip match of "STANF://mil/ddn/nic/"
that convert STANF://mil/ddn/nic/rfc/1234 into FTP://nic.ddn.mil/rfc/1234.

This "editing rules" info might need to be a little bit more sophisticated,
but I would avoid anything too complex, as it will make maintaining the DNS
records more difficult (in particular, I would avoid anything that uses
regular expression syntax).

Finally, perhaps the prefix STANF: is a bit obscure, and renaming it IX: to
match the DNS record type might be clearer.

@alex