Date: Tue, 25 May 93 10:22:04 -0400
From: "Chris Weider" <clw@merit.edu>
Message-Id: <9305251422.AA17930@merit.edu>
To: sollins@lcs.mit.edu, uri@bunyip.com
Subject: Response to Karen Sollins' comments and some more URN jazz
Hi gang:
As I mentioned, we are going to remove the restriction on 'human-readable'
URNs. This is because we need something human readable right off the bat, and
as Peter mentioned there is a screaming need for these types of identifiers.
But let me bounce off you all sort of the theoretical framework that I see
these fitting into (especially if you haven't read the 'Vision' paper Peter
and I wrote).
I see all three of the identifiers currently under discussion as 'objects',
if you will. There are certain functions which apply to each one...
Uniform Resource Citations (contain metainformation about the resource)
and contain one or more URNs and perhaps some URLs. Functional allowed on
Citations are : Extract_URN, Extract_URL, Extract_Fields (this allows
a user to retrieve the Author's name, for example, to then use a resource
location system to find other works about this author), Retrieve_Metainformation
(which uses the URN to retrieve all the metainformation about the resource),
and perhaps also Translate_to_local_language. I'm sure we'll come up with
others. The Uniform Citation is the only place, in my opinion, where we should
allow fragment specifiers, meaning that the smallest conceptual quantum is
the 'resource', whatever that is. Also, I believe that in the long term the
URC is the only place where the metainformation (used by humans to guide
their determination of whether to retrieve the resource) should go.
Uniform Resource Name (contains (incidentally) some metainformation about
the resource)). Functions allowed on the URN are: Retrieve_Metainformation,
Get_URLs, and perhaps Generate_Citation. Functions allowed on a set of URNs
are Remove_Duplicates. A couple of notes here: The metainformation returned by
the Retrieve_Metainformation function is the same as that returned by the
URC version, but slightly less than that of the URL version. Again, I'm sure
we'll come up with others...
Uniform Resource Locators (contains no metainformation about the resource).
Functions allowed include: Access_Resource, Retrieve_Metainformation,
Determine_URN. The Retrieve_Metainformation function returns the standard
URN metainformation, along with any metainformation relevant to this
particular instantiation of the resource (such things as encoding schemes,
access control, size and cost, etc. This information can reasonably be expected
to change depending on the instantiation's location on the Net).
The Determine_URN function may be quite expensive (you can think of it as
an inverse address mapping) but may sometimes be required, especially early
in deployment.
One further comment on the URN human readability. I agree completely with
Karen Sollins' earlier comments about human readability. Although the
problems may seem rather remote, just think of what can happen with multiple
languages and character sets. If I'm putting lots of human usable
relevant information into the URN, then I'm doing a number of things:
I'm assuming that everyone can read my character set, I'm assuming that
everyone speaks the same language I do, and I'm assuming that everyone
has exactly the same ontology for the metainformation that I do. I think that
in a global internet, none of these assumptions are justifiable and in fact
are positively harmful. So I would like to start a drive in this group to
get the URC stuff defined as soon as possible so that we can put the
metainformation where it belongs. Many of the examples which have been given
of 'natural' metainformation to include in the URN work only because we all
speak English, many of us are in the U.S. and thus know what Time Magazine
or Scientific American is, and because we already have a fairly well
defined ontology for specifications in print materials (Volume number,
page number, etc). It's going to be much more difficult to build standard
ontologies for video resources, interactive databases, etc.