Message-Id: <199409282106.RAA14270@thud.cs.utk.edu>
From: Keith Moore <moore@cs.utk.edu>
To: rdaniel@acl.lanl.gov
Subject: Re: URCs and URL resolution
In-Reply-To: Your message of "Wed, 28 Sep 1994 12:19:00 MDT."
<199409281819.MAA29478@idaknow.acl.lanl.gov>
Date: Wed, 28 Sep 1994 17:06:12 -0400
>
> > We have envisioned URC-like information being available
> > in 2 ways.
> >
> > One would be a full URC (or URC-like thing) with lots of
> > handy info which you might get via a whois++ search or
> > related lookup.
> >
> > The other--for URL resolution would use a very lightweight
> > RPC or RPC-like mechanism.
>
> I guess I don't see these as seperate. URLs will be part of
> the URC. It will be possible (if the URC server administrator
> allows it) to launch a general query against the contents of
> a URC server. The URN->URL resolution will be the most common
> query. The service, the servers, and the browsers should be
> optimized for that common case.
It turns out that there are several different kinds of information that you
might want in a URC-thingy. Some of the information is specific to the
overall resource, some of it is specific to an instance of a resource, and
some is specific only to a location of a resource.
Different kinds of descriptive information for any single resource may have
different sources, different lifetimes, and different access patterns.
As an example, take the following attributes of a resource (a flat file,
in this case):
author
title
version
abstract
date
md5
content-type
URN
locations (i.e. URLs)
authentication-information
Some of these would change more frequently than others. While author, title,
and abstract might never change or change only rarely, date and md5 could be
expected to change each time a new version of the file were issued. The list
of locations would change even more frequently, as file servers added or
deleted the file from their collections. The location fields would ideally
be updated by the file servers themselves (with appropriate authentication to
prevent spoofing), but other fields would be more restricted.
Also, for the md5 and authentication information to have any value, they must
match the files obtained from the list locations. So every time a file is
updated, several fields have to change, and this updating must work with file
and database servers distributed all over the world, and also be tolerant of
network failures.
Also, in organizing the descriptive information for a resource, it pays to
take into account where the information comes from. In the print world, some
of the information is effectively generated by publishers (who assign title,
author, ISBN, and arrange for an LC number), while other "cataloging
information" is generated independently by librarians. The reason?
Librarians are the ones who have an interest in making sure that people can
locate the resources that they need. (The main concern of publishers is to
sell books, which isn't quite the same thing...)
So you want to design the URC in such a way that each field gets mtiantained
by the group most interested in getting the information "right".
While you *can* treat the set of descriptive information about a resource as
one abstract entity called a URC, when you get down to "optimizing for the
common cases", there is a real need to treat some of the information very
differently than others. This is true not only for the query protocol, but
also in how the information gets maintained, and how widely it is distributed.
-- Keith Moore NETLIB development group Computer Science Department / University of Tennessee at Knoxville 107 Ayres Hall / Knoxville TN 37996-1301 Let's stamp out license managers in our lifetime.