Message-Id: <9404271841.AA08525@ulua.hal.com>
To: uri@bunyip.com
Subject: Re: Seperating URC format and URN->URC resolution
In-Reply-To: Your message of "Wed, 27 Apr 1994 16:11:40 GMT."
<CoxEBH.GEE@pandora.sf.ca.us>
Date: Wed, 27 Apr 1994 13:41:16 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
In message <CoxEBH.GEE@pandora.sf.ca.us>, Mitra writes:
>
>Question 1: What does a URC look like, what CAN it contain, and how is
>it formatted.
>
>Question 2: In the case of a URN->URC resolution service, what SHOULD a
>URC contain, how will it get used, how will we find the resolution
>service etc.
I agree that too much focus on question 1 has greatly impeded progress
in this group.
I suggest the following focusing questions:
1. What problems are we trying to solve?
(specific scenarios, definition of scope,
possibly requirements)
2. What abstractions model the problem well?
(what objects, operations, functions, ...)
3. How can we realize those abstractions?
Talking about the format while the abstractions are yet to be defined
is silly, except to present examples of how an abstraction might be used.
Is there an archtypical example that motivates the concept of a URC?
I spent several hours yesterday messing with ideas on just what a URC
is, and I can't get any sense of clarity.
I messed with perl scripts to convert rfc-index.txt to something
between BibTeX and SGML, with HTML-style HREF links. But then I
brought it up under Mosaic, and it took to long to display and I
couldn't effectively navigate it. What I needed was WAIS, but with an
SQL front end:
SELECT number, title, author FROM rfc-index-table
WHERE abstract LIKE "whois solo"
ORDER BY date
>The consensus I hear on Question 1 is that a URC is a collection of:
>zero or one URN's, zero or more URL's, and meta data associated with any
>of these.
But until those terms (URN, URL, meta data) are defined in terms of
how they interact with other symbols, this definition is useless.
What are the essential properties of a URN? If you're looking for a
hirearchical namespace with an associated management structure, there
are lots of them: ISO OIDs, ISSNs, ISBNs... if you want one that's
also a distributed computing application, have a look at DNS, DCE Cell
Directory Service, Prospero, X.500 ... if, however, you want to
grandfather all these, then please enumerate the essential properties
that they all have in common that you're so keen after. And please...
can we leave out the question of how to send them through internet
mail until after we've got a line on some of these larger issues?
[Specifically... what's wrong with the URL/URI concept as deployed in
the WWW application? Just introduce a few new schemes.]
Anyway... My answers:
1. What problems are we trying to solve?
(a) Link reliability, fault tolerance, and maintenance
"I want to rename a file, but I know that
will screw up all the pointers to that file."
"I made a link to a big compressed postscript
file on an FTP site, and Joe's browser presented
it to him like it was text"
"I made some comments on the Jan 15 version of the document,
and now the document has changed, and so folks that
read my comments and follow links to the original think
I'm nuts"
"Somebody spoofed DNS and now half the time when I
follow links to info.cern.ch, I get forged info"
(b) Link expressiveness
"I'm reading a document that has a link to RFC822
at an austrailian FTP archive. I've got a copy of
RFC822 right here on my local machine, but my browser
has no way to recognize that it can use the local copy!"
"The file is available in postscript and text form,
but my comments only apply to the postscript version --
how do I make a link to the postscript version?"
(c) Resource Discovery
"What resources are relavent to my task?"
What sites/databases/tools/applications?
What documents in this database?
What section of this document?
"What resources are related to resource X"
"Is there a newer version of this document
out there somewhere?"
(d) Electronic Marketing
"How do I make a publication available to my audience?"
"How do I notify them of changes/updates?"
"How do I collect feedback? (readership statistics?"
(e) Scalable Administration
"Somebody gave me a pointer to this document,
but my client can't get at it. Is it gone? Did it
move? Who do I contact to find out?"
"I've got a bunch of stuff to put online. Who
should I notify?"
"I'm moving a bunch of info from one machine to
another... how do I deal with all the references
to this info?"
"I'm willing to index a bunch of data at my site...
but I don't have the CPU resources to service
thousands of queries from everywhere... how can
we distribute the indexes and search engine resources
like the ARCHIE folks did?"
"How can I establish mirrors sites for this data?"
(e) Security (integrity, secrecy, access control)
"All this sounds great, but I only want to distribute
stuff within my corporation. I'd like to use the
internet, but I don't want any joe out there to
be able to read this stuff."
"How I do allow feedback/annotation without making my
servers vulnerable to attack?"
"How do I keep folks from forging the data I provide?"
2. What abstractions model the problem well?
See * for my thoughts on this
* A Formalism for Internet Information References
$Id: formalism.html,v 1.1 1994/04/25 17:48:29 connolly Exp $
http://www.hal.com/%7Econnolly/drafts/formalism.html
3. How can we realize those abstractions?
HTTP, WHOIS++, SOLO, Prospero, and X.500 are likely candidates,
but without more clarity on questions 1 and 2, I can't seem
to organize my thoughts on how they apply.
Daniel W. Connolly "We believe in the interconnectedness of all things"
Software Engineer, Hal Software Systems, OLIAS project (512) 834-9962 x5010
<connolly@hal.com> http://www.hal.com/%7Econnolly/index.html