four-part harmony? (was "URN single or multiple variants")

John A. Kunze (jak@violet.berkeley.edu)
Tue, 21 Sep 93 00:18:44 -0700

Date: Tue, 21 Sep 93 00:18:44 -0700
From: jak@violet.berkeley.edu (John A. Kunze)
Message-Id: <9309210718.AA26093@violet.berkeley.edu>
To: uri@bunyip.com
Subject: four-part harmony? (was "URN single or multiple variants")

Regarding the thread "URN single or multiple variants", I think
I hear at least four-part harmony. Do I hear five? If you think
of it as four-part cacophany, sing out now (but in key).

For now I for one would really like to restrict discussion to high-level
requirements. Even if we reach only a provisional consensus, the
lower-level discussions that need to follow won't make sense to me
unless we fix these few data points.

The "parts" I hear to be in concord are below. I've truncated several
trenchant insights to make room for my favorite refrain (yea, that one).
I'm sure you'll be delighted :-) to find much of this stuff addressed
as high-level requirements in Part 4. Points made are:

variant packed in or next to URN
IdAuthority says when and how *its own* URN sprouts variants
versioning handled within variants
variant dimensions (format, encoding, quality, version, etc.)
root URN traceable back from specific variants
all variants traceable forward from root URN

Part 1.
> From: mitra@path.net (Mitra)
> It seems like all these things - versioning, quality (converted etc) are
> things for the next meta level. I see the order of tasks...

Part 2.
> From: Tim Berners-Lee <timbl@www3.cern.ch>
> I agree in general with Mitra (that the same URN should
> refer to any variants, machine conversion, etc) ...

Part 3.
> From: winograd@interval.com (Terry Winograd)
> This requires that each "information object" be given a characterization by
> a "responsible party" as to what variance it is intended to cover. That
> is, each of the above three examples in addition to having a unique
> identifying string would have auxiliary information saying what constitutes
> its "unique identity". ...

Part 4.
> From jak Tue Jul 6 19:04:21 1993
> ...
> Resource Citations for Electronic Discovery and Retrieval
> 26 March 1993

[edited to supply missing context, trivial change name (ERI->URN), and
removal of some confusing red flags -jak]
>
> 2. Uniform Resource Name (URN)

[This is a conceptual grammar, not actual syntax.]
> URN ::= IdAuthority IdDesignator
> IdDesignator ::= IdString [ VariantSpecifier ] [ Hints ]

["Hints" are about size and usage restrictions which are independent of
and confusing to this discussion, so I've suppressed them for now.
They can be killed off or resurrected later if need be.]

[This is a summary of a few required operations.]
> Operations:
> URNtoURLlist(URN) --> (URL1, URL2, ...)
> URNequal(URN1, URN2) --> { true | false }
> URNroot(URN) --> { URN | URNwithoutVariantSpec }
> ...
>
> A URN is a pair consisting of an identifying authority (IdAu-
> thority) and an identifier string specific to that authority,
> .... This is a coded pointer to an organization that
> assigns identifiers (e.g., a publisher), a specific *designator*
> for the resource....
>
> Resource designators have the following properties:
>
> (a) two resources have identical designators if and only if the
> IdAuthority deems them the same in content (even if another
> IdAuthority would call them different),
>
> (b) the IdAuthority guarantees not to re-use designators,
>
> (c) while they are in general opaque to everyone but the IdAu-
> thority, the IdAuthority may choose to assign designators
> using a system that conveys extra information about the
> resource (e.g., subject area derivable from call number),
> and
>
> (d) they may contain an optional *variant* specifier.
>
> URNs were conceived to fulfill the need for a small amount of |
> per-item information that can be used to derive the location of |
> the rest of the per-item information. This helps systems keep |
> down the overhead by returning only a modest amount with each |
> item in the initial search results. ...
>
> The guarantee never to re-use designators is critical to support |
> search and retrieval far into the future. Because an URN itself |
> is also unique, it can be used by client systems to help remove |
> duplicate items from the pooled results of searching several dif- |
> ferent servers.=
>
> A *variant* specifier is an optional subpart of a resource desig-
> nator that preserves all of the above semantics while allowing an
> IdAuthority, as an added service, to encode the designator for-
> mally to convey specific extra information to external systems
> (different from (c) above). By this means, for example, a
> software system comparing two designators could tell when the
> IdAuthority deemed two documents to be the same except for cer-
> tain dimensions along which similar documents often vary. Of
> course, no IdAuthority is obliged to reveal such distinctions
> even if it knows them.
>
> The variant dimensions, each one optional, are: (a) format, (b)
> encoding or archive scheme, and (c) version number (1, 2, 3, etc.
> with -1 meaning the most recent, -2 the next most recent, etc.).
> By using the variant dimensions, an IdAuthority can continue to
> inform others what it considers different resources (as before)
> while assisting users in making up their own minds.
>
> An important property of a designator containing a variant
> specifier is that the IdAuthority assigning it implicitly agrees
> to recognize the "root" designator obtained by stripping off the
> variant specifier. Because it is a logically distinct subpart,
> user systems can remove the variant specifier to derive the root
> designator, which can then be used to query the IdAuthority about
> the root document, or about other variants that might be avail-
> able. This is particularly important when an URN designates a
> variant that is unusable by the client (e.g., an unsupported for-
> mat), because it provides a backward pointer through which more
> suitable forms may be discovered.

-John