Message-Id: <199410241854.OAA03368@postman.osf.org>
To: John Curran <jcurran@nic.near.net>, mitra@path.net
Subject: Re: Current URN syntax is unacceptable
In-Reply-To: Your message of "Sun, 23 Oct 1994 07:14:22 EDT."
<aacff45502021004e74e@[192.52.71.147]>
Date: Mon, 24 Oct 1994 14:53:38 -0400
From: "Norbert Leser - OSF DCE: (617)621-8715" <nl@osf.org>
John Curran wrote:
>At 1:27 AM 10/23/94, Roy T. Fielding wrote:
>>The following syntax (as seen in <URL:http://www.path.net/mitra/urn.html>)
>>is unacceptable for use as a URN standard.
>>
>> URN:dns:path.net:mitra1234
>>
>>should be
>>
>> dns:/path.net/mitra1234
>>or
>> urn:/dns/path.net/mitra1234
>>
>>so as to preserve a common URI syntax, as previously agreed on this forum.
>
> Is there a common URI syntax? I remember vocal debates about everything
> to the right of the scheme tag being "opaque"... Are you indicated that
> you believe that all URI-class objects will have a common syntax that can
> be applied prior to scheme recognition?
>
> (I don't mind the change, but would rather not have capricious changes to
> meet potentially non-existance consistency belief...)
I'm glad, that you have brought up this point again. This is still an
contentious area that is more than fuzzy, but we need a solution in order
to make progress. After having brought up some issues surrounding the previous
"Functional Requirements for URNs" draft, it is my understanding that
1. there is no "common URI syntax" other than the generic format
in the form of
[scheme id tag] : [opaque string]
2. the syntactical definitions specified are for URL schemes such
as http and ftp, for locators and not for names and identifiers;
3. the URN requirements document identifies a few encoding requirements
that (though debatable) don't directly impose a specific syntax,
permitted character sets, etc. It specifically says:
"... there is not yet consensus on what the limit might be."
While I agree with the statement that one should not introduce "capricious
changes", I clearly see the need for separating the syntax and encoding
definitions into things that are appropriate for the body of the URN document
and elements (the opaque strings) that are the business of naming authorities
identified by the scheme id tags.
Although I acknowledge that the updated URN requirements draft now qualifies
the supported existing legacy naming systems with "insofar as they satisfy the
other requirements", there is no good reason for imposing unnecessarily
constrains that would prevent a number of naming systems to use the URNs.
Therefore, I'd propose to limit the specification in the "Syntax"
section of the URN draft to following concepts:
- The URN consists of three parts, its URN header, a scheme identifier,
and an opaque string.
- The three parts of the name are delimited by ":" characters.
- In free text, the URN is limited by "<" and ">" characters.
- The URN header is the ascii string "urn", case insensitive.
- The scheme identifier is a registered case insensitive ascii
string, identifying the top level naming authority.
The syntax for this scheme id should be equivalent to the scheme id
in URLs (see draft-ietf-uri-url-08.txt), which permits:
a...z 0...9 + . -
There is no reason for doing it differently for URNs, where the
current draft says:
a...z 0...9 . / percentencoded
- The opaque string is interpreted by its respective naming authority
only. URN doesn't impose any encoding restrictions other than those
required by the underlying (transport) protocols.
Any registered naming authority specifies the encoding rules for
these opaque strings. This includes the possible insertion of
multiple and hierarchical sub-naming authorities, the ordering
of multiple atomic names (in hierarchical naming systems, for
instance), the delimiters used for sequences of atomic names,
and other naming system specific properties.
For the purpose of simple name comparison only, instances (such as
client interpreters) that don't have the naming authority's knowledge,
case matching is insensitive and white spaces are not significant.
Now you can go further and define syntactical rules for the opaque string for
already registered naming authorities, such as "dns" or "isbn". I wouldn't
mind having these in separate subsections of the URN draft; but these
have to be separate and distinct from the generic URN "Syntax" section!
These authorities could define names according to its rules. A dns authority,
for instance may define how sub-authorities are encoded, which could lead to
names that look like
URN:dns:path.net:mitra1234
or
URN:dns:/path.net/mitra1234
or whatever appears to be appropriate for this particular naming system.
The encoding rules for these particular naming authorities may also determine
the used character sets and other limitations and mapping rules.
For "dns", it would be similar to the "Dnsasciis" (including dashes),
currently specified in the URN draft, section "Syntax". It would also allow
for - in whatever form - sub-naming authorities. I'd view a colon separated
element, similar to the scheme id, as intuitive here.
Norbert