URN contents

Peter Svanberg (psv@nada.kth.se)
Tue, 25 May 1993 20:32:31 +0200

Message-Id: <9305251832.AA23074@nada.kth.se>
To: "Chris Weider" <clw@merit.edu>
Subject: URN contents
In-Reply-To: Your message of "Tue, 25 May 1993 10:22:04 EDT."
<9305251422.AA17930@merit.edu>
Date: Tue, 25 May 1993 20:32:31 +0200
From: Peter Svanberg <psv@nada.kth.se>

Quoting: "Chris Weider" <clw@merit.edu>

> One further comment on the URN human readability. I agree completely with
> Karen Sollins' earlier comments about human readability. Although the
> problems may seem rather remote, just think of what can happen with multiple
> languages and character sets. If I'm putting lots of human usable
> relevant information into the URN, then I'm doing a number of things:
> I'm assuming that everyone can read my character set, I'm assuming that
> everyone speaks the same language I do, and I'm assuming that everyone
> has exactly the same ontology for the metainformation that I do. I think that
> in a global internet, none of these assumptions are justifiable and in fact
> are positively harmful. So I would like to start a drive in this group to
> get the URC stuff defined as soon as possible so that we can put the
> metainformation where it belongs. Many of the examples which have been given
> of 'natural' metainformation to include in the URN work only because we all
> speak English, many of us are in the U.S. and thus know what Time Magazine
> or Scientific American is, and because we already have a fairly well
> defined ontology for specifications in print materials (Volume number,
> page number, etc). It's going to be much more difficult to build standard
> ontologies for video resources, interactive databases, etc.

I agree! Actually, the only way to be totally culturally independent
is to reduce the allowed contents of the URN to just digits (and
perhaps dots, dashes and some other punctuation marks)! Everything
else assumes knowledge as described above. The days when ASCII was
some kind of common base are soon gone. Computer usage solely in a
non-latin language will not be unusual.

Cons: The "Naming Authorities" must be enumerated or encoded in some
way. Perhaps a slightly increased input-error-rate compared to
English-words-containg URNs input by users with knowledge in
English. Perhaps slightly increased URN-length.

Pros: Culturally independent. Perhaps a slightly decreased
input-error-rate (especially for users *without* knowledge in
English) as there is no mixing of O with zero, l with 1
etc. Impossible (?) to include metainformation.

(In URLs, on the other hand, it must - in some way - be possible to
use *any* character used in a culture if it should have any chance of
being used - internally - in that culture. I'll return to that.)

---
Peter Svanberg, NADA, KTH		    Email: psv@nada.kth.se
Dept of Num Analysis and Comp. Science,
Royal Institute of Technology		    Phone: +46 8 790 71 40
S-100 44  Stockholm, SWEDEN		    Fax:   +46 8 790 09 30