on Sameness of resources

Larry Masinter (masinter@parc.xerox.com)
Fri, 25 Feb 1994 16:06:46 PST

To: uri@bunyip.com
Subject: on Sameness of resources
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <94Feb25.160648pst.2732@golden.parc.xerox.com>
Date: Fri, 25 Feb 1994 16:06:46 PST

The draft we have says:

> Sameness: There exists a mechanism for asking whether two resources
> are the same, based on their URNs without going to the naming
> authority. This is a simple equality test on the URNs, which
> requires a canonicalization of the strings.

This was definitely an awkward bit of wording, because of how we
arrived it at.

There are two issues in this bullet, and I'd like to separate them.
I'll do this by proposing two replacement paragraphs:

1)Two URN's are `the same URN' only if they are spelled the same.
In particular, URNs are case sensitive, have no
optional parts that default, and have no alternative encodings.


2)The resource denoted by a URN is `the same resource' no matter
when and where the resource is denoted; however, the definition
of `the same resource' is defined by `the publisher', which, for
the purposes of this specification, is defined to be `the entity
that assigned the URN'.

Point 1 keeps URN-comparison simple. By contract, http://foo.com/ and
http://foo.com:80/ are `the same' URL, since the port is optional and
defaults to 80, and URLs allow varying levels of escape characters.

Point 2 is more controversial, because we still haven't really defined
what `the same resource' might mean, and must defer it to some other
authority. But what authority? There are certainly some security
concerns in allowing anyone to reassign someone else's URNs to their
own objects, or, on the other hand, not allowing any reassignment
ever.

================================================================
I normally don't like to extensively quote previous mail, but many of
these were a while ago.
================================================================
Desjardins@citix400.doc.ca asks:
- sameness of resource names (URNs) will imply "sameness" of the
resources (is that concept defined anywhere?)
- sameness verification involves a string (URN representation)
equality test -- i.e. URN resolution into an URL should not be
required or rather IS not required

raisch@internet.com replies:

> Sameness can only be defined by the publisher. Anything else
> implies control or definition of intellectual content, which is an
> abstraction and cannot be defined.

brennan@hal.com replies:

> Does "asking" mean comparing? It implies I have to go somewhere else
> to ask the question. Maybe this should say "...there exists an algorithm
> for determining whether two URNs refer to the same resource..." Or,
> even better just specify that a string compare is that algorithm.

> Also, I'm not sure what "canonicalization" is referring to in this
> context. Is there something I have to do to the URNs before comparing
> them. If so it should be specified. If not, there's no need to mention
> this.

Karen answered more definitely:

>... Again, this is a
> difficult problem. It has been discussed by the group at great
>length. Sameness is not and cannot be defined by us. This is
>something (just as with equality) that is determined by the nature of
>the information being evaluated and the particular evaluator (and
>perhaps other factors as well). But is certainly not something that
>can be defined by a standards group at the infrastructure level.
>Consider several examples. What the book industry defines as sameness
>or distinction is totally unrelated to what the database people might
>say or the telephone company in providing telephone information or the
>film industry or...The best thing that we as the information
>infrastructure architects can do is to provide the hooks needed to
>support whatever policies particular information providing and
>managing communities wish to implement. So this functional spec says
>that a naming authority determines sameness or difference by whatever
>criteria it chooses and reflects those decisions by the imposition of
>URNs. These URNs should then be comparable by a string equality test
>that will reflect the decision previously made about the sameness of
>information. So, you are right about the testing and have hit on a
>point that is often difficult to tease apart about where our
>responsibilities do and don't lie.