To: uri@bunyip.com
Subject: diffs for latest revision
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <94Oct17.172812pdt.2760@golden.parc.xerox.com>
Date: Mon, 17 Oct 1994 17:27:59 PDT
This may seem like the never-ending draft, but... this is really *it*.
These are diffs from what I sent out a week ago. You will see, I hope,
that all revisions are of an editorial nature.
================================================================
*** url-07x.txt Mon Oct 10 01:08:19 1994
--- url-07y.txt Mon Oct 17 17:24:02 1994
***************
*** 1,8 ****
Uniform Resource Locators T. Berners-Lee
! draft-ietf-uri-url-XX.txt L. Masinter
! Expires April 9, 1995 M. McCahill
Editors
! October 9, 1994
Uniform Resource Locators (URL)
--- 1,8 ----
Uniform Resource Locators T. Berners-Lee
! draft-ietf-uri-url-09.txt L. Masinter
! Expires April 17, 1995 M. McCahill
Editors
! October 17, 1994
Uniform Resource Locators (URL)
***************
*** 25,32 ****
Shadow Directories on ds.internic.net, nic.nordu.net,
ftp.isi.edu, or munnari.oz.au.
- This Internet Draft expires April 9, 1995.
-
0. Abstract
This document specifies a Uniform Resource Locator (URL), the
--- 25,30 ----
***************
*** 35,43 ****
1. Introduction
! This document describes the syntax for a compact string
! representation for a resource available via the Internet. These
! strings are called "Uniform Resource Locators" (URLs).
The specification is derived from concepts introduced by the
World-Wide Web global information initiative, whose use of such
--- 33,41 ----
1. Introduction
! This document describes the syntax and semantics for a compact
! string representation for a resource available via the Internet.
! These strings are called "Uniform Resource Locators" (URLs).
The specification is derived from concepts introduced by the
World-Wide Web global information initiative, whose use of such
***************
*** 52,58 ****
archived at <URL:http://www.acl.lanl.gov/URI/archive/uri-archive.
index.html>
! 2. Definitions
Just as there are many different methods of access to resources,
there are several _schemes_ for describing the location of such
--- 50,56 ----
archived at <URL:http://www.acl.lanl.gov/URI/archive/uri-archive.
index.html>
! 2. General URL Syntax
Just as there are many different methods of access to resources,
there are several _schemes_ for describing the location of such
***************
*** 69,75 ****
`update', `replace', `find attributes'. In general, only the
`access' method needs to be specified for any URL scheme.
! 2.1. URL SYNTAX
A full BNF description of the URL syntax is given in Section 5.
--- 67,73 ----
`update', `replace', `find attributes'. In general, only the
`access' method needs to be specified for any URL scheme.
! 2.1. The main parts of URLs
A full BNF description of the URL syntax is given in Section 5.
***************
*** 87,93 ****
interpreting URLs should treat upper case letters as equivalent to
lower case in scheme names (e.g., allow "HTTP" as well as "http").
! 2.2. Encoding of reserved and unsafe characters
URLs are sequences of _characters_, i.e., letters, digits, and
special characters. A URLs may be _represented_ in a variety of
--- 85,91 ----
interpreting URLs should treat upper case letters as equivalent to
lower case in scheme names (e.g., allow "HTTP" as well as "http").
! 2.2. URL Character Encoding Issues
URLs are sequences of _characters_, i.e., letters, digits, and
special characters. A URLs may be _represented_ in a variety of
***************
*** 95,111 ****
character set. The interpretation of a URL depends only on
the identity of the characters used.
! In most URL schemes, different parts of a URL are used to represent
! sequences of octets used in Internet protocols. For example, in the
! ftp scheme, the host name, directory name and file names are
! represented by parts of the URL. Within those parts, chararacters
! are generally used to represent the corresponding octet within the
US-ASCII [20] coded character set.
In addition, octets may be _encoded_ by a character triplet
! consisting of the character "%" followed by two hexadecimal digits
! (from "0123456789ABCDEF"), which forming the hexadecimal value of
! the octet.
Octets must be encoded if they have no corresponding graphic
character within the US-ASCII coded character set, if the use of
--- 93,111 ----
character set. The interpretation of a URL depends only on
the identity of the characters used.
! In most URL schemes, the sequences of characters in different parts
! of a URL are used to represent sequences of octets used in Internet
! protocols. For example, in the ftp scheme, the host name, directory
! name and file names are such sequences of octets, represented by
! parts of the URL. Within those parts, an octet may be represented
! by the chararacter which has that octet as its code within the
US-ASCII [20] coded character set.
In addition, octets may be _encoded_ by a character triplet
! consisting of the character "%" followed by the two hexadecimal
! digits (from "0123456789ABCDEF") which forming the hexadecimal
! value of the octet. (The characters "abcdef" may also be used in
! hexadecimal encodings.)
Octets must be encoded if they have no corresponding graphic
character within the US-ASCII coded character set, if the use of
***************
*** 116,137 ****
No corresponding graphic US-ASCII:
URLs are written only with the graphic printable characters of the
! US-ASCII coded character set. All octets that correspond to
! non-printable characters or space must be encoded.
Unsafe:
! Characters can be unsafe for a number of reasons. The characters
! "<" and ">" are unsafe because they are used as the delimiters
! around URLs in free text; the quote mark (""") is used to delimit
! URLs in some systems. The character "#" is unsafe and should
! always be encoded because it is used in World Wide Web and in other
! systems to delimit a URL from a fragment/anchor identifier that
! might follow it. The character "%" is unsafe because it is used
! for encodings of other characters. Other characters are unsafe
! because gateways and other transport agents are known to sometimes
! modify such characters. These characters are "{", "}", "|", "\",
! "^", "~", "[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
--- 116,141 ----
No corresponding graphic US-ASCII:
URLs are written only with the graphic printable characters of the
! US-ASCII coded character set. The octets 80-FF hexadecimal are not
! used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
! control characters; these must be encoded.
Unsafe:
! Characters can be unsafe for a number of reasons. The space
! character is unsafe because significant spaces may disappear and
! insignificant spaces may be introduced when URLs are transcribed or
! typeset or subjected to the treatment of word-processing programs.
! The characters "<" and ">" are unsafe because they are used as the
! delimiters around URLs in free text; the quote mark (""") is used
! to delimit URLs in some systems. The character "#" is unsafe and
! should always be encoded because it is used in World Wide Web and
! in other systems to delimit a URL from a fragment/anchor identifier
! that might follow it. The character "%" is unsafe because it is
! used for encodings of other characters. Other characters are
! unsafe because gateways and other transport agents are known to
! sometimes modify such characters. These characters are "{", "}",
! "|", "\", "^", "~", "[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
***************
*** 140,151 ****
does use them, it will not be necessary to change the URL encoding.
Reserved:
! Many URL schemes reserve certain characters for a special
! meaning: their appearance in the scheme-specific part of the URL
! has a designated semantics. If the character corresponding to an
! octet is _reserved_ in a scheme, the octet must be encoded. The
! characters ";", "/", "?", ":", "@", "=" and "&" are the characters
! which may be reserved for special meaning within a scheme. No other
characters may be reserved within a scheme.
Usually a URL has the same interpretation when an octet is
--- 144,155 ----
does use them, it will not be necessary to change the URL encoding.
Reserved:
! Many URL schemes reserve certain characters for a special meaning:
! their appearance in the scheme-specific part of the URL has a
! designated semantics. If the character corresponding to an octet is
! _reserved_ in a scheme, the octet must be encoded. The characters
! ";", "/", "?", ":", "@", "=" and "&" are the characters which may
! be reserved for special meaning within a scheme. No other
characters may be reserved within a scheme.
Usually a URL has the same interpretation when an octet is
***************
*** 153,167 ****
not true for reserved characters: encoding a character reserved for
a particular scheme may change the semantics of a URL.
! Summary:
!
! In all URLs, irrespective of scheme, only alphanumerics, reserved
! characters used for their reserved purposes, "$", "-", "_", ".",
! "!", "*", "'", "(", ")", "," and "+" may be used unencoded.
!
! On the other hand, even safe characters such as alphanumerics _may_
! be encoded, as long as they are not being used for a reserved
! purpose.
2.3 Hierarchical schemes and relative links
--- 157,170 ----
not true for reserved characters: encoding a character reserved for
a particular scheme may change the semantics of a URL.
! Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
! reserved characters used for their reserved purposes may be used
! unencoded within a URL.
!
! On the other hand, characters that are not required to be encoded
! (including alphanumerics) _may_ be encoded within the
! scheme-specific part of a URL, as long as they are not being used
! for a reserved purpose.
2.3 Hierarchical schemes and relative links
***************
*** 629,635 ****
The WAIS URL scheme is used to designate WAIS databases, searches,
or individual documents available from a WAIS database. WAIS is
! described in [7]; the WAIS protocol is described in RFC 1625 [17].
A WAIS URL takes one of the following forms:
--- 632,640 ----
The WAIS URL scheme is used to designate WAIS databases, searches,
or individual documents available from a WAIS database. WAIS is
! described in [7]. The WAIS protocol is described in RFC 1625 [17];
! Although the WAIS protocol is based on Z39.50-1988, the WAIS URL
! scheme is not intended for use with arbitrary Z39.50 services.
A WAIS URL takes one of the following forms:
***************
*** 855,862 ****
newsurl = "news:" grouppart
grouppart = "*" | group | article
group = alpha *[ alpha | digit | "-" | "." | "+" | "_" ]
! article = 1*articlechar "@" 1*articlechar
! articlechar = uchar | ";" | "/" | "?" | ":" | "&" | "="
; NNTP (see also RFC977)
--- 860,866 ----
newsurl = "news:" grouppart
grouppart = "*" | group | article
group = alpha *[ alpha | digit | "-" | "." | "+" | "_" ]
! article = 1*[ uchar | ";" | "/" | "?" | ":" | "&" | "=" ] "@" host
; NNTP (see also RFC977)
***************
*** 958,964 ****
Most recently, careful readings and comments by Dan Connolly, Ned
Freed, Roy Fielding, Guido van Rossum, Michael Dolan, Bert Bos,
! John Kunze, and many others have helped refine the current draft.
APPENDIX: Recommendations for URLs in Context
--- 962,969 ----
Most recently, careful readings and comments by Dan Connolly, Ned
Freed, Roy Fielding, Guido van Rossum, Michael Dolan, Bert Bos,
! John Kunze, Olle Jarnefors, Peter Svanberg and many others have
! helped refine the current draft.
APPENDIX: Recommendations for URLs in Context
***************