NEWS via NNTP and HTTP more URL stuff

hallam@alws.cern.ch
Mon, 14 Mar 1994 20:50:03 +0100

Date: Mon, 14 Mar 1994 20:50:03 +0100
Message-Id: <9403141950.AA08754@dxmint.cern.ch>
From: hallam@alws.cern.ch
Subject: NEWS via NNTP and HTTP more URL stuff

Really-From: Phill Hallam-Baker

On the NNTP side we not only need to sort out the article number stuff but also how
the data is arranged in directories.. Here I would also like to propose a scheme for
aliasing URLs.

In my view the NNTP URL should be a superset of the NEWS URL. This is to allow
a mapping from the NEWS URL to the NNTP URL. One use for this would be to specify the
newserver(s) within a client. For example:

NEWS:<messageid@324578>

is mapped to:-

NNTP://newshost.here.ch:119/<messageid@324578>

Increasing the similarity between the NNTP and HTTP URL form makes the use of a
uniform aliasing scheme workable. It also makes it possible to consider a uniform
mechanism to extend URLs to other protocols for the glorious day when we can junk
TCP/IP and move to ATM or whatever new system is about... Unless
somone works out a way to reserve bandwidth on the TCP/Ip channels it will not be
suitable for transferring telephone calls (voice/video) over network links This
is something many WEB people will want to do.

[Digression, adding protocols...
Adding a transport protocol may be done by extending the port number
field. the form HTTP://newshost.here.ch:TCPIP=119/ might be used
the protocol defaults to TCPIP if none specified. If a URL was to be
resolved via DECNET then we would use HTTP://newhost.here.ch:DECNET=WWW/
Where WWW is the facility name on port 0...
]

We also need to tie up the URLS for describing the list of newsgroups. Here
the convention `anything with a single @ is a message id' is very tiresome.
It means the whole URL has to be parsed before anything sensible can be
done with it.

I suggest we give ourselves a bit of space by using the / character to indicate that a
spiffy new NEWS URL is in use. This will prevent a lot of older URLs in existing
documents from breaking, the alternative is to prefix message ids somehow. It also
tightens the gap between NEWS and HTTP. This is important because in the long run we
should think of using HTTP as an interface to NEWS.

Refering to the hierarchy
-------------------------

People do want to be able to write `see the newsgroups in the soc.* hierarchy as a
live link.

The mechanism I would propose is

NEWS:/* List of all newsgroups
NEWS:/*. List of top level news hierarchies (alt, soc etc.0
NEWS:/soc.* List of all news groups in the soc hierachy
NEWS:/soc.*. List of all newsgroups and first level subhierarchies in soc

NEWS:/soc.culture.british List all the articles in soc.culture.british.

Being able to refer to the newsgroup an article is in makes aliasing much more
powerful. it is then possible to set the system up to resolve all requests for
messages in soc.culture.british to be sent to one server and all the requests for
messages in local.very.secret to go to another.

Another point about aliasing.. Might want to set the system up to use NNTP to access
general news and HTTP to access certain newsgroups on another machine. HTTP is a very
nice protocol for the client side to interface to for reading news. Often the primary
source will be on NNTP but the secondary will come from a CD ROM, why not use a HTTP
server in that case? A group may start out with a news type system based arround a
quick 'n dirty HTTP setup but then want to to move to NNTP as the group gets bigger.

So the sort of article URL I would like to be able to use is:

NEWS:/soc.culture.british/<messageids@wherever>
NEWS:/local.very.secret/<another@secret>

Internaly these are aliased onto the protocols NNTP and HTTP ...

Crossposting
------------

What about multiple groups? We have to be able to post as well as get so we should
consider the case of having more than one group. Here the question is what sort of
separator one should use, coma or a plus.

The URL

NEWS:/soc.culture.british+comp.infosystems.www

Would seem to fit the bill. It works for a GET or a POST operation...

Article Numbers
---------------

So where do article numbers enter into this? They are usefull for use in the
NAME= section of HTML anchors:-

<A name=3241 href=NEWS:<article@ERGGU>

Then a browser can knock out previously read articles on the basis of the article id
number....

Searching
---------

I love those question marks.... So what can we use them for in NEWS? I suggest that we
allow searches according to the content of text or headers...

NEWS:/soc.culture.british?From=Hallam
All the messages in s.c.b with the string `hallam' in the from field.

NEWS:/soc.culture.british?Subject=HTTP&Text=bong
All the messages with HTTP in the subject field and bong in the text.

This scheme seems to do everything I would like to do... any comments?

Summary
-------

1) Keep the URL as close as possible to the HTTP URL currently the NEWS URL is the
only one that does not start off with the form PROTOCOL://machine:port/ . Lets
keep this as general as possible. OK you get the same NEWS from every machine, but
internaly there has to be a given machine to use.

2) Leave open the possibility to use other transport protocols. The idea that everyone
will be on TCP/IP always seemed to me to be counter to the philosophy of the web. If I
have DECnet or ATM on my machine I want to use it. The Web may offer a way to move off
TCPIP to some new protocol five or ten years hence if we make the right move now.

3) Need standard forms for the hierarchies.

4) The anything with a single @ in is a message id is an abomination that should be
put to death as quickly as possible.

5) Want to have search terms

6) POSTING!!!! Need to deal with crossposting.

Phill Hallam-Baker