[6025] in www-talk@info.cern.ch

home help back first fref pref prev next nref lref last post

Re: Putting the "World" back in WWW...

daemon@ATHENA.MIT.EDU (Chris Lilley, Computer Graphics Un)
Tue Oct 4 07:37:11 1994

Date: Tue, 4 Oct 1994 12:31:38 +0100
Errors-To: listmaster@www0.cern.ch
Errors-To: listmaster@www0.cern.ch
Reply-To: lilley@v5.cgu.mcc.ac.uk
From: lilley@v5.cgu.mcc.ac.uk (Chris Lilley, Computer Graphics Unit)
To: Multiple recipients of list <www-talk@www0.cern.ch>

Frank Rojas wrote in message  <9410032259.AA23180@nlsarch.austin.ibm.com>  

>  > From:  hallam@dxal18.cern.ch (HALLAM-BAKER Phillip)

>  >  It is simply another content encoding to deal with.
>  >  A charset module can easilly be written to convert fairly arbitrary 
>  >  encodings
>  >  into UNICODE tokens. This can also do UTS, ASCII, ISO-8893, JIS, and
>  >  whacky
>  >  Russian etc. encodings.

> I'm not sure I follow... excuse me if I missed the point ... but it sounds
> like you are suggesting we put "ANY ENCODING" in the document and have each
> viewer convert into UNICODE... 

The alternative seems to be to force everyone to write all their documents in 
unicode, which would give a large increas in server disk space and transfer time 
at a stroke.

As I said yesterday, people in other countries already have methods for encoding 
the characters of their national languages, and these methods should be 
supported.

Put youself in other's shoes - how would you feel if the Web technology was all 
Japanese, say, and the instructions said something like:

 " To type a letter 'e', use shift control right bracket kanji-something. 
  On keyboards without a kanji-something, refer to your manufacturers
  instructions. Pressing the letter 'e' on your keyboard will not work."

> If so, this will cause MAJOR interoperability problems across the network. 

Why? Why would this cause more severe problems than forcing everyone to use 
Unicode when authoring documents?

> Expecting every client to be convert to from every possible encoding will
> never work 

[Is that "to be able to convert" ?]

Sure it will. We *are* using a common libwww aren't we?

> But this causes a nightmare for system administrators that need to provide 
> conversions from any other encoding to UNICODE... and puts the burden of 
> conversion on the clients each time the document is accessed rather then on
> the supplier one time.

I appreciate what you are saying, but the picture is not entirely as you present 
it for two reasons.

Firstly, not all clients will need to convert. Realistically, many of the 
documents using a particular encoding will be read by people also using that 
encoding. So, converting to Unicode on the server would impose a burden of two 
encodings - to and from the same native encoding that the people are using in a 
particular country.

Secondly, the phrase "system administrators" rings warning bells here. Your 
mental model seems to be of a technical support team running a server, doing 
code conversion on all their documents to a common format, etc. This is the 
traditional heavyweight publishing model. 

Fine, some servers are like that but not all. Remember that the first Web 
browser for the NeXT was also an editor, and remember Tim BL's address at WWW94 
describing how important it was that the Web encouraged collaborative, 
lightweight publishing. The readers are also the writers. So putting a burden on 
the 'supplier' is the same thing as putting a burden on the consumer.

It strikes me that an alternative reading of what you wrote would mean the 
'system administrators' are the people who install the clients. If that is what 
you mean, then clients using a common code library would use that for the 
conversions, so sysadmins would not, individually, have to solve the problem.

--
Chris

home help back first fref pref prev next nref lref last post