[6519] in www-talk@info.cern.ch

home help back first fref pref prev next nref lref last post

RE: WWW support for Cyrillic (and UNICODE)

daemon@ATHENA.MIT.EDU (Vladimir Sukonnik, Process Softwar)
Wed Nov 2 17:13:17 1994

Date: Wed, 2 Nov 1994 23:09:33 +0100
Errors-To: listmaster@www0.cern.ch
Reply-To: sukonnik@elnath.process.com
From: sukonnik@elnath.process.com (Vladimir Sukonnik, Process Software Corp)
To: Multiple recipients of list <www-talk@www0.cern.ch>

>Several reasons, in my estimation:
>
>    1) Unicode increases overhead, being 16-bit rather than 8
>    2) It is not supported by GUIs
>    3) 16-bit characters aren't supported by compilers
>    4) US/European programmers are cultural chauvenists
>
>Of course most of these objections are spurious:
>
>    1) UTF-8 allows for backwards 8-bit compatibility, adding
>       to storage requirements only for characters outside the
>       one-byte range
>    2) (valid objection)
>    3) Wide characters should be supported by ANSI-conformant
>       compilers and libraries
>    4) US and European programmers aren't stupid; they are just
>       not terribly aware of the situation in countries like
>       Russia, India, Japan, etc.  Give them gentle nudges, and
>       they *will* respond....

>Adding multi-language capability to the Web is going to take time,
>because it will require changes to HTML, to servers, and to clients.
>Given the lack of multilingual support in most GUIs, a lot of new
>widgets will have to be created, and people will have to stretch
>themselves to learn how things like Japanese and Arabic scripts
>work.  It's going to take time, but it appears we'll get there.

>Right here several things have been hashed out.  For example, we
>all pretty much seem to agree that LANG and CHARSET or CODEPAGE
>attributes will be needed for HTML (with some sensible defaults).
>We've also come to the realization that logical ordering of data
>is the only way to go.  The "visual" ordering that MIME allows
>for embedded Hebrew or Arabic just won't work in the long run.
>So we have to bite the bullet and make sure that clients can
>do the visual reordering themselves for mixed right-left/left-
>right languages.  (There is, by the way, a terribly explained al-
>gorithm for doing this in appendix A volume 1 of the old pub-
>lished Unicode standard; I can supply people with a more prac-
>tical tutorial if anyone wants it.)

>Things *are* happening.  Be patient, and offer to help.  Inject
>comments where you feel they will be appropriate.  Cut some code
>if you know how; otherwise, do some research on scripts and stand-
>ards in various countries and help guide the process.  Above all,
>though, don't complain.  Help us out!

>Richard Goerwitz


Richard,

Thanks for your reply. I agree, in general, with your estimate
of the state of the art of this issue. I just want to point out a few
things. First, Microsoft GUI (the 32-bit one) supports UNICODE,
so does MS Visual C++. I know that this does not solve the problem
for non-windows platforms. Second, EMWAC supports UNICODE in
their release of HTTP server. I believe (please correct me if I am
wrong), that NCSA Mosaic browser supports Unicode as well. The URL of 
the UNICODE testing facility for browser developers  is:

http://emwac.ed.ac.uk/html/internet_toolchest/UNICODE.HTM


 
		Best regards,
		Vladimir.



+---------------------------------------------------------------+
| Vladimir Sukonnik		Voice: 1-508-879-6994		|
| Principal Software Engineer	http://www.process.com		|
| Process Software Corp		Fax:   1-508-879-0042		|
| 959 Concord Street		E-mail: sukonnik@process.com or |
| Framingham, MA 01760 USA		sukonnik@bumetb.bu.edu	|
+---------------------------------------------------------------+


home help back first fref pref prev next nref lref last post