[15588] in Athena Bugs

home help back first fref pref prev next nref lref last post

sun4 8.1.11: /usr/bin/tr

daemon@ATHENA.MIT.EDU (Jacob Morzinski)
Tue Oct 21 01:10:13 1997

To: bugs@MIT.EDU
Date: Tue, 21 Oct 1997 01:10:09 EDT
From: "Jacob Morzinski" <jmorzins@MIT.EDU>

System name:		portnoy
Type and version:	SPARC/5 8.1.11 (with mkserv)
Display type:		cgthree

What were you trying to do?
	Use Solaris's /usr/bin/tr to translate lowercase characters
	to uppercase:
	    /usr/bin/tr '[:lower:]' '[:upper:]' < latin1

What's wrong:
	/usr/bin/tr gets into an off-by-one error when trying to POSIX-ly
	convert the characters [\337-\366\370-\377] to upper case.
	(That's the characters [_`abcdefghijklmnopqrstuvxyz{|}~].)

	The bug also exists under Solaris 2.6.


	Sample invocation:

    % setenv LC_CTYPE iso_8859_1

    % cat < latin1
    @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_                (0100-0137, 0x40-0x5f)
    `abcdefghijklmnopqrstuvwxyz{|}~                 (0140-0176, 0x60-0x7e)
    @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_                (0300-0337, 0xc0-0xdf)
    `abcdefghijklmnopqrstuvwxyz{|}~                (0340-0377, 0xe0-0xff)

    % /usr/bin/tr '[:lower:]' '[:upper:]' < latin1
    @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_                (0100-0137, 0X40-0X5F)
    `ABCDEFGHIJKLMNOPQRSTUVWXYZ{|}~                 (0140-0176, 0X60-0X7E)
    @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^@                (0300-0337, 0XC0-0XDF)
    ABCDEFGHIJKLMNOPQRSTUVXwYZ[\]^~                (0340-0377, 0XE0-0XFF)


	Note especially the last line of each output.
	"`" (0340) has been upcased into "A" (0301),
	"a" (0341) has been upcased into "B" (0302),
	and so on.  Somehow we've gotten off-by-one errors.



What should have happened:

    % perl5 -ne 'print uc($_)' < latin1
    @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_                (0100-0137, 0X40-0X5F)
    `ABCDEFGHIJKLMNOPQRSTUVWXYZ{|}~                 (0140-0176, 0X60-0X7E)
    @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_                (0300-0337, 0XC0-0XDF)
    @ABCDEFGHIJKLMNOPQRSTUVwXYZ[\]^                (0340-0377, 0XE0-0XFF)



Please describe any relevant documentation references:

(/usr/man/man1/tr.1)
     2. The next example translates all lower-case characters  in
        file1  to  upper-case  and writes the results to standard
        output.

             tr "[:lower:]" "[:upper:]" <file1




Thank you,
-- 
 Jacob Morzinski                                jmorzins@mit.edu




You can reconstruct the "latin1" file from the body message, but
if it makes thing simpler, here's a uuencoded version of it:


begin 644 latin1
M0$%"0T1%1D=(24I+3$U.3U!14E-455976%E:6UQ=7E\@(" @(" @(" @(" @
M(" @*# Q,# M,#$S-RP@,'@T,"TP>#5F*0I@86)C9&5F9VAI:FML;6YO<'%R
M<W1U=G=X>7I[?'U^(" @(" @(" @(" @(" @(" H,#$T,"TP,3<V+" P>#8P
M+3!X-V4I"L#!PL/$Q<;'R,G*R\S-SL_0T=+3U-76U]C9VMO<W=[?(" @(" @
M(" @(" @(" @("@P,S P+3 S,S<L(#!X8S M,'AD9BD*X.'BX^3EYN?HZ>KK
M[.WN[_#Q\O/T]?;W^/GZ^_S]_O\@(" @(" @(" @(" @(" @*# S-# M,#,W
.-RP@,'AE,"TP>&9F*0KZ
 
end

home help back first fref pref prev next nref lref last post