[24297] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 6488 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Apr 29 18:06:56 2004

Date: Thu, 29 Apr 2004 15:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 29 Apr 2004     Volume: 10 Number: 6488

Today's topics:
        (R) character in RegEXP (Tsu-na-mi)
    Re: (R) character in RegEXP <flavell@ph.gla.ac.uk>
    Re: (R) character in RegEXP <jtc@shell.dimensional.com>
        ANNOUNCE: Audio::M4pDecrypt 0.01 released <wherrera@lynxview.com>
        ANNOUNCE: Javascript::MD5 1.03 <ron@savage.net.au>
        ANNOUNCE: Javascript::SHA1 1.00 <ron@savage.net.au>
        ANNOUNCE: Spreadsheet::WriteExcel 0.43 <jmcnamara@cpan.org>
        ANNOUNCE: Spreadsheet::WriteExcelXML 0.02 <jmcnamara@cpan.org>
    Re: convert utf8 to latin-1/iso-8859-1 <shailesh@nothing.but.net>
    Re: convert utf8 to latin-1/iso-8859-1 <shailesh@nothing.but.net>
    Re: convert utf8 to latin-1/iso-8859-1 <shailesh@nothing.but.net>
    Re: Count how many times find and replaced happened <tadmc@augustmail.com>
    Re: Count how many times find and replaced happened <webmaster @ infusedlight . net>
        cywin versus activestate on xp <beau@oblios-cap.com>
        generating time series graphs with perl <a5ufv8u02@sneakemail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 29 Apr 2004 12:32:12 -0700
From: tsunami@zedxinc.com (Tsu-na-mi)
Subject: (R) character in RegEXP
Message-Id: <a6b1e337.0404291132.1541084e@posting.google.com>

Hi,

I am having trouble getting a simple regexp to recognize the
registered trademark symbol (R) when it is read from XML.  The XML
uses &#174; for the symbol, and if I print the string after parsing,
it prints correctly.  However, the regexp:

$string =~ s/(R)/somethingelse/g;

does not recognize the (R) symbol.  NOTE: (R) is the single-ASCII
character.  I also tried using \x{AE} which did not work either.  The
regular TM symbol doesn't work either, and seems to throw everything
into unicode mode, screwing up other stuff like the bullet and
copyright symbols.

So my question is, If I have XML like :

<P>This is my Widget&#174;</P>

And read it into a string with XML::Parser, how should I address this
character (and any char > 256 if you know).

For the record, I am using Perl 5.8.3 on Red Hat 9.0.  Thanks for any
help anyone can provide.


------------------------------

Date: Thu, 29 Apr 2004 21:58:55 +0100
From: "Alan J. Flavell" <flavell@ph.gla.ac.uk>
Subject: Re: (R) character in RegEXP
Message-Id: <Pine.LNX.4.53.0404292110370.28014@ppepc56.ph.gla.ac.uk>

On Thu, 29 Apr 2004, Tsu-na-mi wrote:

> I am having trouble getting a simple regexp to recognize the
> registered trademark symbol (R) when it is read from XML.  The XML
> uses &#174; for the symbol,

But what are you giving to Perl - the character itself, or that
numerical character reference?

> However, the regexp:
>
> $string =~ s/(R)/somethingelse/g;
>
> does not recognize the (R) symbol.

You said the XML contains &#174;

How do you expect Perl to know what it's intended to mean?

Or were you getting that as a result from the XML parser? (sorry if
your description didn't seem clear enough).

> NOTE: (R) is the single-ASCII character.

Ahem.  The ASCII code does not contain this character.  ASCII is a
7-bit code, which is at the basis of numerous 8-bit codings.
Presumably you're thinking in terms of iso-8859-1, or maybe even
Windows-1252, as the 8-bit coding.

>  I also tried using \x{AE} which did not work either.  The
> regular TM symbol doesn't work either, and seems to throw everything
> into unicode mode, screwing up other stuff like the bullet and
> copyright symbols.

> So my question is,

Not so fast!  Let's see some Perl code first.  Preferably a
manageable-sized snippet that's complete enough to run for ourselves
and that demonstrates the problem you're experiencing.  You're
obviously in a quagmire, and folks would like to help you, but if you
keep yelling and struggling, then you're liable to just get deeper;
stay calm, take it step by step, show us your working...

> If I have XML like :
>
> <P>This is my Widget&#174;</P>
>
> And read it into a string with XML::Parser, how should I address this
> character (and any char > 256 if you know).

With respect, my advice would have to be that you -do- need to take a
while out with perldoc perluniintro and (maybe) perlunicode, to get a
start on Perl's handling of unicode.  Just getting a prescription
handed out isn't going to be a great deal of help - one needs to
understand this sufficiently for it to make sense, rather than just
applying magic incantations.

> For the record, I am using Perl 5.8.3 on Red Hat 9.0.  Thanks for any
> help anyone can provide.

Looks OK.  But I think you need to (a) show a bit more of your working
and (b) understand the difference between Perl's legacy 8-bit handling
and its Unicode character model.

So, let's see a bit more of your code, if we're to help with the
code.  But your part of the bargain would be (no offence intended) to
get a bit more up to speed with the underlying principles.

hope this helps.


------------------------------

Date: 29 Apr 2004 15:17:02 -0600
From: Jim Cochrane <jtc@shell.dimensional.com>
Subject: Re: (R) character in RegEXP
Message-Id: <slrnc92s6e.ts8.jtc@shell.dimensional.com>

In article <a6b1e337.0404291132.1541084e@posting.google.com>, Tsu-na-mi wrote:
> Hi,
> 
> I am having trouble getting a simple regexp to recognize the
> registered trademark symbol (R) when it is read from XML.  The XML
> uses &#174; for the symbol, and if I print the string after parsing,
> it prints correctly.  However, the regexp:
> 
> $string =~ s/(R)/somethingelse/g;

If this is literally what is in your code, you're using the (...) grouping
construct.  If you want to literally match '(R)', you need to escape the
parens: s/\(R\)/...

> 
> does not recognize the (R) symbol.  NOTE: (R) is the single-ASCII
> character.  I also tried using \x{AE} which did not work either.  The
> regular TM symbol doesn't work either, and seems to throw everything
> into unicode mode, screwing up other stuff like the bullet and
> copyright symbols.
> 
> So my question is, If I have XML like :
> 
><P>This is my Widget&#174;</P>
> 
> And read it into a string with XML::Parser, how should I address this
> character (and any char > 256 if you know).
> 
> For the record, I am using Perl 5.8.3 on Red Hat 9.0.  Thanks for any
> help anyone can provide.


-- 
Jim Cochrane; jtc@dimensional.com
[When responding by email, include the term non-spam in the subject line to
get through my spam filter.]


------------------------------

Date: Wed, 28 Apr 2004 23:41:08 GMT
From: Bill <wherrera@lynxview.com>
Subject: ANNOUNCE: Audio::M4pDecrypt 0.01 released
Message-Id: <Hwy59J.oBG@zorch.sf-bay.org>

The pure Perl module Audio::M4pDecrypt has been posted to CPAN.

NAME

Audio::M4pDecrypt -- DRMS decryption of Apple iTunes style MP4 player files

DESCRIPTION

Perl port of the DeDRMS.cs program by Jon Lech Johansen

SYNOPSIS

use Audio::M4pDecrypt;

my $mp4file = 'myfile';
my $outfile = 'mydecodedfile';
my $deDRMS = new Audio::M4pDecrypt;
$deDRMS->DeDRMS($mp4file, $outfile);


--Bill




------------------------------

Date: Wed, 28 Apr 2004 09:42:13 GMT
From: Ron Savage <ron@savage.net.au>
Subject: ANNOUNCE: Javascript::MD5 1.03
Message-Id: <Hwy58q.oAF@zorch.sf-bay.org>

The pure Perl module Javascript::MD5 1.03
is available immediately from CPAN,
and from http://savage.net.au/Perl-modules.html.

On-line docs, and a *.ppd for ActivePerl are also
available from the latter site.

An extract from the docs:
1.03  Tue Apr  27 14:59:04 2004
	- Replace Yahoo!'s version of the Javascript with Paul Johnston's version,
		from his web site: http://pajhome.org.uk/crypt/md5
	- Change the name of the function you call in your submit button code,
		from RetMD5() - the Yahoo! name - to str2hex_md5() - a name more in keeping
		with Paul's naming convention
	- Add 2 extra functions, str2b64_md5() and str2str_md5(), to return other versions
		of the digest
--
Cheers
Ron Savage, ron@savage.net.au on 28/04/2004
http://savage.net.au/index.html




------------------------------

Date: Wed, 28 Apr 2004 09:43:16 GMT
From: Ron Savage <ron@savage.net.au>
Subject: ANNOUNCE: Javascript::SHA1 1.00
Message-Id: <Hwy58x.1vsy@zorch.sf-bay.org>

The pure Perl module Javascript::SHA1 1.00
is available immediately from CPAN,
and from http://savage.net.au/Perl-modules.html.

On-line docs, and a *.ppd for ActivePerl are also
available from the latter site.

An extract from the docs:
1.00  Fri Mar  05 10:23:29 2004
	- Original version
	- The Javascript is Paul Johnston's version, from his web site:
		http://pajhome.org.uk/crypt/md5
	- There are 3 functions, str2hex_sha1(), str2b64_sha1() and str2str_sha1(), to return various versions
		of the digest

--
Cheers
Ron Savage, ron@savage.net.au on 28/04/2004
http://savage.net.au/index.html




------------------------------

Date: Wed, 28 Apr 2004 23:05:23 GMT
From: John McNamara <jmcnamara@cpan.org>
Subject: ANNOUNCE: Spreadsheet::WriteExcel 0.43
Message-Id: <Hwy59A.oBM@zorch.sf-bay.org>

======================================================================
ANNOUNCE

    Spreadsheet::WriteExcel version 0.43 has been uploaded to CPAN.

    http://search.cpan.org/~jmcnamara/Spreadsheet-WriteExcel

======================================================================
NAME

    Spreadsheet::WriteExcel - Write formatted text and numbers to a
    cross-platform Excel binary file.

======================================================================
CHANGES

    Minor release

    ! Fixed lonstanding bug where page setup features didn't
      show up in OpenOffice.

    ! Fixed localised @_ bug when using threaded perls.

======================================================================
DESCRIPTION

    The Spreadsheet::WriteExcel module can be used create a cross-
    platform Excel binary file. Multiple worksheets can be added to a
    workbook and formatting can be applied to cells. Text, numbers,
    formulas and hyperlinks and images can be written to the cells.

    The Excel file produced by this module is compatible with Excel 5,
    95, 97, 2000 and 2002.

    The module will work on the majority of Windows, UNIX and
    Macintosh platforms. Generated files are also compatible with the
    Linux/UNIX spreadsheet applications Gnumeric and OpenOffice.
    The generated files are not compatible with MS Access.

    This module cannot be used to read an Excel file. See
    Spreadsheet::ParseExcel or look at the main documentation for some
    suggestions.

    This module cannot be used to write to an existing Excel file.

======================================================================
SYNOPSIS

    To write a string, a formatted string, a number and a formula to
    the first worksheet in an Excel workbook called perl.xls:

        use Spreadsheet::WriteExcel;

        # Create a new Excel workbook
        my $workbook = Spreadsheet::WriteExcel->new("perl.xls");

        # Add a worksheet
        $worksheet = $workbook->addworksheet();

        #  Add and define a format
        $format = $workbook->addformat();    # Add a format
        $format->set_bold();
        $format->set_color('red');
        $format->set_align('center');

        # Write a formatted and unformatted string
        $col = $row = 0;
        $worksheet->write($row, $col, "Hi Excel!", $format);
        $worksheet->write(1,    $col, "Hi Excel!");

        # Write a number and a formula using A1 notation
        $worksheet->write('A3', 1.2345);
        $worksheet->write('A4', '=SIN(PI()/4)');

======================================================================
REQUIREMENTS

    This module requires Perl 5.005 (or later), Parse::RecDescent and
    File::Temp

        http://search.cpan.org/search?dist=Parse-RecDescent
        http://search.cpan.org/search?dist=File-Temp


======================================================================
AUTHOR

    John McNamara (jmcnamara@cpan.org)

--




------------------------------

Date: Wed, 28 Apr 2004 23:03:55 GMT
From: John McNamara <jmcnamara@cpan.org>
Subject: ANNOUNCE: Spreadsheet::WriteExcelXML 0.02
Message-Id: <Hwy595.oAx@zorch.sf-bay.org>

======================================================================
ANNOUNCE

    Spreadsheet::WriteExcel version 0.02 has been uploaded to CPAN.

    http://search.cpan.org/~jmcnamara/Spreadsheet-WriteExcelXML/

======================================================================
NAME

    Spreadsheet::WriteExcelXML - Create an Excel file in XML format.

======================================================================
DESCRIPTION

    The Spreadsheet::WriteExcelXML module can be used to create an
    Excel file in XML format. The Excel XML format is supported in
    Excel 2002 and 2003.

    Multiple worksheets can be added to a workbook and formatting
    can be applied to cells. Text, numbers, and formulas can be
    written to the cells. The module supports strings up to 32,767
    characters and the strings can be in UTF8 format.

    Spreadsheet::WriteExcelXML uses the same interface as
    Spreadsheet::WriteExcel.

    This module cannot, as yet, be used to write to an existing
    Excel XML file.


======================================================================
SYNOPSIS

    To write a string, a formatted string, a number and a formula to
    the first worksheet in an Excel XML spreadsheet called perl.xml:

        use Spreadsheet::WriteExcelXML;

        # Create a new Excel workbook
        my $workbook = Spreadsheet::WriteExcelXML->new("perl.xml");

        # Add a worksheet
        $worksheet = $workbook->add_worksheet();

        #  Add and define a format
        $format = $workbook->add_format(); # Add a format
        $format->set_bold();
        $format->set_color('red');
        $format->set_align('center');

        # Write a formatted and unformatted string.
        $col = $row = 0;
        $worksheet->write($row, $col, "Hi Excel!", $format);
        $worksheet->write(1,    $col, "Hi Excel!");

        # Write a number and a formula using A1 notation
        $worksheet->write('A3', 1.2345);
        $worksheet->write('A4', '=SIN(PI()/4)');


======================================================================
REQUIREMENTS

    This module requires Perl 5.005 (or later).


======================================================================
INSTALLATION

    Use the standard Unix style installation, a ppm for Windows
    users will be available in the next release:

        Unzip and untar the module as follows or use winzip:

            tar -zxvf Spreadsheet-WriteExcel-0.xx.tar.gz

        The module can be installed using the standard Perl procedure:

            perl Makefile.PL
            make
            make test
            make install    # You may need to be root
            make clean      # or make realclean


======================================================================
AUTHOR

    John McNamara (jmcnamara@cpan.org)

--




------------------------------

Date: Thu, 29 Apr 2004 19:35:59 GMT
From: Shailesh <shailesh@nothing.but.net>
Subject: Re: convert utf8 to latin-1/iso-8859-1
Message-Id: <zOckc.39929$Vp5.6607@fe2.columbus.rr.com>

Nevermind, the trademark symbol is a Microsoft specific extension to 
the Latin1 character set, and in actuality has no representation in 
iso-8859-1.  That is why it is correctly being erased by Perl during 
the conversion.

http://casa.colorado.edu/~ajsh/iso8859-1.html

http://www.fourmilab.ch/webtools/demoroniser/


Shailesh wrote:

> I have a file that contains the trademark character encoded in 
> utf8/Unicode (i.e. it takes up 3 octets of space).  I can read this file 
> into a string like this:
> 
> $filename = "testutf8.txt";
> $source = IO::File->new( $filename, 'r' );
> binmode( $source, ':utf8' );
> @filedata = <$source>;
> $line = $filedata[0];
> 
> The problem is, I want to write out the $line to another file encoded as 
> iso-8859-1 (a.k.a latin-1 and extended US ASCII).  The trademark 
> character should be translated to a single octet with value 153. I have 
> tried encode/decode and pack/unpack with no success.  The trademark 
> character either gets erased, or converted into three separate gibberish 
> characters.  The only thing that seems to work is 
> HTML::Entities::encode_entities, which correctly detects the trademark 
> symbol in utf8, and converts it to the entity "&trade;".  Any help?
> 
> I have attached the utf8 encoded file, which contains "abc|123", where 
> '|' is the TM symbol.  Note that the file is 9 bytes, as expected. Below 
> is the rest of the testing code.
> 
> # Open a test output file
> $dest = IO::File->new("outenc.txt" , 'w' );
> 
> # Write a correct trademark symbol
> $test = pack("C", 153);
> print "char is: ".$test."\n";
> $dest->write($test);
> $dest->write("\n");
> 
> # Write the utf8 line converted to iso-8859-1 -- HOW TO CONVERT IT?
> $dest->write($line);
> 
> $source->close();
> $dest->close();
> 
> 
> ------------------------------------------------------------------------
> 
> abc™123


------------------------------

Date: Thu, 29 Apr 2004 19:57:19 GMT
From: Shailesh <shailesh@nothing.but.net>
Subject: Re: convert utf8 to latin-1/iso-8859-1
Message-Id: <z6dkc.40077$Vp5.31173@fe2.columbus.rr.com>

This one-liner does the conversion correctly:

$line = Unicode::String::utf8($line)->latin1();

Shailesh wrote:

> Nevermind, the trademark symbol is a Microsoft specific extension to the 
> Latin1 character set, and in actuality has no representation in 
> iso-8859-1.  That is why it is correctly being erased by Perl during the 
> conversion.
> 
> http://casa.colorado.edu/~ajsh/iso8859-1.html
> 
> http://www.fourmilab.ch/webtools/demoroniser/
> 
> 
> Shailesh wrote:
> 
>> I have a file that contains the trademark character encoded in 
>> utf8/Unicode (i.e. it takes up 3 octets of space).  I can read this 
>> file into a string like this:
>>
>> $filename = "testutf8.txt";
>> $source = IO::File->new( $filename, 'r' );
>> binmode( $source, ':utf8' );
>> @filedata = <$source>;
>> $line = $filedata[0];
>>
>> The problem is, I want to write out the $line to another file encoded 
>> as iso-8859-1 (a.k.a latin-1 and extended US ASCII).  The trademark 
>> character should be translated to a single octet with value 153. I 
>> have tried encode/decode and pack/unpack with no success.  The 
>> trademark character either gets erased, or converted into three 
>> separate gibberish characters.  The only thing that seems to work is 
>> HTML::Entities::encode_entities, which correctly detects the trademark 
>> symbol in utf8, and converts it to the entity "&trade;".  Any help?
>>
>> I have attached the utf8 encoded file, which contains "abc|123", where 
>> '|' is the TM symbol.  Note that the file is 9 bytes, as expected. 
>> Below is the rest of the testing code.
>>
>> # Open a test output file
>> $dest = IO::File->new("outenc.txt" , 'w' );
>>
>> # Write a correct trademark symbol
>> $test = pack("C", 153);
>> print "char is: ".$test."\n";
>> $dest->write($test);
>> $dest->write("\n");
>>
>> # Write the utf8 line converted to iso-8859-1 -- HOW TO CONVERT IT?
>> $dest->write($line);
>>
>> $source->close();
>> $dest->close();
>>
>>
>> ------------------------------------------------------------------------
>>
>> abc™123


------------------------------

Date: Thu, 29 Apr 2004 21:44:35 GMT
From: Shailesh <shailesh@nothing.but.net>
Subject: Re: convert utf8 to latin-1/iso-8859-1
Message-Id: <7Hekc.41033$Vp5.40794@fe2.columbus.rr.com>

Shailesh wrote:
> This one-liner does the conversion correctly:
> 
> $line = Unicode::String::utf8($line)->latin1();

Above only seems to work for trademark symbol.  This is more reliable:

use utf8;

s/([\x{80}-\x{FFFF}])//gse;


From: http://perl-xml.sourceforge.net/faq/#encoding_conversion


------------------------------

Date: Thu, 29 Apr 2004 15:52:59 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Count how many times find and replaced happened
Message-Id: <slrnc92qpb.3qo.tadmc@magna.augustmail.com>

Robin <webmaster@infusedlight> wrote:

> From: "Robin" <webmaster @ infusedlight . net>


Please choose one posting address and stick to it.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Thu, 29 Apr 2004 13:37:31 -0800
From: "Robin" <webmaster @ infusedlight . net>
Subject: Re: Count how many times find and replaced happened
Message-Id: <c6rsp9$lnp$1@reader2.nmix.net>


"Tad McClellan" <tadmc@augustmail.com> wrote in message
news:slrnc92qpb.3qo.tadmc@magna.augustmail.com...
> Robin <webmaster@infusedlight> wrote:
>
> > From: "Robin" <webmaster @ infusedlight . net>
>
>
> Please choose one posting address and stick to it.
yeah I did...sorry.
-Robin





------------------------------

Date: Thu, 29 Apr 2004 20:03:27 GMT
From: usenet_spam_cygwin <beau@oblios-cap.com>
Subject: cywin versus activestate on xp
Message-Id: <Pine.CYG.4.58.0404291259130.1808@beren>

Howdy.  A google search "activestate versus cygwin" didn't do me much
good, so I'm asking the fine folks at comp.lang.perl.misc: any clear
reason to use one over the other?  My objective, really, is as seamless as
possible an experience as I work my way through the llama book for the
first time.  Many thanks!  (My feelings won't be hurt by backchannel
responses if you feel it's not sufficiently on topic.)
-- 
beau


------------------------------

Date: Thu, 29 Apr 2004 21:45:06 GMT
From: Po Boy <a5ufv8u02@sneakemail.com>
Subject: generating time series graphs with perl
Message-Id: <pan.2004.04.29.21.46.50.587897@sneakemail.com>


I'm trying to find a perl module that will help me make a certain kind of
graph. I believe it's called a time-series graph, but I'm not sure. It's a
graph of how two variables behave over time. You can see a couple examples
at:

http://www.trendmacro.com/a/goodman/keyIndicators/pvCharting.asp
and
http://www.thestreet.com/comment/openbook/1332231.html

I have tried using gnuplot and the GD package for these graphs, but have
been unable to get either one to generate reasonable looking graphs. I
believe it's because the data that I'm graphing is not strictly described
by any function since some X value can have multiple Y values (just at
different times).

Has anyone had success in graphing this kind of data? Can you recommend a
module or some documentation or pointers that may help me out?

Looking forward to any help you can give me!

-pb



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6488
***************************************


home help back first fref pref prev next nref lref last post