[30270] in Perl-Users-Digest
Perl-Users Digest, Issue: 1513 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue May 6 16:09:41 2008
Date: Tue, 6 May 2008 13:09:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 6 May 2008 Volume: 11 Number: 1513
Today's topics:
Error in Handling Unicode(UTF16-LE) File & String <iaminsik@gmail.com>
Re: Error in Handling Unicode(UTF16-LE) File & String <benkasminbullock@gmail.com>
Re: Error in Handling Unicode(UTF16-LE) File & String <benkasminbullock@gmail.com>
Re: interacting with another program that requires inpu <zentara@highstream.net>
Re: IPC::Open3 : Why can't I catch program output here? (Darren Dunham)
Re: IPC::Open3 : Why can't I catch program output here? <ben@morrow.me.uk>
Re: IPC::Open3 : Why can't I catch program output here? xhoster@gmail.com
Re: maximum hash/array keys/values <spamtrap@dot-app.org>
Re: perl GD Image resolution problem <zentara@highstream.net>
Re: perl GD Image resolution problem <zhilianghu@gmail.com>
perl PNG image searching <mazzawi@gmail.com>
Re: Process to fix a broken CPAN module? (Jens Thoms Toerring)
Re: Process to fix a broken CPAN module? <ramesh.thangamani@gmail.com>
retrieve info from searching site <uspensky@gmail.com>
Re: retrieve info from searching site <greymausg@mail.com>
Why doesn't Perl complain about this bareword? <ro.naldfi.scher@gmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Tue, 6 May 2008 01:00:50 -0700 (PDT)
From: iaminsik <iaminsik@gmail.com>
Subject: Error in Handling Unicode(UTF16-LE) File & String
Message-Id: <d9cdfbb7-8d70-41dd-a08d-493e160eaf67@w34g2000prm.googlegroups.com>
In most cases, I converted utf-16le files into utf-8 encoding.
But, I want to handle utf-16le files directly.
My first source is "read a line from utf-16le file and write it in
utf-16le encoding".
It works well.
==========================================================
use utf8;
use Encode;
open ($infile, "<:encoding(UTF-16LE):crlf", "unicodefile.dat");
binmode $infile;
open ($outfile, ">:raw:encoding(UTF-16LE):crlf", "unicodefile.out");
binmode $outfile;
while ($line = <$infile>)
{
print $outfile $line;
}
close($infile);
close($outfile);
==========================================================
Second source is "read one line, split it into array, and print array
by line in utf-16le encoding".
It seemed to work well, but some characters were broken. It didn't
work well.
After a long web searching, I recognized Unicode::String could solve
this problem.
==========================================================
use utf8;
use Encode;
$\ = "\n";
open ($infile, "<:encoding(UTF-16LE):crlf", "unicodefile.dat");
binmode $infile;
open ($outfile, ">:raw:encoding(UTF-16LE):crlf", "unicodefile.out");
binmode $outfile;
while ($line = <$infile>)
{
chomp($line);
@words = split(/[ ]+/, $line);
foreach $word (@words)
{
print $outfile $word;
}
}
close($infile);
close($outfile);
==========================================================
Using Unicode::String, I made the third source, but still it doesn't
work.
It means "reading" is OK, but split function isn't.
Is there any solution?
==========================================================
use utf8;
use Encode;
use Unicode::String;
Unicode::String->stringify_as('utf16');
$\ = "\n";
open ($infile, "<:encoding(UTF-16LE):crlf", "unicodefile.dat");
binmode $infile;
open ($outfile, ">:raw:encoding(UTF-16LE):crlf", "unicodefile.out");
binmode $outfile;
while ($line = <$infile>)
{
chomp($line);
$sep = new Unicode::String ("[ ]+");
@words = split($sep, $line);
foreach $word (@words)
{
print $outfile $word;
}
}
close($infile);
close($outfile);
==========================================================
Best Regards.
Remi
------------------------------
Date: Tue, 6 May 2008 10:44:09 +0000 (UTC)
From: Ben Bullock <benkasminbullock@gmail.com>
Subject: Re: Error in Handling Unicode(UTF16-LE) File & String
Message-Id: <fvpcpp$2fs$1@ml.accsnet.ne.jp>
On Tue, 06 May 2008 01:00:50 -0700, iaminsik wrote:
> In most cases, I converted utf-16le files into utf-8 encoding. But, I
> want to handle utf-16le files directly.
>
> My first source is "read a line from utf-16le file and write it in
> utf-16le encoding".
> It works well.
No it doesn't. Your problems are all in the first file.
> open ($infile, "<:encoding(UTF-16LE):crlf", "unicodefile.dat");
> binmode $infile;
> open ($outfile, ">:raw:encoding(UTF-16LE):crlf", "unicodefile.out");
> binmode $outfile;
Do you know what binmode does? You'd better have another look at the
manual (perldoc -f binmode). The binmode statements here switch OFF all
the :raw:encoding(UTF... stuff you'd put in the previous lines, which
explains all the other problems you had.
To demonstrate, try this:
#!/usr/local/bin/perl
use warnings;
use strict;
use utf8;
use Encode;
binmode STDOUT, "utf8";
my $utf8 = "モンスター 自惚れ";
for (qw/file1 file2/) {
open (my $outfile, ">:raw:encoding(UTF-16LE):crlf", "$_.dat") or die
$!;
binmode $outfile if /1/; # do what you did for file1 only
print $outfile $utf8;
close $outfile or die $!;
open (my $infile, "<:encoding(UTF-16LE):crlf", "$_.dat");
while (my $line = <$infile>)
{
print "$_: $line\n";
}
close($infile) or die $!;
}
The reason your code appeared to work is because you never did anything
with the data. It was actually just reading and writing it as bytes
without any knowledge of the encoding. As soon as you tried to manipulate
the data, the problem which had been there all along became visible.
P.S. use warnings; use strict; & check the values of open and close as
above.
------------------------------
Date: Tue, 6 May 2008 10:57:39 +0000 (UTC)
From: Ben Bullock <benkasminbullock@gmail.com>
Subject: Re: Error in Handling Unicode(UTF16-LE) File & String
Message-Id: <fvpdj3$2o5$1@ml.accsnet.ne.jp>
On Tue, 06 May 2008 10:44:09 +0000, Ben Bullock wrote:
> open (my $infile, "<:encoding(UTF-16LE):crlf", "$_.dat");
> P.S. use warnings; use strict; & check the values of open and close as
> above.
Oops!
------------------------------
Date: Tue, 06 May 2008 07:09:52 -0400
From: zentara <zentara@highstream.net>
Subject: Re: interacting with another program that requires input
Message-Id: <6oe02419i4k65emp0t25b9kqcqbugdffj2@4ax.com>
On Mon, 5 May 2008 11:37:52 -0700 (PDT), whcchoi@gmail.com wrote:
>
>I need help with "NEED HELP HERE", if my cmd "ipa add" requires input
>of "abc", how can I provide that and interact with "ipa add"??
For bi-directional interaction you need IPC::Open3 , IPC::Open2,
or IPC::Run. Read "perldoc perlipc".
A rudimentary example:
#!/usr/bin/perl
use warnings;
use strict;
use IPC::Open3;
#interface to "bc" calculator
#my $pid = open3(\*WRITE, \*READ, \*ERROR,"bc");
my $pid = open3(\*WRITE, \*READ,0,"bc");
#if \*ERROR is false, STDERR is sent to STDOUT
while(1){
print "Enter expression for bc, i.e. 2 + 2\n";
chomp(my $query = <STDIN>);
#send query to bc
print WRITE "$query\n";
#give bc time to output
select(undef,undef,undef,.5);
#get the answer from bc
chomp(my $answer = <READ>);
print "$query = $answer\n";
}
waitpid($pid, 1);
# It is important to waitpid on your child process,
# otherwise zombies could be created.
__END__
#############################################
You can also add IO::Select to be fancier.
#!/usr/bin/perl
# Something like this:
# It's only drawback is it only outputs 1 line of bc output
# so it errs on something like 234^12345 (which outputs a big number)
use warnings;
use strict;
use IPC::Open3;
use IO::Select;
#interface to "bc" calculator
my $pid = open3(\*WRITE, \*READ,\*ERROR,"bc");
my $sel = new IO::Select();
$sel->add(\*READ);
$sel->add(\*ERROR);
my($error,$answer)=('','');
while(1){
print "Enter expression for bc, i.e. 2 + 2\n";
chomp(my $query = <STDIN>);
#send query to bc
print WRITE "$query\n";
foreach my $h ($sel->can_read)
{
my $buf = '';
if ($h eq \*ERROR)
{
sysread(ERROR,$buf,4096);
if($buf){print "ERROR-> $buf\n"}
}
else
{
sysread(READ,$buf,4096);
if($buf){print "$query = $buf\n"}
}
}
}
waitpid($pid, 1);
__END__
zentara
--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
------------------------------
Date: Tue, 06 May 2008 16:01:19 GMT
From: ddunham@taos.com (Darren Dunham)
Subject: Re: IPC::Open3 : Why can't I catch program output here?
Message-Id: <jh%Tj.15904$2g1.4887@nlpi068.nbdc.sbc.com>
Ben Morrow <ben@morrow.me.uk> wrote:
> I would normally say 'Use lexical filehandles!' at this point;
> unfortunately, IPC::Open3 was written before they existed and the
> obvious way
>
> my $pid = open3(my $CMD_IN, my $CMD_OUT, my $CMD_ERR, @cmd)...
>
> doesn't work (and can't be made to since undef is already meaningful).
What's wrong with the above?
The only issue I see is that Open3 will not autogenerate a filehandle
for stderr (instead it will combine stdout and stderr and place them
both on the $CMD_OUT filehandle). I actually use that method so I don't
have to select between two filehandles and can just read one at a time.
# capture STDOUT and STDERR, on a single filehandle
use IPC::Open3;
open3(my $cmd_in, my $cmd_out, undef, @cmd);
while (<$cmd_out>)
{
...
}
--
Darren Dunham ddunham@taos.com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
------------------------------
Date: Tue, 6 May 2008 17:00:19 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: IPC::Open3 : Why can't I catch program output here?
Message-Id: <jtb6f5-qg1.ln1@osiris.mauzo.dyndns.org>
Quoth Ilya Zakharevich <nospam-abuse@ilyaz.org>:
> [A complimentary Cc of this posting was NOT [per weedlist] sent to
> Ben Morrow
> <ben@morrow.me.uk>], who wrote in article
> <vrf4f5-d901.ln1@osiris.mauzo.dyndns.org>:
> > I would normally say 'Use lexical filehandles!' at this point;
> > unfortunately, IPC::Open3 was written before they existed and the
> > obvious way
> >
> > my $pid = open3(my $CMD_IN, my $CMD_OUT, my $CMD_ERR, @cmd)...
> >
> > doesn't work (and can't be made to since undef is already meaningful).
>
> I have no idea what you mean here.
>
> sub open3 ($$$$@) {
> $_[0] = IO::Handle->new unless defined $_[0]; # Or whatever is THE
> initializer
> ...
> }
Yeah, in general you can do that; however, for the specific case of
open3, the docs say
If CHLD_ERR is false, or the same file descriptor as CHLD_OUT, then
STDOUT and STDERR of the child are on the same filehandle.
so at least the CHLD_ERR filehandle can't be autovivified. Presumably it
is felt that this behaviour can't be changed.
Ben
--
"If a book is worth reading when you are six, * ben@morrow.me.uk
it is worth reading when you are sixty." [C.S.Lewis]
------------------------------
Date: 06 May 2008 16:07:52 GMT
From: xhoster@gmail.com
Subject: Re: IPC::Open3 : Why can't I catch program output here?
Message-Id: <20080506120753.993$9E@newsreader.com>
ddunham@taos.com (Darren Dunham) wrote:
> Ben Morrow <ben@morrow.me.uk> wrote:
> > I would normally say 'Use lexical filehandles!' at this point;
> > unfortunately, IPC::Open3 was written before they existed and the
> > obvious way
> >
> > my $pid = open3(my $CMD_IN, my $CMD_OUT, my $CMD_ERR, @cmd)...
> >
> > doesn't work (and can't be made to since undef is already meaningful).
>
> What's wrong with the above?
It causes cmd's stderr to go to $CMD_OUT rather than the obviously
intended $CMD_ERR.
>
> The only issue I see is that Open3 will not autogenerate a filehandle
> for stderr (instead it will combine stdout and stderr and place them
> both on the $CMD_OUT filehandle).
And this is obviously a problem if that is not what you want to happen.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
------------------------------
Date: Tue, 06 May 2008 11:12:51 -0400
From: Sherman Pendley <spamtrap@dot-app.org>
Subject: Re: maximum hash/array keys/values
Message-Id: <m163tro498.fsf@dot-app.org>
Slickuser <slick.users@gmail.com> writes:
> What's the maximum hash/array in Perl can hold for keys and values?
There is an internal limit to the number of items, but the limit matches
that of a pointer. That is, on 32-bit Perl the limit is 2^32 items, and
the available address space is 2^32 bytes. Since Perl's scalars are always
at least 4 bytes wide, there will therefore never be enough memory to fill
a Perl array or hash to its theoretical maximum capacity.
sherm--
--
My blog: http://shermspace.blogspot.com
Cocoa programming in Perl: http://camelbones.sourceforge.net
------------------------------
Date: Tue, 06 May 2008 07:15:11 -0400
From: zentara <zentara@highstream.net>
Subject: Re: perl GD Image resolution problem
Message-Id: <95f0241cpusaoagm22lspj77jh9ktaq6af@4ax.com>
On Mon, 5 May 2008 20:28:15 -0700 (PDT), Zhiliang Hu
<zhilianghu@gmail.com> wrote:
>I use GD::Image to create figures. Here is an example:
>http://sphinx.vet.unimelb.edu.au/QTLdb/tmp/map490151833.png
>
>For publication purposes we need high resolution pictures. I am
>seeking expert advice as how can I improve the resolution?
>
>Thanks in advance!
>
>Zhiliang
Use a camera with more mega-pixels, and set it to the highest
resolution. ( Usually the largest picture, allowing fewest pictures
in storage).
If you take a low resolution photo, and use a program to make it larger,
each pixel will be duplicated with the same color, to accomodate the
larger size. So the picture will be larger, but the resolution will not
improve.
Resolution comes from the camera's megapixel count and settings.
zentara
--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
------------------------------
Date: Tue, 6 May 2008 08:29:09 -0700 (PDT)
From: Zhiliang Hu <zhilianghu@gmail.com>
Subject: Re: perl GD Image resolution problem
Message-Id: <46aca95f-290f-4a52-ad05-477dd3d249fc@34g2000hsh.googlegroups.com>
Many thanks for all your hints. That'll be useful.
--
Zhiliang
------------------------------
Date: Tue, 6 May 2008 11:02:22 -0700 (PDT)
From: elie <mazzawi@gmail.com>
Subject: perl PNG image searching
Message-Id: <fdbb47f5-4269-4cca-bfc2-48e42d224eb6@34g2000hsh.googlegroups.com>
Hello,
so I have some images (all PNG)
I have a small image, ( a checkered box) and other larger images that
may or may not contain the checkered box small image.
I want to somehow find out if the small image (the checkered box)
apprears anywhere in the larger PNG's or not.
any hits would be appreciated.
Regards,
------------------------------
Date: 6 May 2008 08:59:40 GMT
From: jt@toerring.de (Jens Thoms Toerring)
Subject: Re: Process to fix a broken CPAN module?
Message-Id: <68akvsF2r8td5U1@mid.uni-berlin.de>
google@obmac.org wrote:
> I have found a major flaw in a CPAN package. The package is
> File::Binary and it has big endian and little endian unpack-ing
> backwards. Likewise, all the test files are reversed so that the
> tests all pass!
> Why do I believe this is true?
> From the perl manual on pack:
> n,N unpacks a 16 or 32 bit integer in "network" or big endian order
> v,V unpacks a 16 or 32 bit integer in "VAX" or little endian order
> From the code:
> if ($endian == $BIG_ENDIAN) {
> $self->{_ui16} = 'v';
> $self->{_ui32} = 'V';
> } else {
> $self->{_ui16} = 'n';
> $self->{_ui32} = 'N';
> }
> When I 'od -x' the test files, they are clearly reversed:
> od -x be.power10.n32.ints
> 0000000 ffff ffff f6ff ffff 9cff ffff 18fc
I hope you do realize that 'od -x' does reverse the bytes when
used on a little endian system. Just create a file that only
contains two letter, e.g. first 'a' and then 'b'. Now look at
the file with 'od -x' and you will find (at least if you're
on a little endian system)
0000000 6261
i.e. the 'b' seems to come first and only then the 'a'.
That's because the '-x' option makes od deal with two
bytes at once. If you want to see what's really in the
file in a byte-by-byte fashion use instead
od -t x1 filename
(or e.g. load the file into emacs and switch to hexl-mode).
> OK, so there is a mistake here -- I would like to submit a fix to all
> this -- my question is how do I go about doing this? I have contacted
> the author, but no response.
It looks as if the module is actively maintained (the last
version was uploaded on Aril 1st, 2008). How long did you
give the author to reply? Did you consider that he could
be on vacation and not able to read emails at the moment?
Best regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
------------------------------
Date: Tue, 6 May 2008 05:14:43 -0700 (PDT)
From: rthangam <ramesh.thangamani@gmail.com>
Subject: Re: Process to fix a broken CPAN module?
Message-Id: <62505649-83b6-4c5b-ac7c-68cb08667616@y22g2000prd.googlegroups.com>
Jens Thoms Toerring wrote:
> google@obmac.org wrote:
> > I have found a major flaw in a CPAN package. The package is
> > File::Binary and it has big endian and little endian unpack-ing
> > backwards. Likewise, all the test files are reversed so that the
> > tests all pass!
>
> > Why do I believe this is true?
>
> > From the perl manual on pack:
> > n,N unpacks a 16 or 32 bit integer in "network" or big endian order
> > v,V unpacks a 16 or 32 bit integer in "VAX" or little endian order
>
> > From the code:
> > if ($endian == $BIG_ENDIAN) {
> > $self->{_ui16} = 'v';
> > $self->{_ui32} = 'V';
> > } else {
> > $self->{_ui16} = 'n';
> > $self->{_ui32} = 'N';
> > }
>
> > When I 'od -x' the test files, they are clearly reversed:
>
> > od -x be.power10.n32.ints
> > 0000000 ffff ffff f6ff ffff 9cff ffff 18fc
>
> I hope you do realize that 'od -x' does reverse the bytes when
> used on a little endian system. Just create a file that only
> contains two letter, e.g. first 'a' and then 'b'. Now look at
> the file with 'od -x' and you will find (at least if you're
> on a little endian system)
>
> 0000000 6261
>
> i.e. the 'b' seems to come first and only then the 'a'.
> That's because the '-x' option makes od deal with two
> bytes at once. If you want to see what's really in the
> file in a byte-by-byte fashion use instead
>
> od -t x1 filename
>
> (or e.g. load the file into emacs and switch to hexl-mode).
>
> > OK, so there is a mistake here -- I would like to submit a fix to all
> > this -- my question is how do I go about doing this? I have contacted
> > the author, but no response.
>
> It looks as if the module is actively maintained (the last
> version was uploaded on Aril 1st, 2008). How long did you
> give the author to reply? Did you consider that he could
> be on vacation and not able to read emails at the moment?
>
> Best regards, Jens
> --
> \ Jens Thoms Toerring ___ jt@toerring.de
> \__________________________ http://toerring.de
Which CPAN module are you taking about ?. Every CPAN module has a link
'Report a bug' where you can log a bug. Try it out may be you might
get some response
------------------------------
Date: Tue, 6 May 2008 08:24:02 -0700 (PDT)
From: batman <uspensky@gmail.com>
Subject: retrieve info from searching site
Message-Id: <b242cb61-c046-4720-8164-8c09e588285c@t54g2000hsg.googlegroups.com>
there is a website that cotains 2 search input fields: ID and Lastname
i can input to search using either one but lastname uses fuzzy logic.
if more than one result comes back (like if i pass in % for last
name), i get a selection list of all the people.
i can then click on a single person and click the 'info' button to get
a table returned back with all the pertinent info about that person.
i'm trying to come up with a way to do the above and then cycle
through each person in the selection list and get 'info' and pull the
returned table into a local SQL database.
im familiar with programming but never done any perl before-
anybody have an idea?
------------------------------
Date: 6 May 2008 17:43:56 GMT
From: greymaus <greymausg@mail.com>
Subject: Re: retrieve info from searching site
Message-Id: <slrng216br.bv2.greymausg@maus.org>
On 2008-05-06, batman <uspensky@gmail.com> wrote:
> there is a website that cotains 2 search input fields: ID and Lastname
>
> i can input to search using either one but lastname uses fuzzy logic.
>
> if more than one result comes back (like if i pass in % for last
> name), i get a selection list of all the people.
>
> i can then click on a single person and click the 'info' button to get
> a table returned back with all the pertinent info about that person.
>
> i'm trying to come up with a way to do the above and then cycle
> through each person in the selection list and get 'info' and pull the
> returned table into a local SQL database.
>
> im familiar with programming but never done any perl before-
> anybody have an idea?
perl module WWW::Mechanize might help
--
Greymaus
Anything that can not kill you is a boring experience.
------------------------------
Date: Tue, 6 May 2008 04:25:10 -0700 (PDT)
From: Ronny <ro.naldfi.scher@gmail.com>
Subject: Why doesn't Perl complain about this bareword?
Message-Id: <47c82be2-6151-416e-87f5-8898dae02d5b@x41g2000hsb.googlegroups.com>
By chance I found out that no error is issued on the following
program:
perl -w -e 'use strict; print(Does::Not::Exist,"\n")'
Instead, "Does::Not::Exist" is printed. Shouldn't there be a warning
about
the improper use of a bareword? Similarily, the program
perl -w -e 'use strict; system(Does::Not::Exist,"\n")'
results in the message
Can't exec "Does::Not::Exist": No such file or directory at -e line 1.
which too seems to suggest that Does::Not::Exist is simply interpreted
as string.
But when I use it like this:
perl -w -e 'use strict; print(ref(Does::Not::Exist),"\n")'
I get the more reasonable:
Bareword "Does::Not::Exist" not allowed while "strict subs" in use at -
e line 1.
Why is this bareword treated differently in these contexts?
Ronald
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 1513
***************************************