[31781] in Perl-Users-Digest
Perl-Users Digest, Issue: 3044 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Jul 25 21:09:28 2010
Date: Sun, 25 Jul 2010 18:09:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sun, 25 Jul 2010 Volume: 11 Number: 3044
Today's topics:
Re: exist function in perl 5.12.1 <nospam-abuse@ilyaz.org>
Re: exist function in perl 5.12.1 <nospam-abuse@ilyaz.org>
Re: exist function in perl 5.12.1 <willem@turtle.stack.nl>
Re: exist function in perl 5.12.1 sln@netherlands.com
Re: exist function in perl 5.12.1 <nospam-abuse@ilyaz.org>
Re: FAQ 5.29 How can I read in an entire file all at on <tw@dionic.net>
Re: FAQ 5.29 How can I read in an entire file all at on <hjp-usenet2@hjp.at>
Re: FAQ 5.29 How can I read in an entire file all at on <uri@StemSystems.com>
Re: FAQ 5.29 How can I read in an entire file all at on <tw@dionic.net>
Re: FAQ 5.29 How can I read in an entire file all at on <uri@StemSystems.com>
Re: FAQ 5.29 How can I read in an entire file all at on <ben@morrow.me.uk>
Re: FAQ 8.35 How do I close a process's filehandle with <ben@morrow.me.uk>
Help with regular expression <markhobley@yahoo.donottypethisbit.co>
Re: Help with regular expression <hjp-usenet2@hjp.at>
Re: Help with regular expression <ben@morrow.me.uk>
Re: Help with regular expression (Jens Thoms Toerring)
Re: Speed of reading some MB of data using qx(...) <ben@morrow.me.uk>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sun, 25 Jul 2010 11:26:09 +0000 (UTC)
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: exist function in perl 5.12.1
Message-Id: <slrni4o7qh.dfg.nospam-abuse@powdermilk.math.berkeley.edu>
On 2010-07-24, Uri Guttman <uri@StemSystems.com> wrote:
> >> exists was just a function that provided no information
> >> to the hash expression it was checking.
>
> IZ> ... which in 99.9999% of cases is not what the intent was.
>
> and writing a deep exists is such an easy task anyhow.
Technically: yes, and this is why Perl should provide this itself.
Yours,
Ilya
------------------------------
Date: Sun, 25 Jul 2010 11:28:03 +0000 (UTC)
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: exist function in perl 5.12.1
Message-Id: <slrni4o7u3.dfg.nospam-abuse@powdermilk.math.berkeley.edu>
On 2010-07-24, Dr.Ruud <rvtol+usenet@xs4all.nl> wrote:
> Dilbert wrote:
>>> On Jul 23, 10:40 am, Dilbert <dilbert1...@gmail.com> wrote:
>
>> http://perldoc.perl.org/5.8.8/functions/exists.html
>>
>>>> This surprising autovivification in what does not at first -- or even
>>>> second -- glance appear to be an lvalue context may be fixed in
>>>> a future release.
>>
>> Let's hope that the surprising autovivification will be fixed in Perl
>> 5.14
>
> It is not surprising,
For each person for whom it is not surprising, how many for which it is?
Ilya
------------------------------
Date: Sun, 25 Jul 2010 12:02:39 +0000 (UTC)
From: Willem <willem@turtle.stack.nl>
Subject: Re: exist function in perl 5.12.1
Message-Id: <slrni4o9uv.2ddq.willem@turtle.stack.nl>
<about exists autovivifying intermediates>
Ilya Zakharevich wrote:
) For each person for whom it is not surprising, how many for which it is?
For each person for whom it is surprising, how many for which it is not?
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
------------------------------
Date: Sun, 25 Jul 2010 13:56:51 -0700
From: sln@netherlands.com
Subject: Re: exist function in perl 5.12.1
Message-Id: <lh8p46hli03ufiibcd5hej51hmh3h74erc@4ax.com>
On Fri, 23 Jul 2010 09:40:06 -0700 (PDT), Dilbert <dilbert1999@gmail.com> wrote:
>In perl 5.12.1, with reference to the exist function "perldoc -f
>exist" ( see also http://perldoc.perl.org/functions/exists.html ) it
>says
>
>>> [...]
>>> Although the mostly deeply nested array or hash will
>>> not spring into existence just because its existence
>>> was tested, any intervening ones will. Thus $ref->{"A"}
>>> and $ref->{"A"}->{"B"} will spring into existence due to
>>> the existence test for the $key element above.
>>> [...]
>>> This surprising autovivification in what does not at first
>>> --or even second-- glance appear to be an lvalue context
>>> may be fixed in a future release.
>
>Has this particular case of surprising autovivification always
>existed, even in perl 5.10 or 5.8 ?
This may be a workaround (suprisingly nitpicky to do).
-sln
-------------------
use strict;
use warnings;
use Data::Dumper;
my $x;
$x->{''}{''}{2} = '';
$x->{a} {''}{c}{d} = undef;
print Dumper( $x );
print "1> pass = ", scalar deep_exists( $x, 'a', undef, 'c'), "\n";
print "2> pass = ", scalar deep_exists( $x, '' , undef), "\n";
print "3> pass = ", scalar deep_exists( $x, '' , undef, '2', ''), "\n";
my ($pass, $found) = deep_exists($x, 'a', '', 'c', 'd');
print "4> pass = $pass, found = $found\n";
print "5> pass = ", scalar deep_exists( $x), "\n";
print "6> pass = ", scalar deep_exists(), "\n";
exit 0;
##
sub deep_exists {
return (0,0) unless @_;
my $t = shift;
my $count = map {
my $key = $_ // '';
( ref($t) eq "HASH" and exists $t->{$key} )
? $t = $t->{$key}
: ()
} @_;
my $res = (@_ && $count == @_) ? 1 : 0;
return wantarray ? ($res, $count) : $res;
}
------------------------------
Date: Sun, 25 Jul 2010 21:08:30 +0000 (UTC)
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: exist function in perl 5.12.1
Message-Id: <slrni4p9ue.p32.nospam-abuse@powdermilk.math.berkeley.edu>
On 2010-07-25, Willem <willem@turtle.stack.nl> wrote:
> ) For each person for whom it is not surprising, how many for which it is?
> For each person for whom it is surprising, how many for which it is not?
0, up to experiment's errors. But you know this already; why ask?
Puzzled,
Ilya
------------------------------
Date: Sun, 25 Jul 2010 14:35:03 +0100
From: Tim Watts <tw@dionic.net>
Subject: Re: FAQ 5.29 How can I read in an entire file all at once?
Message-Id: <i2hei7$q8f$1@news.eternal-september.org>
Uri Guttman <uri@StemSystems.com>
wibbled on Friday 23 July 2010 23:15
> that is true. the readonly aspect of a mmap slurp is a win. but given
> the small sizes of most files slurped it isn't that large a win. today
> we have 4k or larger page sizes and many files are smaller than
> that. ram and vram are cheap as hell so fighting for each byte is a long
> lost art that needs to die. :)
Yes that would be true of small files.
But what if you're dealing with 1GB files or just multi MB files? This is
extremely likely if you were processing video or scientific data (ignoring
the fact that you probably wouldn't be using perl for either!)
--
Tim Watts
Managers, politicians and environmentalists: Nature's carbon buffer.
------------------------------
Date: Sun, 25 Jul 2010 16:16:43 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: FAQ 5.29 How can I read in an entire file all at once?
Message-Id: <slrni4ohqb.2df.hjp-usenet2@hrunkner.hjp.at>
On 2010-07-25 13:35, Tim Watts <tw@dionic.net> wrote:
> Uri Guttman <uri@StemSystems.com>
> wibbled on Friday 23 July 2010 23:15
>> that is true. the readonly aspect of a mmap slurp is a win. but given
>> the small sizes of most files slurped it isn't that large a win.
>
> Yes that would be true of small files.
>
> But what if you're dealing with 1GB files or just multi MB files? This is
> extremely likely if you were processing video or scientific data (ignoring
> the fact that you probably wouldn't be using perl for either!)
Perl was used in the Human Genome project.
hp, who also routinely processes files in the range of a few GB.
------------------------------
Date: Sun, 25 Jul 2010 11:38:21 -0400
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: FAQ 5.29 How can I read in an entire file all at once?
Message-Id: <87sk37fu6q.fsf@quad.sysarch.com>
>>>>> "TW" == Tim Watts <tw@dionic.net> writes:
TW> Uri Guttman <uri@StemSystems.com>
TW> wibbled on Friday 23 July 2010 23:15
>> that is true. the readonly aspect of a mmap slurp is a win. but given
>> the small sizes of most files slurped it isn't that large a win. today
>> we have 4k or larger page sizes and many files are smaller than
>> that. ram and vram are cheap as hell so fighting for each byte is a long
>> lost art that needs to die. :)
TW> Yes that would be true of small files.
TW> But what if you're dealing with 1GB files or just multi MB files?
TW> This is extremely likely if you were processing video or
TW> scientific data (ignoring the fact that you probably wouldn't be
TW> using perl for either!)
and your point is?
and someone else pointed out that perl was and is used for genetic
work. ever heard of bioperl? it is a very popular package for
biogenetics. look for the article about perl saving the human genome
project (that was done by the author of cgi.pm!). of course those
systems don't slurp in those enormous data files. but they can always
slurp in the smaller (for some definition of smaller) config, control,
and other files.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: Sun, 25 Jul 2010 21:47:06 +0100
From: Tim Watts <tw@dionic.net>
Subject: Re: FAQ 5.29 How can I read in an entire file all at once?
Message-Id: <i2i7sb$b9s$1@news.eternal-september.org>
Uri Guttman <uri@StemSystems.com>
wibbled on Sunday 25 July 2010 16:38
>>>>>> "TW" == Tim Watts <tw@dionic.net> writes:
>
> TW> Uri Guttman <uri@StemSystems.com>
> TW> wibbled on Friday 23 July 2010 23:15
>
>
> >> that is true. the readonly aspect of a mmap slurp is a win. but given
> >> the small sizes of most files slurped it isn't that large a win.
> >> today we have 4k or larger page sizes and many files are smaller than
> >> that. ram and vram are cheap as hell so fighting for each byte is a
> >> long lost art that needs to die. :)
>
> TW> Yes that would be true of small files.
>
> TW> But what if you're dealing with 1GB files or just multi MB files?
> TW> This is extremely likely if you were processing video or
> TW> scientific data (ignoring the fact that you probably wouldn't be
> TW> using perl for either!)
>
> and your point is?
>
> and someone else pointed out that perl was and is used for genetic
> work. ever heard of bioperl? it is a very popular package for
> biogenetics. look for the article about perl saving the human genome
> project (that was done by the author of cgi.pm!). of course those
> systems don't slurp in those enormous data files.
^^^ That was my point.
I may have misread previous posts, but I was reading them as "mmap is no
more efficient as slurping" which of course is not generally true (though
makes little practical difference as you say for small files).
BTW - I am surprised the genome project was done in perl. I *would* have
thought, even from a perl fanboi perspective, that C would have been
somewhat faster and the amount of data would have made it worth optimising
the project even at the expense of simplicity. I shall have to read up on
that.
> but they can always
> slurp in the smaller (for some definition of smaller) config, control,
> and other files.
>
> uri
>
--
Tim Watts
Managers, politicians and environmentalists: Nature's carbon buffer.
------------------------------
Date: Sun, 25 Jul 2010 17:51:52 -0400
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: FAQ 5.29 How can I read in an entire file all at once?
Message-Id: <874ofncjrb.fsf@quad.sysarch.com>
>>>>> "TW" == Tim Watts <tw@dionic.net> writes:
TW> BTW - I am surprised the genome project was done in perl. I
TW> *would* have thought, even from a perl fanboi perspective, that C
TW> would have been somewhat faster and the amount of data would have
TW> made it worth optimising the project even at the expense of
TW> simplicity. I shall have to read up on that.
the artical i referred to can likely be found. it wasn't that the whole
project was done in perl. the issue was worldwide they ended up with
about 14 different data formats and they couldn't share it with each
other. so this one guy (as i said author of cgi.pm and several perl
books) wrote modules to convert each format to/from a common format
which allowed full sharing of data. that 'saved' the project from its
babel hell. since then, perl is a major language used in biogen both for
having bioperl and for its great string and regex support. c sucks for
both of those and its faster run speed loses out to perl's much better
development time.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: Sun, 25 Jul 2010 23:20:02 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: FAQ 5.29 How can I read in an entire file all at once?
Message-Id: <itpuh7-66m2.ln1@osiris.mauzo.dyndns.org>
Quoth "Peter J. Holzer" <hjp-usenet2@hjp.at>:
>
> I wish Perl would fight for each byte at the low level. The overhead for
> each scalar, array element or hash element is enormous, and these really
> add up if you have enough of them.
Take a look at sv.h from a recent perl sometime. The contortions it goes
through to avoid allocating anything not absolutely needed are quite
incredible (and, incidentally, make C-level debugging really rather
hard).
Ben
------------------------------
Date: Sun, 25 Jul 2010 23:27:07 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: FAQ 8.35 How do I close a process's filehandle without waiting for it to complete?
Message-Id: <raquh7-66m2.ln1@osiris.mauzo.dyndns.org>
Quoth "Peter J. Holzer" <hjp-usenet2@hjp.at>:
> On 2010-07-25 04:00, PerlFAQ Server <brian@theperlreview.com> wrote:
> > 8.35: How do I close a process's filehandle without waiting for it to
> complete?
> >
> > Assuming your system supports such things, just send an appropriate
> > signal to the process (see "kill" in perlfunc). It's common to first
> > send a TERM signal, wait a little bit, and then send a KILL signal to
> > finish it off.
>
> To me "closing a file handle" and "killing a process" are two completely
> different concepts. If somebody asks this question and it turns out that
> killing the process is what they need to do then I smell an XY problem.
The question would be better phrased as 'How do I close a piped-open
filehandle without waiting for the process on the other end to terminate
on its own?', or something like that. When you close a piped-open
filehandle, perl will wait(2) for the process it started, which may take
forever if the process has got stuck somehow.
I can easily see this being a FAQ by people who start with 'why is my
filehandle taking forever to close', not realising that their question
should be 'how do I kill this child process': that is, the FAQ is
attempting to *address* a (once?) common XY problem. It might be better
for the answer to explain the problem in more detail, making it clear
that the filehandle is not really part of the problem.
Ben
------------------------------
Date: Sun, 25 Jul 2010 19:19:33 +0000 (UTC)
From: Mark Hobley <markhobley@yahoo.donottypethisbit.co>
Subject: Help with regular expression
Message-Id: <i2i2o4$14uv$1@adenine.netfront.net>
I need a regular expression with the following properties.
I need to match text (typically, though not necessarily expressions)
enclosed within double parentheses. However, I do not want to match nested
single parentheses enclosed text.
So ((*)) is a match, but ((*)*(*)) is not a match.
Here are some examples to illustrate this.
((FOO)) - This is a match
(()) - This is a match
((3 + 2)) - This is a match
((3 + 2) + (2 * foo)) - This is not a match
((3 * bar) + ((foo))) - This is a match
((3 * bar) + ((foo))bar) - This is a match.
I hope that lot makes sense.
Thanks in advance to anyone who can help.
--
Mark Hobley
Linux User: #370818 http://markhobley.yi.org/
--- news://freenews.netfront.net/ - complaints: news@netfront.net ---
------------------------------
Date: Sun, 25 Jul 2010 22:04:25 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Help with regular expression
Message-Id: <slrni4p669.693.hjp-usenet2@hrunkner.hjp.at>
On 2010-07-25 19:19, Mark Hobley <markhobley@yahoo.donottypethisbit.co> wrote:
> I need a regular expression with the following properties.
> I need to match text (typically, though not necessarily expressions)
> enclosed within double parentheses. However, I do not want to match nested
> single parentheses enclosed text.
>
> So ((*)) is a match, but ((*)*(*)) is not a match.
> Here are some examples to illustrate this.
>
> ((FOO)) - This is a match
> (()) - This is a match
> ((3 + 2)) - This is a match
> ((3 + 2) + (2 * foo)) - This is not a match
> ((3 * bar) + ((foo))) - This is a match
> ((3 * bar) + ((foo))bar) - This is a match.
Is this a match?
(((1 + 2) * (3 +4)))
hp
------------------------------
Date: Sun, 25 Jul 2010 23:29:26 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Help with regular expression
Message-Id: <6fquh7-66m2.ln1@osiris.mauzo.dyndns.org>
Quoth Mark Hobley <markhobley@yahoo.donottypethisbit.co>:
> I need a regular expression with the following properties.
> I need to match text (typically, though not necessarily expressions)
> enclosed within double parentheses. However, I do not want to match nested
> single parentheses enclosed text.
>
> So ((*)) is a match, but ((*)*(*)) is not a match.
> Here are some examples to illustrate this.
>
> ((FOO)) - This is a match
> (()) - This is a match
> ((3 + 2)) - This is a match
> ((3 + 2) + (2 * foo)) - This is not a match
> ((3 * bar) + ((foo))) - This is a match
> ((3 * bar) + ((foo))bar) - This is a match.
Is there something wrong with /\(\([^(]*\)\)/ ?
(Hmm, that's *seriously* unreadable.)
Ben
------------------------------
Date: 25 Jul 2010 22:36:44 GMT
From: jt@toerring.de (Jens Thoms Toerring)
Subject: Re: Help with regular expression
Message-Id: <8b3sjsF29kU1@mid.uni-berlin.de>
Mark Hobley <markhobley@yahoo.donottypethisbit.co> wrote:
> I need a regular expression with the following properties.
> I need to match text (typically, though not necessarily expressions)
> enclosed within double parentheses. However, I do not want to match nested
> single parentheses enclosed text.
> So ((*)) is a match, but ((*)*(*)) is not a match.
> Here are some examples to illustrate this.
> ((FOO)) - This is a match
> (()) - This is a match
> ((3 + 2)) - This is a match
> ((3 + 2) + (2 * foo)) - This is not a match
> ((3 * bar) + ((foo))) - This is a match
Should the whole thing be the match or only the "((foo))" part?
> ((3 * bar) + ((foo))bar) - This is a match.
Same question here
> I hope that lot makes sense.
If in e.g. "((3 * bar) + ((foo)))" only the "((foo))" part is
meant to be the match then I would think
\(\([^(]*\)\)
should do the job - you seem to want two opening parentheses,
followed by some text that does not contain another opening
parenthesis, and finally two closing parentheses.
Regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
------------------------------
Date: Sun, 25 Jul 2010 23:10:53 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Speed of reading some MB of data using qx(...)
Message-Id: <dcpuh7-66m2.ln1@osiris.mauzo.dyndns.org>
Quoth Wolfram Humann <w.c.humann@arcor.de>:
> I have a program that processes PDF files by converting them to
> Postscript, read the ps and do something with it. I use pdftops (from
> xpdf) for the pdf->ps conversion and retrieve the result like this:
>
> $ps_text = qx( pdftops $infile - );
>
> On win32 using strawberry perl (tried 5.10 and 5.12) this takes much
> more time than I expected so I did a test and first converted the PDF
> to Postscript, then read the Postscript (about 12 MB) like this (cat
> on win32 provided by cygwin):
>
> perl -E" $t = qx(cat psfile.ps); say length $t "
>
> This takes about 16 seconds on win32 but only <1 seconds on Linux. I
> was afraid that this might be a 'binmode' problem so I also tried
> this:
>
> perl -E" open $in,'cat psfile.ps |'; binmode $in; local $/; $t=<$in>;
> say length $t "
>
> But the effect is the same: fast on linux, slow on win32. Besides
> bashing win32 :-) and ideas for reason and (possibly) cure?
Win32's pipes are *really* slow. Write it to a temporary and then read
the file normally in perl.
Ben
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 3044
***************************************