[25461] in Perl-Users-Digest
Perl-Users Digest, Issue: 7706 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jan 28 21:05:40 2005
Date: Fri, 28 Jan 2005 18:05:15 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 28 Jan 2005 Volume: 10 Number: 7706
Today's topics:
Re: [perl-python] 20050127 traverse a dir <skip@pobox.com>
Re: [perl-python] 20050127 traverse a dir <abigail@abigail.nl>
getting {string} from \${string} bongo@frii.com
Re: getting {string} from \${string} <noreply@gunnar.cc>
Re: getting {string} from \${string} <jl_post@hotmail.com>
Re: getting {string} from \${string} <jl_post@hotmail.com>
Re: getting {string} from \${string} <noreply@gunnar.cc>
Re: getting {string} from \${string} <notvalid@email.com>
Re: getting {string} from \${string} (Anno Siegel)
Re: getting {string} from \${string} (Anno Siegel)
Re: IO::Select extension xhoster@gmail.com
Re: IO::Select extension <tassilo.von.parseval@rwth-aachen.de>
Re: Old tutorial - now corrected binnyva@hotmail.com
Re: Old tutorial - now corrected <1usa@llenroc.ude.invalid>
regular expression question bayxarea-usenet@yahoo.com
Re: regular expression question <noreply@gunnar.cc>
Re: regular expression question <noelt.dolan@virgin.net>
Re: regular expression question bayxarea-usenet@yahoo.com
Re: regular expression question bayxarea-usenet@yahoo.com
Re: regular expression question <noreply@gunnar.cc>
Re: Script dumps core....? Any suggestions... xhoster@gmail.com
Re: Script dumps core....? Any suggestions... (Anno Siegel)
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 28 Jan 2005 14:52:14 -0600
From: Skip Montanaro <skip@pobox.com>
To: abigail@abigail.nl
Subject: Re: [perl-python] 20050127 traverse a dir
Message-Id: <mailman.1540.1106945544.22381.python-list@python.org>
abigail> @@ No. Second, learn Python. Third, learn Perl (optional). :)
abigail> Just leave the third option out. Let him learn Python. We don't
abigail> want him. ;-)
We don't want him either. Perhaps we can persuade him to learn INTERCAL...
Skip
------------------------------
Date: 28 Jan 2005 21:14:37 GMT
From: Abigail <abigail@abigail.nl>
Subject: Re: [perl-python] 20050127 traverse a dir
Message-Id: <slrncvlapt.a9.abigail@alexandra.abigail.nl>
Skip Montanaro (skip@pobox.com) wrote on MMMMCLXVIII September MCMXCIII
in <URL:news:mailman.1540.1106945544.22381.python-list@python.org>:
__
__ abigail> @@ No. Second, learn Python. Third, learn Perl (optional). :)
__
__ abigail> Just leave the third option out. Let him learn Python. We don't
__ abigail> want him. ;-)
__
__ We don't want him either. Perhaps we can persuade him to learn INTERCAL...
Please don't send stealth CCs.
Abigail
--
perl -wle\$_=\<\<EOT\;y/\\n/\ /\;print\; -eJust -eanother -ePerl -eHacker -eEOT
------------------------------
Date: 28 Jan 2005 12:33:47 -0800
From: bongo@frii.com
Subject: getting {string} from \${string}
Message-Id: <1106944427.741379.101990@c13g2000cwb.googlegroups.com>
Hi folks. Perl newbie here. Having a heck of a
time with something which probably isn't that hard...
When debugging code in progress, I often find myself
typing:
print "\$var1=$var1, \$var2=$var2 \n";
not a big deal, but when you do it all the time, it
would be nice to save some typing by putting it in a
subroutine, such as:
dbg_pr($var1,$var2);
Easy enough if I just wanted the values, but I want to
print both the variable names and the values, and I'd
like to get both from one string. If I pass $var1 or \$var1
I can get the value but not the variable name, and if I
pass "var1" I get the name but not the value
(as ${"var1"} isn't defined in the subroutine name space.)
It doesn't seem like I can pass a string and construct a
reference from it. Can I somehow stringify \${var1} to
get "var1" instead of SCALAR(Hhex address)?
Any help appreciated. It seems as if this has to be simple,
but the best I can currently do is pass both the string and
the reference, which is almost as redundant as just retyping
the whole print statement every time.
thanks,
--bongo
------------------------------
Date: Fri, 28 Jan 2005 22:28:32 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: getting {string} from \${string}
Message-Id: <35vp3bF4rbar8U1@individual.net>
bongo@frii.com wrote:
> When debugging code in progress, I often find myself
> typing:
>
> print "\$var1=$var1, \$var2=$var2 \n";
Me too. :)
> not a big deal, but when you do it all the time, it
> would be nice to save some typing by putting it in a
> subroutine, such as:
>
> dbg_pr($var1,$var2);
>
> Easy enough if I just wanted the values, but I want to
> print both the variable names and the values, and I'd
> like to get both from one string. If I pass $var1 or \$var1
> I can get the value but not the variable name, and if I
> pass "var1" I get the name but not the value
> (as ${"var1"} isn't defined in the subroutine name space.)
Maybe you should start using hashes more often instead of simple scalars...
my %hash = ( key1 => 'one', key2 => 'two' );
sub dbg_pr { my $ref = shift; print "$_=$ref->{$_}\n" for @_ }
dbg_pr( \%hash, qw/key1 key2/ );
Outputs:
key1=one
key2=two
Or hashes together with Data::Dumper:
use Data::Dumper;
my %hash = ( key1 => 'one', key2 => 'two' );
print Dumper \%hash;
Outputs:
$VAR1 = {
'key2' => 'two',
'key1' => 'one'
};
> It doesn't seem like I can pass a string and construct a
> reference from it.
Well..
our ($var1, $var2) = ('one', 'two');
sub dbg_pr {
my %vars = map { $_ => eval } @_;
print "$_=$vars{$_}\n" for @_
}
dbg_pr( qw/$var1 $var2/ );
( but don't take it too seriously ;-) )
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: 28 Jan 2005 13:56:11 -0800
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: getting {string} from \${string}
Message-Id: <1106949371.160978.117810@f14g2000cwb.googlegroups.com>
bongo@frii.com wrote:
> Hi folks. Perl newbie here. Having a heck of a
> time with something which probably isn't that hard...
>
> When debugging code in progress, I often find myself
> typing:
>
> print "\$var1=$var1, \$var2=$var2 \n";
>
> not a big deal, but when you do it all the time, it
> would be nice to save some typing by putting it in a
> subroutine, such as:
>
> dbg_pr($var1,$var2);
You can try using this subroutine:
sub printVar
{ my $v = shift; print "\$$v = ", eval "\$$v", "\n"; }
Then call it like this:
$var = 7;
printVar("var");
# or even like: printVar qw(var);
You'll see the output:
$var = 7
(You can easily modify the printVar() function to handle multiple
variables, if you wish.)
I hope this helps.
-- Jean-Luc
------------------------------
Date: 28 Jan 2005 14:03:53 -0800
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: getting {string} from \${string}
Message-Id: <1106949833.961085.82780@z14g2000cwz.googlegroups.com>
jl_post@hotmail.com wrote:
>
> (You can easily modify the printVar() function to
> handle multiple variables, if you wish.)
Such as:
sub printVar { print "\$$_ = ", eval "\$$_", "\n" for @_ }
Then you can call it like:
my $wow = 5;
my $var = "hello";
printVar qw(wow var);
And you'll see the output:
$wow = 5
$var = hello
(Sorry about the separate post. This came to me right after I posted
my first response.)
-- Jean-Luc
------------------------------
Date: Fri, 28 Jan 2005 23:13:35 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: getting {string} from \${string}
Message-Id: <35vro2F4qq0fqU1@individual.net>
jl_post@hotmail.com wrote:
> You can try using this subroutine:
>
> sub printVar
> { my $v = shift; print "\$$v = ", eval "\$$v", "\n"; }
>
> Then call it like this:
>
> $var = 7;
> printVar("var");
That works only with file scoped variables, and hopefully most variables
are my() declared in the lowest possible scope.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Fri, 28 Jan 2005 23:17:00 GMT
From: Ala Qumsieh <notvalid@email.com>
Subject: Re: getting {string} from \${string}
Message-Id: <MJzKd.17477$5R.10931@newssvr21.news.prodigy.com>
bongo@frii.com wrote:
> would be nice to save some typing by putting it in a
> subroutine, such as:
>
> dbg_pr($var1,$var2);
Do you really need to print the leading '$'?
print "var1 = $var1.\n";
--Ala
------------------------------
Date: 29 Jan 2005 01:54:03 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: getting {string} from \${string}
Message-Id: <cteqbr$srt$1@mamenchi.zrz.TU-Berlin.DE>
<bongo@frii.com> wrote in comp.lang.perl.misc:
> Hi folks. Perl newbie here. Having a heck of a
> time with something which probably isn't that hard...
>
> When debugging code in progress, I often find myself
> typing:
>
> print "\$var1=$var1, \$var2=$var2 \n";
>
> not a big deal, but when you do it all the time, it
> would be nice to save some typing by putting it in a
> subroutine, such as:
>
> dbg_pr($var1,$var2);
>
> Easy enough if I just wanted the values, but I want to
> print both the variable names and the values, and I'd
> like to get both from one string. If I pass $var1 or \$var1
> I can get the value but not the variable name, and if I
> pass "var1" I get the name but not the value
> (as ${"var1"} isn't defined in the subroutine name space.)
>
> It doesn't seem like I can pass a string and construct a
> reference from it. Can I somehow stringify \${var1} to
> get "var1" instead of SCALAR(Hhex address)?
No, not without Devel::Peek.
> Any help appreciated. It seems as if this has to be simple,
> but the best I can currently do is pass both the string and
> the reference, which is almost as redundant as just retyping
> the whole print statement every time.
It's not trivial. You can use "eval" to get the value of an expression
given as a string (including simple variables). Then you can print
the literal expression and the value. Define two routines:
sub showval {
my ( $name, $value) = @_;
print defined $value ? "$name = $value\n" : "$name -undef-\n";
}
sub show {
join '; ', map "showval( '$_', $_)", @_;
}
(If you put them in a module, import both show() and showval().)
In a program, you may have:
my $x = 3;
my $y;
my %z = ( aaa => 123, bbb => 456 );
To see some of the values, say
eval show qw( $x $y $z{bbb} $z{gibsnich});
That prints
$x = 3
$y -undef-
$z{bbb} = 456
$z{gibsnich} -undef-
Close enough? Myself, I have it somewhere but never use it.
Anno
------------------------------
Date: 29 Jan 2005 01:58:57 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: getting {string} from \${string}
Message-Id: <cteql1$srt$2@mamenchi.zrz.TU-Berlin.DE>
jl_post@hotmail.com <jl_post@hotmail.com> wrote in comp.lang.perl.misc:
> bongo@frii.com wrote:
> > Hi folks. Perl newbie here. Having a heck of a
> > time with something which probably isn't that hard...
> >
> > When debugging code in progress, I often find myself
> > typing:
> >
> > print "\$var1=$var1, \$var2=$var2 \n";
> >
> > not a big deal, but when you do it all the time, it
> > would be nice to save some typing by putting it in a
> > subroutine, such as:
> >
> > dbg_pr($var1,$var2);
>
>
> You can try using this subroutine:
>
> sub printVar
> { my $v = shift; print "\$$v = ", eval "\$$v", "\n"; }
>
>
> Then call it like this:
>
> $var = 7;
> printVar("var");
> # or even like: printVar qw(var);
>
> You'll see the output:
>
> $var = 7
>
> (You can easily modify the printVar() function to handle multiple
> variables, if you wish.)
That only works for package variables, while these days most variables
are lexical. With those, "eval" must be called in their lexical scope
to access the value. Putting "eval" in a sub like that won't do.
Anno
------------------------------
Date: 28 Jan 2005 22:05:05 GMT
From: xhoster@gmail.com
Subject: Re: IO::Select extension
Message-Id: <20050128170505.455$yW@newsreader.com>
Brian McCauley <nobull@mail.com> wrote:
> xhoster@gmail.com wrote:
>
> > IO::Select does a nice job of hiding the ugliness of 'select'. But
> > it warns that it shouldn't be mixed with buffered I/O, which still
> > leaves me to deal with all the messy sysread and syswrite stuff. I've
> > been thinking lately of making a module that would either subclass or
> > contain IO::Select objects and give them the veneer of being buffered
> > IO rather than unbuffered IO. Have I overlooked something already in
> > CPAN that does this? Am I insane for thinking this is desirable and/or
> > possible?
>
> I have often thought about this but have never done anything about it.
>
> There is a problem. If you want to avoid the need to play with sysread
> then you need to find a way to allow read() or readline() not to block
> when there is insufficient data avialable to satisfy them.
>
> To do this all the handles returned by IO::Select would have to be
> wrapped in a special class that simulates read()/readline() using
> sysread() and return undef with $!=EAGAIN when there's insufficient data.
My ideas was to keep the interface as close to IO::Select as possible,
so can_read would return handles that would not block upon (a
single) readline. If someone were foolish enough to call readline on a
handle not returned by can_read, then just let them block. That way you
don't need to screw around with EAGAIN.
But this would still require the handle to be wrapped in a special class,
which means the handle returned by can_read would not test identical to the
handle that was passed in to the selector class in the first place, so this
would still break the analogy to IO::Select.
Once I decided that I can't keep highly analogous to IO::Select, I started
wondering why even have can_read(). If the module knows the complete line
can be read, rather than returning a flag, return the line itself. So you
would have do_read(), which would return a hashref where keys are
filehandles and values are the corresponding lines that were read for those
handles. But them I wonder if I haven't digressed from a IO::Select-like
module into a full message passing module.
> Then you have to find a way to avoid a busy loop when there is data in
> the buffer but not enough to satisfy the read() or readline(). One way
> to do this would be to make an unsatisfied read()/readline() set a flag
> to say that even though there is data in the buffer that the special
> buffered IO::Select must not report that the handle is readable.
I think that that is about what I was thinking. The hardest part would be
the timeouts. If the system-level select comes back with a sysread-able
handle, you sysread it into a internal buffer. But if that was not enough
to make a full record, then you should to re-start the system-level select
with a new timeout which is the old timeout minus the time spent in the
prior select. But I guess most systems don't reliably report the info
needed to compute this. Of course, you could probably just return early
(with an empty readable list/hash) in this case.
>
> On the write side you need and extra layer of buffering to set a
> threshold and tell the buffered IO::Select only to return a handle as
> writable if there is at least that much space in the buffer.
I would use a dynamically resizable internal write buffer, then have
can_write report handles that are under some in-use buffer-space threshold,
rather than have it report handles that are over some free buffer-space
threshold. And if they keep writing to a handle even though it is no longer
"writable" by that threshold, there is no blocking and no problem as long
as they have enough RAM to keep expanding the buffer.
> As you can see I've given this a little thought.
Me too. But now I'm wondering if my time wouldn't be better spent just
learning Event.pm or something like that rather than rewriting IO::Select.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: Fri, 28 Jan 2005 23:33:25 +0100
From: "Tassilo v. Parseval" <tassilo.von.parseval@rwth-aachen.de>
Subject: Re: IO::Select extension
Message-Id: <slrncvlfdl.1pg.tassilo.von.parseval@localhost.localdomain>
Also sprach xhoster@gmail.com:
> Brian McCauley <nobull@mail.com> wrote:
[ select(2) with buffered I/O ]
>> I have often thought about this but have never done anything about it.
>>
>> There is a problem. If you want to avoid the need to play with sysread
>> then you need to find a way to allow read() or readline() not to block
>> when there is insufficient data avialable to satisfy them.
>>
>> To do this all the handles returned by IO::Select would have to be
>> wrapped in a special class that simulates read()/readline() using
>> sysread() and return undef with $!=EAGAIN when there's insufficient data.
>
> My ideas was to keep the interface as close to IO::Select as possible,
> so can_read would return handles that would not block upon (a
> single) readline. If someone were foolish enough to call readline on a
> handle not returned by can_read, then just let them block. That way you
> don't need to screw around with EAGAIN.
>
> But this would still require the handle to be wrapped in a special class,
> which means the handle returned by can_read would not test identical to the
> handle that was passed in to the selector class in the first place, so this
> would still break the analogy to IO::Select.
How about overloading the '==' operator?
Tassilo
--
use bigint;
$n=71423350343770280161397026330337371139054411854220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($m+=8)<=200);
------------------------------
Date: 28 Jan 2005 11:50:34 -0800
From: binnyva@hotmail.com
Subject: Re: Old tutorial - now corrected
Message-Id: <1106941834.289660.226900@c13g2000cwb.googlegroups.com>
I am beginning to see that this 'conversation' is going
nowhere - you are not understanding what I am saying and
you feel the same about me. I came seeking advice about
my tutoiral, and all I got were complaints about my
scripts.
So I have decided to drop this subject. But none of you
have convinced me to take the tutorial offline. Rather
than pointing out what was wrong with the tutorial and
how to correct it, what you did was claim that I was
unqualified to write it in the first place. And you
did this by judging me by my 2-3 year old scripts.
The scripts available at
http://www.geocities.com/binnyva/code/perl/cgi/
were among the very first perl scripts that I created.
So it is natural that there would be some problems
with them.
Those scripts was not intended for students of perl;
I wrote those scripts years before I wrote this tutorial.
The intended audience for those scripts where people
who had no knowledge of perl(or very little knowledge),
but where looking for the simplest way to setup a
guestbook(or whatever) for their websites - and
unwilling to learn a programming language for that.
My goal in writing the tutorial was to provide a
introduction to perl. I wanted to get the readers
to be interested in perl enought to go to more
advanced tutorials or books. I tried to create
a tutorial that was easy to read and not hard
to understand.
At the same time, in many ways, I had been stubborn
too. I am not able to see the problems with my
tutorial - and my writing it. I still cannot see the
logic in many things pointed out by you gentlemen.
So, I think that it is better for all to quit this
pointless discussion before it degenerates into a
flame war.
If you still think that my tutorial was a waste of
web space and a bad influence for beginners, I am
sorry about the 'mess' I have made. I will still
uphold my promise that I will correct my old scripts.
So far, that is the only good thing that came out
of these postings.
Once again, thank you for not losing your patience,
and sorry for not finding a solution for this problem.
Binny V A
http://www.geocities.com/binnyva/code
------------------------------
Date: Fri, 28 Jan 2005 21:28:40 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: Old tutorial - now corrected
Message-Id: <Xns95ECA7A8E5D3Basu1cornelledu@127.0.0.1>
binnyva@hotmail.com wrote in
news:1106941834.289660.226900@c13g2000cwb.googlegroups.com:
> I came seeking advice about my tutoiral, and all I got
> were complaints about my scripts.
See http://tinyurl.com/6qe9e for a critique of a section of your
tutorial.
Sinan
------------------------------
Date: 28 Jan 2005 13:00:19 -0800
From: bayxarea-usenet@yahoo.com
Subject: regular expression question
Message-Id: <1106946019.687452.98920@z14g2000cwz.googlegroups.com>
I am trying to test a string to see if it begins with a combinations of
possible 3 letter words plus exactly 5 digits and a . (leading spaces
are ok)
for instance
example: if I want to test -> abc,def,ghi
then
abc00000.
def00001.
ghi00002.
should all pass - but
jkl00003. would fail
abc0002. would fail
def00002 would fail
I know how to handle the leading spaces and I can use | to test eachc
of the possible letter combinations - but how do I handle this with the
5 digits if don't know what the digits are going to be?
Thanks,
John
John
------------------------------
Date: Fri, 28 Jan 2005 22:33:22 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: regular expression question
Message-Id: <35vpceF4o00ppU1@individual.net>
bayxarea-usenet@yahoo.com wrote:
> I am trying to test a string to see if it begins with a combinations of
> possible 3 letter words plus exactly 5 digits and a . (leading spaces
> are ok)
<snip>
> I know how to handle the leading spaces and I can use | to test eachc
> of the possible letter combinations - but how do I handle this with the
> 5 digits if don't know what the digits are going to be?
perldoc perlrequick
perldoc perlretut
perldoc perlre
Do some studying, give it a try, and come back here with some code if
you don't manage to figure it out that way.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Sat, 29 Jan 2005 00:14:51 GMT
From: "Atlantis" <noelt.dolan@virgin.net>
Subject: Re: regular expression question
Message-Id: <%zAKd.908$nQ1.405@newsfe4-gui.ntli.net>
<bayxarea-usenet@yahoo.com> wrote in message
news:1106946019.687452.98920@z14g2000cwz.googlegroups.com...
> I am trying to test a string to see if it begins with a combinations of
> possible 3 letter words plus exactly 5 digits and a . (leading spaces
> are ok)
>
> for instance
>
> example: if I want to test -> abc,def,ghi
>
> then
>
> abc00000.
> def00001.
> ghi00002.
>
> should all pass - but
>
> jkl00003. would fail
> abc0002. would fail
> def00002 would fail
>
> I know how to handle the leading spaces and I can use | to test eachc
> of the possible letter combinations - but how do I handle this with the
> 5 digits if don't know what the digits are going to be?
> Thanks,
>
> John
> John
>
#!c:\perl\bin\perl
use strict;
use warnings;
my @Input = ("abc00000.", "def00001.", "ghi00002.", "jkl00003.", "abc0002.",
"def00002", "ghi00002..", "ghi0002..");
foreach my $Record (@Input)
{
if ($Record =~ /\s*(abc|def|ghi)\d{5}\.{1}\s*/) # Checks for leading and
trailing white spaces.
{
print "$Record is valid\n";
}
else
{
print "$Record is invalid\n";
}
}
------------------------------
Date: 28 Jan 2005 17:08:43 -0800
From: bayxarea-usenet@yahoo.com
Subject: Re: regular expression question
Message-Id: <1106960923.741385.15800@f14g2000cwb.googlegroups.com>
Atlantis,
Thanks for you help - that was what I was looking for!
John
------------------------------
Date: 28 Jan 2005 17:09:03 -0800
From: bayxarea-usenet@yahoo.com
Subject: Re: regular expression question
Message-Id: <1106960943.300723.17180@f14g2000cwb.googlegroups.com>
Thanks for your commentary.
------------------------------
Date: Sat, 29 Jan 2005 03:01:18 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: regular expression question
Message-Id: <360942F4rhiniU1@individual.net>
Atlantis wrote:
>
> if ($Record =~ /\s*(abc|def|ghi)\d{5}\.{1}\s*/)
> # Checks for leading and trailing white spaces.
In what way does the regex check for leading and trailing white spaces?
Which strings would it match that are not matched by:
/(abc|def|ghi)\d{5}\./
(or vice versa) ??
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: 28 Jan 2005 21:17:34 GMT
From: xhoster@gmail.com
Subject: Re: Script dumps core....? Any suggestions...
Message-Id: <20050128161734.453$aW@newsreader.com>
anno4000@lublin.zrz.tu-berlin.de (Anno Siegel) wrote:
> <xhoster@gmail.com> wrote in comp.lang.perl.misc:
> > "Gancy" <ganesh_tiwari@hotmail.com> wrote:
> > > Here is the snipet of the perl script, I have perl version v5.8.5
> > > built for sun4-solaris. I have run this script on thousands of 'c',
> > > 'C++' headers and source files. Runs smoothly as my new ESTEEM car.
> > > But i have one surce file toke.c in my test case. soon this scripts
> > > hits this file at it dumps.
> >
> > Is this feature still considered highly experimental in 5.8.5? I guess
> > the experiment failed in your case. Can you monitor memory usage and
> > see if it becomes exorbitant somehwere? Perhaps from excessive nesting
> > of paranthesis in the toke.c file?
> >
> > I can't make it coredump with a simple test case below, but it doesn't
>
> [...]
>
> I have replied to this in the other thread the OP started about it.
Multiposting strikes again! Gancy, shame on you.
To
> summarize: It does indeed segfault with toke.c from the perl source
> (v5.8.6) as the input. The reason is simple: The recursive regex goes
> into deep recursion. Perl doesn't warn about that, that's the bug, if
> there is one. Otherwise, it's just that the regex is broken.
His regex seems to come right out of the documentation (serves as an
example for use of (??{}) feature in perldoc perlre), which doesn't
necessarily means it isn't broken, but it does give it enough exposure that
I was interested in looking into it. But I looked into, was confused,
and gave up.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: 29 Jan 2005 00:24:34 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Script dumps core....? Any suggestions...
Message-Id: <ctel42$q7s$1@mamenchi.zrz.TU-Berlin.DE>
<xhoster@gmail.com> wrote in comp.lang.perl.misc:
> anno4000@lublin.zrz.tu-berlin.de (Anno Siegel) wrote:
> > <xhoster@gmail.com> wrote in comp.lang.perl.misc:
> > > "Gancy" <ganesh_tiwari@hotmail.com> wrote:
> > > > Here is the snipet of the perl script, I have perl version v5.8.5
> > > > built for sun4-solaris. I have run this script on thousands of 'c',
> > > > 'C++' headers and source files. Runs smoothly as my new ESTEEM car.
> > > > But i have one surce file toke.c in my test case. soon this scripts
> > > > hits this file at it dumps.
> > >
> > > Is this feature still considered highly experimental in 5.8.5? I guess
> > > the experiment failed in your case. Can you monitor memory usage and
> > > see if it becomes exorbitant somehwere? Perhaps from excessive nesting
> > > of paranthesis in the toke.c file?
> > >
> > > I can't make it coredump with a simple test case below, but it doesn't
> >
> > [...]
> >
> > I have replied to this in the other thread the OP started about it.
>
> Multiposting strikes again! Gancy, shame on you.
>
>
> To
> > summarize: It does indeed segfault with toke.c from the perl source
> > (v5.8.6) as the input. The reason is simple: The recursive regex goes
> > into deep recursion. Perl doesn't warn about that, that's the bug, if
> > there is one. Otherwise, it's just that the regex is broken.
>
> His regex seems to come right out of the documentation (serves as an
> example for use of (??{}) feature in perldoc perlre), which doesn't
> necessarily means it isn't broken, but it does give it enough exposure that
> I was interested in looking into it. But I looked into, was confused,
> and gave up.
Ah, that's where it's from.
It's fragile, not broken per se. Too many parentheses inside one outer
pair throw it:
my $re;
$re = qr{
\(
(?:
(?> [^()]+ ) # Non-parens without backtracking
|
(??{ $re }) # Group with matching parens
)*
\)
}x;
$_ = 'a (' . '(many)' x 5000 . ')';
/$re/;
An initial non-parenthesized part ("a ") must be present to trigger
recursion on the outer parentheses. From then on, recursion goes one
deeper for every "(" and every ")". Too much is too much.
How does this happen in toke.c? It's C, not Lisp, after all. Early on
(line 111 of 8090) there is this part of a comment:
* TOKEN : generic token (used for '(', DOLSHARP, etc)
The regex knows nothing about comments or quoting and tries to parse
the rest of the source enclosed in the spurious "'('". Too much is
too much.
The error isn't in the regex, as I prematurely assumed, but in its
careless application.
Anno
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 7706
***************************************