[33116] in Perl-Users-Digest
Perl-Users Digest, Issue: 4392 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Mar 19 00:09:20 2015
Date: Wed, 18 Mar 2015 21:09:05 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 18 Mar 2015 Volume: 11 Number: 4392
Today's topics:
An error on page 142 of The Camel Book. <see.my.sig@for.my.address>
Re: An error on page 142 of The Camel Book. <news@lawshouse.org>
Re: An error on page 142 of The Camel Book. <No-Spam@deezee.org>
Re: An error on page 142 of The Camel Book. <No-Spam@deezee.org>
Re: An error on page 142 of The Camel Book. sharma__r@hotmail.com
Re: An error on page 142 of The Camel Book. <rweikusat@mobileactivedefense.com>
Re: An error on page 142 of The Camel Book. <kaz@kylheku.com>
Re: An error on page 142 of The Camel Book. <rweikusat@mobileactivedefense.com>
Re: An error on page 142 of The Camel Book. <news@todbe.com>
Re: An error on page 142 of The Camel Book. <kaz@kylheku.com>
Re: An error on page 142 of The Camel Book. <see.my.sig@for.my.address>
Re: An error on page 142 of The Camel Book. <see.my.sig@for.my.address>
Re: each ARRAY <uri@stemsystems.com>
Re: each ARRAY <bauhaus@futureapps.invalid>
Re: each ARRAY <derykus@gmail.com>
Re: each ARRAY <rweikusat@mobileactivedefense.com>
Re: Read from huge text file <see.my.sig@for.my.address>
Re: Read from huge text file <jurgenex@hotmail.com>
Re: Read from huge text file <rweikusat@mobileactivedefense.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Tue, 17 Mar 2015 23:11:02 -0700
From: Robbie Hatley <see.my.sig@for.my.address>
Subject: An error on page 142 of The Camel Book.
Message-Id: <l4OdnTLhlq1pi5TInZ2dnUVZ57ydnZ2d@giganews.com>
I just noticed that page 142 of The Camel Book (4th ed) gives the
following snippet of code:
for ($i=0; $i<$#ARGV; $i++)
That's not going to work as intended, because it will always ignore
the last argument. To work correctly it should have read:
for ($i=0; $i<=$#ARGV; $i++)
That does not appear to be on O'Reilly's errata page, either.
Wait, let me rectify that... there, submitted.
Seems to me, that kind of error is a good argument against using such
C-like programming idioms in Perl when they can be avoided.
The following neatly avoids that kind of error:
foreach (@ARGV) {
... # process $_
}
(Hmmm. I wonder if the authors purposely put that error in there,
to see how many years it would take for someone to notice it?
About 3, as it turned out. And it's such an easy mistake to make
that something tells me there's programs out there that have that
error in them, manifesting as puzzling bugs.)
--
Cheers,
Robbie Hatley
Midway City, CA, USA
perl -le 'print "\154o\156e\167o\154f\100w\145ll\56c\157m"'
http://www.well.com/user/lonewolf/
https://www.facebook.com/robbie.hatley
------------------------------
Date: Wed, 18 Mar 2015 10:57:26 +0000
From: Henry Law <news@lawshouse.org>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <XbidnV9vx-m-x5TInZ2dnUVZ8kGdnZ2d@giganews.com>
On 18/03/15 06:11, Robbie Hatley wrote:
> for ($i=0; $i<=$#ARGV; $i++)
>
> Seems to me, that kind of error is a good argument against using such
> C-like programming idioms in Perl when they can be avoided.
I agree with you; if for no other reason there's less typing and less
room for error. Well spotted.
But using a loop variable is sometimes essential when, for example, you
want to be able to issue error messages like "Item $i is borked".
--
Henry Law Manchester, England
------------------------------
Date: Wed, 18 Mar 2015 11:53:00 +0000 (UTC)
From: "Dave Saville" <No-Spam@deezee.org>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <fV45K0OBJxbE-pn2-1SiydTydGdS9@paddington.bear.den>
On Wed, 18 Mar 2015 10:57:26 UTC, Henry Law <news@lawshouse.org>
wrote:
> On 18/03/15 06:11, Robbie Hatley wrote:
> > for ($i=0; $i<=$#ARGV; $i++)
> >
> > Seems to me, that kind of error is a good argument against using such
> > C-like programming idioms in Perl when they can be avoided.
>
> I agree with you; if for no other reason there's less typing and less
> room for error. Well spotted.
>
> But using a loop variable is sometimes essential when, for example, you
> want to be able to issue error messages like "Item $i is borked".
>
--
Regards
Dave Saville
------------------------------
Date: Wed, 18 Mar 2015 11:54:20 +0000 (UTC)
From: "Dave Saville" <No-Spam@deezee.org>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <fV45K0OBJxbE-pn2-pzM9pGfHk3e9@paddington.bear.den>
On Wed, 18 Mar 2015 10:57:26 UTC, Henry Law <news@lawshouse.org>
wrote:
> On 18/03/15 06:11, Robbie Hatley wrote:
> > for ($i=0; $i<=$#ARGV; $i++)
> >
> > Seems to me, that kind of error is a good argument against using such
> > C-like programming idioms in Perl when they can be avoided.
>
> I agree with you; if for no other reason there's less typing and less
> room for error. Well spotted.
>
> But using a loop variable is sometimes essential when, for example, you
> want to be able to issue error messages like "Item $i is borked".
>
So it would be better in that case to use each and keep your own index
counting?
--
Regards
Dave Saville
------------------------------
Date: Wed, 18 Mar 2015 06:49:50 -0700 (PDT)
From: sharma__r@hotmail.com
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <4197bf01-a875-4aa1-8782-078b7f67d952@googlegroups.com>
On Wednesday, 18 March 2015 11:41:03 UTC+5:30, Robbie Hatley wrote:
> I just noticed that page 142 of The Camel Book (4th ed) gives the
> following snippet of code:
>
> for ($i=0; $i<$#ARGV; $i++)
>
> That's not going to work as intended, because it will always ignore
> the last argument. To work correctly it should have read:
>
> for ($i=0; $i<=$#ARGV; $i++)
>
> That does not appear to be on O'Reilly's errata page, either.
> Wait, let me rectify that... there, submitted.
>
> Seems to me, that kind of error is a good argument against using such
> C-like programming idioms in Perl when they can be avoided.
> The following neatly avoids that kind of error:
>
> foreach (@ARGV) {
> ... # process $_
> }
>
> (Hmmm. I wonder if the authors purposely put that error in there,
> to see how many years it would take for someone to notice it?
> About 3, as it turned out. And it's such an easy mistake to make
> that something tells me there's programs out there that have that
> error in them, manifesting as puzzling bugs.)
>
>
> --
> Cheers,
> Robbie Hatley
> Midway City, CA, USA
> perl -le 'print "\154o\156e\167o\154f\100w\145ll\56c\157m"'
> http://www.well.com/user/lonewolf/
> https://www.facebook.com/robbie.hatley
> for ($i=0; $i<$#ARGV; $i++)
They probably had the following in mind:
for ($i=0; $i<@ARGV; $i++)
--Rakesh
------------------------------
Date: Wed, 18 Mar 2015 15:48:06 +0000
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <873852gvd5.fsf@doppelsaurus.mobileactivedefense.com>
"Dave Saville" <No-Spam@deezee.org> writes:
> On Wed, 18 Mar 2015 10:57:26 UTC, Henry Law <news@lawshouse.org>
> wrote:
>
>> On 18/03/15 06:11, Robbie Hatley wrote:
>> > for ($i=0; $i<=$#ARGV; $i++)
>> >
>> > Seems to me, that kind of error is a good argument against using such
>> > C-like programming idioms in Perl when they can be avoided.
>>
>> I agree with you; if for no other reason there's less typing and less
>> room for error. Well spotted.
>>
>> But using a loop variable is sometimes essential when, for example, you
>> want to be able to issue error messages like "Item $i is borked".
>>
>
> So it would be better in that case to use each and keep your own index
> counting?
The 'foreach' for loop iterates over some list. If I actually want to
iterate over the indices of some array, I usually use foreach/ for with
a suitable list, eg,
[rw@doppelsaurus]/tmp#perl -e 'print("$_:\t$ARGV[$_]\n") for 0 .. $#ARGV' a b c
0: a
1: b
2: c
This will even end up as a 'counting loop', ie,
perl -e 'print for 0 .. 1000000'
won't push 1000001 integers into the perl stack:
[rw@doppelsaurus]/tmp#perl -MO=Concise,-exec -e 'print for 0 .. 1000000'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark s
4 <$> const[IV 0] s
5 <$> const[IV 1000000] s
6 <#> gv[*_] s
7 <{> enteriter(next->b last->e redo->8) lKS/8
c <0> iter s
d <|> and(other->8) vK/1
8 <0> pushmark s
9 <#> gvsv[*_] s
a <@> print vK
b <0> unstack v
goto c
e <2> leaveloop vK/2
f <@> leave[1 ref] vKP/REFC
------------------------------
Date: Wed, 18 Mar 2015 22:35:20 +0000 (UTC)
From: Kaz Kylheku <kaz@kylheku.com>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <20150318152812.690@kylheku.com>
On 2015-03-18, Robbie Hatley <see.my.sig@for.my.address> wrote:
>
> I just noticed that page 142 of The Camel Book (4th ed) gives the
> following snippet of code:
>
> for ($i=0; $i<$#ARGV; $i++)
>
> That's not going to work as intended, because it will always ignore
> the last argument. To work correctly it should have read:
>
> for ($i=0; $i<=$#ARGV; $i++)
>
> That does not appear to be on O'Reilly's errata page, either.
> Wait, let me rectify that... there, submitted.
>
> Seems to me, that kind of error is a good argument against using such
> C-like programming idioms in Perl when they can be avoided.
There is nothing "C like" about:
for (i = 0; i <= n; i++)
where the loop actually performs n+1 iterations.
While such a thing could occur in a C program (hopefully for a very
good reason) C argument processing in the main function doesn't work that way.
The "argc" value counts the number of arguments including the program name:
argv[0] through argv[argc-1]. There is an extra element in the array which is
a null pointer: argv[argc] == 0. So a C loop to walk and print the arguments
including program name would be:
for (i = 0; i < argc; i++)
puts(argv[i]);
------------------------------
Date: Wed, 18 Mar 2015 23:19:46 +0000
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <87r3slevvx.fsf@doppelsaurus.mobileactivedefense.com>
Kaz Kylheku <kaz@kylheku.com> writes:
> On 2015-03-18, Robbie Hatley <see.my.sig@for.my.address> wrote:
>>
>> I just noticed that page 142 of The Camel Book (4th ed) gives the
>> following snippet of code:
>>
>> for ($i=0; $i<$#ARGV; $i++)
>>
>> That's not going to work as intended, because it will always ignore
>> the last argument. To work correctly it should have read:
>>
>> for ($i=0; $i<=$#ARGV; $i++)
>>
>> That does not appear to be on O'Reilly's errata page, either.
>> Wait, let me rectify that... there, submitted.
>>
>> Seems to me, that kind of error is a good argument against using such
>> C-like programming idioms in Perl when they can be avoided.
>
> There is nothing "C like" about:
>
> for (i = 0; i <= n; i++)
>
> where the loop actually performs n+1 iterations.
The "C-like" part is using for (;;) to implement a counting loop in
order to iterate over the values of a set. The problem of using the
wrong test or stop value, correct would be either
for ($i = 0; $i <= $#ARGV; $i++) {
}
or
for ($i = 0; $i < @ARGV, $i++) {
}
can be easily avoid in Perl by iterating over the set itself instead, ie
either
for (@ARGV) {
}
or
for (0 .. $#ARGV) {
}
> While such a thing could occur in a C program (hopefully for a very
> good reason) C argument processing in the main function doesn't work that way.
> The "argc" value counts the number of arguments including the program name:
> argv[0] through argv[argc-1]. There is an extra element in the array which is
> a null pointer: argv[argc] == 0. So a C loop to walk and print the arguments
> including program name would be:
>
> for (i = 0; i < argc; i++)
> puts(argv[i]);
Even in C, I'd usually rather avoid this idiom, eg, in the given case,
use something like
-------
#include <stdio.h>
int main(int argc, char **argv)
{
char *p;
while ((p = *argv)) {
puts(p);
++argv;
}
return 0;
}
-------
or even
-------
#include <stdio.h>
int main(int argc, char **argv)
{
do puts(*argv); while (*++argv);
return 0;
}
-------
IMHO, for (;;) counting loops are a bad idea in any language.
------------------------------
Date: Wed, 18 Mar 2015 17:44:25 -0700
From: "$Bill" <news@todbe.com>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <med63s$m87$1@dont-email.me>
On 3/18/2015 16:19, Rainer Weikusat wrote:
>
> IMHO, for (;;) counting loops are a bad idea in any language.
I can't think of a single reason why.
I guess all of my code is a bad idea - I use the for (;;) in most every
script I write and am not about to change. I think it's a wonderful
construct and have been using it most of my life (C, Perl, etc). ;)
And I almost never use
$ii <= $#ARGV;
in deference to
$ii < @ARGV;
or
$ii < $num_items;
and I always use $ii, $jj, $kk in deference to $i, $j, $k for easy
searching of lines using indexes (even though it only saves 1 stroke
and a hand slide).
Druthers ...
------------------------------
Date: Thu, 19 Mar 2015 00:54:04 +0000 (UTC)
From: Kaz Kylheku <kaz@kylheku.com>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <20150318174055.104@kylheku.com>
On 2015-03-18, Rainer Weikusat <rweikusat@mobileactivedefense.com> wrote:
> Kaz Kylheku <kaz@kylheku.com> writes:
>> On 2015-03-18, Robbie Hatley <see.my.sig@for.my.address> wrote:
>>>
>>> I just noticed that page 142 of The Camel Book (4th ed) gives the
>>> following snippet of code:
>>>
>>> for ($i=0; $i<$#ARGV; $i++)
>>>
>>> That's not going to work as intended, because it will always ignore
>>> the last argument. To work correctly it should have read:
>>>
>>> for ($i=0; $i<=$#ARGV; $i++)
>>>
>>> That does not appear to be on O'Reilly's errata page, either.
>>> Wait, let me rectify that... there, submitted.
>>>
>>> Seems to me, that kind of error is a good argument against using such
>>> C-like programming idioms in Perl when they can be avoided.
>>
>> There is nothing "C like" about:
>>
>> for (i = 0; i <= n; i++)
>>
>> where the loop actually performs n+1 iterations.
>
> The "C-like" part is using for (;;) to implement a counting loop in
> order to iterate over the values of a set. The problem of using the
> wrong test or stop value, correct would be either
>
> for ($i = 0; $i <= $#ARGV; $i++) {
> }
>
> or
>
> for ($i = 0; $i < @ARGV, $i++) {
> }
>
> can be easily avoid in Perl by iterating over the set itself instead, ie
> either
Nevertheless, there is some counting stupidity there which contributes
to the probability of a bug.
You would think that $#ARGV means "number of arguments", right?
But in fact, the number of arguments (not including the
script name) is not $#ARGV, but $#ARGV+1, and $#ARGV+2 if
the script name is included.
This program
#!/usr/bin/perl
print $#ARGV, "\n";
print $0, "\n";
confirms that, when it is run with no arguments, $#ARGV is -1,
which is fucking retarded right off the bat.
If one argument is supplied, this is zero, and so on.
In the shell language, $# counts the arguments starting with $1, not including
$0. This is sane and reasonable enough.
In C main, argc counts the arguments including the program name argv[0].
Reasonable.
$#ARGV being one less than then number of arguments is not reasonable.
So, overall point: don't blame this bug entirely on the damn for loop.
Part of the blame goes to the (C coding) numbnut who came up wit this ARGV
in Perl.
------------------------------
Date: Wed, 18 Mar 2015 18:44:13 -0700
From: Robbie Hatley <see.my.sig@for.my.address>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <O9GdnUfU-et2tJfInZ2dnUVZ572dnZ2d@giganews.com>
On 3/18/2015 3:57 AM, Henry Law wrote:
> ... But using a loop variable is sometimes essential when, for example,
> you want to be able to issue error messages like "Item $i is borked"...
Yes, and also, when you don't want to do "for each", but rather
"for some but not all". For example, I have one program which needs to use
C-style for loops because it needs to process just part of an array.
One of its loops looks roughly like this:
for ( my $i = $start ; $i < $stop ; ++$i ) {
... # process $array[$i]
}
--
Cheers,
Robbie Hatley
Midway City, CA, USA
perl -le 'print "\154o\156e\167o\154f\100w\145ll\56c\157m"'
http://www.well.com/user/lonewolf/
https://www.facebook.com/robbie.hatley
------------------------------
Date: Wed, 18 Mar 2015 20:36:49 -0700
From: Robbie Hatley <see.my.sig@for.my.address>
Subject: Re: An error on page 142 of The Camel Book.
Message-Id: <_76dnRpvSbfT2ZfInZ2dnUVZ57ydnZ2d@giganews.com>
On 3/18/2015 3:35 PM, Kaz Kylheku wrote:
> On 2015-03-18, Robbie Hatley <see.my.sig@for.my.address> wrote:
> >
> > I just noticed that page 142 of The Camel Book (4th ed) gives the
> > following snippet of code:
> >
> > for ($i=0; $i<$#ARGV; $i++)
> >
> > That's not going to work as intended, because it will always ignore
> > the last argument. To work correctly it should have read:
> >
> > for ($i=0; $i<=$#ARGV; $i++)
> >
> > That does not appear to be on O'Reilly's errata page, either.
> > Wait, let me rectify that... there, submitted.
> >
> > Seems to me, that kind of error is a good argument against using such
> > C-like programming idioms in Perl when they can be avoided.
>
> There is nothing "C like" about:
>
> for (i = 0; i <= n; i++)
Actually, that *is* a valid C statement (except that it's missing the
block which normally follows after).
> where the loop actually performs n+1 iterations.
It will (in either C or Perl), unless something stops it from doing
so ("next" in Perl, "break" in C, or an error occurs).
> While such a thing could occur in a C program
Huh? I thought you just asserted that such a thing could never occur
in C? But that was an incorrect assertion anyway, so I'm glad you've
now changed your mind.
> (hopefully for a very good reason)
The loop is valid whether the programmer's reason for writing his program
that way is very good, moderately good, mediocre, or attrocious.
> C argument processing in the main function doesn't work that way.
Perl isn't C.
> The "argc" value counts the number of arguments including the program name:
> argv[0] through argv[argc-1].
Perl equivalent: @ARGV in a scalar context is the number of arguments.
(The name of the program, if memory serves, is in $0.)
$#ARGV on the other hand is "last index".
Also, $#ARGV can do things that have no C equivalent at all, because it is
an LVALUE. So you can foreshorten a 100-element array to 50 elements like so:
#! /usr/bin/perl
# ~/scripts/test/foreshorten.perl
our @array;
for ( my $i = 1 ; $i <= 100 ; ++$i ) {push(@array,$i);}
$#array = 49;
# After that last line, @array is now a 50-element array containing
# the integers 1 through 50. An attempt to access $array[73] will not
# be considered an error, but will yield the "undefined" value.
print("$array[24]\n"); # prints 25
print("$array[73]\n"); # prints blank line (because $array[73] is undefined)
$#array = -1; # erase the whole damn array
print("$array[ 0]\n"); # prints blank line (because $array[ 0] is undefined)
print("$array[13]\n"); # prints blank line (because $array[13] is undefined)
print("$array[62]\n"); # prints blank line (because $array[62] is undefined)
> There is an extra element in the array which is a null pointer: argv[argc] == 0.
Lemme research that. :::researches::: Ok, yes, that's true. But not a
particularly useful piece of knowledge that I can see. If one already
knows that argc is the size of argv
> So a C loop to walk and print the arguments including program name would be:
>
> for (i = 0; i < argc; i++)
> puts(argv[i]);
Well, yes. But I don't see how that's relevant to anything. The topic here is
Perl, and more specifically, the difference between scalar(@ARGV) and $#ARGV.
And the difference is 1. :-) Always the following expression is true:
scalar(@ARGV) == $#ARGV + 1
--
Cheers,
Robbie Hatley
Midway City, CA, USA
perl -le 'print "\154o\156e\167o\154f\100w\145ll\56c\157m"'
http://www.well.com/user/lonewolf/
https://www.facebook.com/robbie.hatley
------------------------------
Date: Mon, 16 Mar 2015 23:17:40 -0400
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: each ARRAY
Message-Id: <87sid448iz.fsf@stemsystems.com>
>>>>> "PJH" == Peter J Holzer <hjp-usenet3@hjp.at> writes:
PJH> Where using each on an array shines is that you can use it to get the
PJH> index and value at once:
PJH> while (my ($i, $v) = each @arr) { say $i if $v }
PJH> It's not much of an improvement (if any) in this trivial example, and
PJH> you could always replace it with
PJH> for my $i (0 .. $#arr) {
PJH> $v = $arr[$i];
PJH> ...
the each will be faster too as it doesn't need to index. array indexing
in perl is slowish and i try to avoid it. in fact when i review code, i
will notice array indexing and see if it was really needed. too often
perl coders (of all skills) are not thinking perlish and use indexing
when it is not needed. each on arrays is one way to reduce its use.
uri
------------------------------
Date: Tue, 17 Mar 2015 11:26:41 +0100
From: "G.B." <bauhaus@futureapps.invalid>
Subject: Re: each ARRAY
Message-Id: <me8vfh$4fj$1@dont-email.me>
On 17.03.15 04:17, Uri Guttman wrote:
> too often
> perl coders (of all skills) are not thinking perlish and use indexing
> when it is not needed
In fairness, Perl version 5.12 or later is not available
with typical system installations, even today. So, "each"
does not work on arrays.
$ perl -w -e '@a = (1,2,3,4,5); while (($_, $a) = each @a) { print }'
Type of arg 1 to each must be hash (not array dereference) at -e line 1,
near "@a) "
Execution of -e aborted due to compilation errors.
$
------------------------------
Date: Tue, 17 Mar 2015 06:50:15 -0700 (PDT)
From: "C.DeRykus" <derykus@gmail.com>
Subject: Re: each ARRAY
Message-Id: <fa8ecdd8-5846-4b81-9c6f-b2e8940bf70b@googlegroups.com>
On Monday, March 16, 2015 at 8:19:54 PM UTC-7, Uri Guttman wrote:
> >>>>> "PJH" == Peter J Holzer <hjp-usenet3@hjp.at> writes:
>
>
> PJH> Where using each on an array shines is that you can use it to get the
> PJH> index and value at once:
>
> PJH> while (my ($i, $v) = each @arr) { say $i if $v }
>
> PJH> It's not much of an improvement (if any) in this trivial example, and
> PJH> you could always replace it with
>
> PJH> for my $i (0 .. $#arr) {
> PJH> $v = $arr[$i];
> PJH> ...
>
> the each will be faster too as it doesn't need to index. array indexing
> in perl is slowish and i try to avoid it. in fact when i review code, i
> will notice array indexing and see if it was really needed. too often
> perl coders (of all skills) are not thinking perlish and use indexing
> when it is not needed. each on arrays is one way to reduce its use.
>
It was noted earlier.
And, of course, if @arr is expendable, a shift is stratospherically faster:
for ( 0 .. $#arr) { $v = shift @arr; say $_ if $v }
--
Charles DeRykus
------------------------------
Date: Tue, 17 Mar 2015 15:20:06 +0000
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: each ARRAY
Message-Id: <87fv93mz15.fsf@doppelsaurus.mobileactivedefense.com>
"G.B." <bauhaus@futureapps.invalid> writes:
> On 17.03.15 04:17, Uri Guttman wrote:
>>the each will be faster too as it doesn't need to index. array indexing
>> in perl is slowish and i try to avoid it. in fact when i review code, i
>> too often perl coders (of all skills) are not thinking perlish and
>> use indexing when it is not needed
>
> In fairness, Perl version 5.12 or later is not available
> with typical system installations, even today. So, "each"
> does not work on arrays.
I didn't originally plan to write this as I'm not the style police and
this is IMHO mostly a matter of style: Applying each to an array is a
fishy, new-fangled idea smelling strongly like PHP (that's where they
don't have arrays to begin with) while Perl has supported indexed access
to array elements since the neolithic age or so. Further, this 'slowish
operation' takes (5.14.2) on one computer where I tested this about
6.92E-8s, being about an order of magnitude faster than a comparable hash
lookup (8.07E-7),
--------------
use Benchmark;
my @a = 0 .. 999;
my %h = map { $_ => 1 } 0 .. 999;
timethese(-3,
{
rand => sub {
rand(1000);
},
ary => sub {
$a[rand(1000)];
},
dict => sub {
$h{rand(1000)};
}});
-------------
Further (and that's exactly the result I expected), at least for me, the
indexed for loop below
-------------
use Benchmark qw(cmpthese);
my @a = 0 .. 999;
cmpthese(-3,
{
each => sub {
my ($k, $v, $s);
$s += $k * v while ($k, $v) = each(@a);
},
ndx => sub {
my $s;
$s += $a[$_] * $_ for 0 .. $#a;
}});
-------------
is a litte more than twice as fast as the other.
------------------------------
Date: Tue, 17 Mar 2015 21:54:20 -0700
From: Robbie Hatley <see.my.sig@for.my.address>
Subject: Re: Read from huge text file
Message-Id: <F-idnaF0HqVnmZTInZ2dnUVZ57ydnZ2d@giganews.com>
On 3/13/2015 12:22 PM, Rainer Weikusat wrote:
> gamo <gamo@telecable.es> writes:
>> El 13/03/15 a las 09:35, Robert Crandal escribió:
>>> Suppose I have the following input text file which is
>>> divided into paragraphs. Each paragraph is separated
>>> by a border of 25 star characters.:
>>
>> local $/ = "*" x 25;
>>
>> and proceed as usual
>>
>> while (<IN>){
>> # $_ now contains a paragraph
>> }
>
> When doing this, $_ will end with the 25 stars and all paragraphs after
> the first will begin with the \n of the separator line. Probably more in
> line with what the OP wanted:
>
> local $/ = ('*' x 25)."\n";
>
> while (<IN>) {
> chomp; # remove separator from $_
> # now contains the paragraph
> }
Fascinating. So you're saying "chomp" will delete whatever $/ is
from the end of $_ (or its argument)? If so, I can simplify what
I was doing in my last script to get rid of Windows newlines.
Standard chomp was just chomping the \x0a off the end and leaving
a \x0d at the end of every line, which was screwing up my file
processing. So I made a Windows version of chomp and saved it in
a perl library folder as a module so I could "use" it:
#! /usr/bin/perl
# /cygwin64/lib/perl5/5.14/WinChomp.pm
# Removes windows newline characters from the ends of text lines.
use v5.14;
use strict;
use warnings;
sub winchomp (_) {
s/\x0a$//; # get rid of LF
s/\x0d$//; # get rid of CR
}
1;
Then I invoked it in scripts like so:
#! /usr/bin/perl
# /rhe/scripts/util/some-damn-script.perl
use v5.14;
use strict;
use warnings;
use WinChomp;
while (<>) {
winchomp;
...
}
However, from what you're telling me, I could just do this instead?
$/="\x0d\x0a";
while (<>) {
chomp;
...
}
--
Cheers,
Robbie Hatley
Midway City, CA, USA
perl -le 'print "\154o\156e\167o\154f\100w\145ll\56c\157m"'
http://www.well.com/user/lonewolf/
https://www.facebook.com/robbie.hatley
------------------------------
Date: Tue, 17 Mar 2015 23:17:05 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Read from huge text file
Message-Id: <206iga5qi848bs85om1qm85qqnghvvvj3b@4ax.com>
Robbie Hatley <see.my.sig@for.my.address> wrote:
>Fascinating. So you're saying "chomp" will delete whatever $/ is
>from the end of $_ (or its argument)?
Well, at least it is what the documentation is saying:
chomp [...] removes any trailing string that
corresponds to the current value of $/
[...].
jue
------------------------------
Date: Wed, 18 Mar 2015 19:55:03 +0000
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: Read from huge text file
Message-Id: <87zj7adqso.fsf@doppelsaurus.mobileactivedefense.com>
Robbie Hatley <see.my.sig@for.my.address> writes:
> On 3/13/2015 12:22 PM, Rainer Weikusat wrote:
[...]
>> local $/ = ('*' x 25)."\n";
>>
>> while (<IN>) {
>> chomp; # remove separator from $_
[...]
> I made a Windows version of chomp and saved it in
> a perl library folder as a module so I could "use" it:
>
> #! /usr/bin/perl
> # /cygwin64/lib/perl5/5.14/WinChomp.pm
> # Removes windows newline characters from the ends of text lines.
> use v5.14;
> use strict;
> use warnings;
> sub winchomp (_) {
> s/\x0a$//; # get rid of LF
> s/\x0d$//; # get rid of CR
> }
[...]
> However, from what you're telling me, I could just do this instead?
>
> $/="\x0d\x0a";
> while (<>) {
> chomp;
> ...
> }
That's what the chomp documentation (perldoc -f chomp) says:
chomp VARIABLE
chomp( LIST )
chomp
This safer version of "chop" removes any trailing string that
corresponds to the current value of $/ (also known as
$INPUT_RECORD_SEPARATOR in the "English" module).
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 4392
***************************************