[12716] in Perl-Users-Digest
Perl-Users Digest, Issue: 126 Volume: 9
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Jul 13 19:47:25 1999
Date: Tue, 13 Jul 1999 16:37:41 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 13 Jul 1999 Volume: 9 Number: 126
Today's topics:
Re: I need help <gellyfish@gellyfish.com>
Re: Is it just me or....... (John Borwick)
Re: Is it just me or....... <mjcarman@zeus.ia.net>
Re: Is it just me or....... (Abigail)
Re: Is it just me or....... <chris@inta.net.uk>
Digest Administrivia (Last modified: 1 Jul 99) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 12 Jul 1999 20:52:11 -0000
From: Jonathan Stowe <gellyfish@gellyfish.com>
Subject: Re: I need help
Message-Id: <7mdkhr$45a$1@gellyfish.btinternet.com>
On Sun, 11 Jul 1999 15:58:44 -0400 (EDT) PaCzBuckinAtCha@webtv.net wrote:
>
> --WebTV-Mail-7676-19459
> Content-Type: Text/Plain; Charset=US-ASCII
> Content-Transfer-Encoding: 7Bit
>
> ok im looking for a C compiler. im on webtv as u already can see.
<enough>
It is a little ironic that this has threaded (on my system anyhow) with
two other separate threads with an identical Subject line - both off
topic as well and one similarly enmyred in MIME (my response to which
incidentally led to a particularly amusing flame from the original poster).
Anyhow this is not a question appropriate for this group - but I will make
the suggestion that if you need to use proper tools then you ought to
get a proper computer.
/J\
--
Jonathan Stowe <jns@gellyfish.com>
Some of your questions answered:
<URL:http://www.btinternet.com/~gellyfish/resources/wwwfaq.htm>
Hastings: <URL:http://www.newhoo.com/Regional/UK/England/East_Sussex/Hastings>
------------------------------
Date: Mon, 12 Jul 1999 19:30:48 GMT
From: John.Borwick@sas.com (John Borwick)
Subject: Re: Is it just me or.......
Message-Id: <379239d4.25517402@newshost.unx.sas.com>
[ Sorry for the length ]
On Mon, 12 Jul 1999 19:04:21 +0100, "Chris Denman" <chris@inta.net.uk>
wrote:
>>>split command. Later I can then access any record by using
>>>$DATA[$loop]{'Name'}. All of the code is fine, but I have pinpointed the
>>You could create fewer hashes, like
>>$DATA{'Name'}[$loop];
>>$DATA{'Whatever'}[$loop];
>$DATA[1020]{'Name'} would be the Name field of record number 1020
Just look at what you're doing, though. With your storage method, you
create a new hash and a new array element for each line. With my
method, you create a new array element for each line but only create a
new hash when you've never seen that element name before.
>I know there are betters ways of storing and/or extracting data, but we are
>stuck with the current system at the moment as so many of our clients use
>it. We would have to re-write everyones data structures.
If you're stuck, you should at least consider implementing an OO
method of storing data so you can change it more easily in the future.
As far as split goes, you might want to use a regular expression
instead of a split if you only want the first two fields in your file,
every time.
I've done some tests of split and regular expressions, and in your
case given the sample data regexps work better. I did 5 loops of
2**17 iterations for each sample line of code you gave, and each time
the 'regexp' fn won.
So maybe, try
@_ = /^([^:]*):([^:]*)/;
Below is the program & results. Again, I'm sorry this message is so
long.
#-- begin
use strict;
use Benchmark;
foreach my $line ( "Name:Fred:\n", "Address:Bloggs:\n", "Age:21:\n" )
{
print "line = $line\n";
for(my $c=0; $c<5; $c++) {
print "Try number $c\n";
timethese(2**17, {
'split' => sub {
$_ = $line;
@_ = (split /:/)[0,1];
},
'regexp' => sub {
$_ = $line;
@_ = /^([^:]*):([^:]*)/;
},
});
}
}
#-- end
# results:
# All lines starting with Benchmark: have been removed
# to shorten the size of this message. all iterations
# performed 131072 times.
line = Name:Fred:
Try number 0
regexp: 4 secs ( 3.13 usr 0.01 sys = 3.14 cpu)
split: 4 secs ( 3.82 usr 0.00 sys = 3.82 cpu)
Try number 1
regexp: 2 secs ( 3.17 usr 0.00 sys = 3.17 cpu)
split: 7 secs ( 3.87 usr 0.02 sys = 3.89 cpu)
Try number 2
regexp: 7 secs ( 3.20 usr 0.01 sys = 3.21 cpu)
split: 4 secs ( 3.94 usr 0.02 sys = 3.96 cpu)
Try number 3
regexp: 4 secs ( 3.24 usr 0.02 sys = 3.26 cpu)
split: 7 secs ( 3.85 usr 0.01 sys = 3.86 cpu)
Try number 4
regexp: 2 secs ( 3.14 usr 0.00 sys = 3.14 cpu)
split: 3 secs ( 3.83 usr 0.00 sys = 3.83 cpu)
line = Address:Bloggs:
Try number 0
regexp: 4 secs ( 3.24 usr 0.01 sys = 3.25 cpu)
split: 4 secs ( 3.81 usr 0.00 sys = 3.81 cpu)
Try number 1
regexp: 2 secs ( 3.23 usr 0.00 sys = 3.23 cpu)
split: 3 secs ( 3.83 usr 0.00 sys = 3.83 cpu)
Try number 2
regexp: 5 secs ( 3.24 usr 0.01 sys = 3.25 cpu)
split: 6 secs ( 3.84 usr 0.01 sys = 3.85 cpu)
Try number 3
regexp: 4 secs ( 3.26 usr 0.01 sys = 3.27 cpu)
split: 5 secs ( 3.85 usr 0.00 sys = 3.85 cpu)
Try number 4
regexp: 4 secs ( 3.24 usr 0.00 sys = 3.24 cpu)
split: 5 secs ( 3.79 usr -0.01 sys = 3.78 cpu)
line = Age:21:
Try number 0
regexp: 4 secs ( 3.13 usr 0.00 sys = 3.13 cpu)
split: 4 secs ( 3.81 usr 0.01 sys = 3.82 cpu)
Try number 1
regexp: 2 secs ( 3.11 usr 0.00 sys = 3.11 cpu)
split: 3 secs ( 3.78 usr 0.00 sys = 3.78 cpu)
Try number 2
regexp: 4 secs ( 3.12 usr 0.00 sys = 3.12 cpu)
split: 4 secs ( 3.82 usr 0.01 sys = 3.83 cpu)
Try number 3
regexp: 4 secs ( 3.11 usr 0.00 sys = 3.11 cpu)
split: 5 secs ( 3.85 usr 0.00 sys = 3.85 cpu)
Try number 4
regexp: 3 secs ( 3.11 usr 0.00 sys = 3.11 cpu)
split: 3 secs ( 3.82 usr 0.00 sys = 3.82 cpu)
--
John Borwick
------------------------------
Date: Mon, 12 Jul 1999 18:55:32 -0500
From: Michael Carman <mjcarman@zeus.ia.net>
Subject: Re: Is it just me or.......
Message-Id: <378A8074.C0C11C78@zeus.ia.net>
Chris Denman wrote:
> $line is the record number so
>
> $DATA[1020]{'Name'} would be the Name field of record number 1020
If you know (or can determine) ahead of time how many elements you're going to
have, you can allocate storage for the whole mess ahead of time. This saves
having to either extend your array each pass through the loop. (I think this is
what John was hinting at.)
e.g. (assuming you've read your file into @file)
$line_count = @file;
$DATA[$line_count] = undef; # Don't need to actually set a value yet.
foreach (@file) {
# Do something interesting
}
I believe you can preallocate hashes as well to avoid all the autovivification
-- you may want to look into that.
All this doesn't make split() run any faster, of course, but it could improve
the performance of the loop as a whole.
-mjc
------------------------------
Date: 12 Jul 1999 23:18:34 -0500
From: abigail@delanet.com (Abigail)
Subject: Re: Is it just me or.......
Message-Id: <slrn7olfg1.h7.abigail@alexandra.delanet.com>
Chris Denman (chris@inta.net.uk) wrote on MMCXLI September MCMXCIII in
<URL:news:7mctci$1q68$1@news2.vas-net.net>:
[]
[] What I do is open up each record and create an associative array using the
[] split command. Later I can then access any record by using
[] $DATA[$loop]{'Name'}. All of the code is fine, but I have pinpointed the
[] bottleneck to the split command.
[]
[] There are about 3000 records, and the results take many seconds to appear.
[] The server is fast, and the files load in fast.
Well, yes, now that I see it, that split command is indeed silly.
(What exactly do you expect as an answer - you want input on your code,
but you don't show anything)
Abigail
--
sub _'_{$_'_=~s/$a/$_/}map{$$_=$Z++}Y,a..z,A..X;*{($_::_=sprintf+q=%X==>"$A$Y".
"$b$r$T$u")=~s~0~O~g;map+_::_,U=>T=>L=>$Z;$_::_}=*_;sub _{print+/.*::(.*)/s}
*_'_=*{chr($b*$e)};*__=*{chr(1<<$e)};
_::_(r(e(k(c(a(H(__(l(r(e(P(__(r(e(h(t(o(n(a(__(t(us(J())))))))))))))))))))))))
-----------== Posted via Newsfeeds.Com, Uncensored Usenet News ==----------
http://www.newsfeeds.com The Largest Usenet Servers in the World!
------== Over 73,000 Newsgroups - Including Dedicated Binaries Servers ==-----
------------------------------
Date: Tue, 13 Jul 1999 12:16:51 +0100
From: "Chris Denman" <chris@inta.net.uk>
Subject: Re: Is it just me or.......
Message-Id: <7mf7de$2n5o$1@news2.vas-net.net>
Thanks to all for your interest. I am sorry about not supplying enough
code, but all comments have been greatly appreciated and I will implement
the techniques soon! I will let you know how I get on.
Thanks again...
Chris Denman
------------------------------
Date: 1 Jul 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 1 Jul 99)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V9 Issue 126
*************************************