[24987] in Perl-Users-Digest
Perl-Users Digest, Issue: 7237 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Oct 12 18:12:06 2004
Date: Tue, 12 Oct 2004 15:10:21 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 12 Oct 2004 Volume: 10 Number: 7237
Today's topics:
String and Array Programming in Perl (DeveloperGuy)
Re: String and Array Programming in Perl <tadmc@augustmail.com>
Re: String and Array Programming in Perl <usa1@llenroc.ude.invalid>
Re: String and Array Programming in Perl <tadmc@augustmail.com>
Re: String and Array Programming in Perl <usa1@llenroc.ude.invalid>
undef takes forever <karlUNDERSCOREkramsch@yahooPERIODcom.invalid>
Re: undef takes forever <usa1@llenroc.ude.invalid>
Re: Using a variable size with the repetition quantifie (Philippe Aymer)
Re: Using a variable size with the repetition quantifie <pinyaj@rpi.edu>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 12 Oct 2004 12:20:14 -0700
From: Phillip.Small@gmail.com (DeveloperGuy)
Subject: String and Array Programming in Perl
Message-Id: <64f06fa2.0410121120.4b777d7c@posting.google.com>
I am very very new to Perl and am trying automate a process in my AIX
Unix box. I issed the command ps -aef and sent it to a file. How do
I get how many different users running programs, the total time for
each user in hours:minutes format, and who is running the longest
process and the program name? I am not familiar with using the loops.
I know that I can probably use the date command to specify the date.
This is where I am stuck thus far. Please help anyone...
#! /usr/bin/perl
use strict;
use warnings;
@users;
@tmpfile = OPEN(DataFileHandle, /home/smallp/data.txt);
$tmpfile[0];
$users[0];
for ($count= 0; $count <= $#users; $count++;) {
If $tmpline[0] eq $users[i]
if TRUE then exit
push(@users, $users[0]);
}
------------------------------
Date: Tue, 12 Oct 2004 15:17:20 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: String and Array Programming in Perl
Message-Id: <slrncmoeug.vlt.tadmc@magna.augustmail.com>
DeveloperGuy <Phillip.Small@gmail.com> wrote:
> I am very very new to Perl
We will still expect that you use Perl rather than something
merely Perlish-looking.
> and am trying automate a process in my AIX
> Unix box. I issed the command ps -aef and sent it to a file.
You can do that from within Perl itself, no need for a file.
my @ps_lines = `ps -aef`; # backwards single quotes
or
my @ps_lines = qx/ps -aef/; # backwards single quotes in disguise
or
open PS, 'ps -aef|' or die "could not run ps $!";
while ( <PS> ) ...
> How do
> I get how many different users running programs, the total time for
> each user in hours:minutes format, and who is running the longest
> process and the program name?
By parsing the output of the ps command.
You might want to use Perl's unpack() or substr() functions
to help you with that.
> I am not familiar with using the loops.
Then become familiar with using the loops, they are documented in:
perldoc perlsyn
> I know that I can probably use the date command to specify the date.
You can do that from within Perl too, no need for an external date program.
perldoc -f localtime
perldoc -f gmtime
> This is where I am stuck thus far. Please help anyone...
>
> #! /usr/bin/perl
>
> use strict;
When you put that in your programs you are making a promise:
I promise to declare my variables before using their short names.
If you break your promise, then perl will refuse to run your program.
> use warnings;
>
> @users;
You have not declared that variable, so perl refuses to run your program.
my @users;
> @tmpfile = OPEN(DataFileHandle, /home/smallp/data.txt);
Perl does not have an OPEN() function, only an open() function.
Case matters.
Put 'quotes' around your strings.
open() returns a single thing, no need for an array to hold its return value.
It is a convention to use all UPPER CASE for filehandles.
You should always, yes *always*, check the return value from open()
to ensure that you actually got what you asked for:
open DATA_FILEHANDLE, '/home/smallp/data.txt' or
die "could not open '/home/smallp/data.txt' $!";
Your code never makes use of the filehandle. You will need to *read*
from it to get the data to process...
> $tmpfile[0];
> $users[0];
Those are do-nothing statements, they have no useful effect.
What were you hoping those 2 lines of code would do for you?
> for ($count= 0; $count <= $#users; $count++;) {
A more Perlish way to get the same thing is:
foreach my $count ( 0 .. $#users ) {
> If $tmpline[0] eq $users[i]
Perl does not have an "If" keyword, only an "if" keyword.
Case (still) matters.
You need (parenthesis) around the condition in an if statement.
> if TRUE then exit
Perl does not even have a "then" keyword, nor a "TRUE" keyword.
This is not Perl code. What language is it?
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 12 Oct 2004 20:22:46 GMT
From: "A. Sinan Unur" <usa1@llenroc.ude.invalid>
Subject: Re: String and Array Programming in Perl
Message-Id: <Xns9580A6E7B4F18asu1cornelledu@132.236.56.8>
On 12 Oct 2004, you wrote in comp.lang.perl.misc:
> I am very very new to Perl and am trying automate a process in my AIX
> Unix box. I issed the command ps -aef and sent it to a file. How do
> I get how many different users running programs, the total time for
> each user in hours:minutes format, and who is running the longest
> process and the program name? I am not familiar with using the loops.
> I know that I can probably use the date command to specify the date.
> This is where I am stuck thus far. Please help anyone...
This means you need to bite the bullet and actually pay for a book. For
recommendations, go to http://learn.perl.org/
Now:
> #! /usr/bin/perl
>
> use strict;
> use warnings;
Good :)
> @users;
Not good:
D:\Home> perl -c t.pl
Bareword found where operator expected at t.pl line 7, near "/home/smallp"
(Missing operator before allp?)
Global symbol "@users" requires explicit package name at t.pl line 6.
Global symbol "@tmpfile" requires explicit package name at t.pl line 7.
syntax error at t.pl line 7, near "/home/smallp"
Global symbol "@tmpfile" requires explicit package name at t.pl line 9.
Global symbol "@users" requires explicit package name at t.pl line 10.
Global symbol "$count" requires explicit package name at t.pl line 12.
Global symbol "$count" requires explicit package name at t.pl line 12.
Global symbol "@users" requires explicit package name at t.pl line 12.
Global symbol "$count" requires explicit package name at t.pl line 12.
syntax error at t.pl line 12, near "++;"
t.pl has too many errors.
Why did you not fix this stuff before posting?
> @tmpfile = OPEN(DataFileHandle, /home/smallp/data.txt);
perldoc -f open
I think it was Tad who put it most eloquently: You can't just make s**t up
and expect it to work!
> $tmpfile[0];
> $users[0];
Huh?
> for ($count= 0; $count <= $#users; $count++;) {
> If $tmpline[0] eq $users[i]
> if TRUE then exit
Huh???
> push(@users, $users[0]);
> }
I guess I'll take the bait anyway.
I have:
D:\Home> ps -v
PS (cygwin) 1.11
Process Statistics
Copyright 1996, 1997, 1998, 1999, 2000, 2001, 2002 Red Hat, Inc.
Compiled on May 25 2004
and I get:
D:\Home> ps -aef
UID PID PPID TTY STIME COMMAND
hbb1 582167 1 con 16:09:39 /usr/bin/BASH
hbb1 565087 1 con 16:09:52 /usr/bin/PS
#! /usr/bin/perl
use strict;
use warnings;
my $name = 'd:/home/data.txt';
open my $file, '<', $name
or die "Cannot open $name: $!";
<$file>; # assuming first line is header, so skip it
my $ps;
while(<$file>) {
chomp;
s/^\s*//g; # skip lines with whitespace only
next unless length $_;
my ($uid, $pid, $ppid, $tty, $stime, $command) = split;
$ps->{$pid} = {
uid => $uid,
ppid => $ppid,
tty => $tty,
stime => $stime,
command => $command,
};
}
use Data::Dumper;
print Dumper $ps;
__END__
Output:
D:\Home> perl t.pl
$VAR1 = {
'565087' => {
'uid' => 'hbb1',
'ppid' => '1',
'command' => '/usr/bin/PS',
'stime' => '16:09:52',
'tty' => 'con'
},
'582167' => {
'uid' => 'hbb1',
'ppid' => '1',
'command' => '/usr/bin/BASH',
'stime' => '16:09:39',
'tty' => 'con'
}
};
------------------------------
Date: Tue, 12 Oct 2004 16:35:09 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: String and Array Programming in Perl
Message-Id: <slrncmojgd.bn.tadmc@magna.augustmail.com>
A. Sinan Unur <usa1@llenroc.ude.invalid> wrote:
> I think it was Tad who put it most eloquently: You can't just make s**t up
> and expect it to work!
Nope, that was MJD, not me:
http://perl.plover.com/Questions4.html
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 12 Oct 2004 21:47:16 GMT
From: "A. Sinan Unur" <usa1@llenroc.ude.invalid>
Subject: Re: String and Array Programming in Perl
Message-Id: <Xns9580B53AA4F9Basu1cornelledu@132.236.56.8>
Tad McClellan <tadmc@augustmail.com> wrote in
news:slrncmojgd.bn.tadmc@magna.augustmail.com:
> A. Sinan Unur <usa1@llenroc.ude.invalid> wrote:
>
>> I think it was Tad who put it most eloquently: You can't just make
>> s**t up and expect it to work!
>
>
> Nope, that was MJD, not me:
>
> http://perl.plover.com/Questions4.html
Ah! Apologies for the misattribution and thank you for the link.
Sinan.
------------------------------
Date: Tue, 12 Oct 2004 19:20:55 +0000 (UTC)
From: KKramsch <karlUNDERSCOREkramsch@yahooPERIODcom.invalid>
Subject: undef takes forever
Message-Id: <ckhaqm$ss0$1@reader2.panix.com>
I have a script that, over a period of several *days* gradually
builds a very large Perl hash. Periodically, it saves this large
hash to a file, using the Storable module. In the past, this
storage process has resulted in a corrupted (and unusable) file,
so the current version of the script tests for the soundness of
the stored file by saving the hash to a dummy file first, then
retrieving the hash from memory into a temporary variable $temp,
and making sure that $temp is defined and that %$temp has the right
number of keys. If all this is as it should be, then the dummy
file is used to overwrite the old version of the hash stored on
disk.
It turns out, however, that this version of the script is about
10x slower than the original version, which did not do this extra
check on the stored hash. Using carefully placed print statements,
I determined that the bottleneck is not due to the extra retrieval
and checking steps, but to the deallocation of %$temp that happens
when $temp goes out of scope. Since %$temp is very large and
useless once the check is done, I don't want it hanging around
longer than necessary, but the deallocation step takes 3-4 minutes!
This is about 100 times slower than the time it takes to allocate
%$temp in the first place! It's crazy. I confirmed this by
inserting an explicit statement "undef $temp" right before the end
of the enclosing scope, and noting (via print statements) that this
step is the script's worst bottleneck by far.
It's the same thing if I make $temp file-global and skip the
explicit deallocation step. Now the bottleneck becomes every time
that I assign a new value to $temp, which (except for the first
time) involves deallocating the last contents of %$temp.
Is there any way to speed up the deallocation of %$temp (and
$temp)?
Thanks!
Karl
--
Sent from a spam-bucket account; I check it once in a blue moon. If
you still want to e-mail me, cut out the extension from my address,
and make the obvious substitutions on what's left.
------------------------------
Date: 12 Oct 2004 19:50:41 GMT
From: "A. Sinan Unur" <usa1@llenroc.ude.invalid>
Subject: Re: undef takes forever
Message-Id: <Xns9580A176D599Easu1cornelledu@132.236.56.8>
KKramsch <karlUNDERSCOREkramsch@yahooPERIODcom.invalid> wrote in
news:ckhaqm$ss0$1@reader2.panix.com:
> I have a script that, over a period of several *days* gradually
> builds a very large Perl hash. Periodically, it saves this large
> hash to a file, using the Storable module. In the past, this
> storage process has resulted in a corrupted (and unusable) file,
> so the current version of the script tests for the soundness of
> the stored file by saving the hash to a dummy file first, then
> retrieving the hash from memory into a temporary variable $temp,
> and making sure that $temp is defined and that %$temp has the right
> number of keys. If all this is as it should be, then the dummy
> file is used to overwrite the old version of the hash stored on
> disk.
You can't really be sure of the "soundness" of the file using this method.
It is hard to come up with a recommendation without knowing how much that
hash really needs to stay in memory at any given time. If, most of the
time, you don't need to reference previously computed elements of the hash,
I'd recommend at least using a tied hash, a DBM module. Alternatively, my
favorite at this point, you can look at SQLite with Class::DBI.
> It turns out, however, that this version of the script is about
> 10x slower than the original version, which did not do this extra
> check on the stored hash. Using carefully placed print statements,
> I determined that the bottleneck is not due to the extra retrieval
> and checking steps, but to the deallocation of %$temp that happens
> when $temp goes out of scope. Since %$temp is very large and
> useless once the check is done, I don't want it hanging around
> longer than necessary, but the deallocation step takes 3-4 minutes!
> This is about 100 times slower than the time it takes to allocate
> %$temp in the first place! It's crazy. I confirmed this by
> inserting an explicit statement "undef $temp" right before the end
> of the enclosing scope, and noting (via print statements) that this
> step is the script's worst bottleneck by far.
Again, without code, I have no idea what you are talking about. How big is
this thing?
Over time, most of the memory your script is using is being paged out to
the hard drive. On my Win 98 PIII500 with 128 Mb RAM, I ran the following
script:
#! perl
use strict;
use warnings;
print "Filling the hash now:\n";
my $t0 = time;
{
my $h;
$h->{$_} = $_ for (1 .. 750_000);
print <<EOT;
It took @{[ time - $t0 ]} seconds to fill the hash.
Now let's undef it:
EOT
$t0 = time;
}
print "It took @{[ time - $t0 ]} seconds to undef the hash.\n"
D:\Home> perl t.pl
Filling the hash now:
It took 11 seconds to fill the hash.
Now let's undef it:
It took 65 seconds to undef the hash.
On the other hand, with 500_000 elements instead of 750_000, I get:
D:\Home> perl t.pl
Filling the hash now:
It took 6 seconds to fill the hash.
Now let's undef it:
It took 3 seconds to undef the hash.
So, the solution seems to be to move away from holding all your data in
memory.
Sinan.
------------------------------
Date: 12 Oct 2004 11:42:54 -0700
From: aymerphilippe@hotmail.com (Philippe Aymer)
Subject: Re: Using a variable size with the repetition quantifier
Message-Id: <47971ff0.0410121042.640cad62@posting.google.com>
Great guys! Thank you!
I was sure PERL would do it. I was aware of (??{}), but for "simple"
pattern, I didn't know the use of '"' which can be usefull for more
complex regex.
Now, I still have a trouble. Because:
/X(\d)((??{"\\w{$1}"}))/
works, but in my string, I also have to match newline. So I did:
/X(\d)(??{"\\w{$1}"})/s
which doesn't work (seems to apply only to //, not things within
(?..)), then:
/X(\d)(??{"[\\w\n]{$1}"})/
which doesn't work neither... (?)
Any idea ?
Thanks again for your response, quick and clean!
Phil.
Brian McCauley <nobull@mail.com> wrote in message news:<ck60m1$549$1@sun3.bham.ac.uk>...
> Philippe Aymer wrote:
> >
> > I'm looking at a PERL regex (if possible) that will be able to use a
> > repetition quantifier metachar, but the number of repetition is
> > unknown until runtime.
>
> In general if you want a regex that adapts itself during its own
> execution you want (??{}).
>
> > For example:
> >
> > X3xyz...
> >
> > the number 3 give me the number of "repetition" for the next chars
> > (length of string), something like:
> >
> > /X(\d)(\w{\1})/
> >
> > but \1 is not possible within {} the repetition quantifier.
> >
> > Is there a way to use {} with the repetition number only known from
> > the regex ?
>
> /X(\d)((??{"\\w{$1}"}))/
------------------------------
Date: Tue, 12 Oct 2004 15:25:40 -0400
From: Jeff 'japhy' Pinyan <pinyaj@rpi.edu>
Subject: Re: Using a variable size with the repetition quantifier
Message-Id: <Pine.SOL.3.96.1041012152434.11683A-100000@vcmr-86.server.rpi.edu>
On 12 Oct 2004, Philippe Aymer wrote:
>Now, I still have a trouble. Because:
>
>/X(\d)((??{"\\w{$1}"}))/
>
>works, but in my string, I also have to match newline. So I did:
>
>/X(\d)(??{"\\w{$1}"})/s
>
>which doesn't work (seems to apply only to //, not things within
>(?..)), then:
The /s modifier only affects the '.' metacharacter. \w doesn't match \n.
>/X(\d)(??{"[\\w\n]{$1}"})/
>
>which doesn't work neither... (?)
This should work:
/X(\d)((??{ "[\\w\\n]{$1}" }))/
--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
Senior Dean, Fall 2004 % have long ago been overpaid?
RPI Corporation Secretary %
http://japhy.perlmonk.org/ % -- Meister Eckhart
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 7237
***************************************