[29533] in Perl-Users-Digest
Perl-Users Digest, Issue: 777 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Aug 20 18:09:42 2007
Date: Mon, 20 Aug 2007 15:09:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 20 Aug 2007 Volume: 11 Number: 777
Today's topics:
Re: A question about regex <rvtol+news@isolution.nl>
Mail::Sender problem <jismagic@gmail.com>
MI5 Persecution: Channel Four TV News - 12/Feb/1999 (58 MI5Victim@mi5.gov.uk
MI5 Persecution: Dimbleby / John Major, April 1997 (462 MI5Victim@mi5.gov.uk
MI5 Persecution: Ken Clarke (2), April 1997 (3397) MI5Victim@mi5.gov.uk
Re: On redhat, different users = different @INC <glex_no-spam@qwest-spam-no.invalid>
Re: optimizing text file searches <wahab@chemie.uni-halle.de>
Re: optimizing text file searches <wahab@chemie.uni-halle.de>
Re: optimizing text file searches <mbuttner@gmail.com>
Re: optimizing text file searches <jgibson@mail.arc.nasa.gov>
Re: optimizing text file searches <mbuttner@gmail.com>
Re: optimizing text file searches <wahab@chemie.uni-halle.de>
Re: optimizing text file searches <mbuttner@gmail.com>
Re: Perl sum of array and help with sorting <jurgenex@hotmail.com>
Perldoc and the pipe "|" character <jkstill@gmail.com>
Re: set enviorment varibale <noreply@gunnar.cc>
simple perl regex question <steve.logan@gmail.com>
Re: simple perl regex question <wahab@chemie.uni-halle.de>
Re: simple perl regex question <wahab@chemie.uni-halle.de>
Re: Symbolic representation of logical operators anno4000@radom.zrz.tu-berlin.de
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 20 Aug 2007 22:26:43 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: A question about regex
Message-Id: <fad4fp.1ak.1@news.isolution.nl>
J. Gleixner schreef:
> Henry Law:
>> Madhusudhanan Chandrasekaran:
>>> I am a perl newbie. I am reading from a file line by line and
>>> matching it for a partiuclar regex. If the match is found, I want
>>> to print the previous and the next few lines.
>>
>> I'm all for using Perl, but if you're on a UNIX system then grep
>> -A1 -a1
>
> grep -A 1 -B 1
I also like grep's -P a lot.
:)
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: Mon, 20 Aug 2007 21:53:29 -0000
From: jis <jismagic@gmail.com>
Subject: Mail::Sender problem
Message-Id: <1187646809.323644.233520@50g2000hsm.googlegroups.com>
Hi,
I wrote a program to send email. It looks as below
use strict;
use Mail::Sender;
my $smtp = 'IN****.A***.D*****.net';
my $subj = 'my first automated mail message ';
my $from = 'j********a@d****i.com';
my $to = 'j*********a@d****.com';
my $sender = new Mail::Sender {smtp => $smtp, from => $from ,to =>
$admn, subject => $subj};
$sender->Open({
to => $to,
subject => $subj,
auth => 'LOGIN',
authid => '*****',
authpwd => '********'});
$sender->Close();
if( $Mail::Sender::Error) {
print "Error sending mail: $Mail::Sender::Error \n";
}
else { print "Sent ok $Mail::Sender::Error \n"; }
This works fine for me without any issues.
I converted this script to exe using p2x-8.80-Win32.It gave warnings
initially with
1. Config file missing.
2. Langinfo.pm missing.
3. Digest::Perl::MD5 missing
I re installed Mail::Sender to make sure that there is a config file.I
manually copied LangInfo.pm to the I18N folder under C:\\perl\\lib.
ALso installed MD5 from cpan.
I converted to exe without any warnings.
But when I run exe it throws the error
" -3Error Sending
mail: Connect failed:An established connection was aborted by
software
in your host machine" . I do not have problems when I go back and run
my perl file.
I use Windows xp, Active perl 5.8.8 Build 820, Mail::sender 0.8.13
I use p2x-8.80-Win32 to convert .pl to .exe.I did not have any
problems while converting to exe before.Ofcourse I never used
Mail::Sender before.
Please throw light.
Cheers,
jis
------------------------------
Date: 20 Aug 2007 21:00:45 GMT
From: MI5Victim@mi5.gov.uk
Subject: MI5 Persecution: Channel Four TV News - 12/Feb/1999 (5843)
Message-Id: <m07072021004114@4ax.com>
Channel Four TV News - 12/Feb/1999
Certainty level: 100%
I am positively, utterly, completely sure this item is about me. It's a bit subtle so the objective reader might not understand my certainty. Here is
what happened. I was watching Channel Four News with Jon Snow, on the day Clinton "got off" (as it were) in the Monica-gate scandal. Snow said;
"[and we're anticipating that the President himself will make his] first
comments after the trial in which he has now been cleared at around half-
past seven. So we'll have more on the historic judgment, we'll also be
considering [starts smiling] the winners and losers in this whole sorry
saga. Now further doubts have been...."
When Snow said "half-past seven", I looked at the clock on the mantelpiece above the TV. Snow saw my glance, and in reaction to my glance at the clock,
smiled. I think he was smiling at what he perceived as my self-importance.
Usually when newscasters or radio presenters laugh at me, there is an excuse for their laughter; usually they manage to find some reason for my
being "funny"; their amusement is blamed on me; it is my "fault" they are laughing. And so it was with this instance; Jon Snow thought it was funny
that I should be so interested in seeing Clinton, and his half-smile while reading the words "winners and losers" expresses that.
I wrote to Snow at ITN shortly after this broadcast to ask him about his behaviour; of course, he didn't reply.
5843
--
Posted via NewsDemon.com - Premium Uncensored Newsgroup Service
------->>>>>>http://www.NewsDemon.com<<<<<<------
Unlimited Access, Anonymous Accounts, Uncensored Broadband Access
------------------------------
Date: 20 Aug 2007 20:43:26 GMT
From: MI5Victim@mi5.gov.uk
Subject: MI5 Persecution: Dimbleby / John Major, April 1997 (4620)
Message-Id: <m07072020432256@4ax.com>
Dimbleby / John Major, April 1997
Certainty level: 90%
Dimbleby interviews John Major during the election campaign. Here is
the exchange, regarding Neil Hamilton, the "sleazy" Tory candidate;
Dimbleby: "It's a direct quotation from what he said to Sir Gordon Downey
and what he said to his local newspaper, the Knutsford Guardian".
Major: "Well, heaven forfend that I should doubt what the Knutsford
Guardian actually said"
Dimbleby: "Well I should hope so"
Major: "Absolutely, it would be quite unforgivable to doubt the
Knutsford Guardian"
What I find in this segment is an assonance between "Knutsford" and "Nut"
or "Nuts". I think Dimbleby deliberately invents the play on words,
and says it clearly enough for Major to pick it up and repeat it.
You can gauge Dimbleby's intent from his facial expression of false
honesty when he uses the words "Knutsford Guardian"; and Major's reaction
follows from his recognition of Dimbleby's intent; Major smiles and says
it would be "unforgivable to doubt the Knutsford Guardian".
These are much clearer on the original video than on this Quicktime clip.
4620
--
Posted via NewsDemon.com - Premium Uncensored Newsgroup Service
------->>>>>>http://www.NewsDemon.com<<<<<<------
Unlimited Access, Anonymous Accounts, Uncensored Broadband Access
------------------------------
Date: 20 Aug 2007 20:25:47 GMT
From: MI5Victim@mi5.gov.uk
Subject: MI5 Persecution: Ken Clarke (2), April 1997 (3397)
Message-Id: <m07072020254454@4ax.com>
Ken Clarke (2), April 1997
Certainty level: 30%
The second of two video segments from the same programme, this shows
Ken Clarke during the election campaign. He makes the following statement;
"we have a party, and we have a Cabinet which has produced a manifesto
consistent with our policy of the last five years;
people are challenging us on the basis there's been some monstrous
conspiracy between all the politicians to guide us we know not where,
we know where we're going, we wish to be a leading influence in the European Union"
It should be fairly obvious what I read into in the above sentences.
The "monstrous conspiracy between all the politicians" could of course
apply to the public perception of closer European integration, but
I took it as referring to the conspiracy of politicians and
media people against me. Again, facial expression comes into it; as Clarke
speaks of the "monstrous conspiracy" his eyebrows twitch and it is clear
that something or somebody is being made fun of.
3397
--
Posted via NewsDemon.com - Premium Uncensored Newsgroup Service
------->>>>>>http://www.NewsDemon.com<<<<<<------
Unlimited Access, Anonymous Accounts, Uncensored Broadband Access
------------------------------
Date: Mon, 20 Aug 2007 15:45:11 -0500
From: "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid>
Subject: Re: On redhat, different users = different @INC
Message-Id: <46c9fd57$0$505$815e3792@news.qwest.net>
Paul Lalli wrote:
> On Aug 20, 3:00 pm, Russ <russell.bro...@perdue.com> wrote:
>
>> We have RedHat 4EL and perl 5.8.5. Per a user's request I
>> installed Date:Simple, using perl -MCPAN -e shell
>> as the root user.
>> Now root can find Date::Simple, but other users cannot. They
>> do not want to include a lib statement in their scripts or
>> invoke with a -I. The @INC libraries are close, but not
>> identical.
>>
>> Does anyone know how to correct or resolve this?
>> Any suggestions would be appreciated.
>
> In their .profile (or .bash_profile, or whatever), set the PER5LIB
> variable to the path of the installed modules.
>
> export PERL5LIB=/path/to/modules/
> or
> setenv PERL5LIB /path/to/modules
> (depending on the shell in use...)
Also, are they using the same 'perl' as root is using?
root# which perl
someuser% which perl
------------------------------
Date: Mon, 20 Aug 2007 22:06:25 +0200
From: Mirco Wahab <wahab@chemie.uni-halle.de>
Subject: Re: optimizing text file searches
Message-Id: <facsk6$13mj$1@nserver.hrz.tu-freiberg.de>
bivity wrote:
> On Aug 20, 1:36 pm, Mirco Wahab <wa...@chemie.uni-halle.de> wrote:
>> Is this (above) *one line*, eg.:
>> "885_Addm Un Lse 0867.pdf","885","ELM 111 N BOBBY AVE","Addm Un Lse 0867","Addm Un Lse 0867.pdf","Elmhurst","651","885","885_Addm Un Lse 0867"
>> delimited by '\n' from the following line?
>
> Each line in both files only contains one instance and yes it's
> delimited by \n, Solaris - SunOs 5.8 environment, Perl version 5.8, I
> can't use any extended libraries as well.
OK, I created a 500_000 line "full input file" according to your "long line"
template (with 2 random numbers [0-9999] in each record) and made a 5_000 line
search file according to the same rule. This run on my old Athlon/2500+ (non-X64)
in (user) 1min 52sec (and suprisingly lead to ~30 random hits).
I used the "good old long regex" from or-ed alternatives (the "search terms")
and did't consider printing non-matches:
---
use strict;
use warnings;
my $full_fn = shift;
my $srch_fn = shift;
my $pull_fn = 'lines_that_pulled.txt';
my $err_fn = 'did_not_find.txt';
open my $fh, '<', $srch_fn or die $!; # file with entries to use in search
my $ts ='^"('; # start big regex here
while( <$fh> ) {
if(length) {
tr/)($\n/..../;
$ts .= "(?:$_)|"
}
}
close $fh;
chop $ts;
my $text_q = qr/$ts)/; # build regex
print "Size of content: (unknown) \n";
print "Searching for: (unknown) instances\n";
my @final;
open $fh, '<', $full_fn or die $!; # the 400+K lines
while( <$fh> ) {
push @final, (split /,/,$_)[0]
if /$text_q/
}
close $fh;
open $fh, '>', $pull_fn or die $!;
print $fh join "\n", @final;
close $fh;
---
(FWIW)
I don't really have an idea to make this faster,
maybe someone has *the big flash* ;-)
Regards
M.
------------------------------
Date: Mon, 20 Aug 2007 22:09:57 +0200
From: Mirco Wahab <wahab@chemie.uni-halle.de>
Subject: Re: optimizing text file searches
Message-Id: <facsqr$13mj$2@nserver.hrz.tu-freiberg.de>
Mirco Wahab wrote:
> OK, I created a 500_000 line "full input file" according to your "long line"
> template (with 2 random numbers [0-9999] in each record) and made a
> 5_000 line search file according to the same rule.
FWIW, this is the script used for these files:
use strict;
use warnings;
open my $fh, '>', 'bigfile.txt' or die $!;
for(1..500_000) {
my $num1 = sprintf "%04d", int(rand(10000));
my $num2 = sprintf "%d", int(rand(10000));
print $fh
qq{"${num2}_Addm Un Lse ${num1}.pdf","${num2}","ELM 111 N BOBBY AVE","Addm Un Lse ${num1}","Addm Un Lse ${num1}.pdf","Elmhurst","651","${num2}","${num2}_Addm Un Lse ${num1}"},"\n"
}
close $fh;
open $fh, '>', 'searchfile.txt' or die $!;
for(1..5_000) {
my $num1 = sprintf "%04d", int(rand(10000));
my $num2 = sprintf "%d", int(rand(10000));
print $fh
qq{${num2}_Addm Un Lse ${num1}.pdf}, "\n"
}
close $fh;
Regards
M.
------------------------------
Date: Mon, 20 Aug 2007 20:51:11 -0000
From: bivity <mbuttner@gmail.com>
Subject: Re: optimizing text file searches
Message-Id: <1187643071.441843.164800@e9g2000prf.googlegroups.com>
On Aug 20, 2:34 pm, Brian Helterline <brian.helterl...@hp.com> wrote:
> bivity wrote:
> > My work requires a lot of index lookups for large amounts of files
> > daily. Recently, we have been receiving files with one document per
> > line along with all its attributes. This file will have around 400,000
> > entries. I then receive another file, with just the file name, and I
> > am told, look for each one of these files in this 400,000 entry list.
> > There are about 5000 in the file.
>
> > I just wrote a quick script to meet my needs, where I read in both
> > files, and grep (search_item, content_file). It works pretty well.
> > Except, it takes about 25 minutes for 5000 entries. I can't use a hash
> > implementation here, is there a way I can make this search faster?
>
> why not a hash? It is the right tool for the job.
>
>
You're right, I took another quick look at it, and I was able to
create a hash out of the content file. Thanks for reminding of the \Q
literal strings, always escapes my memory.
Well here is what I got in the end, night and day results.
!/usr/local/bin/perl
my %vals;
open(FILE, $ARGV[0]);
my @text_rep=<FILE>;
close FILE;
foreach my $t (@text_rep){
$t=~/^\"(.*?)\",/;
$vals {$1} = $t;
}
open(FILE, $ARGV[1]);
my @text_search=<FILE>;
close FILE;
open (OUT, "+>did_not_find_beta.txt");
foreach my $query (@text_search){
chomp ($query);
if(my $tmp=$vals{$query} ne ''){
push(@final, $vals{$tmp});
}
else { print OUT $query."\n";
}
}
close OUT;
open (OUT, "+>lines_that_pulled_beta.txt");
print OUT @final;
close OUT;
Thanks for all the help.
------------------------------
Date: Mon, 20 Aug 2007 13:52:41 -0700
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: optimizing text file searches
Message-Id: <200820071352410799%jgibson@mail.arc.nasa.gov>
In article <1187633735.495944.223950@i13g2000prf.googlegroups.com>,
bivity <mbuttner@gmail.com> wrote:
> My work requires a lot of index lookups for large amounts of files
> daily. Recently, we have been receiving files with one document per
> line along with all its attributes. This file will have around 400,000
> entries. I then receive another file, with just the file name, and I
> am told, look for each one of these files in this 400,000 entry list.
> There are about 5000 in the file.
>
> I just wrote a quick script to meet my needs, where I read in both
> files, and grep (search_item, content_file). It works pretty well.
> Except, it takes about 25 minutes for 5000 entries. I can't use a hash
> implementation here, is there a way I can make this search faster?
A hash implementation would be faster. Why can you not use a hash?
>
> Below is a sample search term and what the index line looks like,
> along with the entire script. I know its rough, but i wrote it in a
> hurry and I would like to refine it now, make it faster.
>
> Search File Term:
> 885_Addm Un Lse 0867.pdf
>
> Large File to be searched, its matching index:
> "885_Addm Un Lse 0867.pdf","885","ELM 111 N BOBBY AVE","Addm Un Lse
> 0867","Addm Un Lse 0867.pdf","Elmhurst","651","885","885_Addm Un Lse
> 0867"
>
> script:
>
[snipped]
Suggestions:
1. If all of your data looks like this example, then you are looking
for a fixed string in the first field of each record. Extract only the
first field into your @text_rep array and use the eq operator to
compare with your search strings. Check out the Text::CSV module.
2. Do not use grep. Instead, loop over your text_rep array and stop
when you get a match. Consider using the first() function from the
List::Util module (perldoc -q first).
3. For faster searching, sort your text_rep array and do a binary
search. See, for example, Search::Binary (I have not used it).
4. For even faster searching, use a hash.
--
Jim Gibson
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
------------------------------
Date: Mon, 20 Aug 2007 20:59:09 -0000
From: bivity <mbuttner@gmail.com>
Subject: Re: optimizing text file searches
Message-Id: <1187643549.871741.105870@l22g2000prc.googlegroups.com>
On Aug 20, 3:52 pm, Jim Gibson <jgib...@mail.arc.nasa.gov> wrote:
> In article <1187633735.495944.223...@i13g2000prf.googlegroups.com>,
>
> bivity <mbutt...@gmail.com> wrote:
> > My work requires a lot of index lookups for large amounts of files
> > daily. Recently, we have been receiving files with one document per
> > line along with all its attributes. This file will have around 400,000
> > entries. I then receive another file, with just the file name, and I
> > am told, look for each one of these files in this 400,000 entry list.
> > There are about 5000 in the file.
>
> > I just wrote a quick script to meet my needs, where I read in both
> > files, and grep (search_item, content_file). It works pretty well.
> > Except, it takes about 25 minutes for 5000 entries. I can't use a hash
> > implementation here, is there a way I can make this search faster?
>
> A hash implementation would be faster. Why can you not use a hash?
>
>
>
> > Below is a sample search term and what the index line looks like,
> > along with the entire script. I know its rough, but i wrote it in a
> > hurry and I would like to refine it now, make it faster.
>
> > Search File Term:
> > 885_Addm Un Lse 0867.pdf
>
> > Large File to be searched, its matching index:
> > "885_Addm Un Lse 0867.pdf","885","ELM 111 N BOBBY AVE","Addm Un Lse
> > 0867","Addm Un Lse 0867.pdf","Elmhurst","651","885","885_Addm Un Lse
> > 0867"
>
> > script:
>
> [snipped]
>
> Suggestions:
>
> 1. If all of your data looks like this example, then you are looking
> for a fixed string in the first field of each record. Extract only the
> first field into your @text_rep array and use the eq operator to
> compare with your search strings. Check out the Text::CSV module.
>
> 2. Do not use grep. Instead, loop over your text_rep array and stop
> when you get a match. Consider using the first() function from the
> List::Util module (perldoc -q first).
>
> 3. For faster searching, sort your text_rep array and do a binary
> search. See, for example, Search::Binary (I have not used it).
>
> 4. For even faster searching, use a hash.
>
> --
> Jim Gibson
>
> Posted Via Usenet.com Premium Usenet Newsgroup Services
> ----------------------------------------------------------
> ** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
> ----------------------------------------------------------
> http://www.usenet.com
2. Tried the first function, last week but my company doesn't have
many modules installed.
4. I went with this, my 25 minute process, went to a mere 7
seconds.... Got to love it.
------------------------------
Date: Mon, 20 Aug 2007 23:10:18 +0200
From: Mirco Wahab <wahab@chemie.uni-halle.de>
Subject: Re: optimizing text file searches
Message-Id: <fad0lc$14p3$1@nserver.hrz.tu-freiberg.de>
bivity wrote:
> You're right, I took another quick look at it, and I was able to
> create a hash out of the content file. Thanks for reminding of the \Q
> literal strings, always escapes my memory.
From what I understood, you tried deliberately to
convert ')($' to dot '.', meaning "any char"
in a regular expression.
> Well here is what I got in the end, night and day results.
Nice!
> !/usr/local/bin/perl
> my %vals;
> open(FILE, $ARGV[0]);
> my @text_rep=<FILE>;
> close FILE;
> foreach my $t (@text_rep){
> $t=~/^\"(.*?)\",/;
> $vals {$1} = $t;
> }
> open(FILE, $ARGV[1]);
> my @text_search=<FILE>;
> close FILE;
> open (OUT, "+>did_not_find_beta.txt");
> foreach my $query (@text_search){
> chomp ($query);
> if(my $tmp=$vals{$query} ne ''){
> push(@final, $vals{$tmp});
> }
> else { print OUT $query."\n";
> }
> }
> close OUT;
> open (OUT, "+>lines_that_pulled_beta.txt");
> print OUT @final;
> close OUT;
You don't need these arrays at all. From
reducing the complexity, you might gain
another 50% run time. Especially the hash
lookup can be optimized:
use strict;
use warnings;
my $full_fn = shift;
my $srch_fn = shift;
my $pull_fn = 'lines_that_pulled_beta.txt';
my $err_fn = 'did_not_find.txt';
my (%vals, @final);
open my $fh, '<', $full_fn or die $!;
while ( <$fh> ) {
chomp; # no need to use a temp array
$vals{$1} = $_ if /"([^"]+)/
}
close $fh;
open my $fh_err, '>', $err_fn or die $!;
open $fh, '<', $srch_fn or die $!;
while( <$fh> ) {
chomp; # now do a hash lookup for existance of the key term
exists $vals{$_} ? push @final, $vals{$_} : print $fh_err "$_\n"
}
close $fh, close $fh_err;
open $fh, '>', $pull_fn or die $!;
print $fh join "\n", @final;
close $fh;
But why did you, in your original post (or one later),
stress that you *can't* use hash? Did you assume it
would be to large?
Regards
M.
------------------------------
Date: Mon, 20 Aug 2007 21:42:40 -0000
From: bivity <mbuttner@gmail.com>
Subject: Re: optimizing text file searches
Message-Id: <1187646160.760072.163320@i38g2000prf.googlegroups.com>
On Aug 20, 4:10 pm, Mirco Wahab <wa...@chemie.uni-halle.de> wrote:
> bivity wrote:
> > You're right, I took another quick look at it, and I was able to
> > create a hash out of the content file. Thanks for reminding of the \Q
> > literal strings, always escapes my memory.
>
> From what I understood, you tried deliberately to
> convert ')($' to dot '.', meaning "any char"
> in a regular expression.
>
> > Well here is what I got in the end, night and day results.
>
> Nice!
>
>
>
> > !/usr/local/bin/perl
> > my %vals;
> > open(FILE, $ARGV[0]);
> > my @text_rep=<FILE>;
> > close FILE;
> > foreach my $t (@text_rep){
> > $t=~/^\"(.*?)\",/;
> > $vals {$1} = $t;
> > }
> > open(FILE, $ARGV[1]);
> > my @text_search=<FILE>;
> > close FILE;
> > open (OUT, "+>did_not_find_beta.txt");
> > foreach my $query (@text_search){
> > chomp ($query);
> > if(my $tmp=$vals{$query} ne ''){
> > push(@final, $vals{$tmp});
> > }
> > else { print OUT $query."\n";
> > }
> > }
> > close OUT;
> > open (OUT, "+>lines_that_pulled_beta.txt");
> > print OUT @final;
> > close OUT;
>
> You don't need these arrays at all. From
> reducing the complexity, you might gain
> another 50% run time. Especially the hash
> lookup can be optimized:
>
> use strict;
> use warnings;
>
> my $full_fn = shift;
> my $srch_fn = shift;
> my $pull_fn = 'lines_that_pulled_beta.txt';
> my $err_fn = 'did_not_find.txt';
> my (%vals, @final);
>
> open my $fh, '<', $full_fn or die $!;
> while ( <$fh> ) {
> chomp; # no need to use a temp array
> $vals{$1} = $_ if /"([^"]+)/
> }
> close $fh;
>
> open my $fh_err, '>', $err_fn or die $!;
> open $fh, '<', $srch_fn or die $!;
> while( <$fh> ) {
> chomp; # now do a hash lookup for existance of the key term
> exists $vals{$_} ? push @final, $vals{$_} : print $fh_err "$_\n"
> }
> close $fh, close $fh_err;
>
> open $fh, '>', $pull_fn or die $!;
> print $fh join "\n", @final;
> close $fh;
>
> But why did you, in your original post (or one later),
> stress that you *can't* use hash? Did you assume it
> would be to large?
>
> Regards
>
> M.
That's great stuff, thanks!
Originally, the index file was not going to match the file name:
The file "Title Index" was being provided, IE: "Document.pdf"
However, the Title Index was not found in the large file, and we had
to do a look up based off the storage location, IE:
"21392_2394993Document.pdf"
It got hairy quick, now that I think about it, a hash could have been
still applied, i just needed to clean up the storage location with a
regex. But that was why at first I just wanted to make sure it worked
before getting into a hash implementation. (It can be tricky to
explain to non-coders what the script is doing on the fly when your
superiors want to see the code and a demo, but that is a different
story)
I can't imagine life without Perl :P
------------------------------
Date: Mon, 20 Aug 2007 20:11:41 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Perl sum of array and help with sorting
Message-Id: <1Amyi.1912$wr3.806@trndny04>
elroyerni wrote:
> I have a array of a list of numbers:
>
> 7.9216
> 8.7583
> 12.675
> 0.8028
> 6.9230
> 1.1403
> 6.0083
> 0.1454
>
> I wrote a sub-routine to add the list, but i'm getting these syntax
> errors, and was wondering if someone could tell me what i'm doing
> wrong. here's my code for the sub-routine:
> sub sum_array {
> my($sum) = 0; # initialize the sum to 0
> foreach $i (@array_data) {
> $sum = $sum + $i;
> }
> return($sum);
> }
>
> It's returning this error: isn't numeric in addition (+)
I cannot repro this behaviour. Using your code and sample data I am getting
a clear 44.3747 as return value. Can you please post a _COMPLETE_ script
that demonstrates this message, such that we can copy and run the script?
On a side note: a more perlish way would be
foreach (@array_data) {
$sum += $_;
An even simpler way is to use sum() from List::Util
return sum(@array_data);
> Also I'm trying to write a sub-routine that will go through each
> element in the array and tell me how many elements in the array are
> less than a given value.
That is a typical application for grep().
> For example in the array above say i want to
> return the amount of elements that are less than 5 seconds. From the
> array above I'd return 3.
sub less_five {
return scalar grep ($_ < 5, @array_data);
}
> Here's what I have so far:
> sub less_five {
> foreach $r(@array_data){
> count=0;
You are resetting your counter to 0 for each element of the array.
> while ($count<5)
> {
> $count++;
Why do you loop to increment the counter until the value of the counter is
larger than 5? That doesn't make any sense to me.
Maybe you meant
if ($r < 5) {$count++}
instead?
jue
------------------------------
Date: Mon, 20 Aug 2007 20:59:32 -0000
From: jkstill <jkstill@gmail.com>
Subject: Perldoc and the pipe "|" character
Message-Id: <1187643572.688539.106540@l22g2000prc.googlegroups.com>
While cutting and pasting an example from a script viewed via perldoc,
the example would not work.
The problem was that perldoc was displaying the pipe chr(124) and a
similar character, chr(226).
I have reviewed the docs on perldoc, but have been unable to get any
method to work so that the pipe symbol is properly displayed.
This is on RH Linux ES 4, 2.6 kernel.
Perl is 5.8.8
Here is a snippet from the displayed documentation:
ps -e -o user --no-headers =E2=94=82 sort -u=E2=94=82
As you can see those are not pipes - |
What might be the reason for this, and a solution to get the | to
display properly?
------------------------------
Date: Mon, 20 Aug 2007 22:56:43 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: set enviorment varibale
Message-Id: <5iudglF3mc74aU1@mid.individual.net>
chinmoy.chittaranjan@gmail.com wrote:
> Too much Thanks Gunner ..it is working fine .But if i want to set this
> enviorment variable to nothing (mean set build=1 to nothing)then what
> procedure i have to follow?
1. Enter "perldoc perlfunc".
2. Scroll downwards til you find a section with "Functions for real
%HASHes".
3. The function "delete" sounds promising, right? Enter
"perldoc -f delete" to get more info about it.
4. Yep, that seemed to be it. :)
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Mon, 20 Aug 2007 14:27:36 -0700
From: "steve.logan@gmail.com" <steve.logan@gmail.com>
Subject: simple perl regex question
Message-Id: <1187645256.692881.298210@57g2000hsv.googlegroups.com>
disclaimer - perl and shell scripting is not my area of expertise, but
I've been asked to help in updating a ton of DNS zone files. I've got
my scripts down to one last issue - updating the serial numbers in the
files.
Basically, the serials are all 10 digt numbers followed by a ;
I'm trying to write a regex to find 10 digits ([0-9]{10}), follwed by
a semi-colon ([;]) and then just replace it with 2007083100;
Here's where I'm at - it runs ok, but the file isn't being updated:
perl -pi -e 's/\/([0-9]{10})([;])\//2007083100;/g' domain.com.db
I think I'm close here - maybe?
------------------------------
Date: Mon, 20 Aug 2007 23:23:08 +0200
From: Mirco Wahab <wahab@chemie.uni-halle.de>
Subject: Re: simple perl regex question
Message-Id: <fad1de$14u1$1@nserver.hrz.tu-freiberg.de>
steve.logan@gmail.com wrote:
> disclaimer - perl and shell scripting is not my area of expertise, but
> I've been asked to help in updating a ton of DNS zone files. I've got
> my scripts down to one last issue - updating the serial numbers in the
> files.
>
> Basically, the serials are all 10 digt numbers followed by a ;
>
> I'm trying to write a regex to find 10 digits ([0-9]{10}), follwed by
> a semi-colon ([;]) and then just replace it with 2007083100;
>
> Here's where I'm at - it runs ok, but the file isn't being updated:
>
>
> perl -pi -e 's/\/([0-9]{10})([;])\//2007083100;/g' domain.com.db
>
> I think I'm close here - maybe?
Yes, hairs width (imho):
$> perl -i -pe 's/(?<!\d)\d{10}(?=\s+;)/2007083100/g' domain.com.db
Regards
M.
------------------------------
Date: Mon, 20 Aug 2007 23:25:43 +0200
From: Mirco Wahab <wahab@chemie.uni-halle.de>
Subject: Re: simple perl regex question
Message-Id: <fad1i9$14u1$3@nserver.hrz.tu-freiberg.de>
steve.logan@gmail.com wrote:
> Here's where I'm at - it runs ok, but the file isn't being updated:
> perl -pi -e 's/\/([0-9]{10})([;])\//2007083100;/g' domain.com.db
> I think I'm close here - maybe?
Yes, hairs width (imho). But I've seen zone files with a \s in front
of the ';' ...
$> perl -i -pe 's/(?<!\d)\d{10}(?=\s*;)/2007083100/g' domain.com.db
Regards
M.
------------------------------
Date: 20 Aug 2007 20:31:13 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Symbolic representation of logical operators
Message-Id: <5iuc0hF3rmrv3U1@mid.dfncis.de>
Randal L. Schwartz <merlyn@stonehenge.com> wrote in comp.lang.perl.misc:
> >>>>> "Ruud" == Ruud <rvtol+news@isolution.nl> writes:
>
> Ruud> I guess you could use:
>
> Ruud> !!$v1 ^ !!$v2
>
> Or more simply:
>
> $v1 ? !$v2 : $v2
>
> This will have the proper truth-iness, although there's no easy way to
> preserve "last expression evaluated" as with the other short-circuit
> operators.
Well, you can't expect too much. If the result of the xor is true,
that's fine: there's exactly one true argument, which should be
returned. If the result is false, if we're lucky both arguments
are false, in which case either one can be returned, though there
is no canonical way to choose. If both arguments are true, none
of them can be used, so an "artificial" false has to be returned.
!$v1 ? $v2 :
!$v2 ? $v1 :
!1;
That should be as close as it gets. It arbitrarily returns $v2 in
the false-false case. The final "!1" might as well have been "0",
but this way it returns the Perl-specific Janus-faced false.
Anno
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 777
**************************************