[28194] in Perl-Users-Digest
Perl-Users Digest, Issue: 9558 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Aug 3 21:10:16 2006
Date: Thu, 3 Aug 2006 18:10:10 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Thu, 3 Aug 2006 Volume: 10 Number: 9558
Today's topics:
Linux find and grep to pure Perl fourfour2@gmail.com
Re: Linux find and grep to pure Perl <1usa@llenroc.ude.invalid>
Prototyping Subs as func expr, list As In map? <vtatila@mail.student.oulu.fi>
Re: Recursion xhoster@gmail.com
Regex...HTML::Parser...Getting webpage data? <wbresson@gmail.com>
Re: Regex...HTML::Parser...Getting webpage data? <mritty@gmail.com>
Re: Regex...HTML::Parser...Getting webpage data? <wbresson@gmail.com>
Re: Regex...HTML::Parser...Getting webpage data? xhoster@gmail.com
Re: Regex...HTML::Parser...Getting webpage data? <wbresson@gmail.com>
Re: Regex...HTML::Parser...Getting webpage data? xhoster@gmail.com
Scanning Multiple Log files for patterns continously tambekp@gmail.com
Re: Scanning Multiple Log files for patterns continousl <1usa@llenroc.ude.invalid>
Re: Scanning Multiple Log files for patterns continousl tambekp@gmail.com
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 3 Aug 2006 16:58:16 -0700
From: fourfour2@gmail.com
Subject: Linux find and grep to pure Perl
Message-Id: <1154649496.292186.248170@s13g2000cwa.googlegroups.com>
I have been looking at find2perl...
I want this linux command converted to perl code, I guess using Perl
grep and Perl find, that will run on any OS:
find . -type f -exec egrep -i 'xbox|ps2' {} \; -print
I don't want any OS system calls - just pure clean Perl.
Any help appreciated for the best way to do this ...
------------------------------
Date: Fri, 04 Aug 2006 00:07:50 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: Linux find and grep to pure Perl
Message-Id: <Xns9814CCE07FCB3asu1cornelledu@127.0.0.1>
fourfour2@gmail.com wrote in news:1154649496.292186.248170
@s13g2000cwa.googlegroups.com:
> I have been looking at find2perl...
>
> I want this linux command converted to perl code, I guess using Perl
> grep and Perl find, that will run on any OS:
>
> find . -type f -exec egrep -i 'xbox|ps2' {} \; -print
>
> I don't want any OS system calls - just pure clean Perl.
>
> Any help appreciated for the best way to do this ...
First, make an attempt. Second, post your attempt here in accordance with
the guidelines. Third, participate in discussions about your code.
See
perldoc File::Find
Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
------------------------------
Date: Thu, 3 Aug 2006 22:15:42 +0300
From: "Veli-Pekka Tätilä" <vtatila@mail.student.oulu.fi>
Subject: Prototyping Subs as func expr, list As In map?
Message-Id: <eati1a$8j5$1@news.oulu.fi>
Hi,
I think I've noticed a discrepancy about user and built-in functions taking
code refs. Where as, say, List::Util::reduce (prototyped &@) let's one pass
a code ref, map and grep enable one to also pass an expression as the first
argument. So:
both:
map { lc } @list;
and
map lc, @list;
Do the same thing. Conceptually, the expression is passed as a whole and
evaluated lazily later on as though it was a sub-routine, I suppose.
But in this snippet dealing with a user function, trying to use an
expression terminated by a comma throws a syntax error:
reduce { $a * $b } @list;
reduce $a * $b, @list; # Won't compile.
That is:
Type of arg 1 to List::Util::reduce must be block or sub {} (not
multiplication (*))
Apparently the built-ins don't parse quite the same way as the
user-functions prototyped with the & character. This notion is backed up by
the prototype function. Asking it for the map prototype just hands me undef
unless I've typoed:
print prototype 'CORE::map';
Not being able to prototype reduce and my own list functions as
func EXPR, LIST
is only slightly annoying. However, I'd like to ask why this difference
exists. That is, why not interpret the first argument & like map and grep do
it, when it isn't a code ref? From the user functions point of view it
could be indistinguishable from a normal code ref.
The only downside I can see is not being able to use something totally
different from a code ref as the first argument, of a user sub then. I'm not
sure how common that is, though. Maybe Perl could do a heuristic guess as
to whether you ment an expression to be evaluated as a coderef (operands and
operators) or wanted to pass around a simple variable in a user function. A
safer way would be reserving a different prototype character for the
map-like behavior, I suppose. These are just some vague suggestions that
occurred to me as I don't know all that much about Perl parsing.
Speaking of functions in which the next comma in the list has "special
significance", unary list operators come to mind:
print lc 'FOO', 'bar';
lc's argument list is ended by the first comma in prints arguments. Kind of
like, by association, how the first comma in map separates its expression
and list parts.
I suppose working exactly like map or grep might cause additional problems
with some other Perl constructs I have not thought of. Nevertheless, it
would be the expected behavior for me, or maybe I've got unusual
expectations, <grin>. At any rate, I've got a feeling I've overlooked
something essential which would prevent the func EXPR, LIST construct from
working well for user functions. Wonder what that might be or have I
answered my own question already? That is too much ambiguity, the
possibility of breaking old code and serious limitations on the polymorphism
of sub-routine arguments. And all this for a mere minor inconvenience that's
easily fixed by using braces.
--
With kind regards Veli-Pekka Tätilä (vtatila@mail.student.oulu.fi)
Accessibility, game music, synthesizers and programming:
http://www.student.oulu.fi/~vtatila/
------------------------------
Date: 03 Aug 2006 18:00:06 GMT
From: xhoster@gmail.com
Subject: Re: Recursion
Message-Id: <20060803140819.424$dp@newsreader.com>
"kokolo" <koko_loko_0@yahoo.co.uk> wrote:
> <xhoster@gmail.com> wrote in message
> news:20060802124247.030$Bg@newsreader.com...
> >
> > Well, you have hit one of those famous tradeoffs. There is no doubt
> > that your paritioning method is simpler, less error prone, less subtle,
> > etc. than one of the traditional in-place pivot methods. But it is
> > also slower due to all the allocation and copying going on.
> >
> > Xho
>
> I tried referencing like this:
> ........
> my $l_ref=\@smaller_numbers;
> my $r_ref=\@larger_numbers;
>
> if ($#smaller_numbers > 0){@smaller_numbers = &qs(@$l_ref)}
> if ($#larger_numbers > 0) {@larger_numbers = &qs(@$r_ref)}
This doesn't do anything at all. You are still doing just as much
allocating and copying and passing as you were before.
> Was that referencing ok?
Well, there was nothing wrong with it per se, but it doesn't accomplish
anything. You would need to change the qs to take a reference, rather than
an array, and then pass the reference ($l_ref, not @$l_ref) into the sub
call. But that would only solve one quarter of the problem. You would
still be allocating @smaller_numbers and @larger_numbers and copying
everything into them. And you would still be taking the results of the
recursive call and copying them back into a big array to be returned, and
then copying it again in the return itself.
That said, your program does seem to be N**2, rather than NlogN, and I see
no reason that it should be. Even with all the badness built in, it should
still be NlogN, or maybe N(logN)**2, not N**2. I don't quite get it.
> I wonder how good QuickSort can be in Perl so it will show me how bad or
> good my algorithm is.
It can be at least 100 times faster. ("use sort '_quicksort';" give me
a built in sort that takes <1 second to sort 305780 numbers, versus 140
seconds for your sort). OK, so that may not qualify as being "in Perl",
because I'm sure much of it in C, but is available from Perl and still
uses the Perl variable access methods and such.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: 3 Aug 2006 11:31:19 -0700
From: "Wesley Bresson" <wbresson@gmail.com>
Subject: Regex...HTML::Parser...Getting webpage data?
Message-Id: <1154629879.359487.155630@m73g2000cwd.googlegroups.com>
I'm pretty new to Perl, my past experience has been in modifying other
peoples code in order to do what I want it to do but now I'm trying to
write
my own to do a specific task that I can't find code for and am having
issues. I am trying to retrieve data from a webpage, say
http://www.apmex.com/shop/buy/Silver_American_Eagles.asp?orderid=0 for
example, the price of a 2006 1oz Silver American Eagle in the 20-99
price
break quantity. Should I use Regex to do that or would I be better off
with
HTML::Parser ? I've attemped Regex since I seem to understand it better
but
haven't had much success it getting it to pull the right price.
HTML::Parser
I understand even less than Regex but I've read that its a more
reliable way
of pulling webpage data ? I can't seem to find "easy" to understand
documentation on it though so I'm even farther away from getting it to
work
then Regex, Any advice ?
------------------------------
Date: 3 Aug 2006 11:35:24 -0700
From: "Paul Lalli" <mritty@gmail.com>
Subject: Re: Regex...HTML::Parser...Getting webpage data?
Message-Id: <1154630124.806235.227130@p79g2000cwp.googlegroups.com>
Wesley Bresson wrote:
> I'm pretty new to Perl, my past experience has been in modifying other
> peoples code in order to do what I want it to do but now I'm trying to
> write
> my own to do a specific task that I can't find code for and am having
> issues. I am trying to retrieve data from a webpage, say
> http://www.apmex.com/shop/buy/Silver_American_Eagles.asp?orderid=0 for
> example, the price of a 2006 1oz Silver American Eagle in the 20-99
> price
> break quantity. Should I use Regex to do that
No. Regular Expressions are notoriously unable to parse "real" HTML.
> or would I be better off with HTML::Parser ?
Well, you'd be better than with Regular expressions...
> I've attemped Regex since I seem to understand it better
> but haven't had much success it getting it to pull the right price.
> HTML::Parser I understand even less than Regex
I agree. I don't like HTML::Parser's interface at all. I suggest you
give HTML::TokeParser a shot, though. After a few tries, I'm generally
able to get it to do what I want. I find the interface much more
understandable than HTML::Parser's.
Good luck,
Paul Lalli
------------------------------
Date: 3 Aug 2006 11:53:20 -0700
From: "Wesley Bresson" <wbresson@gmail.com>
Subject: Re: Regex...HTML::Parser...Getting webpage data?
Message-Id: <1154631200.868054.264190@s13g2000cwa.googlegroups.com>
....
> I agree. I don't like HTML::Parser's interface at all. I suggest you
> give HTML::TokeParser a shot, though. After a few tries, I'm generally
> able to get it to do what I want. I find the interface much more
> understandable than HTML::Parser's.
>
> Good luck,
> Paul Lalli
Thanks, I'll look into that, It looks like my provider does have it
installed http://links.1and1faqs.com/perldiver.cgi so I'll start
looking up documentation on it.
------------------------------
Date: 03 Aug 2006 19:20:46 GMT
From: xhoster@gmail.com
Subject: Re: Regex...HTML::Parser...Getting webpage data?
Message-Id: <20060803152900.679$es@newsreader.com>
"Wesley Bresson" <wbresson@gmail.com> wrote:
> I'm pretty new to Perl, my past experience has been in modifying other
> peoples code in order to do what I want it to do but now I'm trying to
> write
> my own to do a specific task that I can't find code for and am having
> issues. I am trying to retrieve data from a webpage, say
> http://www.apmex.com/shop/buy/Silver_American_Eagles.asp?orderid=0 for
> example, the price of a 2006 1oz Silver American Eagle in the 20-99
> price
> break quantity.
What do you mean by "say" and "for example"? Are all the examples going
to be extremely similar to that one, or not? If not, I don't think there
is a magic bullet for you.
> Should I use Regex to do that or would I be better off
> with
> HTML::Parser ?
If I just wanted to parse that page every day to see how the price changes,
I would do it with a regex. If you want to parse a lot of pages that are
kind of, but not exactly, like that, then I would probably use some kind
of HTML parsing module.
> I've attemped Regex since I seem to understand it better
> but
> haven't had much success it getting it to pull the right price.
$ perl -0777 -lne 's/\s+/ /g;
/2006 1oz Silver American Eagles.+?20 - 99.*?\$(\d{1,5}\.\d\d)/
and print "$1\n";' Silver_American_Eagles.html
13.95
If 20 - 99 is no longer offered for 2006 1oz Silver American Eagles, but
is for something further down on the list, you will get the price for that
thing futher down on the list. Similarly, if the price for 20 - 99 is
somehow malformed, it will silently move on to the next price that
is formated like this expects, and report that one.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: 3 Aug 2006 13:42:06 -0700
From: "Wesley Bresson" <wbresson@gmail.com>
Subject: Re: Regex...HTML::Parser...Getting webpage data?
Message-Id: <1154637726.581933.79000@b28g2000cwb.googlegroups.com>
> What do you mean by "say" and "for example"? Are all the examples going
> to be extremely similar to that one, or not? If not, I don't think there
> is a magic bullet for you.
>
> > Should I use Regex to do that or would I be better off
> > with
> > HTML::Parser ?
>
> If I just wanted to parse that page every day to see how the price changes,
> I would do it with a regex. If you want to parse a lot of pages that are
> kind of, but not exactly, like that, then I would probably use some kind
> of HTML parsing module.
>
> > I've attemped Regex since I seem to understand it better
> > but
> > haven't had much success it getting it to pull the right price.
>
> $ perl -0777 -lne 's/\s+/ /g;
> /2006 1oz Silver American Eagles.+?20 - 99.*?\$(\d{1,5}\.\d\d)/
> and print "$1\n";' Silver_American_Eagles.html
>
> 13.95
>
> If 20 - 99 is no longer offered for 2006 1oz Silver American Eagles, but
> is for something further down on the list, you will get the price for that
> thing futher down on the list. Similarly, if the price for 20 - 99 is
> somehow malformed, it will silently move on to the next price that
> is formated like this expects, and report that one.
>
>
> Xho
>
> --
> -------------------- http://NewsReader.Com/ --------------------
> Usenet Newsgroup Service $9.95/Month 30GB
By "say" and "for example" I mean that yes that is one page that I want
to start on but there are others that would be nice also once that one
is figured out. I tried your code for this page and it errored out but
I'm assuming its either my windows perl that is messing it up or extra
spaces in the copy/paste, I saved the page to the same dir that I was
running from but no go. I'll look at it more later, thanks for your
help
C:\Users\Me\Desktop>perl -0777 -lne 's/\s+/ /g;/2006 1oz Silver
American Eagles
.+?20 - 99.*?\$(\d{1,5}\.\d\d)/and print "$1\n";'
Silver_American_Eagles.html
Can't find string terminator "'" anywhere before EOF at -e line 1.
------------------------------
Date: 03 Aug 2006 21:51:04 GMT
From: xhoster@gmail.com
Subject: Re: Regex...HTML::Parser...Getting webpage data?
Message-Id: <20060803175919.394$Cx@newsreader.com>
"Wesley Bresson" <wbresson@gmail.com> wrote:
> I tried your code for this page and it errored out but
> I'm assuming its either my windows perl that is messing it up or extra
> spaces in the copy/paste, I saved the page to the same dir that I was
> running from but no go. I'll look at it more later, thanks for your
> help
>
> C:\Users\Me\Desktop>perl -0777 -lne 's/\s+/ /g;/2006 1oz Silver
> American Eagles
> .+?20 - 99.*?\$(\d{1,5}\.\d\d)/and print "$1\n";'
> Silver_American_Eagles.html
> Can't find string terminator "'" anywhere before EOF at -e line 1.
On Windows, you need to wrap your -e program in double quotes rather
than single quotes which means you need to change any double quotes
occuring inside the script to something else, like qq'$1\n'
Or just put the program into a file.
#!/usr/bin/perl
use strict;
use warnings;
$/=undef; # same as the -0777 command line
$_=<>; # slurp
s/\s+/ /g;
/2006 1oz Silver American Eagles.+?20 - 99.*?\$(\d{1,5}\.\d\d)/
and print "$1\n";
__END__
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: 3 Aug 2006 15:21:44 -0700
From: tambekp@gmail.com
Subject: Scanning Multiple Log files for patterns continously
Message-Id: <1154643704.384937.101190@s13g2000cwa.googlegroups.com>
Hello,
I want to scan multiple log files for multiple patterns continously. I
am limited to the default perl modules (cant install any extra
modules). Here is what I am attempting to do. Any guidance, pointers to
achieve this are highly appreciated.
- I want to make a hash of patterns and files they need to be searched
against and then every pattern has a set threshold. If the threshold
exceeds in any of the files the script should alert. I dont want to
pass patterns/files as arguments but want to put them in a config file
or hash
- I want to have the script run in an infinite loop so that it scans
the files from the position it left the last scan.
- I have tried doing a hash of patterns=>filenames but this fails (and
rightly so) if same pattern has to be searched in different files as
the key of the hash does not remain unique in that case.
Any suggestions on how to go about implementing this?
Thanks in advance for all the help
-k
------------------------------
Date: Thu, 03 Aug 2006 23:10:30 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: Scanning Multiple Log files for patterns continously
Message-Id: <Xns9814C327250F9asu1cornelledu@127.0.0.1>
tambekp@gmail.com wrote in news:1154643704.384937.101190
@s13g2000cwa.googlegroups.com:
> - I want to make a hash of patterns and files they need to be searched
> against and then every pattern has a set threshold. If the threshold
> exceeds in any of the files the script should alert. I dont want to
> pass patterns/files as arguments but want to put them in a config file
> or hash
>
> - I want to have the script run in an infinite loop so that it scans
> the files from the position it left the last scan.
The first task is to write this for scanning only one file at a time.
First off, I do not think keeping the file open is a good idea: The file
may actually be deleted and re-created while your script is running (I
don't think this is possible on Windows, but AFAIK, it is allowed on
various *nix flavored OS). In that case, you would be scanning the wrong
file.
Instead, check out the discussion in
perldoc -f seek
> - I have tried doing a hash of patterns=>filenames but this fails (and
> rightly so) if same pattern has to be searched in different files as
> the key of the hash does not remain unique in that case.
I am not sure what you are talking about here. Hash keys are always
strings, and cannot be patterns.
The most obvious structure would be:
my %patterns_by_file = (
log1 => [ qr/^ERROR/, qr/^WARNING/ ],
log2 => [ qr/^INFO/, qr/^ALERT/ ],
# etc
);
> Any suggestions on how to go about implementing this?
Get the one file version working first. You might want to use or refer to
File::Tail:
http://search.cpan.org/~mgrabnar/File-Tail-0.99.3/Tail.pm for help.
Then, possibly use Parallel::ForkManager to run multiple scanners
simultaneously:
http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm
Now, it is your turn to attack the problem, fill in the blanks, and come
up with some code.
In the mean time, please read the posting guidelines (esp. this section on
posting code).
Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
------------------------------
Date: 3 Aug 2006 16:18:06 -0700
From: tambekp@gmail.com
Subject: Re: Scanning Multiple Log files for patterns continously
Message-Id: <1154647086.858597.80920@m73g2000cwd.googlegroups.com>
Thanks for the pointers Sinan.
The readings you suggested should be a good start for me.
-k
A. Sinan Unur wrote:
> tambekp@gmail.com wrote in news:1154643704.384937.101190
> @s13g2000cwa.googlegroups.com:
>
> > - I want to make a hash of patterns and files they need to be searched
> > against and then every pattern has a set threshold. If the threshold
> > exceeds in any of the files the script should alert. I dont want to
> > pass patterns/files as arguments but want to put them in a config file
> > or hash
> >
> > - I want to have the script run in an infinite loop so that it scans
> > the files from the position it left the last scan.
>
> The first task is to write this for scanning only one file at a time.
> First off, I do not think keeping the file open is a good idea: The file
> may actually be deleted and re-created while your script is running (I
> don't think this is possible on Windows, but AFAIK, it is allowed on
> various *nix flavored OS). In that case, you would be scanning the wrong
> file.
>
> Instead, check out the discussion in
>
> perldoc -f seek
>
> > - I have tried doing a hash of patterns=>filenames but this fails (and
> > rightly so) if same pattern has to be searched in different files as
> > the key of the hash does not remain unique in that case.
>
> I am not sure what you are talking about here. Hash keys are always
> strings, and cannot be patterns.
>
> The most obvious structure would be:
>
> my %patterns_by_file = (
> log1 => [ qr/^ERROR/, qr/^WARNING/ ],
> log2 => [ qr/^INFO/, qr/^ALERT/ ],
> # etc
> );
>
> > Any suggestions on how to go about implementing this?
>
> Get the one file version working first. You might want to use or refer to
> File::Tail:
>
> http://search.cpan.org/~mgrabnar/File-Tail-0.99.3/Tail.pm for help.
>
> Then, possibly use Parallel::ForkManager to run multiple scanners
> simultaneously:
>
> http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm
>
> Now, it is your turn to attack the problem, fill in the blanks, and come
> up with some code.
>
> In the mean time, please read the posting guidelines (esp. this section on
> posting code).
>
> Sinan
>
> --
> A. Sinan Unur <1usa@llenroc.ude.invalid>
> (remove .invalid and reverse each component for email address)
>
> comp.lang.perl.misc guidelines on the WWW:
> http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 9558
***************************************