[29757] in Perl-Users-Digest
Perl-Users Digest, Issue: 1000 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Nov 2 21:09:41 2007
Date: Fri, 2 Nov 2007 18:09:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 2 Nov 2007 Volume: 11 Number: 1000
Today's topics:
How Do You Tell Time in Perl? kvnsmnsn@hotmail.com
Re: Mod_Perl / Apache Issue <superman183@hotmail.com>
Re: Mod_Perl / Apache Issue <jimsgibson@gmail.com>
Re: PID of exec hendedav@gmail.com
Re: Script to find largest files <hjp-usenet2@hjp.at>
Re: Simple string search <5502109103600001@t-online.de>
Re: Simple string search <jordilin@gmail.com>
Re: Simple string search <glex_no-spam@qwest-spam-no.invalid>
Re: Simple string search <jordilin@gmail.com>
Re: Simple string search (Doug Miller)
Re: Simple string search <jurgenex@hotmail.com>
Re: Simple string search <jordilin@gmail.com>
Re: Simple string search <jurgenex@hotmail.com>
Re: Simple string search <jordilin@gmail.com>
Re: Simple string search <jurgenex@hotmail.com>
Re: Simple string search <5502109103600001@t-online.de>
Re: Simple string search (Doug Miller)
Re: Simple string search <jordilin@gmail.com>
Re: string matching doesn't work <jimsgibson@gmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 02 Nov 2007 17:58:14 -0700
From: kvnsmnsn@hotmail.com
Subject: How Do You Tell Time in Perl?
Message-Id: <1194051494.309489.161940@q5g2000prf.googlegroups.com>
I need to write a couple of Perl scripts that operate on a cache, and
the page replacement policy for this cache is Least Recently Used,
which means I need to record a time value for each entry in the cache.
How can one tell what the time is in Perl, so that each time I access
one of these entries I can modify its timestamp? Any information on
this would be greatly appreciated.
I know that I can find out how Perl works with the "perldoc" command,
but that command expects a module name as an argument, and I have no
idea which module has to do with telling time.
---Kevin Simonson
"You'll never get to heaven, or even to LA,
if you don't believe there's a way."
from _Why Not_
------------------------------
Date: Fri, 02 Nov 2007 11:09:31 -0700
From: Jane <superman183@hotmail.com>
Subject: Re: Mod_Perl / Apache Issue
Message-Id: <1194026971.591436.174030@z9g2000hsf.googlegroups.com>
>
> http://perl.apache.org/download/index.html
>
> The above link states that mod_perl 2.0 is for use with Apache 2.0.x/
> 2.2.x.- Hide quoted text -
>
> - Show quoted text -
Well, on the ppm installer there are a couple of downloads, and it
says:
mod_perl-2.0 - Embed a Perl interpreter in the Apache/2.0 HTTP server
mod_perl - Embed a Perl interpreter in the Apache/2.2 HTTP server
... so I chose the former. (don't forget this is on Windows).
Thanks,
Jane
------------------------------
Date: Fri, 02 Nov 2007 11:24:39 -0700
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: Mod_Perl / Apache Issue
Message-Id: <021120071124399814%jimsgibson@gmail.com>
In article <1194026971.591436.174030@z9g2000hsf.googlegroups.com>, Jane
<superman183@hotmail.com> wrote:
[stuff about getting mod_perl working with Apache]
Thankfully, I do not do Windows, but you might be interested in
IndigoPerl, which has been mentioned in this forum before:
<http://www.indigostar.com/indigoperl.htm>
It is free and includes an Apache server with mod_perl installed.
--
Jim Gibson
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
------------------------------
Date: Fri, 02 Nov 2007 18:43:44 -0000
From: hendedav@gmail.com
Subject: Re: PID of exec
Message-Id: <1194029024.897864.216340@19g2000hsx.googlegroups.com>
The test script was called using the same frequency as the other
script. Maybe it is the web server that is causing the problem. Has
anyone heard of a problem with calling the same script to frequently
(ie the previous run hasn't had time to finish before another instance
is called)?
Dave
------------------------------
Date: Fri, 2 Nov 2007 23:28:08 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Script to find largest files
Message-Id: <slrnfin93o.qme.hjp-usenet2@zeno.hjp.at>
On 2007-11-02 14:05, bugbear <bugbear@trim_papermule.co.uk_trim> wrote:
> groups.user@gmail.com wrote:
>> i'm looking for a script to find the largest files in a filesystem,
>> ordered by size.
>>
>
> du -s /* | sort -rn
man du
hp
--
_ | Peter J. Holzer | It took a genius to create [TeX],
|_|_) | Sysadmin WSR | and it takes a genius to maintain it.
| | | hjp@hjp.at | That's not engineering, that's art.
__/ | http://www.hjp.at/ | -- David Kastrup in comp.text.tex
------------------------------
Date: Fri, 02 Nov 2007 19:36:58 +0100
From: Josef Moellers <5502109103600001@t-online.de>
Subject: Re: Simple string search
Message-Id: <fgfqoh$gov$02$1@news.t-online.com>
jordilin wrote:
> On 1 nov, 18:25, Josef Moellers <5502109103600...@t-online.de> wrote:
>> Jack wrote:
>>> hi guys,
>>> A little problem here. I am very new to perl and i am having a problem
>>> search for a substring in a file. So here is a sample
>>> (this is my id for id="wksOI*84sk_")
>>> (this is my id for id="@s3dSSos_")
>>> (this is my id for id="dksWDkps_")
>>> So i have page with 20 of these lines. all I am interested in the id
>>> part of each line ie, wksOl*84sk_ . As you maybe able to tell the id
>>> part of each line is 12 char and it always ends with _"). I think the
>>> regex must be for an expression that starts with id=" and ends with
>>> ") with 12 letters in the middle. So once this id has been found I
>>> need to write it in a file.
>> This is somewhat inconsistent. The example you gave (wksOl*84sk_) is
>> only 11 characters long.
>>
>>> I know with m/regex/ I can find stuff, but I don' t how to return the
>>> cryptic id.
>>> Any solutions.
>> if ($string =~ m/id="(.{12})"/) {
>> $desired_id = $1;
>>
>> }
>>
>> --
>> Mails please to josef dot moellers
>> and I'm on gmx dot de.
>
> I have quickly written the following and tested it successfully:
>
> while (<>) {
> if (/^\(.* id="(.*)"\)/) {
> print "$1\n";
> }
> }
>
> This works.
Fine. TMTOWTDI.
This regex also works:
\(this is my id for id="(.*_)"\)
It's a question of the requirement: how is the input structured and how
much of the input has to be matched in order to avoid false positives.
--
Mails please to josef dot moellers
and I'm on gmx dot de.
------------------------------
Date: Fri, 02 Nov 2007 19:50:24 -0000
From: jordilin <jordilin@gmail.com>
Subject: Re: Simple string search
Message-Id: <1194033024.617604.170330@19g2000hsx.googlegroups.com>
On 1 nov, 18:25, Josef Moellers <5502109103600...@t-online.de> wrote:
> Jack wrote:
> > hi guys,
> > A little problem here. I am very new to perl and i am having a problem
> > search for a substring in a file. So here is a sample
>
> > (this is my id for id="wksOI*84sk_")
> > (this is my id for id="@s3dSSos_")
> > (this is my id for id="dksWDkps_")
>
> > So i have page with 20 of these lines. all I am interested in the id
> > part of each line ie, wksOl*84sk_ . As you maybe able to tell the id
> > part of each line is 12 char and it always ends with _"). I think the
> > regex must be for an expression that starts with id=" and ends with
> > ") with 12 letters in the middle. So once this id has been found I
> > need to write it in a file.
>
> This is somewhat inconsistent. The example you gave (wksOl*84sk_) is
> only 11 characters long.
>
> > I know with m/regex/ I can find stuff, but I don' t how to return the
> > cryptic id.
>
> > Any solutions.
>
> if ($string =~ m/id="(.{12})"/) {
> $desired_id = $1;
>
> }
>
> --
> Mails please to josef dot moellers
> and I'm on gmx dot de.
Yes, there is more than one way to do it. This is Perl, isn't it? By
the way, can you explain the following regex?
m/id="(.{12})"/
I am not sure about this 12 between brackets.
Thanks in advance,
Jordi
------------------------------
Date: Fri, 02 Nov 2007 14:54:34 -0500
From: "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid>
Subject: Re: Simple string search
Message-Id: <472b807a$0$497$815e3792@news.qwest.net>
jordilin wrote:
[...]
> By the way, can you explain the following regex?
> m/id="(.{12})"/
> I am not sure about this 12 between brackets.
When in doubt, read the documentation, or a book.
perldoc perlretut
Look for 'Matching repetitions'.
------------------------------
Date: Fri, 02 Nov 2007 13:10:47 -0700
From: jordilin <jordilin@gmail.com>
Subject: Re: Simple string search
Message-Id: <1194034247.332579.212190@y42g2000hsy.googlegroups.com>
On 2 nov, 19:54, "J. Gleixner" <glex_no-s...@qwest-spam-no.invalid>
wrote:
> jordilin wrote:
>
> [...]
>
> > By the way, can you explain the following regex?
> > m/id="(.{12})"/
> > I am not sure about this 12 between brackets.
>
> When in doubt, read the documentation, or a book.
>
> perldoc perlretut
>
> Look for 'Matching repetitions'.
The reason I am asking is because I have tried this particular regex
and it does not work in this particular example and I want the
explanation of the author.
It is very easy saying look at the docs. I recommend you Mastering
Regular Expressions from Oreilly, by the way.
best regards,
Jordi
------------------------------
Date: Fri, 02 Nov 2007 21:39:02 GMT
From: spambait@milmac.com (Doug Miller)
Subject: Re: Simple string search
Message-Id: <HVLWi.2943$%Z2.2713@nlpi068.nbdc.sbc.com>
In article <1194033024.617604.170330@19g2000hsx.googlegroups.com>, jordilin <jordilin@gmail.com> wrote:
>Yes, there is more than one way to do it. This is Perl, isn't it? By
>the way, can you explain the following regex?
>m/id="(.{12})"/
>I am not sure about this 12 between brackets.
. means "any character"
{12} means "whatever came just before this, we're looking for 12 of it".
So .{12} means "any sequence of exactly 12 characters", and (.{12}) means
"open paren, followed by any sequence of exactly 12 characters, followed by
close paren".
--
Regards,
Doug Miller (alphageek at milmac dot com)
It's time to throw all their damned tea in the harbor again.
------------------------------
Date: Fri, 02 Nov 2007 21:02:26 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Simple string search
Message-Id: <CfMWi.45$lx.21@trndny05>
Doug Miller wrote:
> So .{12} means "any sequence of exactly 12 characters",
So far so good
> and (.{12})
> means "open paren, followed by any sequence of exactly 12 characters,
> followed by close paren".
Aehmmm, no.
jue
------------------------------
Date: Fri, 02 Nov 2007 14:06:14 -0700
From: jordilin <jordilin@gmail.com>
Subject: Re: Simple string search
Message-Id: <1194037574.002121.303560@o3g2000hsb.googlegroups.com>
On 2 nov, 21:39, spamb...@milmac.com (Doug Miller) wrote:
> In article <1194033024.617604.170...@19g2000hsx.googlegroups.com>, jordilin <jordi...@gmail.com> wrote:
>
> >Yes, there is more than one way to do it. This is Perl, isn't it? By
> >the way, can you explain the following regex?
> >m/id="(.{12})"/
> >I am not sure about this 12 between brackets.
>
> . means "any character"
> {12} means "whatever came just before this, we're looking for 12 of it".
>
> So .{12} means "any sequence of exactly 12 characters", and (.{12}) means
> "open paren, followed by any sequence of exactly 12 characters, followed by
> close paren".
>
> --
> Regards,
> Doug Miller (alphageek at milmac dot com)
>
> It's time to throw all their damned tea in the harbor again.
Well, I understand. The problem is that, in this example the ids
differ in length, so it does not work here. We should write sth like
m/id="(.{7,})"/
match at least 7 times, taking into account there are no ids with less
than 7 chars.
Thanks
jordi
------------------------------
Date: Fri, 02 Nov 2007 21:10:03 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Simple string search
Message-Id: <LmMWi.13$4I.2@trndny03>
jordilin wrote:
> Well, I understand. The problem is that, in this example the ids
> differ in length, so it does not work here. We should write sth like
>
> m/id="(.{7,})"/
>
> match at least 7 times, taking into account there are no ids with less
> than 7 chars.
Taking into account that HTML is not a regular language only a fool would
try to parse HTML using Regular Expressions. Even with the non-regular
enhancements in Perl REs are the wrong tool to parse HTML. This has been
discussed in this NG gazillions of times.
Or do you also use a hammer to fasten a screw? It works, ... sort of.
Use a tool that is meant to parse HTML if you want to parse HTML, e.g.
HTML::Parse.
jue
------------------------------
Date: Fri, 02 Nov 2007 14:16:33 -0700
From: jordilin <jordilin@gmail.com>
Subject: Re: Simple string search
Message-Id: <1194038193.826213.271380@d55g2000hsg.googlegroups.com>
On 2 nov, 21:10, "J=FCrgen Exner" <jurge...@hotmail.com> wrote:
> jordilin wrote:
> > Well, I understand. The problem is that, in this example the ids
> > differ in length, so it does not work here. We should write sth like
>
> > m/id=3D"(.{7,})"/
>
> > match at least 7 times, taking into account there are no ids with less
> > than 7 chars.
>
> Taking into account that HTML is not a regular language only a fool would
> try to parse HTML using Regular Expressions. Even with the non-regular
> enhancements in Perl REs are the wrong tool to parse HTML. This has been
> discussed in this NG gazillions of times.
> Or do you also use a hammer to fasten a screw? It works, ... sort of.
>
> Use a tool that is meant to parse HTML if you want to parse HTML, e.g.
> HTML::Parse.
>
> jue
I think you have posted in the wrong thread mate. This is not about
html,
Best regards,
Jordi
------------------------------
Date: Fri, 02 Nov 2007 21:25:45 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Simple string search
Message-Id: <tBMWi.10$sN.0@trndny02>
jordilin wrote:
> On 2 nov, 21:10, "Jürgen Exner" <jurge...@hotmail.com> wrote:
>> Taking into account that HTML is not a regular language only a fool
>> would try to parse HTML using Regular Expressions.
> I think you have posted in the wrong thread mate. This is not about
> html,
Oooops, indeed.
Sorry, I got two threads confused. You are right.
jue
------------------------------
Date: Fri, 02 Nov 2007 22:32:37 +0100
From: Josef Moellers <5502109103600001@t-online.de>
Subject: Re: Simple string search
Message-Id: <fgg51t$4ts$03$1@news.t-online.com>
jordilin wrote:
> On 2 nov, 21:39, spamb...@milmac.com (Doug Miller) wrote:
>> In article <1194033024.617604.170...@19g2000hsx.googlegroups.com>, jordilin <jordi...@gmail.com> wrote:
>>
>>> Yes, there is more than one way to do it. This is Perl, isn't it? By
>>> the way, can you explain the following regex?
>>> m/id="(.{12})"/
>>> I am not sure about this 12 between brackets.
>> . means "any character"
>> {12} means "whatever came just before this, we're looking for 12 of it".
>>
>> So .{12} means "any sequence of exactly 12 characters", and (.{12}) means
>> "open paren, followed by any sequence of exactly 12 characters, followed by
>> close paren".
>>
>> --
>> Regards,
>> Doug Miller (alphageek at milmac dot com)
>>
>> It's time to throw all their damned tea in the harbor again.
>
> Well, I understand. The problem is that, in this example the ids
> differ in length, so it does not work here. We should write sth like
>
> m/id="(.{7,})"/
>
> match at least 7 times, taking into account there are no ids with less
> than 7 chars.
But "Jack" writes in the original post "all I am interested in the id
part of each line ie, wksOl*84sk_ . As you maybe able to tell the id
part of each line is 12 char and it always ends with _")."
So I thought that whatever is between the "" is the id and it's supposed
to be 12 characters long.
If you now state that it should have been 7 or more, please re-read the
original post.
If the requirement is "at least 7", then, indeed, ".{7,}" is correct, as
can be found in "predoc perlre. If the requirement is "12", then ".{12}"
is correct. If the requirement were "anything between the quote signs,
no matter how much", then ".*" would be correct.
I was under the assumption that the OP wanted to filter out illegal ids
which are not 12 characters long.
YMMV,
Josef
--
Mails please to josef dot moellers
and I'm on gmx dot de.
------------------------------
Date: Fri, 02 Nov 2007 22:42:57 GMT
From: spambait@milmac.com (Doug Miller)
Subject: Re: Simple string search
Message-Id: <LRMWi.3491$%Y6.2246@nlpi061.nbdc.sbc.com>
In article <CfMWi.45$lx.21@trndny05>, "Jürgen Exner" <jurgenex@hotmail.com> wrote:
>Doug Miller wrote:
>> So .{12} means "any sequence of exactly 12 characters",
>
>So far so good
>
>> and (.{12})
>> means "open paren, followed by any sequence of exactly 12 characters,
>> followed by close paren".
>
>Aehmmm, no.
>
My fault -- you're right. It *would* mean that if the parens were escaped,
i.e. \( and \). As is, it just means a sequence of 12 characters.
>jue
>
>
--
Regards,
Doug Miller (alphageek at milmac dot com)
It's time to throw all their damned tea in the harbor again.
------------------------------
Date: Fri, 02 Nov 2007 16:03:18 -0700
From: jordilin <jordilin@gmail.com>
Subject: Re: Simple string search
Message-Id: <1194044598.797880.204270@y42g2000hsy.googlegroups.com>
On 2 nov, 21:32, Josef Moellers <5502109103600...@t-online.de> wrote:
> jordilin wrote:
> > On 2 nov, 21:39, spamb...@milmac.com (Doug Miller) wrote:
> >> In article <1194033024.617604.170...@19g2000hsx.googlegroups.com>, jor=
dilin <jordi...@gmail.com> wrote:
>
> >>> Yes, there is more than one way to do it. This is Perl, isn't it? By
> >>> the way, can you explain the following regex?
> >>> m/id=3D"(.{12})"/
> >>> I am not sure about this 12 between brackets.
> >> . means "any character"
> >> {12} means "whatever came just before this, we're looking for 12 of it=
"=2E
>
> >> So .{12} means "any sequence of exactly 12 characters", and (.{12}) me=
ans
> >> "open paren, followed by any sequence of exactly 12 characters, follow=
ed by
> >> close paren".
>
> >> --
> >> Regards,
> >> Doug Miller (alphageek at milmac dot com)
>
> >> It's time to throw all their damned tea in the harbor again.
>
> > Well, I understand. The problem is that, in this example the ids
> > differ in length, so it does not work here. We should write sth like
>
> > m/id=3D"(.{7,})"/
>
> > match at least 7 times, taking into account there are no ids with less
> > than 7 chars.
>
> But "Jack" writes in the original post "all I am interested in the id
> part of each line ie, wksOl*84sk_ . As you maybe able to tell the id
> part of each line is 12 char and it always ends with _")."
> So I thought that whatever is between the "" is the id and it's supposed
> to be 12 characters long.
> If you now state that it should have been 7 or more, please re-read the
> original post.
>
> If the requirement is "at least 7", then, indeed, ".{7,}" is correct, as
> can be found in "predoc perlre. If the requirement is "12", then ".{12}"
> is correct. If the requirement were "anything between the quote signs,
> no matter how much", then ".*" would be correct.
> I was under the assumption that the OP wanted to filter out illegal ids
> which are not 12 characters long.
>
> YMMV,
>
> Josef
> --
> Mails please to josef dot moellers
> and I'm on gmx dot de.
Well, you are absolutely right. The original poster should state
clearly what does he want, but he doesn=B4t.
In any case, I think we have already answered several options that the
original poster can take to solve his problem.
regards,
jordi
------------------------------
Date: Fri, 02 Nov 2007 11:19:24 -0700
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: string matching doesn't work
Message-Id: <021120071119240899%jimsgibson@gmail.com>
In article <1194024730.026878.43090@z24g2000prh.googlegroups.com>, Jack
<accpactec@hotmail.com> wrote:
> I used Mirco Wahab's solution and it worked for some of the cases but
> now I am stuck at parsing this line.
>
> <td class="profile">Address<br>City State V2X
> 6J3<br>Canada<br>
>
> </td><td align="right" class="profile"><nobr><b>Phone: </
> b>555-555-5555</nobr>
> </td>
>
>
> I want to parse each section into a variable using qr. so for the
> Address I used
> <td[^>]+> ([^<]+) <br> # this line works fine and I get the address
> portion
> .+?
> <br> ([^&]+)  \; # this part doesn't work meaning that I
> don't get anything for city
>
> Any help would much appreciated.
If you want help with a program, you need to post a complete, working,
short-as-possible version that demonstrates the problem you are having.
You are running up against the problem of parsing HTML with regular
expressions. Mirco's code was for a very simple case. As your actual
cases become more complex, you have to add more complexity to your REs.
Maybe it is time to consider Mirco's first recommendation: learn to use
an HTML-parsing module, such as HTML::Parser.
--
Jim Gibson
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 1000
***************************************