[32803] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4067 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Oct 31 09:09:35 2013

Date: Thu, 31 Oct 2013 06:09:03 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 31 Oct 2013     Volume: 11 Number: 4067

Today's topics:
    Re: Can this be combined into one statement? <ben@morrow.me.uk>
    Re: Can this be combined into one statement? <rweikusat@mobileactivedefense.com>
    Re: Can this be combined into one statement? <ben@morrow.me.uk>
    Re: Can this be combined into one statement? <jblack@nospam.com>
    Re: Can this be combined into one statement? <ben@morrow.me.uk>
    Re: readdir <*@eli.users.panix.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 29 Oct 2013 22:14:25 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Can this be combined into one statement?
Message-Id: <1jo5ka-is12.ln1@anubis.morrow.me.uk>


Quoth "Peter J. Holzer" <hjp-usenet3@hjp.at>:
> On 2013-10-29 18:36, John Black <jblack@nospam.com> wrote:
> > In article <878uxc6mee.fsf@sable.mobileactivedefense.com>,
> rweikusat@mobileactivedefense.com 
> >
> > Yep, I thought of this after posting.  Thanks.  I like this.  I bet
> its faster than using 
> > split which ends up extracting a bunch of fields that are never used here.
> 
> OTOH the regexp probably needs to do a lot of backtracking, so you might
> lose that bet. 
> 
> Let's see:
[...]
>        Rate match split
> match 208/s    --  -67%
> split 625/s  200%    --
> 
> 
> Yup, split is about 3 times faster for this particular set of strings
> (may be wildly different for other strings).

Interestingly, perl is much better at optimising /.*\s(\S+)/ (it only
has to backtrack over the last word, instead of the whole string), so
that comes out faster again:

             Rate \S+\s*$   split .*\s\S+
    \S+\s*$ 274/s      --    -66%    -66%
    split   794/s    190%      --     -2%
    .*\s\S+ 812/s    197%      2%      --

Not much, though.

Ben



------------------------------

Date: Tue, 29 Oct 2013 23:10:24 +0000
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: Can this be combined into one statement?
Message-Id: <87r4b365cv.fsf@sable.mobileactivedefense.com>

Ben Morrow <ben@morrow.me.uk> writes:
> Quoth "Peter J. Holzer" <hjp-usenet3@hjp.at>:
>> On 2013-10-29 18:36, John Black <jblack@nospam.com> wrote:
>> > In article <878uxc6mee.fsf@sable.mobileactivedefense.com>,
>> rweikusat@mobileactivedefense.com 
>> >
>> > Yep, I thought of this after posting.  Thanks.  I like this.  I bet
>> its faster than using 
>> > split which ends up extracting a bunch of fields that are never used here.
>> 
>> OTOH the regexp probably needs to do a lot of backtracking, so you might
>> lose that bet. 
>> 
>> Let's see:
> [...]
>>        Rate match split
>> match 208/s    --  -67%
>> split 625/s  200%    --
>> 
>> 
>> Yup, split is about 3 times faster for this particular set of strings
>> (may be wildly different for other strings).
>
> Interestingly, perl is much better at optimising /.*\s(\S+)/ (it only
> has to backtrack over the last word, instead of the whole string), so
> that comes out faster again:
>
>              Rate \S+\s*$   split .*\s\S+
>     \S+\s*$ 274/s      --    -66%    -66%
>     split   794/s    190%      --     -2%
>     .*\s\S+ 812/s    197%      2%      --
>
> Not much, though.

I tried this as well: The more words are on such a line, the better the
"Don't backtrack!" match becomes.


------------------------------

Date: Tue, 29 Oct 2013 23:26:54 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Can this be combined into one statement?
Message-Id: <uqs5ka-lf22.ln1@anubis.morrow.me.uk>


Quoth John Black <jblack@nospam.com>:
> In article <slrnl70910.e1m.hjp-usenet3@hrunkner.hjp.at>,
> hjp-usenet3@hjp.at says...
> > 
> > cmpthese(-5,
> 
> BTW, what is the -5 option doing in the cmpthese function?  I thought
> the first param was the 
> number of iterations, but then negative doesn't make sense?

RTFM. -5 means 'run each sub for at least 5 CPU seconds', rather than a
fixed number of iterations.

Ben



------------------------------

Date: Wed, 30 Oct 2013 11:06:11 -0500
From: John Black <jblack@nospam.com>
Subject: Re: Can this be combined into one statement?
Message-Id: <MPG.2cdaea585e8bdc37989796@news.eternal-september.org>

In article <87r4b365cv.fsf@sable.mobileactivedefense.com>, rweikusat@mobileactivedefense.com 
says...
> 
> Ben Morrow <ben@morrow.me.uk> writes:
> > Quoth "Peter J. Holzer" <hjp-usenet3@hjp.at>:
> >> On 2013-10-29 18:36, John Black <jblack@nospam.com> wrote:
> >> > In article <878uxc6mee.fsf@sable.mobileactivedefense.com>,
> >> rweikusat@mobileactivedefense.com 
> >> >
> >> > Yep, I thought of this after posting.  Thanks.  I like this.  I bet
> >> its faster than using 
> >> > split which ends up extracting a bunch of fields that are never used here.
> >> 
> >> OTOH the regexp probably needs to do a lot of backtracking, so you might
> >> lose that bet. 
> >> 
> >> Let's see:
> > [...]
> >>        Rate match split
> >> match 208/s    --  -67%
> >> split 625/s  200%    --
> >> 
> >> 
> >> Yup, split is about 3 times faster for this particular set of strings
> >> (may be wildly different for other strings).
> >
> > Interestingly, perl is much better at optimising /.*\s(\S+)/ (it only
> > has to backtrack over the last word, instead of the whole string), so
> > that comes out faster again:
> >
> >              Rate \S+\s*$   split .*\s\S+
> >     \S+\s*$ 274/s      --    -66%    -66%
> >     split   794/s    190%      --     -2%
> >     .*\s\S+ 812/s    197%      2%      --
> >
> > Not much, though.
> 
> I tried this as well: The more words are on such a line, the better the
> "Don't backtrack!" match becomes.

Why does /(\S+)\s*$/ have to backtrack over "the whole string" whereas /.*\s(\S+)/ does not?  
I'm sure I don't undertand regex backtracking...

John Black


------------------------------

Date: Wed, 30 Oct 2013 18:29:16 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Can this be combined into one statement?
Message-Id: <sov7ka-e3r2.ln1@anubis.morrow.me.uk>


Quoth John Black <jblack@nospam.com>:
> 
> Why does /(\S+)\s*$/ have to backtrack over "the whole string" whereas
> /.*\s(\S+)/ does not?  
> I'm sure I don't undertand regex backtracking...

Consider a string like "foo bar baz ". For /\S+\s*$/ perl tries the
following sequence of matches:

    \S+         \s*         $
    "foo"       " "         no match, backtrack
    "fo"        ""          no match, backtrack
    "f"         ""          no match, backtrack

Now perl has tried all the matches starting at the beginning of the
string, so it has to move along the string and try again. It skips over
characters matching \S, since it's already tried all possible end-points
for \S+ in this word, then it skips over characters not matching \S,
since they can't possibly match, and starts again with:

    "bar"       " "         no match, backtrack
    "ba"        ""          no match, backtrack
    "b"         ""          no match, backtrack

And again:

    "baz"       " "         match

With more words in the string, or longer words, this would take more
attempts. OTOH, with /.*\s\S+/ it tries these matches:

    .*                  \s                      \S+
    "foo bar baz "      no match, backtrack
    "foo bar baz"       " "                     no match, backtrack
    "foo bar ba"        no match, backtrack
    "foo bar b"         no match, backtrack
    "foo bar "          no match, backtrack
    "foo bar"           " "                     "baz"

which only ever has to backtrack over the last word. In the specific
case of a very long last word preceded by a small number of short words
it would come out slower than the first match, but otherwise it comes
out faster.

You can see what perl is doing by running something like

    perl -Mre=debug -e'"foo bar baz " =~ /.*\s\S+/'

though it takes a bit of practice to get used to interpreting the
output.

Ben



------------------------------

Date: Tue, 29 Oct 2013 22:10:31 +0000 (UTC)
From: Eli the Bearded <*@eli.users.panix.com>
Subject: Re: readdir
Message-Id: <eli$1310291807@qz.little-neck.ny.us>

In comp.lang.perl.misc, George Mpouras  <gravitalsun@hotmail.foo> wrote:
> is there any way readdir to return me files by modification time ?

No.

> I do not want keep their dates on an array and sort it . I wantone
> pass like ls -ltr

How do you think ls solves this problem? I'll give you a hint: it stat()s
all the files and creates an array...

Elijah
------
bets there is no guarantee that the filesystem even supports modification times


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 4067
***************************************


home help back first fref pref prev next nref lref last post