[30355] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1598 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat May 31 14:09:46 2008

Date: Sat, 31 May 2008 11:09:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 31 May 2008     Volume: 11 Number: 1598

Today's topics:
    Re: FAQ 4.17 How do I find yesterday's date? <hjp-usenet2@hjp.at>
        new CPAN modules on Sat May 31 2008 (Randal Schwartz)
        OT: Thank you for the link! <bill@ts1000.us>
    Re: Speed comparison of regex versus index, lc, and / / <benkasminbullock@gmail.com>
    Re: Speed comparison of regex versus index, lc, and / / <someone@example.com>
    Re: Speed comparison of regex versus index, lc, and / / <benkasminbullock@gmail.com>
    Re: Speed comparison of regex versus index, lc, and / / <abigail@abigail.be>
    Re: Speed comparison of regex versus index, lc, and / / <simon.chao@gmail.com>
    Re: Speed comparison of regex versus index, lc, and / / <1usa@llenroc.ude.invalid>
    Re: Speed comparison of regex versus index, lc, and / / <ben@morrow.me.uk>
    Re: Speed comparison of regex versus index, lc, and / / <jurgenex@hotmail.com>
    Re: Speed comparison of regex versus index, lc, and / / <jurgenex@hotmail.com>
    Re: The Importance of Terminology's Quality <szrRE@szromanMO.comVE>
    Re: The Importance of Terminology's Quality <NpOeStPeAdM@nnowslpianmk.com>
    Re: The Importance of Terminology's Quality <szrRE@szromanMO.comVE>
    Re: The Importance of Terminology's Quality <jurgenex@hotmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 31 May 2008 11:28:36 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: FAQ 4.17 How do I find yesterday's date?
Message-Id: <slrng426i4.fp6.hjp-usenet2@hrunkner.hjp.at>

On 2008-05-30 18:39, Gunnar Hjalmarsson <noreply@gunnar.cc> wrote:
> Peter J. Holzer wrote:
>> On 2008-05-29 00:54, David Combs <dkcombs@panix.com> wrote:
>>> I'm still not sure, from what you guys have discovered,
>>> what the faq *should* say?  
>> 
>> The FAQ should say what it actually says (well, it could also mention
>> localtime/mktime - I'm not sure if the two modules it mentions are part
>> of the core).
>
> They are not part of the core.
>
> The core module Time::Local + localtime() are sufficient to answer the 
> FAQ question safely.
>
>      use Time::Local;
>      my $today = timelocal 0, 0, 12, ( localtime )[3..5];
>      my ($d, $m, $y) = ( localtime $today-86400 )[3..5];
>      printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d;


or - as I wrote - localtime and mktime (in POSIX). mktime lets you do
arithmetic on the day field, so you can directly write "the day before"
instead of "the day that was 86400 seconds before noon of the current
day" as you did above.


#!/usr/bin/perl
use warnings;
use strict;
use POSIX qw(mktime strftime);

before_and_after(time);
before_and_after(1206914460);
before_and_after(1225061940);

sub before_and_after {
    my ($now) = @_;
    print strftime("%Y-%m-%d %H:%M:%S%z\n", localtime($now));

    my @today = localtime($now);

    my @yesterday = @today; $yesterday[3]--; $yesterday[8] = -1;
    my $yesterday = mktime(@yesterday);

    print strftime("%Y-%m-%d %H:%M:%S%z\n", localtime($yesterday));

    my @tomorrow = @today; $tomorrow[3]++; $tomorrow[8] = -1;
    my $tomorrow = mktime(@tomorrow);

    print strftime("%Y-%m-%d %H:%M:%S%z\n", localtime($tomorrow));
    print "\n";
}
__END__


2008-05-31 11:24:30+0200
2008-05-30 11:24:30+0200
2008-06-01 11:24:30+0200

2008-03-31 00:01:00+0200
2008-03-30 00:01:00+0100
2008-04-01 00:01:00+0200

2008-10-26 23:59:00+0100
2008-10-25 23:59:00+0200
2008-10-27 23:59:00+0100




------------------------------

Date: Sat, 31 May 2008 04:42:19 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Sat May 31 2008
Message-Id: <K1puEJ.rDp@zorch.sf-bay.org>

The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN).  You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.

Acme-CPANAuthors-0.03
http://search.cpan.org/~ishigaki/Acme-CPANAuthors-0.03/
We are CPAN authors 
----
AnyEvent-4.11
http://search.cpan.org/~mlehmann/AnyEvent-4.11/
----
Archive-Lha-0.03_01
http://search.cpan.org/~ishigaki/Archive-Lha-0.03_01/
extract .LZH archives 
----
Audio-M4P-0.45
http://search.cpan.org/~billh/Audio-M4P-0.45/
Perl QuickTime / MP4 / iTunes Music Store audio / video file tools 
----
ClearPress-161
http://search.cpan.org/~rpettett/ClearPress-161/
Simple, fresh & fruity MVC framework 
----
Coro-4.741
http://search.cpan.org/~mlehmann/Coro-4.741/
coroutine process abstraction 
----
DB2-Admin-3.0
http://search.cpan.org/~hbiersma/DB2-Admin-3.0/
Support for DB2 Administrative API from perl 
----
ExtUtils-Manifest-1.52
http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.52/
utilities to write and check a MANIFEST file 
----
ExtUtils-Manifest-1.53
http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.53/
utilities to write and check a MANIFEST file 
----
Games-Sudoku-CPSearch-0.13
http://search.cpan.org/~martyloo/Games-Sudoku-CPSearch-0.13/
Solve Sudoku problems quickly. 
----
HTML-Truncate-0.14
http://search.cpan.org/~ashley/HTML-Truncate-0.14/
(beta software) truncate HTML by percentage or character count while preserving well-formedness. 
----
HTML-Truncate-0.15
http://search.cpan.org/~ashley/HTML-Truncate-0.15/
(beta software) truncate HTML by percentage or character count while preserving well-formedness. 
----
HTML-Truncate-0.16
http://search.cpan.org/~ashley/HTML-Truncate-0.16/
(beta software) truncate HTML by percentage or character count while preserving well-formedness. 
----
IO-Lambda-0.19
http://search.cpan.org/~karasik/IO-Lambda-0.19/
non-blocking I/O in lambda style 
----
Language-Prolog-Yaswi-0.15
http://search.cpan.org/~salva/Language-Prolog-Yaswi-0.15/
Yet another interface to SWI-Prolog 
----
Lingua-StarDict-Gen-0.02_3
http://search.cpan.org/~jjoao/Lingua-StarDict-Gen-0.02_3/
----
Lingua-Stardict-Gen-0.02_2
http://search.cpan.org/~jjoao/Lingua-Stardict-Gen-0.02_2/
Stardict dictionary generator 
----
MooseX-Plaggerize-0.02
http://search.cpan.org/~tokuhirom/MooseX-Plaggerize-0.02/
plagger like plugin feature for Moose 
----
Net-Abuse-Utils-0.06
http://search.cpan.org/~mikegrb/Net-Abuse-Utils-0.06/
Routines useful for processing network abuse 
----
Net-Abuse-Utils-0.07
http://search.cpan.org/~mikegrb/Net-Abuse-Utils-0.07/
Routines useful for processing network abuse 
----
ORLite-Mirror-0.06
http://search.cpan.org/~adamk/ORLite-Mirror-0.06/
Extend ORLite to support remote SQLite databases 
----
POE-Component-CPAN-YACSmoke-1.28
http://search.cpan.org/~bingos/POE-Component-CPAN-YACSmoke-1.28/
Bringing the power of POE to CPAN smoke testing. 
----
POE-Component-IRC-5.78
http://search.cpan.org/~bingos/POE-Component-IRC-5.78/
a fully event-driven IRC client module. 
----
Pg-Pcurse-0.18
http://search.cpan.org/~ioannis/Pg-Pcurse-0.18/
Monitors a Postgres cluster 
----
Pod-HtmlEasy-1.0102
http://search.cpan.org/~gleach/Pod-HtmlEasy-1.0102/
Generate personalized HTML from PODs. 
----
Proc-SafeExec-1.4
http://search.cpan.org/~bilbo/Proc-SafeExec-1.4/
Convenient utility for executing external commands in various ways. 
----
SMS-Send-US-Ipipi-0.01
http://search.cpan.org/~amoore/SMS-Send-US-Ipipi-0.01/
An SMS::Send driver for the ipipi.com website 
----
Test-Deep-0.102
http://search.cpan.org/~fdaly/Test-Deep-0.102/
Extremely flexible deep comparison 
----
Time-Duration-Parse-0.06
http://search.cpan.org/~miyagawa/Time-Duration-Parse-0.06/
Parse string that represents time duration 
----
URI-Escape-XS-0.02
http://search.cpan.org/~dankogai/URI-Escape-XS-0.02/
Drop-In replacement for URI::Escape 
----
URI-FromHash-0.03
http://search.cpan.org/~drolsky/URI-FromHash-0.03/
Build a URI from a set of named parameters 
----
iCal-Parser-1.15
http://search.cpan.org/~rfrankel/iCal-Parser-1.15/
Parse iCalendar files into a data structure 
----
iCal-Parser-1.16
http://search.cpan.org/~rfrankel/iCal-Parser-1.16/
Parse iCalendar files into a data structure 
----
pQuery-0.07
http://search.cpan.org/~ingy/pQuery-0.07/
Perl Port of jQuery.js 


If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.

This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
  http://www.stonehenge.com/merlyn/LinuxMag/col82.html

print "Just another Perl hacker," # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion


------------------------------

Date: Sat, 31 May 2008 09:58:42 -0700 (PDT)
From: Bill H <bill@ts1000.us>
Subject: OT: Thank you for the link!
Message-Id: <14dbdd1e-8ddc-4ad2-ad60-1a9eb1dc85bd@y21g2000hsf.googlegroups.com>

Not sure who posted the link www.thedailywtf.com in one of their
messages, but thank you! A hilarious site that I have already killed
about 16 hours reading!

Bill H


------------------------------

Date: Sat, 31 May 2008 05:21:07 +0000 (UTC)
From: Ben Bullock <benkasminbullock@gmail.com>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <g1qn82$3i2$1@ml.accsnet.ne.jp>

On Sat, 31 May 2008 01:49:14 +0000, A. Sinan Unur wrote:

> There is one change I would make to both routines. Cache length $ss
> before the loop. On my machine, that cut down the time by about 40%.

In the real situation almost every search for the search string ($ss)
in the text ($text) is going to fail, so I think it would make much
less difference than in the fake situation of repeating with the same
string over and over again.

I modified the program to use /o that A. Sinan Unur added, and also
added a routine to test Ben Morrow's idea and reran it using the
Benchmark module:

#!/usr/local/bin/perl
use warnings;
use strict;

my $text = <<EOF;
xhoster is the coolest perl programmer ever. xhoster is the
greatest. xhoster is the champion. xhoster is a babe magnet.
EOF
my $ss = "xhoster";

sub index_find
{
    my @finds;
    my $found = 0;
    while (($found = index ($text, $ss, $found)) != -1) {
	push @finds, $found;
	$found += length ($ss);
    }
    return \@finds;
}

sub regex_find
{
    my @finds;
    while ($text =~ /\Q$ss\E/g) {
	push @finds, pos ($text) - length($ss);
    }
    return \@finds;
}

sub regex_unur_find
{
    my @finds;
    while ($text =~ /\Q$ss/go) {
	push @finds, pos ($text) - length($ss);
    }
    return \@finds;
}

my $re = qr{\Q$ss};

sub regex_morrow_find
{
    my @finds;
    while ($text =~ /$re/g) {
	push @finds, pos ($text) - length($ss);
    }
    return \@finds;
}

# check it all works

for (\&index_find, \&regex_find, \&regex_unur_find, \&regex_morrow_find) {
    print "String found at ", (join ", ",@{&{$_}($text, $ss)}),"\n";
}

use Benchmark qw( cmpthese );

my $count = 100000;

cmpthese $count, {
    'index' => sub {index_find($text,$ss)},
    'regex' => sub {regex_find($text,$ss)},
    'regex_unur' => sub {regex_unur_find($text,$ss)},
    'regex_morrow' => sub {regex_morrow_find($text,$ss)},
};

Here are the results of five runs:

                 Rate regex_morrow        regex   regex_unur        index
regex_morrow 116279/s           --         -31%         -40%         -41%
regex        169492/s          46%           --         -12%         -14%
regex_unur   192308/s          65%          13%           --          -2%
index        196078/s          69%          16%           2%           --

                 Rate regex_morrow        regex   regex_unur        index
regex_morrow 119048/s           --         -30%         -40%         -40%
regex        169492/s          42%           --         -15%         -15%
regex_unur   200000/s          68%          18%           --          -0%
index        200000/s          68%          18%           0%           --

                 Rate regex_morrow        regex        index   regex_unur
regex_morrow 120482/s           --         -29%         -40%         -40%
regex        169492/s          41%           --         -15%         -15%
index        200000/s          66%          18%           --           0%
regex_unur   200000/s          66%          18%           0%           --

                 Rate regex_morrow        regex   regex_unur        index
regex_morrow 119048/s           --         -31%         -42%         -42%
regex        172414/s          45%           --         -16%         -16%
regex_unur   204082/s          71%          18%           --          -0%
index        204082/s          71%          18%           0%           --

                 Rate regex_morrow        regex        index   regex_unur
regex_morrow 114943/s           --         -33%         -40%         -44%
regex        172414/s          50%           --         -10%         -16%
index        192308/s          67%          12%           --          -6%
regex_unur   204082/s          78%          18%           6%           --


This supports my original statement that the regex version is almost
exactly the same speed as "index". But I was surprised that Ben Morrow's
version is the slowest of all. Does anyone see anything wrong with my
methodology?



------------------------------

Date: Sat, 31 May 2008 07:19:24 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <0_60k.140$Gn.101@edtnps92>

Ben Bullock wrote:
> On Sat, 31 May 2008 01:49:14 +0000, A. Sinan Unur wrote:
> 
>> There is one change I would make to both routines. Cache length $ss
>> before the loop. On my machine, that cut down the time by about 40%.
> 
> In the real situation almost every search for the search string ($ss)
> in the text ($text) is going to fail, so I think it would make much
> less difference than in the fake situation of repeating with the same
> string over and over again.
> 
> I modified the program to use /o that A. Sinan Unur added, and also
> added a routine to test Ben Morrow's idea and reran it using the
> Benchmark module:
> 
> #!/usr/local/bin/perl
> use warnings;
> use strict;
> 
> my $text = <<EOF;
> xhoster is the coolest perl programmer ever. xhoster is the
> greatest. xhoster is the champion. xhoster is a babe magnet.
> EOF
> my $ss = "xhoster";
> 
> sub index_find
> {
>     my @finds;
>     my $found = 0;
>     while (($found = index ($text, $ss, $found)) != -1) {
>         push @finds, $found;
>         $found += length ($ss);
>     }
>     return \@finds;
> }

On my machine and version of Perl a get a speed improvement by using a C 
style for loop instead:

sub index_find
{
     my @finds;
     for ( my $found = 0; ( $found = index( $text, $ss, $found ) ) != 
-1; $found += length $ss ) {
         push @finds, $found;
     }
     return \@finds;
}

> sub regex_find
> {
>     my @finds;
>     while ($text =~ /\Q$ss\E/g) {
> 	push @finds, pos ($text) - length($ss);
>     }
>     return \@finds;
> }
> 
> sub regex_unur_find
> {
>     my @finds;
>     while ($text =~ /\Q$ss/go) {
> 	push @finds, pos ($text) - length($ss);
>     }
>     return \@finds;
> }
> 
> my $re = qr{\Q$ss};
> 
> sub regex_morrow_find
> {
>     my @finds;
>     while ($text =~ /$re/g) {
> 	push @finds, pos ($text) - length($ss);
>     }
>     return \@finds;
> }
> 
> # check it all works
> 
> for (\&index_find, \&regex_find, \&regex_unur_find, \&regex_morrow_find) {
>     print "String found at ", (join ", ",@{&{$_}($text, $ss)}),"\n";
> }
> 
> use Benchmark qw( cmpthese );
> 
> my $count = 100000;
> 
> cmpthese $count, {
>     'index' => sub {index_find($text,$ss)},
>     'regex' => sub {regex_find($text,$ss)},
>     'regex_unur' => sub {regex_unur_find($text,$ss)},
>     'regex_morrow' => sub {regex_morrow_find($text,$ss)},
> };
> 
> Here are the results of five runs:
> 
>                  Rate regex_morrow        regex   regex_unur        index
> regex_morrow 116279/s           --         -31%         -40%         -41%
> regex        169492/s          46%           --         -12%         -14%
> regex_unur   192308/s          65%          13%           --          -2%
> index        196078/s          69%          16%           2%           --

On my machine and version of Perl regex_unur_find() and 
regex_morrow_find() are statically the same;

                  Rate       regex regex_unur regex_morrow index_while 
index_for
regex        144442/s          --        -5%          -6%        -16% 
     -17%
regex_unur   151952/s          5%         --          -1%        -11% 
     -13%
regex_morrow 154059/s          7%         1%           --        -10% 
     -12%
index_while  171325/s         19%        13%          11%          -- 
      -2%
index_for    174905/s         21%        15%          14%          2% 
       --



John
-- 
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.                            -- Larry Wall


------------------------------

Date: Sat, 31 May 2008 11:13:39 +0000 (UTC)
From: Ben Bullock <benkasminbullock@gmail.com>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <g1rbt3$94p$1@ml.accsnet.ne.jp>

On Sat, 31 May 2008 07:19:24 +0000, John W. Krahn wrote:

> On my machine and version of Perl a get a speed improvement by using a C
> style for loop instead:

I think this is just an artifact of using a string which matches
several times.

> On my machine and version of Perl regex_unur_find() and
> regex_morrow_find() are statically the same;
> 
>                   Rate       regex regex_unur regex_morrow index_while
> index_for
> regex        144442/s          --        -5%          -6%        -16%
>      -17%
> regex_unur   151952/s          5%         --          -1%        -11%
>      -13%
> regex_morrow 154059/s          7%         1%           --        -10%
>      -12%
> index_while  171325/s         19%        13%          11%          --
>       -2%
> index_for    174905/s         21%        15%          14%          2%

I repeated the test with Perl 5.8 and got similar results to you.

It seems like the Ben Morrow version has been de-optimized, and the
A. Sinan Unur version optimized, in Perl 5.10.

Perl 5.10:

regex_morrow 138889/s           --         -32%         -40%         -40%
regex        204082/s          47%           --         -12%         -12%
regex_unur   232558/s          67%          14%           --          -0%
index        232558/s          67%          14%           0%           --

Perl 5.8:

regex        178571/s           --          -7%          -7%         -21%
regex_unur   192308/s           8%           --          -0%         -15%
regex_morrow 192308/s           8%           0%           --         -15%
index        227273/s          27%          18%          18%           --

(repeating the test several times gave both faster and slower times
for unur and morrow versions, with both being roughly the same.)


------------------------------

Date: 31 May 2008 11:59:21 GMT
From: Abigail <abigail@abigail.be>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <slrng42fco.6ns.abigail@alexandra.abigail.be>

                                       _
Ben Morrow (ben@morrow.me.uk) wrote on VCCCLXXXVII September MCMXCIII in
<URL:news:mas6h5-nt11.ln1@osiris.mauzo.dyndns.org>:
""  
""  Quoth "A. Sinan Unur" <1usa@llenroc.ude.invalid>:
"" > 
"" > Changing this to
"" > 
"" >     	while ( $text =~ /\Q$ss\E/og ) {
"" > 
"" > makes regex_find faster by about 1%.
""  
""      my $rx = qr/\Q$ss/;
""  
""      ...
""  
""      while ($text =~ /$rx/g) {
""  
""  is both clearer and safer. If the program is modified so that $ss is
""  actually variable (say, the whole thing is made into a sub) then /o
""  would cause it to fail in ways that are rather hard to diagnose.
""  
""  Note that m// will only use the precompiled form of the qr// if the
""  $rx is the only thing in the match. Something like /^$rx/ or
""  /$rx1|$rx2/ or even / $rx/x will cause the regex to be recompiled
""  every time all over again.


That hasn't been the case for over a decade or so:

     perl -Mre=debug -wE '$re = qr /foo/;
                           for (qw [bar baz]) {/ $re/}' 2>&1| grep '^Compiling' 
     Compiling REx "foo"
     Compiling REx " (?-xism:foo)"


It only compiles twice, once for the qr //, and once for the m //.
It used to be that way back (before we had qr//), Perl would recompile
a regexp like / $re/, in which case /o was useful.

Nowadays the slight improvement of using /o (there's a sligh improvement
in the sense that when /o is used, Perl doesn't have to check whether
the variables interpolated have changed), IMO, doesn't weight up against
the risk of introducing hard to find errors.

Out of obfuscated code, I would never use /o.


Abigail
-- 
use   lib sub {($\) = split /\./ => pop; print $"};
eval "use Just" || eval "use another" || eval "use Perl" || eval "use Hacker";


------------------------------

Date: Sat, 31 May 2008 06:38:59 -0700 (PDT)
From: nolo contendere <simon.chao@gmail.com>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <d82b55f8-170d-4881-b21c-cf18f243345e@a70g2000hsh.googlegroups.com>

On May 31, 7:59=A0am, Abigail <abig...@abigail.be> wrote:
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0_
> Out of obfuscated code, I would never use /o.
>
> Abigail
> --
> use =A0 lib sub {($\) =3D split /\./ =3D> pop; print $"};
> eval "use Just" || eval "use another" || eval "use Perl" || eval "use Hack=
er";

you should use a bunch in your japhs then :-).


------------------------------

Date: Sat, 31 May 2008 14:44:04 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <Xns9AAF6D323253Easu1cornelledu@127.0.0.1>

Ben Bullock <benkasminbullock@gmail.com> wrote in
news:g1qn82$3i2$1@ml.accsnet.ne.jp: 

> On Sat, 31 May 2008 01:49:14 +0000, A. Sinan Unur wrote:
> 
>> There is one change I would make to both routines. Cache length $ss
>> before the loop. On my machine, that cut down the time by about 40%.
> 
> In the real situation almost every search for the search string ($ss)
> in the text ($text) is going to fail, so I think it would make much
> less difference than in the fake situation of repeating with the same
> string over and over again.

True.

> I modified the program to use /o that A. Sinan Unur added, and also
> added a routine to test Ben Morrow's idea and reran it using the
> Benchmark module:

Thank you for collecting all these ideas and running and reporting the 
test results.

Let me clarify one point: I wasn't advocating the use of /o. I wanted to 
point out that the 4% speed advantage of index which I had observed 
disappeared with the use of /o. Of course, correct usage is more 
important than speed and as Abigail noted, the use of /o could create 
nasty behavior in this case.

I am not going to change my practice of searching for or checking the 
existence of literal strings in text using index because that makes 
intuitive sense to me.

Sinan

-- 
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/


------------------------------

Date: Sat, 31 May 2008 16:14:13 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <5j68h5-eq3.ln1@osiris.mauzo.dyndns.org>


Quoth abigail@abigail.be:
> Ben Morrow (ben@morrow.me.uk) wrote on VCCCLXXXVII September MCMXCIII in
> <URL:news:mas6h5-nt11.ln1@osiris.mauzo.dyndns.org>:
> ""  
> ""  Note that m// will only use the precompiled form of the qr// if the
> ""  $rx is the only thing in the match. Something like /^$rx/ or
> ""  /$rx1|$rx2/ or even / $rx/x will cause the regex to be recompiled
> ""  every time all over again.
> 
> That hasn't been the case for over a decade or so:
> 
>      perl -Mre=debug -wE '$re = qr /foo/;
>                      for (qw [bar baz]) {/ $re/}' 2>&1| grep '^Compiling' 
>      Compiling REx "foo"
>      Compiling REx " (?-xism:foo)"
> 
> It only compiles twice, once for the qr //, and once for the m //.

You know, I did actually test that :).

    ~% perl -Mre=debug -e'$qr=qr/a/; /$qr/' 2>&1 | grep Comp
    Compiling REx `a'
    ~% perl -Mre=debug -e'$qr=qr/a/; / $qr/x' 2>&1 | grep Comp
    Compiling REx `a'
    Compiling REx ` (?-xism:a)'

If it's actually using the compiled form of the qr//, it doesn't need to
compile for the m// at all. The fact your example only compiles the m//
once is the 'if a variable hasn't changed, don't recompile'
optimization, which applies regardless of qr//:

    ~% perl -Mre=debug -e'$qr=qr/a/; / $qr/ for 1, 2' 2>&1 | grep Comp
    Compiling REx `a'
    Compiling REx ` (?-xism:a)'
    ~% perl -Mre=debug -e'$qr=q/a/; / $qr/ for 1, 2' 2>&1 | grep Comp 
    Compiling REx ` a'

> Out of obfuscated code, I would never use /o.

Me either.

Ben

-- 
   If you put all the prophets,   |   You'd have so much more reason
   Mystics and saints             |   Than ever was born
   In one room together,          |   Out of all of the conflicts of time.
ben@morrow.me.uk                                    The Levellers, 'Believers'


------------------------------

Date: Sat, 31 May 2008 16:01:30 GMT
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <gbs244h79doa07dg6a8q9o9attfad6nh23@4ax.com>

"A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote:
>I am not going to change my practice of searching for or checking the 
>existence of literal strings in text using index because that makes 
>intuitive sense to me.

I second that. While certainly REs are very powerful and can do many
things sometimes a simpler tool is all you need. 

In a way it is similar to using double versus single quotes. If the
programmer used REs (or double quotes) then I expect that there was a
need for them and I start looking for special RE constructs (or items
that interpolate) and when I don't find any then the question becomes
"oops, what did I miss?" It's kind of don't use the big gun unless there
is a need for it.

An extreme example of this overkill could be seen in what I believe may
have triggered Ben's curiosity and his commendable effort to gather some
hard data. Although the OP never comfirmed his actual intentions he
anchored his RE (which he apparently wanted to be a literal match) to
the beginning and end of the string, giving a very strong indication
that in reality he was looking for a trivial 'eq'.

Sure, instead of 
	$foo eq 'bar'
you can use
	$foo =~ m/^\Q$foo\E$/
But why would you want to? 

And IMO the same applies to pattern matching when a simpler index() does
the job just as well.

jue


------------------------------

Date: Sat, 31 May 2008 16:06:21 GMT
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Speed comparison of regex versus index, lc, and / /i
Message-Id: <jmt244deir5t29ca30mi8u2v565apnvr5n@4ax.com>

Jürgen Exner <jurgenex@hotmail.com> wrote:

>Sure, instead of 
>	$foo eq 'bar'
>you can use
>	$foo =~ m/^\Q$foo\E$/

Oooops, make that 
	$foo =~ m/^\Qbar\E$/
of course.

jue


------------------------------

Date: Fri, 30 May 2008 22:40:03 -0700
From: "szr" <szrRE@szromanMO.comVE>
Subject: Re: The Importance of Terminology's Quality
Message-Id: <g1qobk02jbj@news4.newsguy.com>

Arne Vajhøj wrote:
> Stephan Bour wrote:
>> Lew wrote:
>> } John Thingstad wrote:
>> } > Perl is solidly based in the UNIX world on awk, sed, bash and C.
>> } > I don't like the style, but many do.
>> }
>> } Please exclude the Java newsgroups from this discussion.
>>
>> Did it ever occur to you that you don't speak for entire news groups?
>
> Did it occur to you that there are nothing about Java in the above ?

Looking at the original post, it doesn't appear to be about any specific 
language.

-- 
szr 




------------------------------

Date: Sat, 31 May 2008 00:36:27 -0700
From: "Peter Duniho" <NpOeStPeAdM@nnowslpianmk.com>
Subject: Re: The Importance of Terminology's Quality
Message-Id: <op.ub0cq1fy8jd0ej@petes-computer.local>

On Fri, 30 May 2008 22:40:03 -0700, szr <szrRE@szromanMO.comVE> wrote:

> Arne Vajhøj wrote:
>> Stephan Bour wrote:
>>> Lew wrote:
>>> } John Thingstad wrote:
>>> } > Perl is solidly based in the UNIX world on awk, sed, bash and C.
>>> } > I don't like the style, but many do.
>>> }
>>> } Please exclude the Java newsgroups from this discussion.
>>>
>>> Did it ever occur to you that you don't speak for entire news groups?
>>
>> Did it occur to you that there are nothing about Java in the above ?
>
> Looking at the original post, it doesn't appear to be about any specific
> language.

Indeed.  That suggests it's probably off-topic in most, if not all, of the  
newsgroups to which it was posted, inasmuch as they exist for topics  
specific to a given programming language.

Regardless, unless you are actually reading this thread from the c.l.j.p  
newsgroup, I'm not sure I see the point in questioning someone who _is_  
about whether the thread belongs there or not.  Someone who is actually  
following the thread from c.l.j.p can speak up if they feel that Lew is  
overstepping his bounds.  Anyone else has even less justification for  
"speaking for the entire newsgroup" than Lew does, and yet that's what  
you're doing when you question his request.

And if it's a vote you want, mark me down as the third person reading  
c.l.j.p that doesn't feel this thread belongs.  I don't know whether Lew  
speaks for the entire newsgroup, but based on comments so far, it's pretty  
clear that there unanimous agreement among those who have expressed an  
opinion.

If you all in the other newsgroups are happy having the thread there,  
that's great.  Please feel free to continue with your discussion.  But  
please, drop comp.lang.java.programmer from the cross-posting.

Pete


------------------------------

Date: Sat, 31 May 2008 09:27:11 -0700
From: "szr" <szrRE@szromanMO.comVE>
Subject: Re: The Importance of Terminology's Quality
Message-Id: <g1ru8v0v3l@news4.newsguy.com>

Peter Duniho wrote:
> On Fri, 30 May 2008 22:40:03 -0700, szr <szrRE@szromanMO.comVE> wrote:
>
>> Arne Vajhøj wrote:
>>> Stephan Bour wrote:
>>>> Lew wrote:
>>>> } John Thingstad wrote:
>>>> } > Perl is solidly based in the UNIX world on awk, sed, } > bash 
>>>> and C. I don't like the style, but many do.
>>>> }
>>>> } Please exclude the Java newsgroups from this discussion.
>>>>
>>>> Did it ever occur to you that you don't speak for entire news
>>>> groups?
>>>
>>> Did it occur to you that there are nothing about Java in the above ?
>>
>> Looking at the original post, it doesn't appear to be about any
>> specific language.
>
> Indeed.  That suggests it's probably off-topic in most, if not all,
> of the newsgroups to which it was posted, inasmuch as they exist for
> topics specific to a given programming language.

Perhaps - comp.programming might of been a better place, but not all 
people who follow groups for specific languages follow a general group 
like that - but let me ask you something. What is it you really have 
against discussing topics with people of neighboring groups? Keep in 
mind you don't have to read anything you do not want to read. [1]

> Regardless, unless you are actually reading this thread from the
> c.l.j.p newsgroup, I'm not sure I see the point in questioning
> someone who _is_ about whether the thread belongs there or not.

I would rather have the OP comment about that, as he started the thread. 
But what gets me is why you are against that specific group being 
included but not others? What is so special about the Java group and why 
are you so sure people there don't want to read this thread? [1] What 
right do you or I or anyone have to make decisions for everyone in a 
news group? Isn't this why most news readers allow one to block a 
thread?

> And if it's a vote you want, mark me down as the third person reading
> c.l.j.p that doesn't feel this thread belongs.  I don't know whether
> Lew speaks for the entire newsgroup, but based on comments so far,
> it's pretty clear that there unanimous agreement among those who have
> expressed an opinion.

Ok, so, perhaps 3 people out of what might be several hundred, if not 
thousand (there is no way to really know, but there are certainly a lot 
of people who read that group, and as with any group, there are far more 
readers than there are people posting, so, again, just because you or 
two other people or so don't want to read a topic or dislike it, you 
feel you can decide for EVERYONE they mustn't read it? Again, this is 
why readers allow you to ignore threads. Please don't force your views 
on others; let them decide for themselves. [1]


[1] I do not mean this topic specifically,  but in general,  if one
    dislikes a thread, they are free to ignore it. I find it rather
    inconsiderate to attempt to force a decision for everyone, when
    one has the ability to simply ignore the thread entirely.


-- 
szr 




------------------------------

Date: Sat, 31 May 2008 16:48:11 GMT
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: The Importance of Terminology's Quality
Message-Id: <kjv2441a9im6bcistqcav6c87j4024s2pj@4ax.com>

"szr" <szrRE@szromanMO.comVE> wrote:
>I would rather have the OP comment about that, as he started the thread. 

The OP is a very well-known troll who has the habit of spitting out a
borderline OT article to a bunch of loosly related NGs ever so often and
then sits back and enjoys the complaints and counter-complaints of the
regulars. He doesn't provide anything useful in any of the groups he
targets (at least AFAIK) and he doesn't participate in the resulting
mayhem himself, either. 
He will only go away if everyone just ignores him.

With this in mind another reminder:

         +-------------------+             .:\:\:/:/:.
         |   PLEASE DO NOT   |            :.:\:\:/:/:.:
         |  FEED THE TROLLS  |           :=.' -   - '.=:
         |                   |           '=(\ 9   9 /)='
         |   Thank you,      |              (  (_)  )
         |       Management  |              /`-vvv-'\
         +-------------------+             /         \
                 |  |        @@@          / /|,,,,,|\ \
                 |  |        @@@         /_//  /^\  \\_\
   @x@@x@        |  |         |/         WW(  (   )  )WW
   \||||/        |  |        \|           __\,,\ /,,/__
    \||/         |  |         |      jgs (______Y______)
/\/\/\/\/\/\/\/\//\/\\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
==============================================================

Follow-up adjusted.

jue


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 1598
***************************************


home help back first fref pref prev next nref lref last post