[25554] in Perl-Users-Digest
Perl-Users Digest, Issue: 7798 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Feb 18 11:06:07 2005
Date: Fri, 18 Feb 2005 08:05:39 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 18 Feb 2005 Volume: 10 Number: 7798
Today's topics:
Re: [perl-python] exercise: partition a list by equival <antonmuhin@rambler.ru>
cperl-mode and emacs-21.4 brocken? <mike53@moocow.tu-bs.de>
How Can I Find/Remove a Null from a String? msargent100@hotmail.com
Re: How Can I Find/Remove a Null from a String? <1usa@llenroc.ude.invalid>
Re: How Can I Find/Remove a Null from a String? <jurgenex@hotmail.com>
Re: How Can I Find/Remove a Null from a String? <anomousty@webwonders.org>
Re: How Can I Find/Remove a Null from a String? <1usa@llenroc.ude.invalid>
How can I get the exit status? laredotornado@zipmail.com
Re: ithreads + signals on modern Unices brianr@liffe.com
Re: Low level data manipulation in Perl nospam@geniegate.com
Re: Modify keys in a %hash using tr/// or s/// <bart.lateur@pandora.be>
Need help with an advanced? regular expression <nospam@nospam.net>
Re: Need help with an advanced? regular expression <bernard.el-haginDODGE_THIS@lido-tech.net>
Re: Newbie Perl programming help (RSS & IRC) <cmw@tulpje.co.uk>
Re: Newbie Perl programming help (RSS & IRC) <spamtrap@dot-app.org>
Re: Perl script timeout problem <jurgenex@hotmail.com>
Re: Perl script timeout problem <nobull@mail.com>
problem with system(@args) <npritchard@mail.com>
Regex combining /(foo|bar)/ slower than using foreach ( jolly@tavern.de
Re: Regex combining /(foo|bar)/ slower than using forea jolly@tavern.de
Re: Regex combining /(foo|bar)/ slower than using forea <jolly@tavern.de>
Re: Regex combining /(foo|bar)/ slower than using forea xhoster@gmail.com
Re: Regex combining /(foo|bar)/ slower than using forea <noreply@gunnar.cc>
Re: Regex combining /(foo|bar)/ slower than using forea <noreply@gunnar.cc>
Re: Regex combining /(foo|bar)/ slower than using forea <nobull@mail.com>
Re: regexp: read ip address <lawshouse.public@btconnect.com>
Re: SET Operations in Perl <jurgenex@hotmail.com>
Re: simple encryption/decryption nospam@geniegate.com
simple map query <john1976@hotmail.com>
Re: simple map query <phaylon@dunkelheit.at>
Re: use strict; and O_WRONLY <spamtrap@dot-app.org>
Re: Why aren't 'warnings' on by default? nospam@geniegate.com
Re: Why aren't 'warnings' on by default? (Anno Siegel)
Re: Why aren't 'warnings' on by default? xhoster@gmail.com
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 18 Feb 2005 17:51:05 +0300
From: anton muhin <antonmuhin@rambler.ru>
Subject: Re: [perl-python] exercise: partition a list by equivalence
Message-Id: <37mdmqF5fcvdpU1@individual.net>
Xah Lee wrote:
> here's another interesting algorithmic exercise, again from part of a
> larger program in the previous series.
>
> Here's the original Perl documentation:
>
> =pod
>
> merge($pairings) takes a list of pairs, each pair indicates the
> sameness
> of the two indexes. Returns a partitioned list of same indexes.
>
> For example, if the input is
> merge( [ [1,2], [2,4], [5,6] ] );
>
> that means 1 and 2 are the same. 2 and 4 are the same. Therefore
> 1==2==4. The result returned is
>
> [[4,2,1],[6,5]];
>
> (ordering of the returned list and sublists are not specified.)
>
> =cut
Almost a joke:
from numarray import *
def merge(*pairs):
flattened = reduce(tuple.__add__, pairs, tuple())
m, M = min(flattened), max(flattened)
d = M - m + 1
matrix = zeros((d, d), type = Bool)
for x, y in pairs:
X, Y = x - m, y - m
matrix[X, X] = 1
matrix[X, Y] = 1
matrix[Y, X] = 1
matrix[Y, Y] = 1
while True:
next = greater(dot(matrix, matrix), 0)
if alltrue(ravel(next == matrix)):
break
matrix = next
results = []
for i in range(d):
eqls, = nonzero(matrix[i])
if eqls.size():
if i == eqls[0]:
results.append(tuple(x + m for x in eqls))
return results
Disclaimer: I'm not an expert in numarray and suppose the something can
be dramatically imporved.
------------------------------
Date: 18 Feb 2005 12:09:13 GMT
From: Mike Dowling <mike53@moocow.tu-bs.de>
Subject: cperl-mode and emacs-21.4 brocken?
Message-Id: <slrnd1bmnm.q7.mike53@moocow.localhost>
I had no problems with cperl-mode with emacs-21.3.
With emacs-21.4, it still works, but only if started manually (M-x
cperl-mode). Starting automaticall with
(autoload 'perl-mode "cperl-mode" "alternate mode for editing Perl
programs" t)
in the .emacs file fails. Instead, I get the message:
File mode specification error: (error "Autoloading failed to define
function perl-mode")
Any clues, anybody?
Cheers,
M. Dowling
--
Sorry, after years of fighting spam, I've given in. I changed
my email address, and don't want to reveal it to spammers. I
understand any wrath therby incurred, but I'm no longer in any
position to continue the fight.
------------------------------
Date: 18 Feb 2005 05:18:55 -0800
From: msargent100@hotmail.com
Subject: How Can I Find/Remove a Null from a String?
Message-Id: <1108732735.621070.66630@f14g2000cwb.googlegroups.com>
Perl newbie question -
I am having a problem finding the end of a string. I determined there
are nulls at the end but I do not know how many nor how to get rid of
them. I did a reverse on the variable and then looking at substr($x, 0,
1), I get a null returned.
What happened -
I am reading from a binary file that has fixed length fields but the
contents of the field may be variable length strings. To get the data I
need I pull the contents of the field using the substr function with an
offset and length of the field. So the text field is always
64-characters long, but the values in the field will vary considerably.
So my question is: How can I find the end of the characters and assign
only the valid characters to the variable and drop the nulls?
I'm trying to use the data extracted from the binary file to
access/insert into a database. SQL does not like nulls at the end of
the values being inserted.
Thanks,
Mike
------------------------------
Date: Fri, 18 Feb 2005 13:36:03 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: How Can I Find/Remove a Null from a String?
Message-Id: <Xns9601577E09809asu1cornelledu@127.0.0.1>
msargent100@hotmail.com wrote in news:1108732735.621070.66630
@f14g2000cwb.googlegroups.com:
> I am having a problem finding the end of a string. I determined there
> are nulls at the end but I do not know how many nor how to get rid of
> them. I did a reverse on the variable and then looking at substr($x,
> 0, 1), I get a null returned.
I am not sure what information you are trying to convey by the last
statement.
#! /usr/bin/perl
my $s = qq{This is a test\000\000\000\000\000\000\000};
print 'length $s = ', length $s, "\n";
$s = substr $s, 0, index $s, "\000";
print 'length $s = ', length $s, "\n"
__END__
D:\Home\asu1\UseNet\clpmisc> n
length $s = 21
length $s = 14
Sinan
------------------------------
Date: Fri, 18 Feb 2005 13:58:12 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: How Can I Find/Remove a Null from a String?
Message-Id: <UvmRd.10363$uc.8133@trnddc09>
msargent100@hotmail.com wrote:
> I am having a problem finding the end of a string. I determined there
> are nulls at the end but I do not know how many nor how to get rid of
> them. I did a reverse on the variable and then looking at substr($x,
> 0, 1), I get a null returned.
[...]
> So my question is: How can I find the end of the characters and
> assign only the valid characters to the variable and drop the nulls?
Many different ways:
- use index() to find the first \000 and then use substr() to extract the
good part of the string from 1 to that position
- use s/// to replace the trailing \000s with nothing
- use tr/// to transliterate (actually delete) \000s
- "while" the last character of the string is \000 (can easily be checked
with a negative offset argument to substr) shorten the string by one
character
- reverse the string, "while" the first character is \000 remove it, reverse
the string
- split() the string into an array of single characters, grep() for
non-\000, and join() again
- split() the string into an array of single characters, map all \000 into
an empty string, and join() again
- ...
I am sure there are many more obscure ways to do it.
jue
------------------------------
Date: Fri, 18 Feb 2005 09:15:59 -0600
From: m <anomousty@webwonders.org>
Subject: Re: How Can I Find/Remove a Null from a String?
Message-Id: <cv50s7$2m$1@nntp.msstate.edu>
A. Sinan Unur wrote:
> I am not sure what information you are trying to convey by the last
> statement.
>
> #! /usr/bin/perl
>
> my $s = qq{This is a test\000\000\000\000\000\000\000};
>
> print 'length $s = ', length $s, "\n";
>
> $s = substr $s, 0, index $s, "\000";
>
> print 'length $s = ', length $s, "\n"
> __END__
>
> D:\Home\asu1\UseNet\clpmisc> n
> length $s = 21
> length $s = 14
>
> Sinan
#! /usr/bin/perl
my $s = qq{This is a test\000\000\000\000\000\000\000};
print "length of string is ",length $s,"\n" ;
$s=~/\000/;
print "length is ",length $`,"\n" ;
is this a "worse" form when compared to the above program?am jst
starting off with perl!
------------------------------
Date: 18 Feb 2005 15:50:08 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: How Can I Find/Remove a Null from a String?
Message-Id: <Xns96016E3A151D5asu1cornelledu@132.236.56.8>
m <anomousty@webwonders.org> wrote in news:cv50s7$2m$1@nntp.msstate.edu:
> A. Sinan Unur wrote:
>> I am not sure what information you are trying to convey by the last
>> statement.
>>
>> #! /usr/bin/perl
>>
>> my $s = qq{This is a test\000\000\000\000\000\000\000};
>>
>> print 'length $s = ', length $s, "\n";
>>
>> $s = substr $s, 0, index $s, "\000";
>>
>> print 'length $s = ', length $s, "\n"
>> __END__
>>
>> D:\Home\asu1\UseNet\clpmisc> n
>> length $s = 21
>> length $s = 14
>>
>> Sinan
>
> #! /usr/bin/perl
> my $s = qq{This is a test\000\000\000\000\000\000\000};
> print "length of string is ",length $s,"\n" ;
> $s=~/\000/;
> print "length is ",length $`,"\n" ;
>
> is this a "worse" form when compared to the above program?am jst
> starting off with perl!
They do different things.
I try to write what I mean. If the purpose of the code is to get rid of
everything after the first \000, then that's what it should do.
Sinan
------------------------------
Date: 18 Feb 2005 07:15:27 -0800
From: laredotornado@zipmail.com
Subject: How can I get the exit status?
Message-Id: <1108739727.195814.98050@o13g2000cwo.googlegroups.com>
Hello,
I have a perl script that launches a Unix process that calls another
Perl script:
`perl /myscripts/gen_ecom_xml.pl $ckit_file > $xml_file`;
and I am wondering how I can get the exit status value and any
potential error messages?
Thanks in advance, - Dave
------------------------------
Date: Fri, 18 Feb 2005 13:50:58 +0000
From: brianr@liffe.com
Subject: Re: ithreads + signals on modern Unices
Message-Id: <vt7jl6rn7h.fsf@ssdevws28.admin.liffe.com>
Thomas Jahns <Thomas.Jahns@epost.de> writes:
> Thomas Jahns <Thomas.Jahns@epost.de> writes:
>> I wish to make a background application I told to 'use threads;' also
>> react nicely to a SIGHUP (and reread configuration). perldoc perlthrtut
>> tells me not to mix signals and ithreads but the other aspects to
>> consider are as follows:
>>
>> - I don't care for portability to Win32, pre-X MacOS, MVS or whatever
>> platforms may also provide a Perl implementation. I just need the
>> program to run on relatively modern Unices (i.e. pthreads and POSIX
>> sigaction will be available).
>>
>> - perlthrtut also tells me 'use Thread;' will break real soon and isn't
>> so great to begin with, and Thread::Queue which my program already
>> uses is--surprise--not meant to work with Thread but threads anyway.
>>
>> So I seek a description of signal semantics for the systems outlined
>> when using ithreads. Is there such documentation available? I searched
>> but apart from the Perl source couldn't find anything useful (not that I
>> didn't get many google hits, but what I got was either outdated or a
>> repetition of the message from perlthrtut).
>>
>> Since the I really like the ease at which Perl allows me to write
>> programs for Unix/Linux I'd really hate to turn my program into ten
>> times the number of code lines of C.
>
> So does the lack of answers mean, that I
>
> - I did not describe my intended application clearly enough?
> - I should take this question to a Unix programming group?
> - I violated etiquette really badly?
> - noone except me cares about using threads and still handling
> signals, after all it has to work every time one calls system(), or
> not?
>
> Please, any pointer will do, even if it means I'll have to either dig
> through the Perl or rather rewrite in C.
If it is any of those reasons, it is likely to be the last, as I would
expect (hope?) that not many people want to combine signals and
threads. It is generally considered to be a bad idea. David Butenhof
in his book "Programming with POSIX Threads" gives a good explanation
of why.
If you do have to use threads and signals together the usual advice is
to use signal masks to block signals from all threads, and collect
signals in a dedicated thread which blocks on one of the sigwait
variants. Unfortunately, I believe that this is part of the POSIX API
that Perl doesn't provide (as mentioned in perlthrtut).
I would recommend finding a different way of doing what you want.
HTH
--
Brian Raven
I don't like this official/unofficial distinction. It sound, er, officious.
-- Larry Wall in <199702221943.LAA20388@wall.org>
------------------------------
Date: Fri, 18 Feb 2005 15:37:08 GMT
From: nospam@geniegate.com
Subject: Re: Low level data manipulation in Perl
Message-Id: <Lucy1108738569137890xef2054@air.tunestar.net>
In: <csdo6v$60e$1@newsg2.svr.pol.co.uk>, "Leonard Challis" <perl@lennychallis.co.uk> wrote:
>Hi everyone,
>
>I have spent a few hours looking on Google, Perl.com, CPAN etc to try find
>some information on messing about with low leveldata in Perl. I am talking
>about opening files and looking at them in their very simplest format, 1s
>and 0s.
>
>What I have noticed from my searches so far is things like pack(), unpack(),
>binmode() and some other stuff, but not really what I'm looking for, AFAIK.
Actually, you're in the right direction. pack(), unpack(), ord(), chr(), vec()
and sometimes sprintf() (sprintf mostly for the "looking at" part)
I look at them kind of like a letter "1" and then there is a binary
1, (031x and ^A I believe). Unless you tell it with unpack() (or for single
byte ASCII ord()) that you're really interested in it numerically, perl will treat
the data as text.
$a = 'A';
$a + 1; # is '1'
ord($a) + 1 # Should now be 66.
Sometimes I'll add a zero to something to force numeric context:
$a += 0;
Jamie
--
http://www.geniegate.com Custom web programming
guhzo_42@lnubb.pbz (rot13) User Management Solutions
------------------------------
Date: Fri, 18 Feb 2005 11:04:30 GMT
From: Bart Lateur <bart.lateur@pandora.be>
Subject: Re: Modify keys in a %hash using tr/// or s///
Message-Id: <lqib11tm4ond8geeddnufknkal7knok8j0@4ax.com>
Fred Hare wrote:
>I have a script for comparing a dir of MP3-files to a DB-file. It works
>OK unless there are files with high-ascii characters in the dir or in
>the DB. I tried to add a conversion but I get errors like "uninitialized
>value in pattern-match..."
>Is there a way to alter the hash-keys using tr/// or s/// ?
No, you have to do the conversion before you put the stuff in the hash.
A hash key is a read-only property.
You can use
$hash{$new} = delete $hash{$old};
but preferably only after you make sure that $new ne $old, so you know
this step is required.
--
Bart.
------------------------------
Date: Fri, 18 Feb 2005 14:55:06 +0000
From: Martin Gill <nospam@nospam.net>
Subject: Need help with an advanced? regular expression
Message-Id: <42160012$1_1@baen1673807.greenlnk.net>
Hi,
I'm trying to write a regular expression which parses the following string:
blah blah items 1234, 4567, 4345, and 3245 blah blah blah
I want to be able to pick up the numbers following the "items" label.
I thought the following might work, but it doesn't seem to
/ORs (\b(\d+)\b)+/
i want it to match:
1234
4567
4345
3245
Any help is greatly appreciated.
--
--
Martin Gill
------------------------------
Date: Fri, 18 Feb 2005 16:11:33 +0100
From: "Bernard El-Hagin" <bernard.el-haginDODGE_THIS@lido-tech.net>
Subject: Re: Need help with an advanced? regular expression
Message-Id: <Xns9601A4B7AE1BAelhber1lidotechnet@62.89.127.66>
Martin Gill <nospam@nospam.net> wrote:
> Hi,
>
> I'm trying to write a regular expression which parses the
> following string:
>
> blah blah items 1234, 4567, 4345, and 3245 blah blah blah
>
> I want to be able to pick up the numbers following the "items"
> label.
>
> I thought the following might work, but it doesn't seem to
>
> /ORs (\b(\d+)\b)+/
^^^
What is that supposed to do?
> i want it to match:
> 1234
> 4567
> 4345
> 3245
With the input and specification you've provided this will work for
you:
print "$_\n" for m/(\d+)/g;
--
Cheers,
Bernard
------------------------------
Date: 18 Feb 2005 03:11:43 -0800
From: "Chris" <cmw@tulpje.co.uk>
Subject: Re: Newbie Perl programming help (RSS & IRC)
Message-Id: <1108725103.524748.206630@l41g2000cwc.googlegroups.com>
Thanks for the help. It is much appreciated. A couple more things.
So would declaring the variables with the "local" command to make them
global variables work? As what I am doing in this program is monitoring
the the output of an irc channel. On every public message that is sent
from the nick "eZebra", that contains an ed2k link to write it to an
rss file. So what I think I need is the $rss and the $latest variable
to be global to the whole program. So that I can read the contents
wherever I am.
I do find programming Net::IRC a little odd. I am not used to event
driven perl + I am new to perl anyway.
thanks
Chris
------------------------------
Date: Fri, 18 Feb 2005 07:41:27 -0500
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: Newbie Perl programming help (RSS & IRC)
Message-Id: <lO-dneLxDb7lf4jfRVn-1Q@adelphia.com>
Chris wrote:
> So would declaring the variables with the "local" command to make them
> global variables work?
"local" doesn't do what most new Perl programmers expect it to do. Read up
on it - "perldoc -q scoping", "perldoc -f local".
> the the output of an irc channel. On every public message that is sent
> from the nick "eZebra", that contains an ed2k link to write it to an
> rss file. So what I think I need is the $rss and the $latest variable
> to be global to the whole program.
So declare them that way. When you use my() inside a subroutine or other
block, you're declaring a new lexical variable that exists only inside that
block.
Instead of doing that, declare a variable at the top of the file with my(),
so it's visible to all the code in that file. Or go a step further and use
our(), so that it's visible to any code within the same package.
> I do find programming Net::IRC a little odd. I am not used to event
> driven perl + I am new to perl anyway.
This has nothing at all to do with your app being event-driven.
Although, it *is* important to understand early on. I think you should take
a step back from the app you're working on, go to <http://learn.perl.org>
and study variable scoping until you're comfortable with it.
Several sections of "perldoc perlsub" discuss scoping as well.
sherm--
--
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org
------------------------------
Date: Fri, 18 Feb 2005 13:20:13 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Perl script timeout problem
Message-Id: <hYlRd.41715$uc.34427@trnddc04>
sipitai wrote:
> Brian McCauley wrote...
>
>> I shall assume this is a stealth CGI question.
>>
>> Also, this does not appear to be a Perl question at all.
>>
>> Ask yourself this: Would you expect the answer to be any different if
>> your CGI script were in python, C, pasacal, bash...?
>
> Maybe, maybe not. I figure this problem could originate from either
> the script itself, or the environment its being executed in.
Are you using alarm() in you program?
If not then it's the environment that is triggering the timeout.
jue
------------------------------
Date: Fri, 18 Feb 2005 14:01:30 +0000
From: Brian McCauley <nobull@mail.com>
Subject: Re: Perl script timeout problem
Message-Id: <cv4s5e$j1j$1@sun3.bham.ac.uk>
sipitai wrote:
> Brian McCauley wrote...
>
>
>>Do not have the script send the file. Have it perform an internal
>>redirect to the file. You may or may not be able to configure your web
>>server to prevent people bypassing the script but even if you can't you
>>can just make sure that the directory name is obscure.
>
>
> Unfortunately there are a number of reasons why this wouldnt work,
Are any of them valid?
> one of which is that the "key" for the file needs to be able to expire.
That one, for example, is not a valid reason. I suspect you misread
"internal" as "external".
Pseudo-code:
if ( key_has_expired ) {
display_error_page;
} else {
perform_internal_redirect; # Client never sees the real URL
}
------------------------------
Date: 18 Feb 2005 06:52:17 -0800
From: "npritchard@mail.com" <npritchard@mail.com>
Subject: problem with system(@args)
Message-Id: <1108738337.054901.251840@o13g2000cwo.googlegroups.com>
i'm using using Win32::GuiTest to programmatically uninstall an
application. for that, i need to launch the uninstaller (setup.exe) as
a separate process *and* return back to the script so i can use the
sendkeys fucntion. this works great for the compressed installer where
i issue a simple:
system ("start c:\\guitest\\test\\myapp.exe");
Win32::GuiTest::SendKeys("{ENTER}");
however, i'm running into problems while uninstalling.
=======================
@args = ("C:\\Program Files\\InstallShield Installation
Information\\{71A2182D-A59E-4560-80BD-71E3D21A13F3}\\setup.exe",
"-forced_uninstall");
#system(@args) == 0 or die "crap";
system(@args);
===============================
this doesn't work because the uninstaller is launched in the same
process, so the next command is never executed
also, the following does nothing. no error but the uninstaller is never
launched
==============================
system("start /D \"C:\\Program Files\\InstallShield Installation
Information\\\{71A2182D-A59E-4560-80BD-71E3D21A13F3\}\\\" setup.exe
-forced_uninstall");
=============================
i can chdir to the installshield root directory and:
=============================
system("start $installShieldGUID/setup.exe -forced_uninstall");
=============================
this works but doesn't seem like the best solution. it also forces me
to reboot my machine which is normally unneccessary.
any hints/ideas are appreciated.
------------------------------
Date: 18 Feb 2005 03:15:14 -0800
From: jolly@tavern.de
Subject: Regex combining /(foo|bar)/ slower than using foreach (/foo/,/bar/) ???
Message-Id: <1108725314.956450.106740@c13g2000cwb.googlegroups.com>
I've got a problem with the perl regex compiler. It seems that
compliation of combined regexes ( or alternation whatever you call it
) is not optimized.
Using a /(foo|bar)/ regex on strings is slower than using a foreach
loop doing the matching one after another. I've written a testprogramm
and looked at the perl source to find out why. Now I know. It seems
that DFA won't get optimised for the alternation.
As I have no time and knowledge and skill for optimising the perlregex
compiler from scratch, what can I do. Programming such foreach loops
gives me headaches - it such 'awk'ward.
Here's the testprogram for those of you that don't think it's true:
#!/bin/perl
use strict;
use Digest::MD5 qw(md5 md5_hex md5_base64);
use Time::HiRes qw(time );
#use re 'debug' ;
foreach my $regexcount (1,5,10)
{
foreach my $regexlength (2,5,10,20)
{
my @items = map{ createRandomTextWithLength($regexlength); }
(1..$regexcount);
my $regexstr = join('|',@items);
my $regex = qr /(?:$regexstr)/;
foreach my $stringlength (100,1000,10000,100000)
{
print localtime()." Stringlength: $stringlength Number of
Regexes:$regexcount Length of each Regex:$regexlength\n";
my $teststring = createRandomTextWithLength($stringlength);
my $timer;
{
my $test=$teststring;
$timer =time;
$test =~ s/$regex/foobar/g;
printf("ElapsedTime:%5.4f %20s
%20s\n",time-$timer,md5_hex($test),$regex);
}
{
my $test=$teststring;
$timer =time;
foreach my $oneregex (@items)
{
$test =~ s/$oneregex/foobar/g;
}
printf("ElapsedTime:%5.4f %20s
%20s\n",time-$timer,md5_hex($test),' for loop over '.join(',',@items));
}
print "\n";
}
}
}
sub createRandomTextWithLength($)
{
my($count) = (@_);
my $string;
for (1.. $count)
{
$string.=chr(ord('a')+rand(20));
}
return $string;
}
------------------------------
Date: 18 Feb 2005 03:36:43 -0800
From: jolly@tavern.de
Subject: Re: Regex combining /(foo|bar)/ slower than using foreach (/foo/,/bar/) ???
Message-Id: <1108726603.857300.66710@z14g2000cwz.googlegroups.com>
Yep, it's an optimisation issue. I always thought that using the
/(foo|bar)/ would be the quickest way. So lot's of code has been
already written with that in mind.
It has become a problem lately as the strings I do regexes on tend to
get larger ( e.q. xml-files ) and the performance penalty is HUGE.
I thought' that maybe someone has a solution for the problem. I haven't
taken a look how parrot works with regexes. I'm thinking of writing a
Module with an optimised parser for such regexes but that would be my
last resort.
Jolly
------------------------------
Date: 18 Feb 2005 06:33:40 -0800
From: "JollyJinx" <jolly@tavern.de>
Subject: Re: Regex combining /(foo|bar)/ slower than using foreach (/foo/,/bar/) ???
Message-Id: <1108737220.645118.65920@g14g2000cwa.googlegroups.com>
A pity that Foo and bar are NOT plain strings the real thing looks more
like:
$self->{MATCHCACHE}= '([^[:alnum:]\xc0-\xff]('.join('|', map{
RockBottom::buildMatchFromString($_) }(sort
{length($b)<=>length($a)}(@matcharray)) ).')[^[:alnum:]\xc0-\xff])';
buildMatchFromString builds multiple versions of a string and it builds
regexes not just plain strings .
The array contains thousands of matches and replacement is into a hash
of strings ( actually a function ). But anyways the XML files are not
small ( > 100 kBytes ).
index isn't an option here ;-(
--
Jolly
------------------------------
Date: 18 Feb 2005 15:12:46 GMT
From: xhoster@gmail.com
Subject: Re: Regex combining /(foo|bar)/ slower than using foreach (/foo/,/bar/) ???
Message-Id: <20050218101246.211$XW@newsreader.com>
jolly@tavern.de wrote:
> I've got a problem with the perl regex compiler. It seems that
> compliation of combined regexes ( or alternation whatever you call it
> ) is not optimized.
Maybe, but I don't think you've demonstrated that.
> Using a /(foo|bar)/ regex on strings is slower than using a foreach
> loop doing the matching one after another.
Well, they also aren't doing the same thing. So that makes any comparison
rather meaningless.
> I've written a testprogramm
> and looked at the perl source to find out why. Now I know. It seems
> that DFA won't get optimised for the alternation.
What part in the perl source tipped you off that they aren't optimized?
> As I have no time and knowledge and skill for optimising the perlregex
> compiler from scratch, what can I do. Programming such foreach loops
> gives me headaches - it such 'awk'ward.
You can write one subroutine or module, and then use it over and over.
That way you only have to program the foreach loop once.
>
> Here's the testprogram for those of you that don't think it's true:
Try adding an assertion to your code to check that the md5_hex of each
$test are actually equal.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: Fri, 18 Feb 2005 12:12:34 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Regex combining /(foo|bar)/ slower than using foreach (/foo/,/bar/) ???
Message-Id: <37m1nbF5ge7taU1@individual.net>
jolly@tavern.de wrote:
> I've got a problem with the perl regex compiler. It seems that
> compliation of combined regexes ( or alternation whatever you call it
> ) is not optimized.
>
> Using a /(foo|bar)/ regex on strings is slower than using a foreach
> loop doing the matching one after another. I've written a testprogramm
> and looked at the perl source to find out why. Now I know. It seems
> that DFA won't get optimised for the alternation.
>
> As I have no time and knowledge and skill for optimising the perlregex
> compiler from scratch, what can I do. Programming such foreach loops
> gives me headaches - it such 'awk'ward.
Do you need to optimise the program in this respect? If not, why would
you have a problem? ;-)
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Fri, 18 Feb 2005 13:21:37 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Regex combining /(foo|bar)/ slower than using foreach (/foo/,/bar/) ???
Message-Id: <37m5pfF5ejibiU1@individual.net>
[ Please provide some context when replying to a message. ]
jolly@tavern.de wrote:
> Gunnar Hjalmarsson wrote:
>> jolly@tavern.de wrote:
>>> I've got a problem with the perl regex compiler. It seems that
>>> compliation of combined regexes ( or alternation whatever you call it
>>> ) is not optimized.
>>
>> Do you need to optimise the program in this respect? If not, why would
>> you have a problem? ;-)
>
> Yep, it's an optimisation issue. I always thought that using the
> /(foo|bar)/ would be the quickest way. So lot's of code has been
> already written with that in mind.
> It has become a problem lately as the strings I do regexes on tend to
> get larger ( e.q. xml-files ) and the performance penalty is HUGE.
XML files are seldom very large.
Anyway, are "foo" and "bar" plain strings? If they are, you may want to
try using the index() function instead of the regex engine for better
efficiency.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Fri, 18 Feb 2005 14:10:32 +0000
From: Brian McCauley <nobull@mail.com>
Subject: Re: Regex combining /(foo|bar)/ slower than using foreach (/foo/,/bar/) ???
Message-Id: <cv4smd$j6m$1@sun3.bham.ac.uk>
jolly@tavern.de wrote:
> Using a /(foo|bar)/ regex on strings is slower than using a foreach
> loop doing the matching one after another.
Yes, this is even mentioned in the FAQ, albeit in an oblique way.
The solution to FAQ "How do I efficiently match many regular expressions
at once?" does not mention joining them with '|'. It would IMNSHO be
better if it were to explicitly mention that joining was not efficient.
------------------------------
Date: Fri, 18 Feb 2005 13:57:49 +0000
From: Henry Law <lawshouse.public@btconnect.com>
Subject: Re: regexp: read ip address
Message-Id: <mmsb11pasgq1decc9a1m1ne1r8nil0fkje@4ax.com>
On Thu, 17 Feb 2005 14:34:10 +0100, vertigo <ax178@wp.pl> wrote:
>Hello
>I have line with several words, and on ip address in it. How can i read
>it ? Could anybody help ?
It's not clear what you're trying to do. I think that English is not
your first language, which may be your problem. Here is what you
should do:
(0) Read the posting guidelines for this group. They are posted twice
a week, and Google will find them for you if all else fails.
(1) Make a simple example of your input stream
(2) Be clear about what it is you want to do with it
(3) Do the best you can to write your program in Perl; debug it so it
runs without warnings.
(4) If it doesn't give you the output that you expect, come back here
and post, explaining what you're trying to do and what happened
instead. Help will come!
Last hint: be sure to follow the posting guidelines in respect of (a)
using warnings and strict; and (b) posting a small complete program
that demonstrates your problem.
Please forgive Sinan's obstructive reply: he's not very tolerant of
people who haven't taught themselves English as well as he has. But
he's a mine of information on Perl so we forgive his occasional
waspishness.
------------------------------
Date: Fri, 18 Feb 2005 13:23:43 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: SET Operations in Perl
Message-Id: <z%lRd.41716$uc.37361@trnddc04>
George wrote:
> Bernard El-Hagin wrote:
>
>> "George" <georgekinley@hotmail.com> wrote:
>> > [Lot of crap]
>>
>> *plonk*
>
> does it make you feel any better
<quote>
PROUD TO PLONK
Life's too short to be trolled, spammed or get upset about lame
posters! Make the Net a better place by ignoring those who annoy you.
Don't be annoyed; keep your calm. Be proud to plonk.
</quote>
*PLONK*
Yes, this did feel good
jue
------------------------------
Date: Fri, 18 Feb 2005 15:37:10 GMT
From: nospam@geniegate.com
Subject: Re: simple encryption/decryption
Message-Id: <Lucy1108740050137890xf16c40@air.tunestar.net>
In: <csj6h7$nc7$1@oden.abc.se>, stig <_nospam_stigerikson@yahoo.se> wrote:
>hi.
>which perl-module(s) can be used to implement very simple
>encryption/decryption of arbitrary length strings?
>it does not need to very secure but must be able to encrypt and decrypt
>(not only one way).
>
>for various reasons Crypt::Simple will not be a possible choice, are
>there any other modules that you can recommend?
>
>many thanks
>stig
y/A-Za-z/N-ZA-Mn-za-m/;
Works both ways, plus it's really simple! :-)
You could also just XOR the data and back again.. a bit more "secure" than
above, but still fairly simple and easily broken. (Kid-sister scrambling unless
you've got a secure channel to transfer the XOR chunk back and forth, AND are
able to generate a new key each time, AND those keys are identical in length to
the original data) It'd probably work to keep grep from finding your string
though.
Jamie
--
http://www.geniegate.com Custom web programming
guhzo_42@lnubb.pbz (rot13) User Management Solutions
------------------------------
Date: Fri, 18 Feb 2005 15:50:09 GMT
From: John H <john1976@hotmail.com>
Subject: simple map query
Message-Id: <R8oRd.38032$k4.741027@news1.nokia.com>
i have been trying to write very simple statement using map,
what it should do check the condition if it fails don't push it in to
result
@Myarray= map {if($_!~/echo/),@fromarray
i have totaly confused now , I have tried all the damm combination of
brackets , it does not work,
its pathatic now ,
------------------------------
Date: Fri, 18 Feb 2005 16:57:52 +0100
From: phaylon <phaylon@dunkelheit.at>
Subject: Re: simple map query
Message-Id: <pan.2005.02.18.15.57.52.67116@dunkelheit.at>
John H wrote:
> @Myarray= map {if($_!~/echo/),@fromarray
perldoc -f map has many examples, such as:
%hash = map { getkey($_) => $_ } @array;
therefore:
@myarray = map { $_ !~ /echo/ } @fromarray;
but I think you're searching perldoc -f grep
hth,phay
--
http://www.dunkelheit.at/
I want, therefore I can.
------------------------------
Date: Fri, 18 Feb 2005 08:12:18 -0500
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: use strict; and O_WRONLY
Message-Id: <M66dnVXULuApdIjfRVn-3w@adelphia.com>
Thanks for the extended tour.
Tassilo v. Parseval wrote:
> const-xs.inc is the XS implementation of the 'constant' function as
> available from AUTOLOAD. 'XS_constant' then calls the C function
> 'constant' (defined in const-c.inc).
When I was dealing with constants in my XS code, I managed to follow things
up to this point...
> This one then works a bit like a
> finite state machine that does pattern matching on the name of the
> requested constant.
But this is where I got lost. When I created a new XS module with h2xs, it
had skeletons of both *.inc files, but I couldn't figure out how those were
generated, nor how to update them. I suppose there's a discussion about it
to be found in the p5p archives, but I didn't want to dive into Perl's
guts, all I wanted to do is export some constants.
Eventually, I gave up - I wrote a simple text file with the names of the
constants I wanted to export to Perl. For each name in it I call a short C
function that uses dlsym() to check for the existence and location of that
symbol. If it exists, I pass its value to newCONSTSUB() to create the
constant in Perl.
sherm--
--
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org
------------------------------
Date: Fri, 18 Feb 2005 13:37:39 GMT
From: nospam@geniegate.com
Subject: Re: Why aren't 'warnings' on by default?
Message-Id: <Lucy1108732493135340x1cd461c@air.tunestar.net>
In: <Jm_Md.148136$K7.19555@news-server.bigpond.net.au>, "Peter Wyzl" <wyzelli@yahoo.com> wrote:
>and look at the resultant errors/warnings.
>
>A recent example is a publicly available 'blogging' script that does use
>'my' declarations, but inappropriately (ie, declares the same variable name
>three times in the same scope - 'declaraton masks earlier in same scope').
>That gives some idea about the types of problems to be encountered
>elsewhere. Gives you some idea of what the original programmer does or
>doesn't know.
Thats actually one of the reasons I don't care for the -w switch. (*EXCEPT* in
debugging) I don't know if it's still the case or not, but this used to
generate a warning:
sub something {
my($i);
{
my($i)
.. use inner $i ...
}
... use outer $i ..
}
The way I see it, the above should be legal, and maybe even encouraged for
small index variables.
I see so much code with:
my($var) = '' # squelch warning about undefined variable
When your program knows very well it could be undefined but it doesn't care.
Extra junk to void out warnings can sometimes make it harder to follow.
I use a -w pretty much only when hunting down a bug. (Use it so seldom
that I sometimes forget it's available, I shouldn't do that because it
is sometimes useful.)
OTOH, I'm quite fond of 'use strict'. I like the way it catches
problems in advance and makes you think twice before using a global
variable. Once in awhile you have to do a
{
no strict 'refs';
...
}
But, overall, the advantages of strict are well worth it.
Just my opinion, of course. :-)
Jamie
--
http://www.geniegate.com Custom web programming
guhzo_42@lnubb.pbz (rot13) User Management Solutions
------------------------------
Date: 18 Feb 2005 14:00:18 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Why aren't 'warnings' on by default?
Message-Id: <cv4sdi$f20$2@mamenchi.zrz.TU-Berlin.DE>
<nospam@geniegate.com> wrote in comp.lang.perl.misc:
> Thats actually one of the reasons I don't care for the -w switch. (*EXCEPT* in
> debugging) I don't know if it's still the case or not, but this used to
> generate a warning:
>
> sub something {
> my($i);
> {
> my($i)
> .. use inner $i ...
> }
> ... use outer $i ..
> }
It doesn't warn (why didn't you test it?) and it never did. That is
no reason not to use warnings.
Anno
------------------------------
Date: 18 Feb 2005 15:27:44 GMT
From: xhoster@gmail.com
Subject: Re: Why aren't 'warnings' on by default?
Message-Id: <20050218102744.624$Bt@newsreader.com>
nospam@geniegate.com wrote:
>
> I see so much code with:
>
> my($var) = '' # squelch warning about undefined variable
Why is the comment necessary? Do you put a comment after every
semicolon saying "# To prevent syntax errors"?
If I see a variable being initialized to '' or 0, that gives me valuable
information about how I expect the variable is going to be used. Not
distracting at all. (The unnecessary paranthesis, on the other hand, do
throw me off.)
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 7798
***************************************