[24472] in Perl-Users-Digest
Perl-Users Digest, Issue: 6655 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Jun 5 14:05:54 2004
Date: Sat, 5 Jun 2004 11:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sat, 5 Jun 2004 Volume: 10 Number: 6655
Today's topics:
$NF for perl <usenet@molen.thuis.net>
Re: $NF for perl (Walter Roberson)
Re: $NF for perl <jurgenex@hotmail.com>
Re: a windows registry monitor (Malcolm Dew-Jones)
Re: Advice: hiding sensitive info used in devel ctcgag@hotmail.com
Re: Can one set perlvars on command line? <jeffrey.schwab@comcast.net>
Re: customize unix (Thorsten Gottschalk)
Re: Cute bit of Perl to Assign $1,$2 to named variables <jkrugman345@yahbitoo.com>
Re: Delete a line out of a flat file database <uri.guttman@fmr.com>
Re: Escaping single quotes with sql <lgoddard@cpan.org>
Re: How to get Win32 mouse pointer style (Bart Van der Donck)
How to use fork correctly! <luke@program.com.tw>
Re: How to use fork correctly! <usenet@morrow.me.uk>
Re: Kill a system process within the script <lgoddard@cpan.org>
Memory problem with XML::DOM::Parser??? <markus.mohr@mazimoi.de>
Re: Memory problem with XML::DOM::Parser??? <usenet@morrow.me.uk>
Re: Memory problem with XML::DOM::Parser??? <usenet@morrow.me.uk>
Re: perl style and returning from function? (Bill)
Re: perl style and returning from function? (Bill)
Re: perl style and returning from function? <usenet@morrow.me.uk>
Re: perl style and returning from function? <uri.guttman@fmr.com>
regexp problem in perl 5.6.1 and 5.8.4 <tstauffer@cas.org>
Re: Regexp: Lazy match workaround? (R. Rajesh Jeba Anbiah)
Re: Regexp: Lazy match workaround? nobull@mail.com
Simple regexp question (Marko)
Re: Simple regexp question <gnari@simnet.is>
Re: Simple regexp question <tadmc@augustmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sat, 05 Jun 2004 13:19:48 GMT
From: Tabe Kooistra <usenet@molen.thuis.net>
Subject: $NF for perl
Message-Id: <pan.2004.06.05.13.19.32.286128@molen.thuis.net>
perldoc split reads:
"In scalar context, returns the number of fields found and
splits into the @_ array. Use of split in scalar context
is deprecated, however, because it clobbers your subroutine
arguments."
perl v5.8.2
somebody has a pointer or a hint to achieve this?
regards,
Tabe
------------------------------
Date: 5 Jun 2004 13:32:04 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: $NF for perl
Message-Id: <c9si0k$brb$1@canopus.cc.umanitoba.ca>
In article <pan.2004.06.05.13.19.32.286128@molen.thuis.net>,
Tabe Kooistra <usenet@molen.thuis.net> wrote:
:perldoc split reads:
:"In scalar context, returns the number of fields found and
:splits into the @_ array.
:somebody has a pointer or a hint to achieve this?
You have not indicated what it is you want to achieve.
If what you want to get out is *just* the number of fields, then
just ensure that you are in a list conext at the point the split
is evaluated, and then scalar() that context.
A couple of months ago, I saw someone here use an idiom about that.
It was something *like*
my $number_of_fields = scalar( () = split ... )
The assignment to () provided the list context.
--
Contents: 100% recycled post-consumer statements.
------------------------------
Date: Sat, 05 Jun 2004 13:39:17 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: $NF for perl
Message-Id: <92kwc.6520$AU1.792@nwrddc01.gnilink.net>
Tabe Kooistra wrote:
> perldoc split reads:
>
> "In scalar context, returns the number of fields found and
> splits into the @_ array. Use of split in scalar context
> is deprecated, however, because it clobbers your subroutine
> arguments."
>
> perl v5.8.2
>
> somebody has a pointer or a hint to achieve this?
What is "this"? You didn't tell us what you want to achive.
jue
------------------------------
Date: 5 Jun 2004 00:15:12 -0800
From: yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones)
Subject: Re: a windows registry monitor
Message-Id: <40c17300@news.victoria.tc.ca>
justme (eight02645999@yahoo.com) wrote:
: hi
: i am trying to code a small perl program to monitor the windows
: registry. The idea is to create a baseline on some keys like
: LOCAL_MACHINE or USERS, ( the whole registry would be too big ), where
: the RUN and RUNONCE keys are located.
: Then i would poll these registry locations and see if there are
: suspicious keys added by comparing it against the baseline. The script
: will be scheduled to check every once in a while. I have checked CPAN
: for Win32::Registry. I wonder if it is the right tool to help me in
: this purpose...?
: thanks
Actually, regedit can provide a text dump, .ini file style, of the
registry, and possibly portions of it. You might try just diff'ing one
dump with a previous. The output would be easy to archive, is self
documenting, and is in the required format to restore the original
settings.
(Of course that doesn't use perl except to glue the parts together.)
------------------------------
Date: 05 Jun 2004 16:57:24 GMT
From: ctcgag@hotmail.com
Subject: Re: Advice: hiding sensitive info used in devel
Message-Id: <20040605125724.928$rF@newsreader.com>
kj <socyl@987jk.com> wrote:
> I'm writing a library that is supposed to be customized with
> potentially sensitive info (passwords, etc.). All these variables
> are defined in a file MyModule/Config.pm:
>
> package MyModule::Config;
>
> our %Config = (
> user => 'yours_truly',
> password => 'topsecret',
> # etc., etc.
> );
>
> During development, my working copy of MyModule/Config.pm holds
> real values for various variables, which obviously I don't want to
> publicize. This means that, in order to build the distribution
> package for release, one of the things I must do is change all the
> values of these variables. Conversely, if I want to test a released
> version of our software, as stored in our CVS repository, I first
> must change the values of these variables back to those that make
> sense for our system. There is always a mismatch between what we
> release and what we use locally , and at least one of these must
> necessarily be different from what is stored in our CVS repository.
Make two MyModule::Config.pm, one that has dummy data, is included in
CVS and in your ordinary dev source tree, and another with the real data.
Make sure the path to the one with the real data is in @INC before the
path to the dev source tree, so it will find the right one.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: Sat, 05 Jun 2004 12:00:52 -0400
From: Jeff Schwab <jeffrey.schwab@comcast.net>
Subject: Re: Can one set perlvars on command line?
Message-Id: <lP6dnRx94u-Nc1zdRVn-vA@comcast.com>
J Krugman wrote:
> Is there any trick to set a Perl variable (or any global variable,
> for that matter) on the command line, as in, for example:
>
> % perl {{{set $| to 1}}} somescript.pl --opt --flag arg
>
> so that myscript.pl started out with $| having value 1?
It's *definitely* cheating, but you could use the -i flag to set $^I...
------------------------------
Date: 5 Jun 2004 10:19:03 -0700
From: iqrity@web.de (Thorsten Gottschalk)
Subject: Re: customize unix
Message-Id: <beb38f7.0406050919.130dd0aa@posting.google.com>
BM> I don't usually say this, but this strikes me as a better job for
a
BM> shell script than Perl...
Yes, I also though this first, but in this script is a kind of user
dialog. And I like perl for text user dialogs. These changes I
described above are only a small part of the hole script.
And I install solaris and linux so I hoped to get it easier using "a
common" interface (perl) to access the file for example.
lostriver <vladimir@NoSpamPLZ.net> wrote in message news:<vK%vc.65846$F75.715777@weber.videotron.net>...
> On 4 Jun 2004 05:46:38 -0700, Thorsten Gottschalk wrote:
> > Hello all,
> >
> > I normal install 1-2 unix systems per week.
>
> So you basicaly doing a monkey job over and over and over.....
> Do not waste time and learn how to setup hands off install
> servers. Search on www.google.com for 'jumpstart' if your
> 'unix' is Solaris and 'kickstart' if it is Linux.
>
> good luck.
I never told, that I always install the same system. The system are
for different purposes. I just use kickstart and jumpstart, but I
couldn't find any information how to change /etc/system with jumpstart
WITHOUT writing a shell script and integrating this script in
jumpstart. Any additional information would be very helpfull.
Thanks.
ciao
Thorsten
------------------------------
Date: Sat, 5 Jun 2004 17:31:19 +0000 (UTC)
From: J Krugman <jkrugman345@yahbitoo.com>
Subject: Re: Cute bit of Perl to Assign $1,$2 to named variables
Message-Id: <c9t017$eql$1@reader2.panix.com>
In <jm01c0195sg795e2e3dedtiv9eu2nuqcup@4ax.com> zzapper <david@tvis.co.uk> writes:
>Hey guys I surrender!!!
I hear you, zzapper! I love reading clpm, but one thing that has
always puzzled me about it is all the gratuitous aggressiveness
one comes accross here. It's peculiar to clpm; I don't detect
anything like it in other computer language newsgroups that I read.
This is not to say that all the posts in clpm (or even a majority
of them) are nasty; on the contrary, there's *tons* of helpful,
friendly (not to mention knowledgeable and clever) stuff posted
here, for which I'm immensely grateful.
What I'm saying is just that the number of unnecessarily aggressive
posts seem to me well above average for this type of newsgroup, at
least in my experience.
In fewer words, people: chill a little.
(Of course, it goes without saying, I'll get roasted for saying
this.)
jill
--
To s&e^n]d me m~a}i]l r%e*m?o\v[e bit from my a|d)d:r{e:s]s.
------------------------------
Date: Fri, 04 Jun 2004 17:36:48 -0400
From: Uri Guttman <uri.guttman@fmr.com>
Subject: Re: Delete a line out of a flat file database
Message-Id: <liwu2nkntb.fsf@fmr.com>
>>>>> "KC" == Kevin Collins <spamtotrash@toomuchfiction.com> writes:
>>> >
>>> > None of those parens are necessary.
>>>
>>> But they do add clarity
>>
>> That's debatable.
KC> If you use parens, there can be no ambiguity...
>>> and help make precedence more obvious.
>>
>> "Learn the language". :-)
KC> Well, then why bother with indenting, white-space, comments, etc?
KC> If you "know the language", why use any of those constructs - they
KC> are mostly not required?
then you might as well put parens around everything and become lisp.
parens are best used when needed and they can hide (or make noisy) stuff
when they are not. i still do push( @foo, $bar ) for style reasons and
not from any ambiguity thing. the code (and reader) should be expected
to know basic perl precedence rules. baby code is ok when you start out
but you should graduate to cleaner code where you assume the reader has
enough skill to follow basic perl.
uri
------------------------------
Date: Sat, 05 Jun 2004 17:52:48 +0100
From: lee <lgoddard@cpan.org>
Subject: Re: Escaping single quotes with sql
Message-Id: <40C1FA60.6030403@cpan.org>
David Irving wrote:
> This is what I do:
>
> $myvariable =~ /\\\'/\\\'\\\'/g;
I'm really lazy: I use $dbh->quote().
Lee Goddard
------------------------------
Date: 5 Jun 2004 04:03:28 -0700
From: bart@nijlen.com (Bart Van der Donck)
Subject: Re: How to get Win32 mouse pointer style
Message-Id: <b5884818.0406050303.5cd15625@posting.google.com>
Tom wrote...
> Does anyone know how to query the mouse pointer style (hourglass,
> pointer, hand, etc) from a perl program on Win32? I assume there's a
> Win32 API function for this, but I can't find it.
There is a module Tk::CursorControl, info at:
http://cpan.uwinnipeg.ca/search?query=CursorControl&mode=module
If you are outputting HTML, you can use a CSS sheet to set cursors in IE.
Bart
------------------------------
Date: Sat, 5 Jun 2004 15:48:16 +0800
From: "news.hinet.net" <luke@program.com.tw>
Subject: How to use fork correctly!
Message-Id: <c9rtt9$d7e@netnews.hinet.net>
When i fork 5 child ,then process will terminate correctly.
But if i fork 25 child to do the same process. The program
will not terminate correctly.
How to use fork correctly??
ps: OS(win2000)
========================================
use IO::Socket::INET;
foreach my $child ((0..25)) {
if (fork() == 0) {
$| = 1;
print "Child $child trying to connect\n";
my $sock = IO::Socket::INET->new("www.hinet.net:80")
or die "Could not create connection\n";
$header=<<"ENDSTR";
GET / HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/vnd.ms-powerpoint, application/vnd.ms-excel, application/msword,
application/x-shockwave-flash, */*
Accept-Language: zh-tw
Accept-Encoding: gzip
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461)
Connection: close
ENDSTR
print $sock "$header";
@res=<$sock>;
open (handle,">$child");
print handle join('',@res);
close(handle);
$sock->close;
#print "Child $child exiting\n";
exit 0;
}
}
wait foreach ((0..25));
print "Parent exiting\n";
exit 0;
------------------------------
Date: Sat, 5 Jun 2004 14:35:16 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: How to use fork correctly!
Message-Id: <c9sln4$rar$1@wisteria.csv.warwick.ac.uk>
Quoth "news.hinet.net" <luke@program.com.tw>:
> When i fork 5 child ,then process will terminate correctly.
> But if i fork 25 child to do the same process. The program
> will not terminate correctly.
>
> How to use fork correctly??
>
> ps: OS(win2000)
You may have trouble with fork under Win32: it is not a real fork, perl
emulates it using threads. You may (or may not) have more luck with
5.8's ithreads mechanism, which provides more control.
> ========================================
> use IO::Socket::INET;
>
> foreach my $child ((0..25)) {
> if (fork() == 0) {
You need to remember the pid; also to check of fork failed.
> $| = 1;
> print "Child $child trying to connect\n";
> my $sock = IO::Socket::INET->new("www.hinet.net:80")
> or die "Could not create connection\n";
>
> $header=<<"ENDSTR";
> GET / HTTP/1.1
Don't do this: use LWP.
> wait foreach ((0..25));
You would be better using waitpid to check that each child has in fact
terminated.
> print "Parent exiting\n";
> exit 0;
I think you want something more like (untested):
use strict;
use warnings;
use LWP::UserAgent;
my $UA = LWP::UserAgent->new(
agent =>
'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461)',
);
my @Accept = qw[
image/gif image/x-xbitmap image/jpeg image/pjpeg
application/vnd.ms-powerpoint application/vnd.ms-excel application/msword
application/x-shockwave-flash */*
];
my %headers = (
accept_language => 'zh_tw',
accept_encoding => 'gzip',
accept => \@Accept,
);
my @kids;
for my $child (0..25) {
my $kid = fork;
defined $kid or warn "fork failed: $!" and last;
if ($kid) {
$kids[$child] = $kid;
next;
}
my $res = $UA->get( 'http://www.hinet.net/',
:content_file => $child,
%headers,
);
$res->is_success or die "Request failed: " . $res->message;
exit 0;
}
for (0..$#kids) {
my $pid = waidpid $kids[$_], 0;
$pid == 0 and warn "child $_ appears to have vanished...";
$pid < 0 and warn "can't wait for child $_: $!";
$? and warn "child $_ failed: $?";
}
__END__
Ben
--
Joy and Woe are woven fine,
A Clothing for the Soul divine William Blake
Under every grief and pine 'Auguries of Innocence'
Runs a joy with silken twine. ben@morrow.me.uk
------------------------------
Date: Sat, 05 Jun 2004 17:51:32 +0100
From: lee <lgoddard@cpan.org>
Subject: Re: Kill a system process within the script
Message-Id: <40C1FA14.7080100@cpan.org>
Mav wrote:
> I am trying to lanuch a command on my perl script using system call,
> I wonder is that a way when someone hit Ctrl-Y, it will kill my
> script,and also kill that system call process as well.
Yes.
When you launch the process, note its process ID.
Then have your script "trap" the control character in question,
and when that character is received, call the system 'kill'
command
on the previously noted PID, as described by someone else on
this thread.
See, for example, perldoc -q signal:
Found in .... pod/perlfaq8 :
How do I trap control characters/signals?
$Interrupted = 0; # to ensure it has a value
$SIG{INT} = sub {
$Interrupted++;
syswrite(STDERR, "ouch\n", 5);
}
Lee Goddard
------------------------------
Date: Sat, 05 Jun 2004 15:43:53 +0200
From: Markus Mohr <markus.mohr@mazimoi.de>
Subject: Memory problem with XML::DOM::Parser???
Message-Id: <8gj3c0tptu0rqlr9pr45oskhq6gpf007ov@4ax.com>
He, everybody,
I'm having a big problem when it comes to parsing a large file with
the ActiveState XML-DOM 1.43 XML-Parser: It consumes a hell of a lot
of memory, raises the CPU of the commputer to 100 % and takes a very
long time to handle files of "merely" 500 kB size.
Is there any way to speed things up?
Sincerely
Markus Mohr
------------------------------
Date: Sat, 5 Jun 2004 14:48:51 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Memory problem with XML::DOM::Parser???
Message-Id: <c9smgj$rl0$1@wisteria.csv.warwick.ac.uk>
Quoth markus.mohr@mazimoi.de:
> He, everybody,
>
> I'm having a big problem when it comes to parsing a large file with
> the ActiveState XML-DOM 1.43 XML-Parser: It consumes a hell of a lot
> of memory, raises the CPU of the commputer to 100 % and takes a very
> long time to handle files of "merely" 500 kB size.
>
> Is there any way to speed things up?
I would have a look to see if XML::LibXML2 or XML::Xerces could be used
instead. Unfortunately their APIs are both different from XML::DOM's,
but they should be substantially faster. XML::DOM does its DOM
processing in Perl, based on the callbacks provided by the Expat XML
parser; the other two libraries parse, build the DOM and manipulate it
directly in C(++).
Now, what Perl could really do with is a standard DOM API like
XML::SAX... :)
Ben
--
"If a book is worth reading when you are six, * ben@morrow.me.uk
it is worth reading when you are sixty." - C.S.Lewis
------------------------------
Date: Sat, 5 Jun 2004 14:50:53 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Memory problem with XML::DOM::Parser???
Message-Id: <c9smkd$rl0$2@wisteria.csv.warwick.ac.uk>
Quoth Ben Morrow <usenet@morrow.me.uk>:
>
> I would have a look to see if XML::LibXML2
^^
Of course, I just meant XML::LibXML... the C lib is called libxml2.
Ben
--
The cosmos, at best, is like a rubbish heap scattered at random.
- Heraclitus
ben@morrow.me.uk
------------------------------
Date: 4 Jun 2004 21:34:24 -0700
From: wherrera@lynxview.com (Bill)
Subject: Re: perl style and returning from function?
Message-Id: <239ce42f.0406042034.57684381@posting.google.com>
Ben Morrow <usenet@morrow.me.uk> wrote in message news:<c9q45s$9cb$2@wisteria.csv.warwick.ac.uk>...
> Quoth Uri Guttman <uri@stemsystems.com>:
> > >>>>> "BM" == Ben Morrow <usenet@morrow.me.uk> writes:
> >
> > BM> Quoth wherrera@lynxview.com (Bill):
> > >> What do you think is better:
> > >>
> > >> ...
> > >> return ($rc and $rc == 1) ? 1 : 0;
> > >> }
> > >>
> > >> or
> > >>
> > >> $rc and $rc == 1 and return 1;
> > >> return 0;
> > >> }
> >
> > BM> return $rc == 1 ? 1 : 0;
> >
> > why not just return $rc == 1? the OP didn't specify what the false value
> > must be.
>
> Yes he did. He specified 0. (IMO that is a bad specification, but... :)
>
> > BM> Perl is not C. A value of undef will quite happily be != 0.
> >
> > but it will trigger an uninitialized warning.
>
> True. I tend to turn those off, so I tend to forget about them... undef
> is too useful to be warned about :).
>
> > but the OP wasn't testing for undef.
>
> No... I was puzzled as to what he thought he *was* testing for, though.
To clarify, this is a database access function that should modify
exactly one row. In an error state relative to the intended outcome,
no rows or more than one row could have been modified.
$rc could be undef, or defined but not one, which would still be an
error. The function is specified to return a boolean, defined value
that is 1 if $rc is 1 and false (but defined) otherwise. I guess
either 0 or '' should work?
------------------------------
Date: 4 Jun 2004 21:47:57 -0700
From: wherrera@lynxview.com (Bill)
Subject: Re: perl style and returning from function?
Message-Id: <239ce42f.0406042047.2c82430a@posting.google.com>
anno4000@lublin.zrz.tu-berlin.de (Anno Siegel) wrote in message news:<c9qple$qv8$1@mamenchi.zrz.TU-Berlin.DE>...
> >
> > $rc == 1 and return 1;
> > return;
> >
> > which will always return false, even if called in list context.
>
> I'm not sure about this rule in this case. When a function is described
> as returning a boolean value, it should return one, imho. I'd like
> "map boolfunc( $_), @list" to return as many "true"s and "false"s as
> @list has elements, not an indefinite number of all "true"s.
>
> The rule is well applied where the blank return indicates failure to
> deliver the normal result, but not when a result is expected in any case.
>
> Anno
It seems that Perl 6 will havew a 'boolean context' operator, so that
(I think)
? $rc and $rc == 1
will always return a boolean, which I guess means it will return
either 1 or 0. Can Perl 5 be streched onto that rack? :)
------------------------------
Date: Sat, 5 Jun 2004 05:22:41 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: perl style and returning from function?
Message-Id: <c9rlb1$cnv$1@wisteria.csv.warwick.ac.uk>
Quoth wherrera@lynxview.com (Bill):
> anno4000@lublin.zrz.tu-berlin.de (Anno Siegel) wrote in message news:<c9qple$qv8$1@mamenchi.zrz.TU-Berlin.DE>...
> > >
> > > $rc == 1 and return 1;
> > > return;
> > >
> > > which will always return false, even if called in list context.
> >
> > I'm not sure about this rule in this case. When a function is described
> > as returning a boolean value, it should return one, imho. I'd like
> > "map boolfunc( $_), @list" to return as many "true"s and "false"s as
> > @list has elements, not an indefinite number of all "true"s.
> >
> > The rule is well applied where the blank return indicates failure to
> > deliver the normal result, but not when a result is expected in any case.
>
> It seems that Perl 6 will havew a 'boolean context' operator, so that
> (I think)
>
> ? $rc and $rc == 1
>
> will always return a boolean, which I guess means it will return
> either 1 or 0.
^^ undef, not 0, I'd guess.
> Can Perl 5 be streched onto that rack? :)
!! $rc will boolify $rc. What would be more useful in this context would
be if (I don't know whether this will happen or not) the explicit numify
operator '+' *didn't* warn about conversion from undef, so that
+$rc == 1
would do what is required with no warning.
Ben
--
Although few may originate a policy, we are all able to judge it.
- Pericles of Athens, c.430 B.C.
ben@morrow.me.uk
------------------------------
Date: Fri, 04 Jun 2004 11:59:37 -0400
From: Uri Guttman <uri.guttman@fmr.com>
Subject: Re: perl style and returning from function?
Message-Id: <liy8n3iaae.fsf@fmr.com>
>>>>> "BM" == Ben Morrow <usenet@morrow.me.uk> writes:
>> i prefer:
>>
>> return 1 if $rc == 1 ;
>> return ;
>>
>> then the returns line up prettily :)
BM> Yes. I would also be inclined to use
BM> $rc == 1 ? return 1
BM> : return;
gack! side effects (in this case flow control) inside ?: is fugly and
nasty! and the return token is duplicated. my rule is to only use ?:
for a value which is its intent. we see newbies all the time doing
assignments in ?: and not getting the precedence issues.
uri
------------------------------
Date: Fri, 04 Jun 2004 14:48:02 -0400
From: Thomas Stauffer <tstauffer@cas.org>
Subject: regexp problem in perl 5.6.1 and 5.8.4
Message-Id: <40C0C3E2.3090400@cas.org>
I have done some Perl programming in the past but I am by no means and
expert. I am currently working on changing some code written some time
ago by an employee no longer with the company. The code is currently
running under 5.005.02. I am making changes and adding some ucs2 ->
utf8 conversion. I want to run the code under Perl 5.8.4 to take
advantage of Perl's internal Unicode support. At any rate, there is a
regular expression in the code the works fine under 5.005.02 but loops
under 5.6.1 and above. Following code illustrates the problem:
$orig_string = 'JKXXAF';
$regex = qr {\G
# Match as many characters as possible
# that can be passed thru as-is
([^\x00-\xFF]+)
# Then try to match $A1 and next two bytes
| (@..)
# Otherwise just get the next byte
| (.)
}sx;
print "regex = $regex\n";
while ($orig_string =~ /$regex/g) {
print "\$1=$1\n";
print "\$2=$2\n";
print "\$3=$3\n";
}
The problem seems to be with the use of the \G attribute. If I take it
out, the regular expression works the same in all versions of Perl.
However, since I did not write the code and the programmer who did was
considerably more experienced using Perl than I am, I am hesitant just
to remove it. Anyhow, I have been looking at this for several days
without success. My Perl expert suggested I post it to this forum. Any
help would be greatly appreciated.
Following is the details of the version of Perl I'm using:
Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
Platform:
osname=solaris, osvers=2.8, archname=sun4-solaris
uname='sunos cwu21awu 5.8 generic_108528-29 sun4u sparc
sunw,sun-blade-100 '
config_args=''
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='/opt/SUNWspro/bin/cc', ccflags =' -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64',
optimize='-O',
cppflags=''
ccversion='Sun WorkShop 6 update 2 C 5.3 Patch 111679-08
2002/05/09', gccversion='', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='/opt/SUNWspro/bin/cc', ldflags =' -L/usr/lib -L/usr/ccs/lib
-L/opt/SUNWspro/WS6U2/lib -L/usr/local/lib '
libpth=/usr/lib /usr/ccs/lib /opt/SUNWspro/WS6U2/lib /usr/local/lib
libs=-lsocket -lnsl -ldl -lm -lc
perllibs=-lsocket -lnsl -ldl -lm -lc
libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-KPIC', lddlflags='-G -L/usr/lib -L/usr/ccs/lib
-L/opt/SUNWspro/WS6U2/lib -L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: USE_LARGE_FILES
Built under solaris
Compiled at Apr 22 2004 16:07:19
@INC:
/usr/local/perl5/lib/5.8.4/sun4-solaris
/usr/local/perl5/lib/5.8.4
/usr/local/perl5/lib/site_perl/5.8.4/sun4-solaris
/usr/local/perl5/lib/site_perl/5.8.4
/usr/local/perl5/lib/site_perl
------------------------------
Date: 5 Jun 2004 00:24:27 -0700
From: ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah)
Subject: Re: Regexp: Lazy match workaround?
Message-Id: <abc4d8b8.0406042324.2e93ecfc@posting.google.com>
Brian McCauley <nobull@mail.com> wrote in message news:<u98yf3e0cx.fsf@wcl-l.bham.ac.uk>...
> Brian McCauley <nobull@mail.com> writes:
>
> > Consider
> >
> > ' A C !' =~ /(A.*?)+.*!/;
> >
> > Here the repeated group matches only 'A'. It does not match the 'C'
> > because the non-greedyness of the '*?' is more important than the
> > greedyness of the '+'.
>
> I meant, of course, consider
>
> ' A C !' =~ /(\w.*?)+.*!/;
>
> Obviously /A/ won't match 'C' ever!
Again, many thanks to all the experts. I understand what you mean,
for example in the following case:
Target string: XabcABCX
Regex Pattern: /X(abc)+X/i
Matches : XabcABCX, ABC
NOT: XabcABCX, abc, ABC
^^^
Here, only the 'ABC' is get matched, but not the first 'abc'. This
behavior is indeed bit difficult to understand :-(
------------------------------
Date: 5 Jun 2004 09:52:06 -0700
From: nobull@mail.com
Subject: Re: Regexp: Lazy match workaround?
Message-Id: <4dafc536.0406050852.58675c7e@posting.google.com>
ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah) wrote in message news:<abc4d8b8.0406042324.2e93ecfc@posting.google.com>...
> Brian McCauley <nobull@mail.com> wrote in message news:<u98yf3e0cx.fsf@wcl-l.bham.ac.uk>...
> > consider
> >
> > ' A C !' =~ /(\w.*?)+.*!/;
> >
> > Here the repeated group matches only 'A'. It does not match the 'C'
> > because the non-greedyness of the '*?' is more important than the
> > greedyness of the '+'.
>
> Again, many thanks to all the experts. I understand what you mean,
> for example in the following case:
> Target string: XabcABCX
> Regex Pattern: /X(abc)+X/i
> Matches : XabcABCX, ABC
> NOT: XabcABCX, abc, ABC
> ^^^
> Here, only the 'ABC' is get matched, but not the first 'abc'. This
> behavior is indeed bit difficult to understand :-(
Indeed it would be - but that it not what happens. Go back and
re-read what Anno said.
The repeated capturing subexpression /(abc)/i does indeed match and
capture both 'abc' and then also 'ABC'. But upon completion of the
pattern match the special variable $1 ( or the first element of the
list context value of the m// operator ) will contain the _last_ thing
to be captured (i.e. 'ABC').
The only way you could see that 'abc' had been captured would be to
look at the value of $1 part way through the pattern match operation.
This is where (?{}) would come in.
------------------------------
Date: 5 Jun 2004 01:39:57 -0700
From: marko_1978@suomi24.fi (Marko)
Subject: Simple regexp question
Message-Id: <f998eb80.0406050039.6cfd4dfd@posting.google.com>
I have a very tiny problem with regexp and im totally frozen..
If i have several sentences of text, how can i get it out (for
example) first two sentences? Assuming that those sentences can end
with dot ("." only one or more (".." or "..."), question mark ("?") or
exclamation mark ("!").
Thanks for help.
------------------------------
Date: Sat, 5 Jun 2004 09:08:04 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: Simple regexp question
Message-Id: <c9s2e7$8vh$1@news.simnet.is>
"Marko" <marko_1978@suomi24.fi> wrote in message
news:f998eb80.0406050039.6cfd4dfd@posting.google.com...
> I have a very tiny problem with regexp and im totally frozen..
> If i have several sentences of text, how can i get it out (for
> example) first two sentences? Assuming that those sentences can end
> with dot ("." only one or more (".." or "..."), question mark ("?") or
> exclamation mark ("!").
split /\.+|\?|!/;
gnari
------------------------------
Date: Sat, 5 Jun 2004 08:39:00 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Simple regexp question
Message-Id: <slrncc3j7k.1p1.tadmc@magna.augustmail.com>
Marko <marko_1978@suomi24.fi> wrote:
> If i have several sentences of text, how can i get it out (for
> example) first two sentences? Assuming that those sentences can end
> with dot ("." only one or more (".." or "..."), question mark ("?") or
> exclamation mark ("!").
my @sentences = $text =~ /\s*([^.?!]+[.?!]+)/g;
Will fail with a sentence like
Mr. Marko does Perl.
and many others...
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 6655
***************************************