[31093] in Perl-Users-Digest
Perl-Users Digest, Issue: 2338 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Apr 14 14:14:30 2009
Date: Tue, 14 Apr 2009 11:14:18 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 14 Apr 2009 Volume: 11 Number: 2338
Today's topics:
Regex question. Oh I so cannot do regular expression ma <cdalten@gmail.com>
Re: Regex question. Oh I so cannot do regular expressio <Ansher.M@gmail.com>
Re: Regex question. Oh I so cannot do regular expressio <Ansher.M@gmail.com>
Re: Regex question. Oh I so cannot do regular expressio <uri@stemsystems.com>
Re: Regex question. Oh I so cannot do regular expressio (Darren Dunham)
Re: Regex question. Oh I so cannot do regular expressio <jurgenex@hotmail.com>
Re: Regex question. Oh I so cannot do regular expressio <cdalten@gmail.com>
Re: Regex question. Oh I so cannot do regular expressio <cdalten@gmail.com>
Re: Regex question. Oh I so cannot do regular expressio <uri@stemsystems.com>
Re: Regex question. Oh I so cannot do regular expressio <cdalten@gmail.com>
Re: Regex question. Oh I so cannot do regular expressio <uri@stemsystems.com>
Split Function <Ansher.M@gmail.com>
Re: Split Function <rvtol+usenet@xs4all.nl>
Re: Split Function <Ansher.M@gmail.com>
Re: Split Function <rvtol+usenet@xs4all.nl>
Re: XML::LibXML UTF-8 toString() -vs- nodeValue() <hjp-usenet2@hjp.at>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Tue, 14 Apr 2009 09:08:37 -0700 (PDT)
From: grocery_stocker <cdalten@gmail.com>
Subject: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <47cc8d10-4633-4ad5-94a0-29ef6e255e89@x29g2000prf.googlegroups.com>
I can't seem to get perl to match both the words 'chad' AND 'party'
in the string "chad ttyp0 party". Below is what I attempted.
[cdalten@localhost oakland]$ more match.pl
#!/usr/bin/perl
use warnings;
#$string = `w | grep cdalten | grep telnet`;
$test = "chad ttyp0 party";
if ("$test" =~/(\bchad\b)(\bparty\b)/) {
print "true \n";
}
[cdalten@localhost oakland]$ ./match.pl
[cdalten@localhost oakland]$
What am I doing wrong>
------------------------------
Date: Tue, 14 Apr 2009 09:19:56 -0700 (PDT)
From: perl Newbie <Ansher.M@gmail.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <a33a2bb6-0fc5-45b5-a2d6-65a513f7797d@r8g2000yql.googlegroups.com>
On Apr 14, 9:08=A0pm, grocery_stocker <cdal...@gmail.com> wrote:
> I can't seem to get perl to match both the words 'chad' AND =A0'party'
> in the string "chad ttyp0 party". Below is what I attempted.
>
> [cdalten@localhost oakland]$ more match.pl
> #!/usr/bin/perl
> use warnings;
>
> #$string =3D `w | grep cdalten | grep telnet`;
>
> $test =3D "chad ttyp0 party";
>
> if ("$test" =3D~/(\bchad\b)(\bparty\b)/) =A0{
> =A0 =A0 print "true \n";}
>
> [cdalten@localhost oakland]$ ./match.pl
> [cdalten@localhost oakland]$
>
> What am I doing wrong>
Add OR condition in your code
if ("$test" =3D~/(\bchad\b) || (\bparty\b)/)
------------------------------
Date: Tue, 14 Apr 2009 09:25:24 -0700 (PDT)
From: perl Newbie <Ansher.M@gmail.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <57b40c3f-5620-4e5d-be50-5259706ed7a0@s20g2000yqh.googlegroups.com>
On Apr 14, 9:08=A0pm, grocery_stocker <cdal...@gmail.com> wrote:
> I can't seem to get perl to match both the words 'chad' AND =A0'party'
> in the string "chad ttyp0 party". Below is what I attempted.
>
> [cdalten@localhost oakland]$ more match.pl
> #!/usr/bin/perl
> use warnings;
>
> #$string =3D `w | grep cdalten | grep telnet`;
>
> $test =3D "chad ttyp0 party";
>
> if ("$test" =3D~/(\bchad\b)(\bparty\b)/) =A0{
> =A0 =A0 print "true \n";}
>
> [cdalten@localhost oakland]$ ./match.pl
> [cdalten@localhost oakland]$
>
> What am I doing wrong>
You can use OR , AND operator as per your requirement
1. if ("$test" =3D~/(\bchad\b)/ || "$test" =3D~/(\bparty\b)/ ) {
2. if ("$test" =3D~/(\bchad\b)/ && "$test" =3D~/(\bparty\b)/ ) {
------------------------------
Date: Tue, 14 Apr 2009 12:27:35 -0400
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <x74owrb3pk.fsf@mail.sysarch.com>
>>>>> "pN" == perl Newbie <Ansher.M@gmail.com> writes:
pN> On Apr 14, 9:08 pm, grocery_stocker <cdal...@gmail.com> wrote:
>> I can't seem to get perl to match both the words 'chad' AND 'party'
>> in the string "chad ttyp0 party". Below is what I attempted.
>>
>> [cdalten@localhost oakland]$ more match.pl
>> #!/usr/bin/perl
>> use warnings;
>>
>> #$string = `w | grep cdalten | grep telnet`;
>>
>> $test = "chad ttyp0 party";
>>
>> if ("$test" =~/(\bchad\b)(\bparty\b)/) {
perldoc -q var
don't quote scalar vars
>> print "true \n";}
>>
>> [cdalten@localhost oakland]$ ./match.pl
>> [cdalten@localhost oakland]$
>>
>> What am I doing wrong>
pN> Add OR condition in your code
pN> if ("$test" =~/(\bchad\b) || (\bparty\b)/)
huh? have you tried that yourself? also the OP wanted both words to
match and your attempt implies OR. besides there is no boolean logic
INSIDE regexes.
but i have several questions for the OP. why are you grabbing both words
when you only print true if you found them? if that is all you want then
an external boolean test with two separate regexes:
if ( $test =~ /\bchad\b/ && $test =~ /\bparty\b/ ) {
if you want that in one regex you need to account for text between the
two words:
if ( $test =~ /\bchad\b.*\bparty\b/ ) {
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: Tue, 14 Apr 2009 16:27:07 GMT
From: ddunham@taos.com (Darren Dunham)
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <vP2Fl.13536$%54.2308@nlpi070.nbdc.sbc.com>
grocery_stocker <cdalten@gmail.com> wrote:
> I can't seem to get perl to match both the words 'chad' AND 'party'
> in the string "chad ttyp0 party". Below is what I attempted.
>
> [cdalten@localhost oakland]$ more match.pl
> #!/usr/bin/perl
> use warnings;
>
> #$string = `w | grep cdalten | grep telnet`;
>
> $test = "chad ttyp0 party";
>
> if ("$test" =~/(\bchad\b)(\bparty\b)/) {
Why are you quoting $test?
This will only work if there were a string with "chad" and "party" with
nothing between them but a wordbreak. Such a string doesn't exist
(because you'd need a character to create the wordbreak).
So you probably want either...
if ($test =~/(\bchad\b)/ and
$test =~/(\bparty\b)/) {
(But that won't preserve $1 properly if you want to capture both items)
or
if ($test =~ /(\bchad\b).*(\bparty\b)/) {
because there are actually characters between them.
--
Darren
------------------------------
Date: Tue, 14 Apr 2009 09:30:16 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <ovd9u492e24rctn74lv7nuijc191mvfn6m@4ax.com>
grocery_stocker <cdalten@gmail.com> wrote:
>I can't seem to get perl to match both the words 'chad' AND 'party'
>in the string "chad ttyp0 party". Below is what I attempted.
>$test = "chad ttyp0 party";
>if ("$test" =~/(\bchad\b)(\bparty\b)/) {
Why are you quoting $test? Please see 'perldoc -q quoting':
What's wrong with always quoting "$vars"?
Your RE is trying to match a word boundary, followed by 'chad', followed
by a word boundary, immediately followed by a word boundary, then
'party', then another word boundary.
Obviously your test data contains other characters between the word
boundarybehind 'chad' and the word boundaryin front of 'party',
therefore it cannot match.
Depending upon what you want to achive you can either split the RE into
two
if ($test =~/\bchad\b/ and $test =~/\bparty\b/)
or insert some RE between those two word boundaries that will suck up
the additional characters, e.g.
if ($test =~/(\bchad\b).*(\bparty\b)/)
jue
------------------------------
Date: Tue, 14 Apr 2009 09:39:55 -0700 (PDT)
From: grocery_stocker <cdalten@gmail.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <b97f52a8-773b-48db-8cfb-4d550f77b2c6@s22g2000prg.googlegroups.com>
On Apr 14, 9:27 am, Uri Guttman <u...@stemsystems.com> wrote:
> >>>>> "pN" == perl Newbie <Anshe...@gmail.com> writes:
>
> pN> On Apr 14, 9:08 pm, grocery_stocker <cdal...@gmail.com> wrote:
> >> I can't seem to get perl to match both the words 'chad' AND 'party'
> >> in the string "chad ttyp0 party". Below is what I attempted.
> >>
> >> [cdalten@localhost oakland]$ more match.pl
> >> #!/usr/bin/perl
> >> use warnings;
> >>
> >> #$string = `w | grep cdalten | grep telnet`;
> >>
> >> $test = "chad ttyp0 party";
> >>
> >> if ("$test" =~/(\bchad\b)(\bparty\b)/) {
>
> perldoc -q var
>
> don't quote scalar vars
>
> >> print "true \n";}
> >>
> >> [cdalten@localhost oakland]$ ./match.pl
> >> [cdalten@localhost oakland]$
> >>
> >> What am I doing wrong>
>
> pN> Add OR condition in your code
>
> pN> if ("$test" =~/(\bchad\b) || (\bparty\b)/)
>
> huh? have you tried that yourself? also the OP wanted both words to
> match and your attempt implies OR. besides there is no boolean logic
> INSIDE regexes.
>
> but i have several questions for the OP. why are you grabbing both words
> when you only print true if you found them? if that is all you want then
> an external boolean test with two separate regexes:
>
> if ( $test =~ /\bchad\b/ && $test =~ /\bparty\b/ ) {
>
> if you want that in one regex you need to account for text between the
> two words:
>
> if ( $test =~ /\bchad\b.*\bparty\b/ ) {
>
The question stems from a much larger side/site specific project that
I'm working on. I just couldn't figure out how to search for multiple
words in a single line. I figured it would have been just easier to
post the part of the code that was giving me grief.
------------------------------
Date: Tue, 14 Apr 2009 09:44:25 -0700 (PDT)
From: grocery_stocker <cdalten@gmail.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <1d6b1f3f-0f01-470a-9a9c-ba1ceaa1b6e8@l16g2000pra.googlegroups.com>
On Apr 14, 9:27 am, ddun...@taos.com (Darren Dunham) wrote:
> grocery_stocker <cdal...@gmail.com> wrote:
> > I can't seem to get perl to match both the words 'chad' AND 'party'
> > in the string "chad ttyp0 party". Below is what I attempted.
>
> > [cdalten@localhost oakland]$ more match.pl
> > #!/usr/bin/perl
> > use warnings;
>
> > #$string = `w | grep cdalten | grep telnet`;
>
> > $test = "chad ttyp0 party";
>
> > if ("$test" =~/(\bchad\b)(\bparty\b)/) {
>
> Why are you quoting $test?
>
Because in the full size script, $test is actually...
$test = `w | grep cdalten | grep party`;
> This will only work if there were a string with "chad" and "party" with
> nothing between them but a wordbreak. Such a string doesn't exist
> (because you'd need a character to create the wordbreak).
>
> So you probably want either...
>
> if ($test =~/(\bchad\b)/ and
> $test =~/(\bparty\b)/) {
>
> (But that won't preserve $1 properly if you want to capture both items)
>
> or
>
> if ($test =~ /(\bchad\b).*(\bparty\b)/) {
>
> because there are actually characters between them.
>
------------------------------
Date: Tue, 14 Apr 2009 12:47:36 -0400
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <x7hc0r9o7r.fsf@mail.sysarch.com>
>>>>> "gs" == grocery stocker <cdalten@gmail.com> writes:
gs> On Apr 14, 9:27 am, ddun...@taos.com (Darren Dunham) wrote:
>>
>> Why are you quoting $test?
>>
gs> Because in the full size script, $test is actually...
gs> $test = `w | grep cdalten | grep party`;
so?? perl isn't the shell so it doesn't need quoting around single
scalars even if they have blanks in them.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: Tue, 14 Apr 2009 10:58:21 -0700 (PDT)
From: grocery_stocker <cdalten@gmail.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <c0ca75ee-792e-448a-b9e4-2460ccd599a0@c18g2000prh.googlegroups.com>
On Apr 14, 9:47 am, Uri Guttman <u...@stemsystems.com> wrote:
> >>>>> "gs" == grocery stocker <cdal...@gmail.com> writes:
>
> gs> On Apr 14, 9:27 am, ddun...@taos.com (Darren Dunham) wrote:
> >>
> >> Why are you quoting $test?
> >>
>
> gs> Because in the full size script, $test is actually...
>
> gs> $test = `w | grep cdalten | grep party`;
>
> so?? perl isn't the shell so it doesn't need quoting around single
> scalars even if they have blanks in them.
>
Actually, the script borks when I try to use backticks.
[cdalten@localhost oakland]$ more match.pl
#!/usr/bin/perl
use warnings;
#$string = `w | grep cdalten | grep telnet`;
$test = `w | grep cdalten | grep telnet`;
print $test;
if ($test =~/(\bchad\b)/ && $test =~/(\btelnet\b)/ ) {
print "true \n";
}
[cdalten@localhost oakland]$ ./match.pl
cdalten pts/7 :0.0 Mon12 14:55 0.62s 0.24s telnet
[cdalten@localhost oakland]$
Ides why?
------------------------------
Date: Tue, 14 Apr 2009 14:07:41 -0400
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Regex question. Oh I so cannot do regular expression matching.
Message-Id: <x7bpqz85xu.fsf@mail.sysarch.com>
>>>>> "gs" == grocery stocker <cdalten@gmail.com> writes:
gs> Actually, the script borks when I try to use backticks.
gs> [cdalten@localhost oakland]$ more match.pl
gs> #!/usr/bin/perl
gs> use warnings;
use strict too.
gs> $test = `w | grep cdalten | grep telnet`;
perl can do the grep for you and faster than forking two external greps.
gs> print $test;
gs> if ($test =~/(\bchad\b)/ && $test =~/(\btelnet\b)/ ) {
gs> print "true \n";
gs> }
gs> [cdalten@localhost oakland]$ ./match.pl
gs> cdalten pts/7 :0.0 Mon12 14:55 0.62s 0.24s telnet
do you see the word 'chad' there? i don't. it helps if you search for
things that are actually in the text.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: Tue, 14 Apr 2009 09:17:15 -0700 (PDT)
From: perl Newbie <Ansher.M@gmail.com>
Subject: Split Function
Message-Id: <745a969e-31cc-41d5-a485-cb6d3ed0c50c@3g2000yqk.googlegroups.com>
Hi,
I want to split text whenever there is a semi-colon. But my condition
is split function should ignore \; in line.
I have the following script to achieve the task. My query is
1. Is there a better way to do this or Is it possible to give
condition along with split
2. In the script if I use $_=~s/$ignore_txts[$i]/##/; instead of $_=~s/
\\;/###/; it replaces first occurrences of semi-colon instead of
replacing \;. I could not understand why is this happening?
DATA FILE
-------------------
*include q1.qin;col(a)=100;txt1=Ablah Ablah\; Bblah Bblah \; Cblah
cblah ;txt2=New Text;txt3=blah blah \; blah
SCRIPT
-------------------
use strict;
use warnings;
open(DATA,"< text") or die "Unable to open file\n";
my @normal_splits;
my @special_splits;
my @ignore_txts;
my $line;
while(<DATA>){
print "Text: $_\n";
@normal_splits=split(/;/,$_);
print "Texts after regular split function:\n";
foreach my $l(@normal_splits){
print $l, "\n";
}
if ($_=~/\\;/){
@ignore_txts = /(\\;)/g;
}
for (my $i=0;$i<$#ignore_txts+1;$i++) {
##$_=~s/$ignore_txts[$i]/##/;
$_=~s/\\;/###/;
}
print "$_\n";
@special_splits=split(/;/,$_);
print "Texts after replace\n";
foreach my $l(@special_splits){
$l=~s/###/\\;/g;
print $l, "\n";
}
}
close(DATA);
OUTPUT
-------------------
Text: *include q1.qin;col(a)=100;txt1=Ablah Ablah\; Bblah Bblah \;
Cblah cblah ;txt2=New Text;txt3=blah blah \; blah
Texts after regular split function:
*include q1.qin
col(a)=100
txt1=Ablah Ablah\
Bblah Bblah \
Cblah cblah
txt2=New Text
txt3=blah blah \
blah
*include q1.qin;col(a)=100;txt1=Ablah Ablah### Bblah Bblah ### Cblah
cblah ;txt2=New Text;txt3=blah blah ### blah
Texts after replace
*include q1.qin
col(a)=100
txt1=Ablah Ablah\; Bblah Bblah \; Cblah cblah
txt2=New Text
txt3=blah blah \; blah
------------------------------
Date: Tue, 14 Apr 2009 19:46:29 +0200
From: "Dr.Ruud" <rvtol+usenet@xs4all.nl>
Subject: Re: Split Function
Message-Id: <49e4cbf5$0$194$e4fe514c@news.xs4all.nl>
perl Newbie wrote:
> I want to split text whenever there is a semi-colon. But my condition
> is split function should ignore \; in line.
Try a split on /(?<!\\);/.
--
Ruud
------------------------------
Date: Tue, 14 Apr 2009 11:01:25 -0700 (PDT)
From: perl Newbie <Ansher.M@gmail.com>
Subject: Re: Split Function
Message-Id: <643d5487-d9da-476e-a1b0-eb207b163d85@k2g2000yql.googlegroups.com>
On Apr 14, 10:46=A0pm, "Dr.Ruud" <rvtol+use...@xs4all.nl> wrote:
> perl Newbie wrote:
> > I want to split text whenever there is a semi-colon. But my condition
> > is split function should ignore \; in line.
>
> Try a split on /(?<!\\);/.
>
> --
> Ruud
Thanks ! It works fine, could you please expain the logic, ?<!
------------------------------
Date: Tue, 14 Apr 2009 20:04:07 +0200
From: "Dr.Ruud" <rvtol+usenet@xs4all.nl>
Subject: Re: Split Function
Message-Id: <49e4d018$0$189$e4fe514c@news.xs4all.nl>
perl Newbie wrote:
> On Apr 14, 10:46 pm, "Dr.Ruud" <rvtol+use...@xs4all.nl> wrote:
>> perl Newbie wrote:
>>> I want to split text whenever there is a semi-colon. But my condition
>>> is split function should ignore \; in line.
>>
>> Try a split on /(?<!\\);/.
>
> Thanks ! It works fine, could you please expain the logic, ?<!
perlre
--
Ruud
------------------------------
Date: Tue, 14 Apr 2009 09:38:21 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: XML::LibXML UTF-8 toString() -vs- nodeValue()
Message-Id: <slrngu8fbd.d3g.hjp-usenet2@hrunkner.hjp.at>
On 2009-04-14 05:14, Mark <liarafan@xs4all.nl> wrote:
> From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
>> On the other hand, I think you don't know what a stream is:
>>
>> my ($fh, '<', 'test.xml');
>>
>> Now $fh refers a stream. Please show me how you can apply a regexp to
>> this stream. Solutions which don't count:
>>
>> * reading chunks from the stream into a scalar variable and then
>> applying the regexp to this variable (because then you apply it to a
>> string (as I wrote), not a stream.
>> * writing your own regexp engine (since Perl is a general purpose
>> programming language, you can of course write that but we were
>> talking about Perl' builtin regexp).
>
> Regexes in 'split' can be done on a stream; for example:
>
> open (SMTPD, "$out_name") or return undef;
> undef $/;
> my ($header, $body) = split (/\n{2,}/, <SMTPD>, 2);
That reads the complete contents from the stream into a
(temporary) scalar and then passes the scalar to split. The split
function still works on this string, not a stream.
awk does apply a regexp to a stream in one specific instance: You can
specify the record separator as a regexp and getline will read from the
stream until it finds a match (or EOF). perldoc perlvar mentions this:
Remember: the value of $/ is a string, not a regex. awk has to
be better for something. :-)
A more general example would be the code generated by lex: You describe
your tokens with regexps and the generated code reads one token after
the other from the stream by applying those regexps to the stream.
hp
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 2338
***************************************