[29015] in Perl-Users-Digest
Perl-Users Digest, Issue: 259 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Mar 23 16:17:04 2007
Date: Fri, 23 Mar 2007 13:16:56 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 23 Mar 2007 Volume: 11 Number: 259
Today's topics:
Truncating text from a string with beginning text from <google@markginsburg.com>
Re: Truncating text from a string with beginning text f <nobull67@gmail.com>
Re: Truncating text from a string with beginning text f <nobull67@gmail.com>
Re: Truncating text from a string with beginning text f usenet@DavidFilmer.com
Re: Truncating text from a string with beginning text f <nobull67@gmail.com>
Re: Truncating text from a string with beginning text f usenet@DavidFilmer.com
Re: Truncating text from a string with beginning text f <nobull67@gmail.com>
Re: Truncating text from a string with beginning text f anno4000@radom.zrz.tu-berlin.de
Re: Truncating text from a string with beginning text f anno4000@radom.zrz.tu-berlin.de
Re: Truncating text from a string with beginning text f (Gary E. Ansok)
Re: Truncating text from a string with beginning text f <glennj@ncf.ca>
Re: Truncating text from a string with beginning text f <wahab-mail@gmx.de>
Re: Truncating text from a string with beginning text f <wahab-mail@gmx.de>
Re: Urgent requirement in perl for a US based CMM Level <dha@panix.com>
Re: Using @ARGV in object oriented script <klaus03@gmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 23 Mar 2007 10:44:12 -0700
From: "Mark" <google@markginsburg.com>
Subject: Truncating text from a string with beginning text from another string
Message-Id: <1174671852.658156.20910@p15g2000hsd.googlegroups.com>
>From a line of arbitrary text, possibly followed by some amount of
text from the beginning of the string ' Reference #\d+', where \d+
represents one or more digit characters, I want to output the line
without the ending ' Reference...' string. For example, the input line
'some arbitrary text Refer' would become 'some arbitrary text'.
Here are two programs that seem to do what I want, but they seem
overly complicated for this task. I'm looking for a simpler solution,
possibly by using a better regular expression than I have chosen in my
first sample code.
First sample:
use strict ;
use warnings ;
my $re = qr'^(.*)\ ( (R$)|
(Re$)|
(Ref$)|
(Refe$)|
(Refer$)|
(Refere$)|
(Referenc$)|
(Reference\ {0,1}$)|
(Reference\ \#\d{0,}$)
)'x ;
while(<DATA>) {
chomp ;
print "in : >$_<\n" ;
if (my($result) = /$re/g) {
print "out: >$result<\n" ;
}
else {
print "out: >$_<\n" ;
}
}
__DATA__
Refer
One Referenc
two three Reference
xx yy Reference Reference
def Refere Reference #xx
abc the def Refere Reference #
abc the def Refere Reference #12
Second sample:
use strict ;
use warnings ;
my $PATTERN = 'Reference #000000' ;
my $pos ;
while (<DATA>) {
chomp ;
$pos = -1 ;
while ((my $ind = index($_,' R',$pos)) != -1) {
$pos = $ind + 1 ;
}
print "in : >$_<\n" ;
my $result = $_ ;
if ($pos > 0) {
my $re = substr($_,$pos) ;
$re =~ s/\d+$/\\d+/ ;
$re = qr/^$re/ ;
if ($PATTERN =~ /$re/) {
$result = substr($_,0,$pos-1) ;
}
}
print "out: >$result<\n" ;
}
__DATA__
Refer
One Referenc
two three Reference
xx yy Reference Reference
def Refere Reference #xx
abc the def Refere Reference #
abc the def Refere Reference #12
------------------------------
Date: 23 Mar 2007 11:34:36 -0700
From: "Brian McCauley" <nobull67@gmail.com>
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <1174674876.739053.226730@o5g2000hsb.googlegroups.com>
On Mar 23, 5:44 pm, "Mark" <goo...@markginsburg.com> wrote:
[ An interesting problem ]
> I'm looking for a simpler solution,
> possibly by using a better regular expression than I have chosen in my
> first sample code.
Wow! What a brilliant post. Clear, well thought out, interesting.
Just wish I had an answer. I'll think about that one tonight. I'll
probably be up all night thinking about it!
------------------------------
Date: 23 Mar 2007 11:44:06 -0700
From: "Brian McCauley" <nobull67@gmail.com>
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <1174675445.505098.172310@p15g2000hsd.googlegroups.com>
On Mar 23, 5:44 pm, "Mark" <goo...@markginsburg.com> wrote:
> use strict ;
> use warnings ;
>
> my $re = qr'^(.*)\ ( (R$)|
> (Re$)|
> (Ref$)|
> (Refe$)|
> (Refer$)|
> (Refere$)|
> (Referenc$)|
> (Reference\ {0,1}$)|
> (Reference\ \#\d{0,}$)
> )'x ;
>
> while(<DATA>) {
> chomp ;
> print "in : >$_<\n" ;
> if (my($result) = /$re/g) {
> print "out: >$result<\n" ;
> }
> else {
> print "out: >$_<\n" ;
> }
>
> }
Just being picky but...
As far as I can see the /g in the match does nothing useful.
Nor to most of the (...) in the regex.
{0,1} and {0,} in regex are so commonly used that they have one-
character short hands: ? and * respectively.
BTW are you perhaps trying to implement something like File::Stream?
------------------------------
Date: 23 Mar 2007 11:44:42 -0700
From: usenet@DavidFilmer.com
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <1174675480.824959.193970@b75g2000hsg.googlegroups.com>
On Mar 23, 10:44 am, "Mark" <goo...@markginsburg.com> wrote:
> my $re = qr'^(.*)\ ( (R$)|
> (Re$)|
> (Ref$)|
> (Refe$)|
> (Refer$)|
> (Refere$)|
> (Referenc$)|
> (Reference\ {0,1}$)|
> (Reference\ \#\d{0,}$)
> )'x ;
Try this instead; results are identical to your regex except what
happens to $2, which you don't use anyway (and you could avoid setting
$2, but extra complexity for no real gain):
$re = qr{^(.*) Re?f?e?r?e?n?c?e? ?(\#\d*)$}x;
--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)
------------------------------
Date: 23 Mar 2007 11:46:56 -0700
From: "Brian McCauley" <nobull67@gmail.com>
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <1174675616.793351.164270@y80g2000hsf.googlegroups.com>
On Mar 23, 6:44 pm, "Brian McCauley" <nobul...@gmail.com> wrote:
>
> BTW are you perhaps trying to implement something like
> File::Stream?
I thought I has d=E9j=E0-vu
http://groups.google.com/group/comp.lang.perl.misc/browse_frm/thread/6b7d06=
f61ea9f640
------------------------------
Date: 23 Mar 2007 11:49:07 -0700
From: usenet@DavidFilmer.com
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <1174675746.881193.208240@e65g2000hsc.googlegroups.com>
On Mar 23, 11:44 am, use...@DavidFilmer.com wrote:
> $re = qr{^(.*) Re?f?e?r?e?n?c?e? ?(\#\d*)$}x;
Then again, it would be possible to "fool" this regex where your
original would not be fooled (for example, by dropping a middle
character). Needs more thought....
--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)
------------------------------
Date: 23 Mar 2007 11:51:32 -0700
From: "Brian McCauley" <nobull67@gmail.com>
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <1174675892.837486.83210@n59g2000hsh.googlegroups.com>
On Mar 23, 6:44 pm, use...@DavidFilmer.com wrote:
> On Mar 23, 10:44 am, "Mark" <goo...@markginsburg.com> wrote:
>
> > my $re = qr'^(.*)\ ( (R$)|
> > (Re$)|
> > (Ref$)|
> > (Refe$)|
> > (Refer$)|
> > (Refere$)|
> > (Referenc$)|
> > (Reference\ {0,1}$)|
> > (Reference\ \#\d{0,}$)
> > )'x ;
>
> Try this instead; results are identical to your regex except what
> happens to $2, which you don't use anyway (and you could avoid setting
> $2, but extra complexity for no real gain):
>
> $re = qr{^(.*) Re?f?e?r?e?n?c?e? ?(\#\d*)$}x;
No, that matches "Rernc 10" etc too.
------------------------------
Date: 23 Mar 2007 18:59:30 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <56imciF29eq4vU1@mid.dfncis.de>
Brian McCauley <nobull67@gmail.com> wrote in comp.lang.perl.misc:
> On Mar 23, 5:44 pm, "Mark" <goo...@markginsburg.com> wrote:
>
> [ An interesting problem ]
>
> > I'm looking for a simpler solution,
> > possibly by using a better regular expression than I have chosen in my
> > first sample code.
>
> Wow! What a brilliant post. Clear, well thought out, interesting.
...plus runnable code, including a convincing set of test data.
I quite agree.
> Just wish I had an answer. I'll think about that one tonight. I'll
> probably be up all night thinking about it!
Ah, it won't take all night. Here is my take:
{
my $fix = ' Reference #';
my $pat = "$fix\\d+";
my @parts = map substr( $fix, 0, $_), 1 .. length $fix;
sub rem_ref {
my $str = shift;
$str =~ s/$pat$// and return $str;
$str =~ s/$_$// and return $str for @parts;
return $str;
}
}
while ( <DATA> ) {
chomp;
print "in : >$_<\n";
print "out: >", rem_ref( $_), "<\n";
}
Anno
------------------------------
Date: 23 Mar 2007 19:08:27 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <56imtbF29eq4vU2@mid.dfncis.de>
<usenet@DavidFilmer.com> wrote in comp.lang.perl.misc:
> On Mar 23, 10:44 am, "Mark" <goo...@markginsburg.com> wrote:
>
> > my $re = qr'^(.*)\ ( (R$)|
> > (Re$)|
> > (Ref$)|
> > (Refe$)|
> > (Refer$)|
> > (Refere$)|
> > (Referenc$)|
> > (Reference\ {0,1}$)|
> > (Reference\ \#\d{0,}$)
> > )'x ;
>
> Try this instead; results are identical to your regex except what
> happens to $2, which you don't use anyway (and you could avoid setting
> $2, but extra complexity for no real gain):
>
> $re = qr{^(.*) Re?f?e?r?e?n?c?e? ?(\#\d*)$}x;
No, that would also match things like "gaga Refe #12".
Anno
------------------------------
Date: Fri, 23 Mar 2007 19:29:15 +0000 (UTC)
From: ansok@alumni.caltech.edu (Gary E. Ansok)
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <eu19qa$nt$1@naig.caltech.edu>
<anno4000@radom.zrz.tu-berlin.de> wrote:
> <usenet@DavidFilmer.com> wrote in comp.lang.perl.misc:
>> On Mar 23, 10:44 am, "Mark" <goo...@markginsburg.com> wrote:
>>
>> > my $re = qr'^(.*)\ ( (R$)|
>> > (Re$)|
>> > (Ref$)|
>> > (Refe$)|
>> > (Refer$)|
>> > (Refere$)|
>> > (Referenc$)|
>> > (Reference\ {0,1}$)|
>> > (Reference\ \#\d{0,}$)
>> > )'x ;
>>
>> Try this instead; results are identical to your regex except what
>> happens to $2, which you don't use anyway (and you could avoid setting
>> $2, but extra complexity for no real gain):
>>
>> $re = qr{^(.*) Re?f?e?r?e?n?c?e? ?(\#\d*)$}x;
>
>No, that would also match things like "gaga Refe #12".
You could write something like this
$re = qr{^(.*)\ (R(?:e(?:f(?:e(?:r(?:e(?:n(?:c(?:e(?:\ (?:\#\d*)
?)?)?)?)?)?)?)?)?))$}x;
but that's not clear at all to the human reader, and I don't think
adding more whitespace would help much in this case.
Depending on your needs, it might be more clear to use a simpler regex like
$re = qr{^(.*) ((R[a-z #]+) \d*)$};
and then test ($3 eq substr('Reference #', 0, length $3))
Gary Ansok
--
3M suggests that to obtain the best results, one should make the bond
"while the adhesive is wet, aggressively tacky." I did not know what
"aggressively tacky" meant until I saw a recent notice in the Bboard.
------------------------------
Date: 23 Mar 2007 19:29:30 GMT
From: Glenn Jackman <glennj@ncf.ca>
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <slrnf08akr.s5p.glennj@smeagol.ncf.ca>
At 2007-03-23 02:51PM, "Brian McCauley" wrote:
> On Mar 23, 6:44 pm, use...@DavidFilmer.com wrote:
> > On Mar 23, 10:44 am, "Mark" <goo...@markginsburg.com> wrote:
> >
> > > my $re = qr'^(.*)\ ( (R$)|
> > > (Re$)|
> > > (Ref$)|
> > > (Refe$)|
> > > (Refer$)|
> > > (Refere$)|
> > > (Referenc$)|
> > > (Reference\ {0,1}$)|
> > > (Reference\ \#\d{0,}$)
> > > )'x ;
> >
> > Try this instead; results are identical to your regex except what
> > happens to $2, which you don't use anyway (and you could avoid setting
> > $2, but extra complexity for no real gain):
> >
> > $re = qr{^(.*) Re?f?e?r?e?n?c?e? ?(\#\d*)$}x;
>
> No, that matches "Rernc 10" etc too.
So instead you'd want...
$re = qr{^(.*) R(e(f(e(r(e(n(c(e( (#\d*)?)?)?)?)?)?)?)?)?)?$}
or
$re = qr{^(.*) R(?:e(?:f(?:e(?:r(?:e(?:n(?:c(?:e(?: (?:#\d*)?)?)?)?)?)?)?)?)?)?$}
--
Glenn Jackman
"You can only be young once. But you can always be immature." -- Dave Barry
------------------------------
Date: Fri, 23 Mar 2007 19:23:43 +0100
From: Mirco Wahab <wahab-mail@gmx.de>
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <eu17pl$1qq$1@mlucom4.urz.uni-halle.de>
Mark wrote:
> Here are two programs that seem to do what I want, but they seem
> overly complicated for this task. I'm looking for a simpler solution,
> possibly by using a better regular expression than I have chosen in my
> first sample code.
> First sample:
> [...]
> Second sample:
> [...]
I don't really know what all this
should give, but whay wouldn't
a simple:
while(<DATA>) {
chomp && print "$1 ==> from [$_]\n" if /(.+?)Refer/
}
do all you want? In your explanations you
mentioned you'd truncate all subsequent
occurencies of 'refer' 'reference' and all
following stuff.
Regards
M.
------------------------------
Date: Fri, 23 Mar 2007 20:23:44 +0100
From: Mirco Wahab <wahab-mail@gmx.de>
Subject: Re: Truncating text from a string with beginning text from another string
Message-Id: <eu19rb$2bt$1@mlucom4.urz.uni-halle.de>
Mark wrote:
> Here are two programs that seem to do what I want, but they seem
> overly complicated for this task. I'm looking for a simpler solution,
> possibly by using a better regular expression than I have chosen in my
> first sample code.
After making the wrong turn first,
I think this can't be solved very
much different from your solution.
Of course, one can write it somehow 'different',like:
...
my @end = split //, 'Reference #000000';
my $key = '('.(join '|', map join('',,@$_), map[@end[0..$_]], 0..$#end).')';
...
while(<DATA>) {
print "$1\t\t$2\n"
if /^(.+?)($key)$/
}
__DATA__
...
Regards
M.
------------------------------
Date: Fri, 23 Mar 2007 19:07:39 +0000 (UTC)
From: "David H. Adler" <dha@panix.com>
Subject: Re: Urgent requirement in perl for a US based CMM Level 4 company
Message-Id: <slrnf089br.g3g.dha@panix2.panix.com>
On 2007-03-16, josh.arni@gmail.com <josh.arni@gmail.com> wrote:
> Hi,
> We currently have an urgent requirement
You have posted a job posting or a resume in a technical group.
Longstanding Usenet tradition dictates that such postings go into
groups with names that contain "jobs", like "misc.jobs.offered", not
technical discussion groups like the ones to which you posted.
Had you read and understood the Usenet user manual posted frequently to
"news.announce.newusers", you might have already known this. :) (If
n.a.n is quieter than it should be, the relevent FAQs are available at
http://www.faqs.org/faqs/by-newsgroup/news/news.announce.newusers.html)
Another good source of information on how Usenet functions is
news.newusers.questions (information from which is also available at
http://www.geocities.com/nnqweb/).
Please do not explain your posting by saying "but I saw other job
postings here". Just because one person jumps off a bridge, doesn't
mean everyone does. Those postings are also in error, and I've
probably already notified them as well.
If you have questions about this policy, take it up with the news
administrators in the newsgroup news.admin.misc.
http://jobs.perl.org may be of more use to you
Yours for a better usenet,
dha
--
David H. Adler - <dha@panix.com> - http://www.panix.com/~dha/
Perl gives you enough rope to hang yourself and your neighbor.
- Randal L. Schwartz
------------------------------
Date: 23 Mar 2007 10:43:34 -0700
From: "Klaus" <klaus03@gmail.com>
Subject: Re: Using @ARGV in object oriented script
Message-Id: <1174671814.658638.261390@y66g2000hsf.googlegroups.com>
On Mar 21, 11:09 pm, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth "Klaus" <klau...@gmail.com>:
> > print do{ local $" = "', '"; "Before sub test(): \@ARGV = ('@ARGV')
> > \n"; };
>
> I realise this is not the point of your post :), but this is
> unnecessarily ugly. Since it's only a tiny program, just set $" and have
> done:
>
> $" = "', '";
> warn "Before sub test(): \@ARGV = ('@ARGV')";
I was in the habit of always localising $"
(see perldoc perlvar : "[...] In most cases you want to localize these
variables before changing them [...]")
But I agree with you, localising $" in this simple example is
overkill.
> > test_ARGV(\@ARGV);
>
> > print "Test program finished.\n";
>
> > sub test_ARGV {
> > my @Array = @{$_[0]};
>
> I don't know if you realise, but this makes a *copy* of the passed in
> array. With large arrays, or if you are wanting to modify the array
> in-place, you want something more like
>
> my ($Array) = @_;
> warn "\@Array = ('@$Array')";
>
> > print do{ local $" = "', '"; "Inside sub test(): \@Array =
> > ('@Array')\n"; }
> > }
Point taken. Thanks for the hint.
--
Klaus
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 259
**************************************