[23182] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 5403 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Aug 21 06:10:32 2003

Date: Thu, 21 Aug 2003 03:10:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 21 Aug 2003     Volume: 10 Number: 5403

Today's topics:
        RegEx to match names? <pdo@myrealbox.com>
    Re: RegEx to match names? (Sam Holden)
    Re: RegEx to match names? (Sam Holden)
        Regular Expression - BackReferences Question <kasp@epatra.com>
    Re: Regular Expression - BackReferences Question (Jay Tilton)
    Re: Regular expression, getting href which is followed  (fatted)
    Re: retaining data in memory <tassilo.parseval@rwth-aachen.de>
    Re: Strange INC error when using perl 5.8.0 as a cgi sc (Tony)
        undeclaring multiple arrays (Aaron)
    Re: undeclaring multiple arrays <krahnj@acm.org>
    Re:  <bwalton@rochester.rr.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 21 Aug 2003 05:09:43 GMT
From: "Patrick D." <pdo@myrealbox.com>
Subject: RegEx to match names?
Message-Id: <Xns93DDE17198926pdopdo@204.127.204.17>

There are a few different name formats out there, and I'm having trouble 
incorporating them into one regex.

e.g.

George Walker Bush
George W. Bush
George Bush
G. Walker Bush
G. W. Bush
G.W. Bush

What I want is a way to always be able to identify the first+middle and 
last names. I'd like the $1 to be "George Walker" or "George W." or G. 
Walker", or just "George" if there is no middle. With $2, hopefully it 
could contain *only* "Bush".

Can anyone help me make a regex for all the above?


------------------------------

Date: 21 Aug 2003 05:25:52 GMT
From: sholden@flexal.cs.usyd.edu.au (Sam Holden)
Subject: Re: RegEx to match names?
Message-Id: <slrnbk8lv0.bvt.sholden@flexal.cs.usyd.edu.au>

On Thu, 21 Aug 2003 05:09:43 GMT, Patrick D. <pdo@myrealbox.com> wrote:
> There are a few different name formats out there, and I'm having trouble 
> incorporating them into one regex.
> 
> e.g.
> 
> George Walker Bush
> George W. Bush
> George Bush
> G. Walker Bush
> G. W. Bush
> G.W. Bush
> 
> What I want is a way to always be able to identify the first+middle and 
> last names. I'd like the $1 to be "George Walker" or "George W." or G. 
> Walker", or just "George" if there is no middle. With $2, hopefully it 
> could contain *only* "Bush".
> 
> Can anyone help me make a regex for all the above?

/(.*)\s+(\S+)/;

For something more robust you could try a module like Lingua::En:NameParse.

-- 
Sam Holden



------------------------------

Date: 21 Aug 2003 05:29:41 GMT
From: sholden@flexal.cs.usyd.edu.au (Sam Holden)
Subject: Re: RegEx to match names?
Message-Id: <slrnbk8m65.plb.sholden@flexal.cs.usyd.edu.au>

On 21 Aug 2003 05:25:52 GMT, Sam Holden <sholden@flexal.cs.usyd.edu.au> wrote:
> 
> /(.*)\s+(\S+)/;

The + in \s+ might as well be dropped, since if there are multiple
spaces, the .* is going to grab them leaving just one for the \s+ anyway.

Or the .* could be replaced with .*?, if trailing spaces in $1 aren't
wanted.

-- 
Sam Holden



------------------------------

Date: Thu, 21 Aug 2003 14:29:51 +0530
From: "Kasp" <kasp@epatra.com>
Subject: Regular Expression - BackReferences Question
Message-Id: <bi21md$f00$1@newsreader.mailgate.org>

I have a file containing the following URL in it
http://www.somesite.com/folder/1.gif

Now, everyday I need to run a script so that 1.gif in the URL is replaced by
2.gif and so on each day.

I tried this substitution using backreference...and it did not work

$line =~ s#(\d+)\.gif#$1++\.gif#gsi;

and neither did
$line =~ s#(\d+)\.gif#($1+1)\.gif#gsi;

How can I increment the number found in URL by one?
Thanks
--
"Accept that some days you are the pigeon and some days the statue."
"A pat on the back is only a few inches from a kick in the butt." - Dilbert.





------------------------------

Date: Thu, 21 Aug 2003 09:58:35 GMT
From: tiltonj@erols.com (Jay Tilton)
Subject: Re: Regular Expression - BackReferences Question
Message-Id: <3f449629.44661415@news.erols.com>

"Kasp" <kasp@epatra.com> wrote:

: I have a file containing the following URL in it
: http://www.somesite.com/folder/1.gif
: 
: Now, everyday I need to run a script so that 1.gif in the URL is replaced by
: 2.gif and so on each day.
: 
: I tried this substitution using backreference...and it did not work
: 
: $line =~ s#(\d+)\.gif#$1++\.gif#gsi;

"$1++" will cause a "Modification of a read-only value" error.

: and neither did
: $line =~ s#(\d+)\.gif#($1+1)\.gif#gsi;

Almost there.  You just need to make perl treat the replacement
portion as code, so "$1+1" will actuall perform some math.
The /e modifier gets it done.
 
    $line =~ s#(\d+)\.gif# $1+1 . ".gif" #e;

Did you have reasons for including the /g and /s modifiers?



------------------------------

Date: 21 Aug 2003 01:32:27 -0700
From: fatted@yahoo.com (fatted)
Subject: Re: Regular expression, getting href which is followed by img tag with specific src
Message-Id: <4eb7646d.0308210032.3668d99d@posting.google.com>

tadmc@augustmail.com (Tad McClellan) wrote in message news:<slrnbk802c.867.tadmc@magna.augustmail.com>...
> Fatted <fatted@yahoo.com> wrote:
> > "Tad McClellan" <tadmc@augustmail.com> wrote in message
> > news:slrnbk6u8t.73q.tadmc@magna.augustmail.com...
> >> fatted <fatted@yahoo.com> wrote:
>  
> >> You should use a module that understands HTML for processing HTML data.
> > 
> > Unfortunately I don't think that will help me with my problem, 
> 
> 
> Yes it will. That is why I suggested it.

Perhaps, I mean't that I couldn't see *how* it would help with my
problem :)
 
> 
> > I want to
> > extract the value of a href, for an <a> tag, preceding an <img> tag which
> > has an attribute src with a specific value. I'm not sure what module does
> > this. (I'm going to look again though!)
> 
> 
> I understood what you wanted to do quite clearly, that's why the
> code that I already posted does just what you describe above!
> 
> Did you run the program?

I did, but some idiot copy pasted incorrectly :) When I catch that
guy...

> 
> >> "lines" do not matter in HTML.
> > 
> > Thanks for the reminder :) 
> 
> 
> But you are going to forget it again before you get to the
> end of your followup...

Just put the gun down son... No I really do understand how HTML works.
I talked about a line, because, I am absolutely sure that the <a><img
/></a> tags which I'm interested in are always on one text line from
the html file.

> > I
> > first wanted to find the line 
> 
> 
> If you think of "lines" when processing HTML you aren't thinking
> correctly, and it will hurt you at some point.
> 
> So don't do that. :-)

No more please :)

> 
> 
> > which contained the <img src="importantimage.gif" (there just happens to be
> > lots of tags on this line), and then try to find the preceding value of the
> ><a> tags href.
> 

<snip>

 
> You can do it in less than 10 lines of code with HTML::Tree
> 
>    http://search.cpan.org/author/SBURKE/HTML-Tree-3.17/
> ---------------------------------------------------------
> #!/usr/bin/perl
> use strict;
> use warnings;
> use HTML::TreeBuilder;
> 
> my $html = '
> <a class="red" href="uninteresting.html" target="_new">Not so exciting
> text</a><a href="equallyboring.html" class = "blue">yawn</a><a
> class="green" href="IwantThis.html"><img border="0"
> src="importantimage.gif" alt="MeMe"></a>
> ';
> 
> # $html =~ s/\n/ /g;   # make it all on one line
> 
> my $tree = HTML::TreeBuilder->new();
> $tree->parse($html);
> 
> # find elements containing:   src="importantimage.gif"
> foreach my $img ( $tree->look_down('src', 'importantimage.gif') ) {
>    next unless $img->tag eq 'img';        # ensure the "src" attr was on
>                                           # an <img> element
> 
>    next unless $img->parent->tag eq 'a';  # ensure parent is an <a> element
>    my $href = $img->parent->attr('href'); # grab its "href" attr value
> 
>    print "$href\n";
> }
> 
> $tree->delete;
> ---------------------------------------------------------

Thanks.

I also figured out what was wrong (Keep the list short :)with the
regular expression in my original post. I had:

if($line =~ /<a.+?href="(.+?)".+?src="importantimage\.gif".+?><\/a>/)

But if I'd tried:

if($line =~ /<a.+href="(.+?)".+?src="importantimage\.gif".+><\/a>/)

I would have managed. Although I'll have to think about that a bit
more.


------------------------------

Date: 21 Aug 2003 06:06:47 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: retaining data in memory
Message-Id: <bi1nhn$7ng$1@nets3.rz.RWTH-Aachen.DE>

Also sprach anonymous@coolgroups.com:

> does anyone know how i might write a perl program to store data in
> memory so that another perl program can access it at a later time?
> 
> basically, i want to write a script that takes an encoded image and
> stores the decoded version in memory.  then, sometime later, another
> program will be called to retrieve the decoded version from memory.

You didn't tell us on which platform this is supposed to run so I
conveniently assume one with the shmget(2) syscall. In this case you can
use IPC::ShareLite (or a similar module from the IPC:: namespace; this
howevers has always turned out to work best for me):

    use IPC::ShareLite;
    use Fcntl qw/:flock/;
    
    my $mem = IPC::ShareLite->new( -key     => 181079,
                                   -create  => 'yes',
                                   -destroy => 'no' );
    $mem->lock(LOCK_EX);
    $mem->store($img_data);
    $mem->unlock;
    
    exit;

And another script just has to do
    
    use IPC::ShareLite;
    use Fcntl qw/:flock/;
    
    my $mem = IPC::ShareLite->new( -key     => 181079,
                                   -destroy => 'no' );
    $mem->lock(LOCK_SH);
    my $img_data = $mem->fetch;
    $mem->unlock;

There are usually limits on the size of a shared memory segment. If your
data exceeds this limit, you have to create multiple segments (of which
there is no limit I know of) and split the data between them.

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: 21 Aug 2003 00:47:11 -0700
From: ts@relson.net (Tony)
Subject: Re: Strange INC error when using perl 5.8.0 as a cgi script
Message-Id: <63073ce9.0308202347.65c41603@posting.google.com>

Lou Moran <ellem52@mac.com> wrote in message news:<nbv6kvgab0ga7i7kp6q95rt8rg1tr4o5f7@4ax.com>...
> On 20 Aug 2003 05:53:25 -0700, ts@relson.net (Tony) wrote wonderful
> things about me:
> 
> >Hi All, 
> >
> >	I have just compiled perl 5.8.0 on aix 4.3.3. I got the stable.tar
> >from cpan. It complied fine without any errors.  The trouble starts
> >when I try to use it to write cgi scripts.
> 
> 5.8.0 & 5.6.1 are not binary compatible.  This is known and
> documented.  Google is you friend.  Go get new modules.

Hi,

I have been doing a bit more digging , and found it fails from the
command line as well for all none root users, so i thinks its a
permission problem somewhere, if I get it working i will let you know.

Tony


------------------------------

Date: 20 Aug 2003 21:24:48 -0700
From: Chewy2426@aol.com (Aaron)
Subject: undeclaring multiple arrays
Message-Id: <7036ffb9.0308202024.3c4e7f28@posting.google.com>

I've looked on deja a little but couldn't find a definite answer. I
created an array of hashes to store a lot of data. I have everything
declared as locally as possible with MYs but I'm still taking up too
much memory.

Here's a shorted version of my code:

foreach $key (sort { $top10talkTemp{$b} <=> $top10talkTemp{$a} }
keys(%top10talkTemp)) {

   foreach (@{$source{$key}}) {
       #Stuff in here
   }

   @{$source{$key}} = undef;
}

The @{%hash} is new to me. Is setting @{$source{$key}} = undef the
best way to clear the memory space, or can I do @{%source} = undef
after the foreach loop? Or is there even a better method?

Thanks in advance,
Aaron


------------------------------

Date: Thu, 21 Aug 2003 07:47:46 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: undeclaring multiple arrays
Message-Id: <3F447902.831F2DD5@acm.org>

Aaron wrote:
> 
> I've looked on deja a little but couldn't find a definite answer. I
> created an array of hashes to store a lot of data. I have everything
> declared as locally as possible with MYs but I'm still taking up too
> much memory.
> 
> Here's a shorted version of my code:
> 
> foreach $key (sort { $top10talkTemp{$b} <=> $top10talkTemp{$a} }
> keys(%top10talkTemp)) {
> 
>    foreach (@{$source{$key}}) {
>        #Stuff in here
>    }
> 
>    @{$source{$key}} = undef;
> }
> 
> The @{%hash} is new to me. Is setting @{$source{$key}} = undef the
> best way to clear the memory space, or can I do @{%source} = undef
> after the foreach loop? Or is there even a better method?

If you just want to delete the key then use delete:

delete $source{$key};

However if you want to keep the key and just clear the array for that
key:

@{$source{$key}} = ();


John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Sat, 19 Jul 2003 01:59:56 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: 
Message-Id: <3F18A600.3040306@rochester.rr.com>

Ron wrote:

> Tried this code get a server 500 error.
> 
> Anyone know what's wrong with it?
> 
> if $DayName eq "Select a Day" or $RouteName eq "Select A Route") {

(---^


>     dienice("Please use the back button on your browser to fill out the Day
> & Route fields.");
> }
 ...
> Ron

 ...
-- 
Bob Walton



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 5403
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[23182] in Perl-Users-Digest

Perl-Users Digest, Issue: 5403 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Thu Aug 21 06:10:32 2003

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Aug 21 06:10:32 2003