[22170] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4391 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Jan 12 18:10:37 2003

Date: Sun, 12 Jan 2003 15:10:10 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 12 Jan 2003     Volume: 10 Number: 4391

Today's topics:
    Re: Match-time code evaluation (Ben Morrow)
    Re: Match-time code evaluation (Jay Tilton)
    Re: Match-time code evaluation (Tad McClellan)
    Re: Net::Mysql - capture error messages (Mike Solomon)
        Newbie / cgi script doesn't take no option. <estrunk@home.nl>
    Re: Newbie / cgi script doesn't take no option. <tony_curtis32@yahoo.com>
    Re: Parsing /(terminated|non-terminated)/ records from  (Kevin Newman)
        reading commandline parameters <nomail@mail.com>
    Re: Recreating directory hierarchy <bik.mido@tiscalinet.it>
    Re: undef of large Hashes/Arrays took a very long time <pa@panix.com>
    Re: Using tr/// - Am I barking up the wrong tree? <Jodyman@hotmail.com>
    Re: Using tr/// - Am I barking up the wrong tree? <uri@stemsystems.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sun, 12 Jan 2003 16:29:54 +0000 (UTC)
From: mauzo@mimosa.csv.warwick.ac.uk (Ben Morrow)
Subject: Re: Match-time code evaluation
Message-Id: <avs562$hqa$1@wisteria.csv.warwick.ac.uk>

ramm <invalid@invalid.com> wrote:
>Hello everybody,
>	I am reading chapter 5 of "Programming Perl 3rd" and I am
>stuck on this example:

A pageref would have been helpful It's on p211.

>$_ = 'lothlorien';
>m/  (?{ $i = 0 })
>    (.    (?{ $i++ })    )*
>    lori
> /x;
>
>I cannot see why "i" holds 10 after this code is executed. Doesn't
                   ^
$i please. Otherwise you'll get confused when you start with @i, etc.

>backtracking decrease "i" each time a character is given back?

No, it doesn't. Because the code in (?{}) is arbitrary, Perl has no way of
knowing how to reverse it's effects, so it doesn't try. So $i gets up to 10
and never comes back down.

The next paragraph explains how to use local to actually count the chars
matched, if you need to.

>I played around with the code, and ended up with the following:
>
>$_ = 'lothlorien';
>m/  (?{ $i = 0 })
>    .*
>    (?{ $i++ })
>    lori
> /x;
>
>Why does this yield a different result (=1)?

Uh? I get 7...

First (?{ $i = 0 }) matches at the start of the string.
Then .* gobbles 10 characters.
Then (?{ $i++ }) matches nothing.
Then 'l' doesn't match '', so we backtrack and steal a char from .*.
(?{ $i++ }) matches nothing again.
'l' doesn't match 'n', so we backtrack.
(?{ $i++ }).
'l' doesn't match 'en', backtrack.
(?{ $i++ }).
'l' doesn't match 'ien', backtrack.
(?{ $i++ }).
'l' !~ 'rien', backtrack.
(?{ $i++ }).
'l' !~ 'orien', backtrack.
(?{ $i++ }).
'lori' does match 'lorien', so we're done.

That makes $i == 7.

Ben


------------------------------

Date: Sun, 12 Jan 2003 16:57:57 GMT
From: tiltonj@erols.com (Jay Tilton)
Subject: Re: Match-time code evaluation
Message-Id: <3e219a8f.222083858@news.erols.com>

 mauzo@mimosa.csv.warwick.ac.uk (Ben Morrow) wrote:

: ramm <invalid@invalid.com> wrote:
:
: >I played around with the code, and ended up with the following:
: >
: >$_ = 'lothlorien';
: >m/  (?{ $i = 0 })
: >    .*
: >    (?{ $i++ })
: >    lori
: > /x;
: >
: >Why does this yield a different result (=1)?
: 
: Uh? I get 7...

Different versions of Perl.  5.6 vs 5.8, probably.



------------------------------

Date: Sun, 12 Jan 2003 10:56:55 -0600
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Match-time code evaluation
Message-Id: <slrnb237im.9je.tadmc@magna.augustmail.com>

ramm <invalid@invalid.com> wrote:

> 	I am reading chapter 5 of "Programming Perl 3rd" 


That is a awfully big book...


> and I am
> stuck on this example:


 ... so a page number would have been helpful.   (p211)


> $_ = 'lothlorien';
> m/  (?{ $i = 0 })
>     (.    (?{ $i++ })    )*
>     lori
>  /x;
> 
> I cannot see why "i" holds 10 after this code is executed. Doesn't
> backtracking decrease "i" each time a character is given back?


Obviously not.  :-)

You seem to be expecting that perl can "invert/undo" the code
somehow. That might seem a reasonable expectation in the case
of a simple increment, but it wouldn't generalize for something
like   (?{  big_ol_subroutine() }).

big_ol_subroutine() might write something to a file for instance.
We can't really expect perl to unwrite what has been written.


> I played around with the code, and ended up with the following:
> 
> $_ = 'lothlorien';
> m/  (?{ $i = 0 })
>     .*
>     (?{ $i++ })
>     lori
>  /x;
> 
> Why does this yield a different result (=1)?


I don't know (yet). Let's try and find out.

As with "all things regex" we turn to "Mastering Regular Expressions"
(2e) for help.  :-)

Let's add some stategically placed print()s for debugging, as
suggested on page 332.

--------------------------------------------
#!/usr/bin/perl
#use strict;
use warnings;

$_ = 'lothlorien';

### 1) CODE inside, greedy
m/  (?{ $i = 0 })
    (.    
       (?{ $i++; print "[$`<$&>$'] (i=$i)\n" })    
    )*
    lori
    (?{ print "final: [$`<$&>$'] (i=$i)\n\n" })
 /x;

### 2) CODE outside, greedy
m/  (?{ $i = 0 })
    (.    
    )*
       (?{ $i++; print "[$`<$&>$'] (i=$i)\n" })    
    lori
    (?{ print "final: [$`<$&>$'] (i=$i)\n\n" })
 /x;

### 3) CODE inside, non-greedy
m/  (?{ $i = 0 })
    (.    
       (?{ $i++; print "[$`<$&>$'] (i=$i)\n" })    
    )*?
    lori
    (?{ print "final: [$`<$&>$'] (i=$i)\n\n" })
 /x;

### 4) CODE outside, non-greedy
m/  (?{ $i = 0 })
    (.    
    )*?
       (?{ $i++; print "[$`<$&>$'] (i=$i)\n" })    
    lori
    (?{ print "final: [$`<$&>$'] (i=$i)\n\n" })
 /x;
--------------------------------------------

output:

[<l>othlorien] (i=1)
[<lo>thlorien] (i=2)
[<lot>hlorien] (i=3)
[<loth>lorien] (i=4)
[<lothl>orien] (i=5)
[<lothlo>rien] (i=6)
[<lothlor>ien] (i=7)
[<lothlori>en] (i=8)
[<lothlorie>n] (i=9)
[<lothlorien>] (i=10)
final: [<lothlori>en] (i=10)

[<loth>lorien] (i=1)
final: [<lothlori>en] (i=1)

[<l>othlorien] (i=1)
[<lo>thlorien] (i=2)
[<lot>hlorien] (i=3)
[<loth>lorien] (i=4)
final: [<lothlori>en] (i=4)

[<>lothlorien] (i=1)
[<loth>lorien] (i=2)
final: [<lothlori>en] (i=2)


I think the regex engine is doing the "Pre-check of required
character/substring optimization" (MRE p244) because the
"lori" substring is required by the pattern.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: 12 Jan 2003 13:47:06 -0800
From: mike_solomon@lineone.net (Mike Solomon)
Subject: Re: Net::Mysql - capture error messages
Message-Id: <56568be5.0301121347.6ab0938a@posting.google.com>

"Andrey Tapkin" <tapkin@rol.ru> wrote in message news:<avndc2$nal$1@news.sovam.com>...
> Use MysqlPP instead of Mysql:
> ...DBI->connect("DBI:mysqlPP:database...
> Don't close ur eyes when you write something :-))
> 
Andrew,

Thanks

I will try and keep my eyes open in future

Regards

Mike


------------------------------

Date: Sun, 12 Jan 2003 17:50:41 +0100
From: "estrunk" <estrunk@home.nl>
Subject: Newbie / cgi script doesn't take no option.
Message-Id: <K0hU9.18956$J6.1762249@zwoll1.home.nl>

Hello all,

I'm realy new to this, and i found an older script to be usefull with my
site.
Now it works but it gives always te line if [ "$FOUND" = = "no" ]; then
      echo "<H3>No names found!</H3>"
I know that it defould but if found it should be "yes"
So it shows a lot of information AND a line witch tells me that nothing has
been found.

Please help!!
and please sent it also to my normal e-mailadress as a cc.

-----------script--------------
FOUND="no"
cd $DOCROOT
$GREP -i $VALUE *.* |
   $AWK -F':' ' \
      $1 == prev { \
         print($2) \
      }; \
      $1 != prev { \
         print("</PRE><P><B>Bestandsnaam:</B><BR><I>",$1,"</I>\n<PRE>"); \
         print("</PRE><BR><PRE>"); \
         print($2); \
         FOUND="yes"; \
         prev = $1 \
      }';

if [ "$FOUND" = = "no" ]; then
      echo "<H3>No names found!</H3>"

fi
echo "</BODY></HTML>"
-------------end script---------------
--
Kind Regards,
Met Vriendelijke Groeten,
E. Strunk





------------------------------

Date: Sun, 12 Jan 2003 10:53:29 -0600
From: Tony Curtis <tony_curtis32@yahoo.com>
Subject: Re: Newbie / cgi script doesn't take no option.
Message-Id: <87lm1qs41i.fsf@limey.hpcc.uh.edu>

>> On Sun, 12 Jan 2003 17:50:41 +0100,
>> "estrunk" <estrunk@home.nl> said:

> Hello all, I'm realy new to this, and i found an older
> script to be usefull with my site.  Now it works but it
> gives always te line if [ "$FOUND" = = "no" ]; then echo
> "<H3>No names found!</H3>" I know that it defould but if
> found it should be "yes" So it shows a lot of
> information AND a line witch tells me that nothing has
> been found.

> -----------script--------------
> FOUND="no" cd $DOCROOT

That is a shell script.

This is comp.lang.perl.misc.

hth
t


------------------------------

Date: 12 Jan 2003 14:04:36 -0800
From: knewman00@earthlink.net (Kevin Newman)
Subject: Re: Parsing /(terminated|non-terminated)/ records from a file
Message-Id: <4c8e4398.0301121404.5894d065@posting.google.com>

"Jodyman" <Jodyman@hotmail.com> wrote in message news:<PaZT9.5021$Qr4.486776@newsread1.prod.itd.earthlink.net>...

>         If the files aren't too large, you can slurp them into a single
> variable and then break up the individual records or fields (i.e. apply
> rules) after having removed all \r\n|\n|\r.  I've used this to parse phone
> books before and it works great.  I'm not telling you how to do it but
> perhaps a different method.

It's okay, you can tell me how to do it.  This problem has challenged
my Perl Kung Fu, so I'm seeking any help I can get. :-)

Good suggestion, but there are a couple of immediate problems that
arise.

1.  The size of the files that I work with is routinely in the
100MB-500MB range.  On occasion, the files are > 1GB.  That might be
okay for one user, but multiple users on the same system may use this
utility.
2.  Files with logical record lengths are a very difficult problem
because the logical length often does not agree with the physical
length.  So, I would be unable to determine the start and end of each
record unless each record is perfectly formed.

However, if you have an example of what you are describing I would
like to see it.  I'm sure there is an "easy" way to handle this data
transparently,

Thanks,

kln


------------------------------

Date: Sun, 12 Jan 2003 22:37:55 GMT
From: Tommi <nomail@mail.com>
Subject: reading commandline parameters
Message-Id: <77mU9.648$0m4.142@read3.inet.fi>


How can I read commandline parameters?


------------------------------

Date: Sun, 12 Jan 2003 19:28:01 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Recreating directory hierarchy
Message-Id: <8gb32v4rntkc82l8ps9ia5t50dgb70991g@4ax.com>

On Thu, 09 Jan 2003 22:47:44 +0100, I (Michele Dondi)
<bik.mido@tiscalinet.it> wrote:

>On Wed, 08 Jan 2003 18:39:20 -0500, Benjamin Goldberg
><goldbb2@earthlink.net> wrote:

>>I suppose... you want to provide some modified code?
[...]
>  warn("'$dest' exists: refusing to clobber it"), return if -e $dest;

>I was just about to send this post when I realized, on a second
>thought, that if '-e $dest' and $_ is not the toplevel (source) dir,
[...]
>But maybe I'm missing something obvious and all this is utter
>nonsense...

On a third thought I realized that my second thought was indeed utter
nonsense! This is probably why you didn't even mind following-up. OTOH
I want to acknowledge my gross and embarassing mistake - memento to
myself: never follow-up to Benjamin Goldberg (or anybody) without
thinking at least three times!! :-)

However the first remark still partly applies, only I realize that the
modification I suggested is more annoying than useful with dirs, but I
guess that discussing about this doesn't make any sense any more.


Michele
-- 
>It's because the universe was programmed in C++.
No, no, it was programmed in Forth.  See Genesis 1:12:
"And the earth brought Forth ..."
- Robert Israel on sci.math, thread "Why numbers?"


------------------------------

Date: Sun, 12 Jan 2003 17:16:09 +0000 (UTC)
From: Pierre Asselin <pa@panix.com>
Subject: Re: undef of large Hashes/Arrays took a very long time
Message-Id: <avs7sm$8bk$2@reader1.panix.com>

In <3E20B47D.56246AB6@earthlink.net> Benjamin Goldberg <goldbb2@earthlink.net> writes:

>Why are you undefing, instead of declaring the array as a lexical
>variable and merely letting it go out of scope?  Or assigning an empty
>list to it?

I no longer have the script.  I don't remember if I undef'd the stuff
or assigned empties.  The data structures were at (main) mackage
scope.  I know that one version of the script did *nothing* to reclaim
memory, and it took forever to exit after it was done (confirmed by
unbufferred prints).


>When you undef() an array or hash, you're doing something very special,
>which often has bad consequences on performance.

Ok, so the OP should try assigning () and see if that helps.


------------------------------

Date: Sun, 12 Jan 2003 18:40:32 GMT
From: "Jodyman" <Jodyman@hotmail.com>
Subject: Re: Using tr/// - Am I barking up the wrong tree?
Message-Id: <AEiU9.6101$Dq.654277@newsread2.prod.itd.earthlink.net>

"Uri Guttman" <uri@stemsystems.com> wrote in message

> >>>>> "J" == Jodyman  <Jodyman@hotmail.com> writes:
>
>   J> "R. Charles Henry" <trapforcannedmeatproduct@hotmail.com> wrote in
message
>
>   >> Trying to convert individual letters in a string, to gif image HTML
tags:
>   >> e.g.
>   >> from:  jo@jo.com to  <img src="j.gif"><img src="o.gif"><img
>   J> src="at.gif"><img src="j.gif"> etc.
>   >> is there an easier way of doing this?
>
>   J> my $in = 'jo@jo.com';
>
>   J> my @in = split //, $in;
>
>   J> foreach (@in) {
>   J> $html .= "<img src='$_.gif'>";
>   J> }
>
>   J> print $html;
>
> why the split and loop? a simple s/// will do it.
>
>   J> BTW, ..gif, @.gif etc are valid filenames so you can use the easier
way.
>
> but / is not a valid filename in unix, : is not in macos.
>
> to handle special chars, you make a hash. you also can handle alpha
> chars in there.
>
> my %char2file = (
> '@' => 'at',
> '#' => 'hash',
> map { $_ => $_ } 'a' .. 'z' ) ;
>
> my $html = 'jo@jo.com';
> $html =~ s/(.)/<img src="$char2hash{$1}.gif">/g ;

### slight embellishment:

#!c:\perl\bin\perl -w
use strict;

my %char2file = (
 '@' => 'at',
 '#' => 'hash',
 map { $_ => $_ } 'a' .. 'z' ) ;

 my $html = 'jo@jo.com';
 $html =~ s/(.)/<img src="$char2hash{$1}.gif">/g ;


Global symbol "%char2hash" requires explicit package name at mapchar.pl line
10.
Execution of mapchar.pl aborted due to compilation errors.

Seems this code is broken. ;-)

Jody




------------------------------

Date: Sun, 12 Jan 2003 19:13:55 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Using tr/// - Am I barking up the wrong tree?
Message-Id: <x71y3idvv1.fsf@mail.sysarch.com>

>>>>> "J" == Jodyman  <Jodyman@hotmail.com> writes:

  J> "Uri Guttman" <uri@stemsystems.com> wrote in message
  >> 
  >> my %char2file = (
  >> $html =~ s/(.)/<img src="$char2hash{$1}.gif">/g ;


  J> Global symbol "%char2hash" requires explicit package name at
  J> mapchar.pl line

  J> Seems this code is broken. ;-)

i didn't claim it was tested. a simple name mistake caught by the use of
strict. so it is a good example to those who don't use strict. i don't
even count that as a bug as it is trivial to find and fix.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
----- Stem and Perl Development, Systems Architecture, Design and Coding ----
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org
Damian Conway Perl Classes - January 2003 -- http://www.stemsystems.com/class


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 4391
***************************************


home help back first fref pref prev next nref lref last post