[29595] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 839 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 10 21:09:42 2007

Date: Mon, 10 Sep 2007 18:09:10 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 10 Sep 2007     Volume: 11 Number: 839

Today's topics:
    Re: (?{ code }) block works fine in child rule but not  <clint.olsen@gmail.com>
    Re: (?{ code }) block works fine in child rule but not  xhoster@gmail.com
    Re: (?{ code }) block works fine in child rule but not  xhoster@gmail.com
    Re: (?{ code }) block works fine in child rule but not  <clint.olsen@gmail.com>
    Re: (?{ code }) block works fine in child rule but not  <ben@morrow.me.uk>
    Re: (?{ code }) block works fine in child rule but not  <nospam-abuse@ilyaz.org>
        append to the right instead of the bottom <jiehuang001@gmail.com>
    Re: append to the right instead of the bottom xhoster@gmail.com
    Re: append to the right instead of the bottom <noreply@gunnar.cc>
        Replacing every character in a string with individual <  PaddyPerl@gmail.com
    Re: Replacing every character in a string with individu <mritty@gmail.com>
    Re: Replacing every character in a string with individu  usenet@DavidFilmer.com
    Re: Replacing every character in a string with individu <mritty@gmail.com>
    Re: Replacing every character in a string with individu <noreply@gunnar.cc>
    Re: Replacing every character in a string with individu  PaddyPerl@gmail.com
    Re: Replacing every character in a string with individu <noreply@gunnar.cc>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 10 Sep 2007 15:30:51 -0500
From: Clint Olsen <clint.olsen@gmail.com>
Subject: Re: (?{ code }) block works fine in child rule but not in parent
Message-Id: <slrnfebabr.5mi.clint.olsen@belle.0lsen.net>

On 2007-09-10, Ben Morrow <ben@morrow.me.uk> wrote:
> This last regex contains both interpolation and code escapes which do not
> come from a qr//. This is what is forbidden unless you have re 'eval'.
> The documentation does say this, but it is not entirely clear.  (Patches
> welcome :) ).
>
> The solution (if you want to avoid re 'eval', which is a good idea) is
> to precompile the code assertion into a qr// as well:
>
>     my $escape_id = qr/\\(\S+)(?=\s)/;
>     my $simple_id = qr/([a-zA-Z_][a-zA-Z0-9_\$]*/;
>     my $capture   = qr/(?{ $foo = $^N })/;
>     my $id        = qr/ ($simple_id | $escape_id) $capture /x;
>
> Note that all of your '/o's are redundant, as you are using qr//.

Yeah, this is an old habit.  Thanks for the reminder.  What'd I'd like to
do is to be able to use combinations of $^N and $^R to be able to roll up
lexical subexpressions in a nice way so that I don't have to rewrite stuff
and also be able to have an entire RE that describes every lexical
possibility ala what lex/flex would do if you were constructing a lexical
analyzer.  So, that would also mean using the minimum required capture
buffers to extract the tokens and any state necessary.

> FWIW, insulting Perl in a Perl newsgroup is not likely to be a way to get
> useful advice...

Point taken.  However, I often to subscribe to the notion that I both love
and hate Perl simulataneously.  It's really awesome when it works well, and
a sonofabitch to debug otherwise :)

Thanks to all of you for your help and suggestions.

-Clint


------------------------------

Date: 10 Sep 2007 20:45:14 GMT
From: xhoster@gmail.com
Subject: Re: (?{ code }) block works fine in child rule but not in parent
Message-Id: <20070910164516.841$ux@newsreader.com>

Ben Morrow <ben@morrow.me.uk> wrote:
> Quoth Clint Olsen <clint.olsen@gmail.com>:
> > On 2007-09-10, xhoster@gmail.com <xhoster@gmail.com> wrote:
> > > Apparently the problem is that you think the answer is to start out
> > > by reading the source code rather than the documentation.
> > >
> > > The behavior you describe is documented in both perldoc perlre and
> > > perldoc re.
> >
> > Yes, I read that section, but I'm not relying on any runtime
> > interpolation to get my work done (or did I misread something?).
>
> But you are. From your original post:
>
> | my $escaped_identifier = qr/\\(\S+)(?=\s)/o;
> | my $simple_identifier = qr/([a-zA-Z_][a-zA-Z0-9_\$]*)/o;
> | my $identifier = qr/  ($simple_identifier
> |                     | $escaped_identifier)
> |                       (?{ $foo = $^N })
> |                    /xo;
>
> This last regex contains both interpolation and code escapes which do
> not come from a qr//. This is what is forbidden unless you have re
> 'eval'.

Does this requirement make sense?  Why does it matter if some part
of the regex which *isn't* the code part comes from an interpolation?
Was this just easier to implement than whatever makes more sense
would be?

> The documentation does say this, but it is not entirely clear.

I would argue that it is entirely anti-clear.

       For the purpose of this pragma, interpolation of precom-
       piled regular expressions (i.e., the result of "qr//") is
       not considered variable interpolation.


                 For reasons of security, this construct is for-
                 bidden if the regular expression involves run-
                 time interpolation of variables, unless the per-
                 ilous "use re 'eval'" pragma has been used (see
                 re), or the variables contain results of "qr//"
                 operator (see "qr/STRING/imosx" in perlop).

But all of the interpolated variables in his example do contain the results
of qr//.

It seems clear but the apparently clear meaning is not correct.

> (Patches welcome :) ).

I'm no longer confident that I know what it does, so I don't know
what it should say.

                 For reasons of security, this construct is for-
                 bidden if the regular expression involves run-
                 time interpolation of variables, unless the per-
                 ilous "use re 'eval'" pragma has been used (see
                 re), even if those variables are results of "qr//".
                 However, variables containing qr// compiled forms of this
                 construct can themselves be interpolated into other
                 regular expressions which involve other interpolations.



Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: 10 Sep 2007 20:53:28 GMT
From: xhoster@gmail.com
Subject: Re: (?{ code }) block works fine in child rule but not in parent
Message-Id: <20070910165330.968$RF@newsreader.com>

Clint Olsen <clint.olsen@gmail.com> wrote:
> On 2007-09-10, xhoster@gmail.com <xhoster@gmail.com> wrote:
> > Apparently the problem is that you think the answer is to start out by
> > reading the source code rather than the documentation.
> >
> > The behavior you describe is documented in both perldoc perlre and
> > perldoc re.
>
> Yes, I read that section, but I'm not relying on any runtime
> interpolation to get my work done (or did I misread something?).

Ah, now I see.  Now I prefer your reading of the docs to my reading of the
docs (or I would if it weren't for the sad fact that the worse reading
seems to be correct one).

> I want
> to avoid using switches that are 'perilous' when it isn't required.

Maybe I'm missing something here, but I would argue that if you are running
in a hostile environment, it isn't enough to refuse to use re 'eval', you
should also run under taint.  And once you use taint appropriately, I don't
see why use re 'eval' would be perilous.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Mon, 10 Sep 2007 16:52:33 -0500
From: Clint Olsen <clint.olsen@gmail.com>
Subject: Re: (?{ code }) block works fine in child rule but not in parent
Message-Id: <slrnfebf51.5mi.clint.olsen@belle.0lsen.net>

On 2007-09-10, xhoster@gmail.com <xhoster@gmail.com> wrote:
> Ah, now I see.  Now I prefer your reading of the docs to my reading of
> the docs (or I would if it weren't for the sad fact that the worse
> reading seems to be correct one).
>
> Maybe I'm missing something here, but I would argue that if you are
> running in a hostile environment, it isn't enough to refuse to use re
> 'eval', you should also run under taint.  And once you use taint
> appropriately, I don't see why use re 'eval' would be perilous.

Well, what I'm writing is just a parser (lexer for this part).  It's not
technically a 'hostile' environment in the sense of worrying about someone
trying to execute rogue code, but I just hesitate when I see warnings like
this.

I don't think taint applies in my application.  At least intuitively I
don't believe it does.  The example uses $^X which isn't immediately
obvious why $x is tainted.  I definitely want to extract substrings from my
substitutions since this is the lexical analysis phase of the parsing.  I'm
only using s/// because it performs much much better than m/// with the \G
assertion.

Thanks,

-Clint


------------------------------

Date: Tue, 11 Sep 2007 00:57:44 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: (?{ code }) block works fine in child rule but not in parent
Message-Id: <o82hr4-b56.ln1@osiris.mauzo.dyndns.org>


Quoth xhoster@gmail.com:
> Ben Morrow <ben@morrow.me.uk> wrote:
> >
> > This last regex contains both interpolation and code escapes which do
> > not come from a qr//. This is what is forbidden unless you have re
> > 'eval'.
> 
> Does this requirement make sense?  Why does it matter if some part
> of the regex which *isn't* the code part comes from an interpolation?

It doesn't... :)

> Was this just easier to implement than whatever makes more sense
> would be?

AFAICS (and the guts of the regex engine are *very* hard to follow) it
is a consequence of Perl's regexen doing two-fold interpolation. First
the qr is stringified and interpolated, and then the result is compiled.
Perl has no way of knowing which bits came from where. (Any ideas anyone
may have had about qrs being more efficient when interpolated into other
regexen later are, unfortunately, false.[1]) 

However, when a qr is interpolated, it makes a record of how many eval
groups it contained; then when the regex engine compiles an eval group,
it checks to see whether it has met more eval groups so far than have
been interpolated from qrs; if so, it throws the 'Eval-group not
allowed' error. This is, of course, horribly crude, but it's hard to see
what else could be done without completely re-working the way the regex
engine operates.

[1] Although this may change in 5.10. I understand a lot of work has
gone into the regex engine; in part making qrs re-use their compiled
form more often. I'm afraid I don't know the details...

> > The documentation does say this, but it is not entirely clear.
> 
> I would argue that it is entirely anti-clear.

Heh. Yes, I agree. That was an understatement... :) 

> > (Patches welcome :) ).
> 
> I'm no longer confident that I know what it does, so I don't know
> what it should say.
> 
>                  For reasons of security, this construct is for-
>                  bidden if the regular expression involves run-
>                  time interpolation of variables, unless the per-
>                  ilous "use re 'eval'" pragma has been used (see
>                  re), even if those variables are results of "qr//".
>                  However, variables containing qr// compiled forms of this
>                  construct can themselves be interpolated into other
>                  regular expressions which involve other interpolations.

    For reasons of security, this construct is forbidden if the regular
    expression contains variable interpolations, unless it results from
    the interpolation of a C<qr//>, or C<use re 'eval'> is in effect.

Ben



------------------------------

Date: Tue, 11 Sep 2007 01:05:49 +0000 (UTC)
From:  Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: (?{ code }) block works fine in child rule but not in parent
Message-Id: <fc4pld$ckp$1@agate.berkeley.edu>

[A complimentary Cc of this posting was sent to

<xhoster@gmail.com>], who wrote in article <20070910164516.841$ux@newsreader.com>:
> > This last regex contains both interpolation and code escapes which do
> > not come from a qr//. This is what is forbidden unless you have re
> > 'eval'.
> 
> Does this requirement make sense?  Why does it matter if some part
> of the regex which *isn't* the code part comes from an interpolation?
> Was this just easier to implement than whatever makes more sense
> would be?

Right.  Basically, it was "either I make this feature secure quick, or
it won't make it into v5.6".  The "proper" solution would mean a MAJOR
rehaul of how REx engine interacts with the lexer.

Hope this helps,
Ilya


------------------------------

Date: Mon, 10 Sep 2007 13:43:59 -0700
From:  Jie <jiehuang001@gmail.com>
Subject: append to the right instead of the bottom
Message-Id: <1189457039.754685.115170@g4g2000hsf.googlegroups.com>


I am thinking if there is a way to append strings to the right in
perl.

for example, i want to create an output file like below:
A X
B Y
C Z

based on the following 2 arrays.
@array_1 = ("A", "B", "C");
@array_2 = ("X", "Y", "Z");

I am thinking to write some code like below:
#############################
@arrays = ("array_1", "array_2")
foreach $array (@arrays) {
    open OUTPUT ">>final_file.txt";
    $this_column ="";
    foreach (@{$array}) {
        $this_column .= $_;
    }
    print OUTPUT $this_column;
}
##############################

however, apparently, the above code will generates something like
below, instead of what i desired....  So, please help!!!! A temporary
2-dimentional array might not be a good idea though, because my real
data is really big....

A
B
C
X
Y
X



------------------------------

Date: 10 Sep 2007 21:08:00 GMT
From: xhoster@gmail.com
Subject: Re: append to the right instead of the bottom
Message-Id: <20070910170803.119$Nd@newsreader.com>

Jie <jiehuang001@gmail.com> wrote:
> I am thinking if there is a way to append strings to the right in
> perl.

That is the only way to append strings.  That is what append means.  (Well,
assuming by "right" you mean when most people to when talking about
strings, the end.)


>
> for example, i want to create an output file like below:
> A X
> B Y
> C Z
>
> based on the following 2 arrays.
> @array_1 = ("A", "B", "C");
> @array_2 = ("X", "Y", "Z");
>
> I am thinking to write some code like below:
> #############################
> @arrays = ("array_1", "array_2")
> foreach $array (@arrays) {
>     open OUTPUT ">>final_file.txt";

If you are going to post example code, please make sure it works
first.  Unless syntax errors are what your example code is supposed
to be an example of.

>     $this_column ="";
>     foreach (@{$array}) {
>         $this_column .= $_;
>     }
>     print OUTPUT $this_column;
> }
> ##############################
>
> however, apparently, the above code will generates something like
> below, instead of what i desired....  So, please help!!!! A temporary
> 2-dimentional array might not be a good idea though, because my real
> data is really big....

You already have a 2-dimensional array, albeit one in which one of
the dimensions is in the symbol table.  Your questions are more likely to
be taken seriously here if you follow the guide-lines by, for example,
using "strict" and thus using lexical variables rather than the symbol
table, and posting *real* code.

Anyway, the issues involved here seem to be morally equivalent to the "how
to tranpose a huge text file" you posted last month.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Tue, 11 Sep 2007 00:09:37 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: append to the right instead of the bottom
Message-Id: <5kltleF4d7caU1@mid.individual.net>

Jie wrote:
> i want to create an output file like below:
> A X
> B Y
> C Z
> 
> based on the following 2 arrays.
> @array_1 = ("A", "B", "C");
> @array_2 = ("X", "Y", "Z");

     while ( my $col_1 = shift @array_1 ) {
         print "$col_1 ", shift @array_2, "\n";
     }

> I am thinking to write some code like below:
> #############################
> @arrays = ("array_1", "array_2")
> foreach $array (@arrays) {
>     open OUTPUT ">>final_file.txt";
>     $this_column ="";
>     foreach (@{$array}) {
>         $this_column .= $_;
>     }
>     print OUTPUT $this_column;
> }
> ##############################
> 
> however, apparently, the above code will generates something like

<snip>

> A
> B
> C
> X
> Y
> X

Apparently?? On my computer, and leaving the syntax errors aside, your 
code rather outputs

ABCXYZ

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: Mon, 10 Sep 2007 22:05:26 -0000
From:  PaddyPerl@gmail.com
Subject: Replacing every character in a string with individual <img> tags
Message-Id: <1189461926.385823.215380@r29g2000hsg.googlegroups.com>

Hi everybody!

OK, so what I want to do is to display a string in html as images
rather than text. On the server I have individual image files a.gif,
b.gif, c.gif ... 0.gif, 1.gif, 2.gif, 3.gif etc...
So if the string is Hello, I would like to get the following output:
<img src="H.gif"><img src="e"><img src="l"><img src="l"><img src="o">

I would like to translate all characters, numbers and special
characters as well.

Shouldn't this be quite easy? I just don't know how to do it though...
Thanks a lot for your help!

/Paddy



------------------------------

Date: Mon, 10 Sep 2007 15:10:35 -0700
From:  Paul Lalli <mritty@gmail.com>
Subject: Re: Replacing every character in a string with individual <img> tags
Message-Id: <1189462235.441157.176830@57g2000hsv.googlegroups.com>

On Sep 10, 6:05 pm, PaddyP...@gmail.com wrote:
> Hi everybody!
>
> OK, so what I want to do is to display a string in html as images
> rather than text. On the server I have individual image files a.gif,
> b.gif, c.gif ... 0.gif, 1.gif, 2.gif, 3.gif etc...
> So if the string is Hello, I would like to get the following output:
> <img src="H.gif"><img src="e"><img src="l"><img src="l"><img src="o">
>
> I would like to translate all characters, numbers and special
> characters as well.
>
> Shouldn't this be quite easy? I just don't know how to do it
> though...

In general, you should always post your best attempt when asking for
help.

However, I'm bored.

$string = s/(.)/<img src="$1.gif">/g;

Paul Lalli



------------------------------

Date: Mon, 10 Sep 2007 23:47:22 -0000
From:  usenet@DavidFilmer.com
Subject: Re: Replacing every character in a string with individual <img> tags
Message-Id: <1189468042.319831.180820@19g2000hsx.googlegroups.com>

On Sep 10, 3:10 pm, Paul Lalli <mri...@gmail.com> wrote:
> $string = s/(.)/<img src="$1.gif">/g;

Surely Paul meant
   $string =~ s/(.)/<img src="$1.gif">/g;

But you need to consider if double-quote is one of the "special
characters" that you want to allow...



--
David Filmer (http://DavidFilmer.com)




------------------------------

Date: Mon, 10 Sep 2007 16:53:16 -0700
From:  Paul Lalli <mritty@gmail.com>
Subject: Re: Replacing every character in a string with individual <img> tags
Message-Id: <1189468396.821343.192610@d55g2000hsg.googlegroups.com>

On Sep 10, 7:47 pm, use...@DavidFilmer.com wrote:
> On Sep 10, 3:10 pm, Paul Lalli <mri...@gmail.com> wrote:
>
> > $string = s/(.)/<img src="$1.gif">/g;
>
> Surely Paul meant
>    $string =~ s/(.)/<img src="$1.gif">/g;

Took me five reads to realize you're pointing out the fact that I used
= where I meant =~

Thanks,
Paul Lalli



------------------------------

Date: Tue, 11 Sep 2007 01:59:54 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Replacing every character in a string with individual <img> tags
Message-Id: <5km446F4g98nU1@mid.individual.net>

usenet@DavidFilmer.com wrote:
> 
>    $string =~ s/(.)/<img src="$1.gif">/g;
> 
> But you need to consider if double-quote is one of the "special
> characters" that you want to allow...

Special? In what sense?

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: Tue, 11 Sep 2007 00:07:06 -0000
From:  PaddyPerl@gmail.com
Subject: Re: Replacing every character in a string with individual <img> tags
Message-Id: <1189469226.920386.77070@r29g2000hsg.googlegroups.com>

>    $string =3D~ s/(.)/<img src=3D"$1.gif">/g;
>

Great! Thanks alot to all of you!
But... I've now realized I have another problem...
With special characters such as =C4, I decided to go for filenames such
as AE.gif to avoid problems with encoding, and first tried a
$string =3D~ s/=C4/AE/g;
prior to the above, but then of course I'm left with two images (A and
E) instead of the desired AE.gif. Is there a way to avoid a lot of if
statements here?

Paddy



------------------------------

Date: Tue, 11 Sep 2007 02:30:39 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Replacing every character in a string with individual <img> tags
Message-Id: <5km5tsF4heb4U1@mid.individual.net>

PaddyPerl@gmail.com wrote:
>>    $string =~ s/(.)/<img src="$1.gif">/g;
> 
> Great! Thanks alot to all of you!
> But... I've now realized I have another problem...
> With special characters such as �, I decided to go for filenames such
> as AE.gif to avoid problems with encoding, and first tried a
> $string =~ s/�/AE/g;
> prior to the above, but then of course I'm left with two images (A and
> E) instead of the desired AE.gif. Is there a way to avoid a lot of if
> statements here?

Use a hash.

     my %special = (
         '�' => 'AE',
         '�' => 'OE',
     );

     s/(.)/'<img src="'.($special{$1} or $1).'.gif">'/eg;

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 839
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[29595] in Perl-Users-Digest

Perl-Users Digest, Issue: 839 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Sep 10 21:09:42 2007

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 10 21:09:42 2007