[32396] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 3663 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Apr 11 14:09:24 2012

Date: Wed, 11 Apr 2012 11:09:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 11 Apr 2012     Volume: 11 Number: 3663

Today's topics:
    Re: f python? (Seymour J.)
    Re: f python? <kaz@kylheku.com>
    Re: f python? <rweikusat@mssgmbh.com>
    Re: f python? <pjb@informatimago.com>
    Re: Help with pattern matching <ben@morrow.me.uk>
    Re: Help with pattern matching <artmerar@yahoo.com>
    Re: Help with pattern matching <NoSpamPleaseButThisIsValid3@gmx.net>
    Re: Help with pattern matching <justin.1203@purestblue.com>
    Re: Help with pattern matching <rweikusat@mssgmbh.com>
    Re: Help with pattern matching <rweikusat@mssgmbh.com>
    Re: Help with pattern matching <rweikusat@mssgmbh.com>
    Re: Help with pattern matching <artmerar@yahoo.com>
    Re: Help with pattern matching <m@rtij.nl.invlalid>
    Re: Help with pattern matching <ben@morrow.me.uk>
    Re: Help with pattern matching <rweikusat@mssgmbh.com>
    Re: Help with pattern matching (Tim McDaniel)
    Re: Help with pattern matching <rweikusat@mssgmbh.com>
    Re: Help with perl special variable <ben@morrow.me.uk>
    Re: Help with perl special variable riccardo.marini@gmail.com
    Re: Help with perl special variable riccardo.marini@gmail.com
    Re: Help with perl special variable <ben@morrow.me.uk>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 10 Apr 2012 21:09:20 -0400
From: Shmuel (Seymour J.) Metz <spamtrap@library.lspace.org.invalid>
Subject: Re: f python?
Message-Id: <4f84d9c0$9$fuzhry+tra$mr2ice@news.patriot.net>

In <87wr5nl54w.fsf@sapphire.mobileactivedefense.com>, on 04/10/2012
   at 09:10 PM, Rainer Weikusat <rweikusat@mssgmbh.com> said:

>'car' and 'cdr' refer to cons cells in Lisp, not to strings. How the
>first/rest terminology can be sensibly applied to 'C strings' (which
>are similar to linked-lists in the sense that there's a 'special
>termination value' instead of an explicit length)

A syringe is similar to a sturgeon in the sense that they both start
with S. LISP doesn't have arrays, and C doesn't allow you to insert
into the middle of an array.

-- 
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not
reply to spamtrap@library.lspace.org



------------------------------

Date: Wed, 11 Apr 2012 14:06:53 +0000 (UTC)
From: Kaz Kylheku <kaz@kylheku.com>
Subject: Re: f python?
Message-Id: <20120411065129.546@kylheku.com>

["Followup-To:" header set to comp.lang.lisp.]
On 2012-04-11, Shmuel Metz <spamtrap@library.lspace.org.invalid> wrote:
> In <87wr5nl54w.fsf@sapphire.mobileactivedefense.com>, on 04/10/2012
>    at 09:10 PM, Rainer Weikusat <rweikusat@mssgmbh.com> said:
>
>>'car' and 'cdr' refer to cons cells in Lisp, not to strings. How the
>>first/rest terminology can be sensibly applied to 'C strings' (which
>>are similar to linked-lists in the sense that there's a 'special
>>termination value' instead of an explicit length)
>
> A syringe is similar to a sturgeon in the sense that they both start
> with S. LISP doesn't have arrays, and C doesn't allow you to insert
> into the middle of an array.

Lisp, however, has arrays. (Not to mention hash tables, structures, and
classes). Where have you been since 1960-something?

  (let ((array #(1 2 3 4)))
    (aref array 3)) ;; -> 4, O(1) access


------------------------------

Date: Wed, 11 Apr 2012 15:49:26 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: f python?
Message-Id: <877gxmz5l5.fsf@sapphire.mobileactivedefense.com>

Shmuel (Seymour J.) Metz <spamtrap@library.lspace.org.invalid> writes:
> In <87wr5nl54w.fsf@sapphire.mobileactivedefense.com>, on 04/10/2012
>    at 09:10 PM, Rainer Weikusat <rweikusat@mssgmbh.com> said:
>
>>'car' and 'cdr' refer to cons cells in Lisp, not to strings. How the
>>first/rest terminology can be sensibly applied to 'C strings' (which
>>are similar to linked-lists in the sense that there's a 'special
>>termination value' instead of an explicit length)
>
> A syringe is similar to a sturgeon in the sense that they both start
> with S.

And the original definition of 'idiot' is 'a guy who cannot learn
because he is too cocksure to already know everything'. Not that this
would matter in the given context ...

> LISP doesn't have arrays,

Lisp has arrays.

> and C doesn't allow you to insert
> into the middle of an array.

Well, of course it does: You just have to move the content of all
memory cells 'after' the new insert 'one up'. But unless I'm very much
mistaken, the topic was "first and rest" (car and cdr), as the terms
could be used with a C string and not "whatever Shmuel happens to
believe to know" ...


------------------------------

Date: Wed, 11 Apr 2012 17:32:42 +0200
From: "Pascal J. Bourguignon" <pjb@informatimago.com>
Subject: Re: f python?
Message-Id: <87aa2iz3l1.fsf@kuiper.lan.informatimago.com>

Shmuel (Seymour J.) Metz <spamtrap@library.lspace.org.invalid> writes:

> In <87wr5nl54w.fsf@sapphire.mobileactivedefense.com>, on 04/10/2012
>    at 09:10 PM, Rainer Weikusat <rweikusat@mssgmbh.com> said:
>
>>'car' and 'cdr' refer to cons cells in Lisp, not to strings. How the
>>first/rest terminology can be sensibly applied to 'C strings' (which
>>are similar to linked-lists in the sense that there's a 'special
>>termination value' instead of an explicit length)
>
> A syringe is similar to a sturgeon in the sense that they both start
> with S. LISP doesn't have arrays, and C doesn't allow you to insert
> into the middle of an array.

You're confused. C doesn't have arrays.  Lisp has arrays.
C only has vectors (Lisp has vectors too).

That C calls its vectors "array", or its bytes "char" doesn't change the
fact that C has no array and no character.


cl-user> (make-array '(3 4 5) :initial-element 42)
#3A(((42 42 42 42 42) (42 42 42 42 42) (42 42 42 42 42) (42 42 42 42 42))
    ((42 42 42 42 42) (42 42 42 42 42) (42 42 42 42 42) (42 42 42 42 42))
    ((42 42 42 42 42) (42 42 42 42 42) (42 42 42 42 42) (42 42 42 42 42)))

cl-user> (make-array 10 :initial-element 42)
#(42 42 42 42 42 42 42 42 42 42)



-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.


------------------------------

Date: Wed, 11 Apr 2012 14:25:38 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Help with pattern matching
Message-Id: <ibdg59-9ut.ln1@anubis.morrow.me.uk>


Quoth Wolf Behrenhoff <NoSpamPleaseButThisIsValid3@gmx.net>:
> Am 11.04.2012 11:51, schrieb Justin C:
> > On 2012-04-11, ExecMan <artmerar@yahoo.com> wrote:
> > 
> >>   $count = grep { /$url/ } <FILE>;
> >>
> >> The $url contains slashes, how can I get around this??
> > 
> > Don't use / as a regex delimiter. See perlretut and search for
> > 'delimiters'.
> 
> What is wrong with / as delimiter?

I suspect Justin is making the same mistake as the OP, and confusing

    my $url = "http://foo";
    /$url/;

, which works just fine, with

    /http://foo/;

, which doesn't, and needs to be rewritten as

    m!http://foo!;

Perl's string parsing rules are not as simple as they appear at first
glance (...to say the least).

Ben



------------------------------

Date: Wed, 11 Apr 2012 06:31:28 -0700 (PDT)
From: ExecMan <artmerar@yahoo.com>
Subject: Re: Help with pattern matching
Message-Id: <f1642993-65b7-4707-8c2d-5bff7cc69a53@y13g2000yqj.googlegroups.com>

On Apr 11, 5:11=A0am, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth ExecMan <artme...@yahoo.com>:
>
>
>
>
>
> > I have a file containing URL's, and I am trying to scan a log and get
> > a count of the matching string. =A0But, I think because the input
> > contains slashes I am not getting a match. =A0 Any help please?? =A0I'm
> > pretty new to this:
>
> > #!/usr/bin/perl
> > open (FILE,"monday.csv") or die $!;
> > while(<FILE>) {
> > =A0 chomp($_);
> > =A0 ($tag, $url) =3D split(',', $_);
> > =A0 $url_tags{$tag} =3D $url;
> > =A0 $url_counts{$tag} =3D 0;
> > }
> > close(FILE);
>
> > open (FILE,"<","/home/httpdlogs/apache2/access_log") or die "Can't
> > open apache log!";
> > foreach $tag (keys(%url_tags)) {
> > =A0 $url =3D $url_tags{$tag};
> > =A0 $count =3D grep { /$url/ } <FILE>;
>
> The slashes are not the problem. Perl isn't like shell, which expands
> variable before doing word splitting: perl finds the end of /$url/ at
> compile time, before it knows whether $url will contain slashes or not.
>
> You have two problems here. The first and most obvious is that <FILE>,
> in list context, reads the file to the end, breaks it into lines, and
> *leaves the file pointer at the end of the file*. That means that next
> time round the loop, the file pointer is already at the end, and <FILE>
> returns the empty list.
>
> There are several ways to fix this. The simplest is to read the file
> once into an array, and run the grep over the array instead:
>
> =A0 =A0 open (FILE, "<", "/...") or die ...;
> =A0 =A0 @log =3D <FILE>;
> =A0 =A0 close FILE;
>
> =A0 =A0 foreach $tag (keys(%url_tags)) {
> =A0 =A0 =A0 =A0 $url =3D $url_tags{$tag};
> =A0 =A0 =A0 =A0 $count =3D grep { /$url/ } @log;
> =A0 =A0 =A0 =A0 ...
> =A0 =A0 }
>
> (I'm deliberately omitting several important correction to that code
> I'll mention below, so you can see which changes are relevant here.)
>
> The second problem is that while perl looks for the closing slash before
> interpolation $url, it looks for regex metacharacters afterwards. This
> means that if one of your URLs contains, say, '+', perl will interpret
> it as a 'match at least once' pattern character. To fix this you need
> the \Q escape, which says 'quote everything from here until \E':
>
> =A0 =A0 $count =3D grep { /\Q$url/ } @log;
>
> Some more general remarks:
>
> You don't appear to be using 'warnings' or 'strict'. Until you know
> enough to know better, start *every* Perl program with
>
> =A0 =A0 use warnings;
> =A0 =A0 use strict;
>
> This will then start yelling at you about 'Global symbol requires
> explicit package name': this means you need to go through and declare
> all your variables with 'my'. The point of this is that it makes it much
> less likely you'll reuse a variable without meaning to by mistake, or
> that you'll misspell a variable name and get a completely new variable
> with no warning.
>
> You should also be keeping your filehandles in 'my' variables, for the
> same reason. As your program gets longer, it becomes increasingly likely
> you will use FILE for something else somewhere else, and you'll get a
> mess. 'my' variables aren't visible outside the block they're declared
> in, so that can't happen.
>
> If that file 'monday.csv' is actually CSV, generated from some other
> program, you can't safely parse it like that. CSV has (rather
> ill-defined) quoting rules, to allow entries to contain ',', and a lot
> of programs randomly quote CSV when they didn't really need to. Reading
> it is a lot harder than it seems, and you should use a module from CPAN,
> such as Text::CSV.
>
> Ben


Hi,

I got around what I thought was a slash issue like this.  Not sure if
it is the fastest thing:

foreach $tag (keys(%url_tags)) {
  open (FILE,"/home/httpdlogs/apache2/access_log") or die "Can't open
apache log!";
  $url =3D $url_tags{$tag};
  $url =3D~ s/([\\\/\^\$\*\+\?\=3D\@\{\}\[\]\(\)\<\>])/\\$&/g;
  $count =3D grep { /$url/ } <FILE>;
  $url_counts{$tag} =3D $count;
  close(FILE);
}

Also, about reading the file into an array.  Problem is the file could
be a couple of million lines long.  Isn't that a lot to be reading
into an array?  If not, and it would be faster, then maybe I'll change
the code.


------------------------------

Date: Wed, 11 Apr 2012 16:07:06 +0200
From: Wolf Behrenhoff <NoSpamPleaseButThisIsValid3@gmx.net>
Subject: Re: Help with pattern matching
Message-Id: <4f85900b$0$6625$9b4e6d93@newsspool2.arcor-online.net>

Am 11.04.2012 15:31, schrieb ExecMan:
> Hi,
> 
> I got around what I thought was a slash issue like this.  Not sure if
> it is the fastest thing:
> 
> foreach $tag (keys(%url_tags)) {
>   open (FILE,"/home/httpdlogs/apache2/access_log") or die "Can't open
> apache log!";
>   $url = $url_tags{$tag};
>   $url =~ s/([\\\/\^\$\*\+\?\=\@\{\}\[\]\(\)\<\>])/\\$&/g;

What are you doing?!!! Way to many slashes to be able to read this.
I guess you are trying to achieve what just a \Q would do.

Did you even read Ben's and/or my answer?

>   $count = grep { /$url/ } <FILE>;
>   $url_counts{$tag} = $count;
>   close(FILE);
> }
> 
> Also, about reading the file into an array.  Problem is the file could
> be a couple of million lines long.  Isn't that a lot to be reading
> into an array?  If not, and it would be faster, then maybe I'll change
> the code.

Now you are reading the file multiple times. Do you really think that is
better?

If the log file is really too large (probably it isn't) then read it
line by line as suggested in my previous posting in b2).

- Wolf


------------------------------

Date: Wed, 11 Apr 2012 15:04:51 +0100
From: Justin C <justin.1203@purestblue.com>
Subject: Re: Help with pattern matching
Message-Id: <3lfg59-fla.ln1@zem.masonsmusic.co.uk>

On 2012-04-11, Wolf Behrenhoff <NoSpamPleaseButThisIsValid3@gmx.net> wrote:
> Am 11.04.2012 11:51, schrieb Justin C:
>> On 2012-04-11, ExecMan <artmerar@yahoo.com> wrote:
>> 
>> [snip]
>> 
>>>   $count = grep { /$url/ } <FILE>;
>>>
>>> The $url contains slashes, how can I get around this??
>> 
>> Don't use / as a regex delimiter. See perlretut and search for
>> 'delimiters'.
>
> What is wrong with / as delimiter?

Apparently nothing, I was misunderstanding where his problem was.

   Justin.

-- 
Justin C, by the sea.


------------------------------

Date: Wed, 11 Apr 2012 15:31:10 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: Help with pattern matching
Message-Id: <87mx6iz6fl.fsf@sapphire.mobileactivedefense.com>

Wolf Behrenhoff <NoSpamPleaseButThisIsValid3@gmx.net> writes:
> Am 11.04.2012 11:51, schrieb Justin C:
>> On 2012-04-11, ExecMan <artmerar@yahoo.com> wrote:
>> 
>> [snip]
>> 
>>>   $count = grep { /$url/ } <FILE>;
>>>
>>> The $url contains slashes, how can I get around this??
>> 
>> Don't use / as a regex delimiter. See perlretut and search for
>> 'delimiters'.
>
> What is wrong with / as delimiter?

Nothing. There's 'something wrong'/ an inherent limitation with the
concept of 'a delimiter characters', namely, that occurrence of this
character inside a pattern will need to be escaped. As an alternative,
Perl supports using arbitrary delimiter characters so that a character
which doesn't appear inside the pattern can be used if such a
character exists. As hinted at by the second part of this sentence,
this doesn't really solve to 'problem' because it is conceivable that
no suitable character can be found. Further drawbacks are that it adds
significant 'optical noise' to the text and that it is not compatible
with the regular expression syntax used by other UNIX(*) tools
anymore. Because of this, I have so far just continued to use the
/-separator + /-escaping syntax also supported by, say, sed.



------------------------------

Date: Wed, 11 Apr 2012 15:32:21 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: Help with pattern matching
Message-Id: <87iph6z6dm.fsf@sapphire.mobileactivedefense.com>

Ben Morrow <ben@morrow.me.uk> writes:

[...]


> I suspect Justin is making the same mistake as the OP, and confusing
>
>     my $url = "http://foo";
>     /$url/;
>
> , which works just fine, with
>
>     /http://foo/;
>
> , which doesn't, and needs to be rewritten as
>
>     m!http://foo!;

/http:\/\/foo/ would also work.


------------------------------

Date: Wed, 11 Apr 2012 15:43:46 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: Help with pattern matching
Message-Id: <87ehruz5ul.fsf@sapphire.mobileactivedefense.com>

ExecMan <artmerar@yahoo.com> writes:

[...]

> Also, about reading the file into an array.  Problem is the file could
> be a couple of million lines long.  Isn't that a lot to be reading
> into an array?  If not, and it would be faster, then maybe I'll change
> the code.

Whether it is 'a lot' depends on how much memory you want to dedicate
to this task and if you're reasonably sure that your the size of your
input will never exceed that. The latter is especially problematic
because an out-of-memory failure happening because of 'a large input'
is a very bad situation for rewriting the code supposed to process
that input. There's also the question if this task is so much more
important than all other tasks running on the same computer that
you're willing to maximize its resource usage in order to minimize the
wallclock time it needs to complete.

Except in cases where it is known that the input file will always be
'rather small', eg, if it is a configuration file, the safe,
conservative choice is to process it line-by-line and assume that the
buffering layer between the Perl code and the system I/O facilities
will employ a 'sensible' buffering strategy.


------------------------------

Date: Wed, 11 Apr 2012 07:38:39 -0700 (PDT)
From: ExecMan <artmerar@yahoo.com>
Subject: Re: Help with pattern matching
Message-Id: <26c54a67-b7c2-4c0e-ab79-81140cb223c9@h4g2000yqj.googlegroups.com>

On Apr 11, 9:07=A0am, Wolf Behrenhoff
<NoSpamPleaseButThisIsVal...@gmx.net> wrote:
> Am 11.04.2012 15:31, schrieb ExecMan:
>
> > Hi,
>
> > I got around what I thought was a slash issue like this. =A0Not sure if
> > it is the fastest thing:
>
> > foreach $tag (keys(%url_tags)) {
> > =A0 open (FILE,"/home/httpdlogs/apache2/access_log") or die "Can't open
> > apache log!";
> > =A0 $url =3D $url_tags{$tag};
> > =A0 $url =3D~ s/([\\\/\^\$\*\+\?\=3D\@\{\}\[\]\(\)\<\>])/\\$&/g;
>
> What are you doing?!!! Way to many slashes to be able to read this.
> I guess you are trying to achieve what just a \Q would do.
>
> Did you even read Ben's and/or my answer?
>
> > =A0 $count =3D grep { /$url/ } <FILE>;
> > =A0 $url_counts{$tag} =3D $count;
> > =A0 close(FILE);
> > }
>
> > Also, about reading the file into an array. =A0Problem is the file coul=
d
> > be a couple of million lines long. =A0Isn't that a lot to be reading
> > into an array? =A0If not, and it would be faster, then maybe I'll chang=
e
> > the code.
>
> Now you are reading the file multiple times. Do you really think that is
> better?
>
> If the log file is really too large (probably it isn't) then read it
> line by line as suggested in my previous posting in b2).
>
> - Wolf


Ok, your solution seems to work.   Nice:


open (FILE,"<","/home/httpdlogs/apache2/access_log") or die "Can't
open log!";
@log =3D <FILE>;
close (FILE);

foreach $tag (keys(%url_tags)) {
  $url =3D $url_tags{$tag};
  $count =3D grep { /\Q$url/ } @log;
  $url_counts{$tag} =3D $count;
}

I'm just worried about a 4 million line file going into an array.  As
long as it does not take up too many resources.  If the file is say,
300MB, that is a lot to put into an array......




------------------------------

Date: Wed, 11 Apr 2012 17:19:40 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: Help with pattern matching
Message-Id: <c1kg59-quf.ln1@news.rtij.nl>

On Wed, 11 Apr 2012 15:43:46 +0100, Rainer Weikusat wrote:

> Except in cases where it is known that the input file will always be
> 'rather small', eg, if it is a configuration file, the safe,
> conservative choice is to process it line-by-line and assume that the
> buffering layer between the Perl code and the system I/O facilities will
> employ a 'sensible' buffering strategy.

This is such important advice. Absolutely spot on. Well worded too. 
Should go in a FAQ somewhere as this subject actually comes up fairly 
often.

M4


------------------------------

Date: Wed, 11 Apr 2012 17:47:55 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Help with pattern matching
Message-Id: <r6pg59-k801.ln1@anubis.morrow.me.uk>


Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
> 
> Nothing. There's 'something wrong'/ an inherent limitation with the
> concept of 'a delimiter characters', namely, that occurrence of this
> character inside a pattern will need to be escaped.

Yes. However, any pattern included in some larger body of text will need
to be delimited somehow, and making human programmers write manual
Pascal strings isn't kind. In the general case (when generating code)
you can use here-docs and Cantor's Diagonal Argument, at least until you
hit perl's line length limit (if there is one: I'm not sure).

> Further drawbacks are that it adds significant 'optical noise' to the
> text and that it is not compatible with the regular expression syntax
> used by other UNIX(*) tools anymore.

Perl's regexes are not compatible with those of other Unix tools in any
case. This is a feature: egrep-style regexen rapidly become unreadable,
especially when used from a language which makes you quote them again.
('...Now you have two problems.')

Ben



------------------------------

Date: Wed, 11 Apr 2012 18:19:51 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: Help with pattern matching
Message-Id: <878vi2xk20.fsf@sapphire.mobileactivedefense.com>

Ben Morrow <ben@morrow.me.uk> writes:
> Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:

[...]


>> Further drawbacks are that it adds significant 'optical noise' to the
>> text and that it is not compatible with the regular expression syntax
>> used by other UNIX(*) tools anymore.
>
> Perl's regexes are not compatible with those of other Unix tools in any
> case.

Contrived counter-example:

[rw@sapphire]~ $echo 'a/b' | sed 's/\//\/\//'
a//b
[rw@sapphire]~ $echo 'a/b' | perl -pe 's/\//\/\//'
a//b


> This is a feature: egrep-style regexen rapidly become unreadable,
> especially when used from a language which makes you quote them
> again.

That's your opinion, not mine. My opinion is that code which uses
non-uniform ad hoc syntax is much harder to read than code which
consistently uses one syntax.


------------------------------

Date: Wed, 11 Apr 2012 17:26:05 +0000 (UTC)
From: tmcd@panix.com (Tim McDaniel)
Subject: Re: Help with pattern matching
Message-Id: <jm4erc$2n7$1@reader1.panix.com>

In article <878vi2xk20.fsf@sapphire.mobileactivedefense.com>,
Rainer Weikusat  <rweikusat@mssgmbh.com> wrote:
>Ben Morrow <ben@morrow.me.uk> writes:
>> Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
>
>[...]
>
>
>>> Further drawbacks are that it adds significant 'optical noise' to
>>> the text and that it is not compatible with the regular expression
>>> syntax used by other UNIX(*) tools anymore.

I believe that there has never been "the" regular expression syntax in
UNIX tools: I believe that most tools chose their own implementations.
Heck, even egrep wasn't compatible with grep.

>> Perl's regexes are not compatible with those of other Unix tools in any
>> case.
>
>Contrived counter-example:
>
>[rw@sapphire]~ $echo 'a/b' | sed 's/\//\/\//'
>a//b
>[rw@sapphire]~ $echo 'a/b' | perl -pe 's/\//\/\//'
>a//b

An experienced Perl person should know what he meant: *in general*,
Perl's regexps are not identical with those of other tools (except
those that use Perl or libraries designed to be Perl regexp).

$ echo 'a(b)' | sed -e 's/(b)/{B}/'
a{B}
$ echo 'a(b)' | perl -pe 's/(b)/{B}/'
a({B})

Or any other case where Perl's metacharacters differ from sed, ed,
grep, egrep, or whatnot.

-- 
Tim McDaniel, tmcd@panix.com


------------------------------

Date: Wed, 11 Apr 2012 19:05:43 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: Help with pattern matching
Message-Id: <87r4vuw3d4.fsf@sapphire.mobileactivedefense.com>

tmcd@panix.com (Tim McDaniel) writes:
>>Ben Morrow <ben@morrow.me.uk> writes:
>>> Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:

[...]


>>> Perl's regexes are not compatible with those of other Unix tools in any
>>> case.
>>
>>Contrived counter-example:
>>
>>[rw@sapphire]~ $echo 'a/b' | sed 's/\//\/\//'
>>a//b
>>[rw@sapphire]~ $echo 'a/b' | perl -pe 's/\//\/\//'
>>a//b
>
> An experienced Perl person should know what he meant: *in general*,

Yes. And as 'experienced Perl persons' both of you should know that I
know that the Perl regular expression (sub-)language is not identical
to the regular expression language used by sed (or anything else) and
that this is besides the point since the topic of conversations was
'delimiter characters for regular expressions': Somebody who use
anything except Perl has to deal with the // convention, anyway,
consequently, keeping it for Perl doesn't make things worse than they
already are.


------------------------------

Date: Wed, 11 Apr 2012 14:20:54 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Help with perl special variable
Message-Id: <m2dg59-apt.ln1@anubis.morrow.me.uk>


Quoth Ricky <riccardo.marini@gmail.com>:
>
> I m reading a regexp and a string from an external file (actually my
> program configuration file).
> After, I need to do a string matching, using previously loaded regex,
> and print the "destination" result, using special variable $2,
> hardcoded in my conf.

This is a FAQ: run 'perldoc -q variable' for the answer.

Ben



------------------------------

Date: Wed, 11 Apr 2012 07:14:06 -0700 (PDT)
From: riccardo.marini@gmail.com
Subject: Re: Help with perl special variable
Message-Id: <4201298.1901.1334153646530.JavaMail.geo-discussion-forums@vbpp14>

Il giorno mercoled=EC 11 aprile 2012 14:30:23 UTC+2, Ricky ha scritto:
> Hi gurus,
> I m reading a regexp and a string from an external file (actually my
> program configuration file).
> After, I need to do a string matching, using previously loaded regex,
> and print the "destination" result, using special variable $2,
> hardcoded in my conf.
>=20
> I have no luck..please give me some hints:
>=20
> Here is my configuration line:
>=20
> rule       (^.*blabla-dir/blabla-file/)(.*$)       /somedirectory/
> somefile/$2     LABEL
>=20
> perl code extract:
> ...
>         /(^rule)\s+(.[^\s]+)\s+([\/\w\$]+)\s+(\w+)/
>        $regex[$i]=3D$2;
>        $destination[$i]=3D$3;
> ..
> ..
>        $_=3D"/blabla-dir/blabla-file/OIAC-ciao";
>        if ( /$regex[$i]/ ) {
>            print "$destination[$i]\n";
>         }
> ..
>=20
> The print only show:
> /somedir/somepath/$2
> but I need to "explode" that $2. Note that if i modify the print as
> follows:
>=20
>  print "$2 $destination[$i]\n";
>=20
> it returns:
> OIAC-ciao /somedir/somepath/$2
>=20
> I'm sure there is a way to do this..but I am a noob programmer
>=20
> Thanks,
> perlnoob

I tried the example in the FAQ with string substitution operator and modifi=
er "eeg".No luck.
I also tried to use "eval", and tried another approach to the problem. Stil=
l no luck.


=20


------------------------------

Date: Wed, 11 Apr 2012 08:12:14 -0700 (PDT)
From: riccardo.marini@gmail.com
Subject: Re: Help with perl special variable
Message-Id: <6407373.186.1334157134546.JavaMail.geo-discussion-forums@vbjl30>

Il giorno mercoled=EC 11 aprile 2012 16:14:06 UTC+2, riccard...@gmail.com h=
a scritto:
> Il giorno mercoled=EC 11 aprile 2012 14:30:23 UTC+2, Ricky ha scritto:
> > Hi gurus,
> > I m reading a regexp and a string from an external file (actually my
> > program configuration file).
> > After, I need to do a string matching, using previously loaded regex,
> > and print the "destination" result, using special variable $2,
> > hardcoded in my conf.
> >=20
> > I have no luck..please give me some hints:
> >=20
> > Here is my configuration line:
> >=20
> > rule       (^.*blabla-dir/blabla-file/)(.*$)       /somedirectory/
> > somefile/$2     LABEL
> >=20
> > perl code extract:
> > ...
> >         /(^rule)\s+(.[^\s]+)\s+([\/\w\$]+)\s+(\w+)/
> >        $regex[$i]=3D$2;
> >        $destination[$i]=3D$3;
> > ..
> > ..
> >        $_=3D"/blabla-dir/blabla-file/OIAC-ciao";
> >        if ( /$regex[$i]/ ) {
> >            print "$destination[$i]\n";
> >         }
> > ..
> >=20
> > The print only show:
> > /somedir/somepath/$2
> > but I need to "explode" that $2. Note that if i modify the print as
> > follows:
> >=20
> >  print "$2 $destination[$i]\n";
> >=20
> > it returns:
> > OIAC-ciao /somedir/somepath/$2
> >=20
> > I'm sure there is a way to do this..but I am a noob programmer
> >=20
> > Thanks,
> > perlnoob
>=20
> I tried the example in the FAQ with string substitution operator and modi=
fier "eeg".No luck.
> I also tried to use "eval", and tried another approach to the problem. St=
ill no luck.

I saw the light..
 ..
my $test=3Deval "\"$destination[$i]\"";
print "$test\n";=20
 ..

finally shows:
/somedir/somepath/OIAC-ciao=20

I really can't understand why it needs that kind of quoting ..
Without it, doesn't works..
Can someone explain??=20
tks


------------------------------

Date: Wed, 11 Apr 2012 17:37:43 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Help with perl special variable
Message-Id: <njog59-k801.ln1@anubis.morrow.me.uk>


Quoth riccardo.marini@gmail.com:
> 
> I saw the light..
> ..
> my $test=eval "\"$destination[$i]\"";
> print "$test\n"; 
> ..
> 
> finally shows:
> /somedir/somepath/OIAC-ciao 
> 
> I really can't understand why it needs that kind of quoting ..
> Without it, doesn't works..
> Can someone explain?? 

I can try... :).

The first thing that happens is this

    "\"$destination[$i]\""

string is evaluated. Since it's double-quoted, it interprets backslashes
specially and interpolates variables, so the result is the string

    '"/somedir/somepath/$2"'

I've single-quoted that string to show that it won't be expanded again
(at least, not until you ask for it).

The next thing that happens is that string is passed to eval. This looks
at the string and interprets it as a Perl expression, in this case the
expression

    "/somedir/somepath/$2"

 . Notice I've removed the single quotes, since this is no longer a
string *containing* a Perl expression, it's an actual Perl expression.
(This distinction is a little confusing, but is at the heart of what
eval does.)

This expression is another double-quoted string, so again variables are
expanded and the result is

    '/somedir/somepath/OIAC-ciao'

(I've single-quoted it for the same reason as before), which string is
then assigned to $test.

eval is much too powerful a tool for the job you are doing. Even if you
aren't worried about the security implications, the multiple layers of
quoting required get very confusing. Assuming you only want to support
numbered variables in your substitutions, I would use something like
this:

    if ( my @capture = /$regex[$i]/ ) {
        my $result = $destination[$i];
        $result =~ s/\$([0-9]+)/$capture[$1]/eg;
        print "$result\n";
    }

This expands the variables in the result explicitly, rather than trying
to get perl to do it for you with eval.

Ben



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 3663
***************************************


home help back first fref pref prev next nref lref last post