[31359] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2611 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Sep 24 21:09:42 2009

Date: Thu, 24 Sep 2009 18:09:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 24 Sep 2009     Volume: 11 Number: 2611

Today's topics:
        FAQ 5.34 How do I close a file descriptor by number? <brian@theperlreview.com>
        FAQ 9.18 How do I decode a MIME/BASE64 string? <brian@theperlreview.com>
    Re: How to check if a module is installed in Strawberry <RedGrittyBrick@spamweary.invalid>
    Re: How to check if a module is installed in Strawberry <tadmc@seesig.invalid>
    Re: Log4perl -- how to copy FATALs to STDERR <Peter@PSDT.com>
        open3 "sh -c" problem with pipes or redirect <vanospam@cox.net>
    Re: open3 "sh -c" problem with pipes or redirect <derykus@gmail.com>
    Re: open3 "sh -c" problem with pipes or redirect <vanospam@cox.net>
        problem with IO:Socket perldba@dba.invalid.com
    Re: Regular expression help <peter@makholm.net>
    Re: Regular expression help <jcombe@gmail.com>
        Trying to parse/match a C string literal <jl_post@hotmail.com>
    Re: Trying to parse/match a C string literal <uri@StemSystems.com>
    Re: Trying to parse/match a C string literal (Randal L. Schwartz)
    Re: Trying to parse/match a C string literal sln@netherlands.com
    Re: Trying to parse/match a C string literal sln@netherlands.com
    Re: Trying to parse/match a C string literal jl_post@hotmail.com
    Re: Using Perl's Bits 'n Pieces  ;-) <RF@NoDen.con>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 24 Sep 2009 16:00:03 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 5.34 How do I close a file descriptor by number?
Message-Id: <7IMum.71139$4t6.49027@newsfe06.iad>

This is an excerpt from the latest version perlfaq5.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

5.34: How do I close a file descriptor by number?

    If, for some reason, you have a file descriptor instead of a filehandle
    (perhaps you used "POSIX::open"), you can use the "close()" function
    from the "POSIX" module:

            use POSIX ();

            POSIX::close( $fd );

    This should rarely be necessary, as the Perl "close()" function is to be
    used for things that Perl opened itself, even if it was a dup of a
    numeric descriptor as with "MHCONTEXT" above. But if you really have to,
    you may be able to do this:

            require 'sys/syscall.ph';
            $rc = syscall(&SYS_close, $fd + 0);  # must force numeric
            die "can't sysclose $fd: $!" unless $rc == -1;

    Or, just use the fdopen(3S) feature of "open()":

            {
            open my( $fh ), "<&=$fd" or die "Cannot reopen fd=$fd: $!";
            close $fh;
            }



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Thu, 24 Sep 2009 22:00:08 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 9.18 How do I decode a MIME/BASE64 string?
Message-Id: <IZRum.13876$lR3.9735@newsfe25.iad>

This is an excerpt from the latest version perlfaq9.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

9.18: How do I decode a MIME/BASE64 string?

    The MIME-Base64 package (available from CPAN) handles this as well as
    the MIME/QP encoding. Decoding BASE64 becomes as simple as:

            use MIME::Base64;
            $decoded = decode_base64($encoded);

    The MIME-Tools package (available from CPAN) supports extraction with
    decoding of BASE64 encoded attachments and content directly from email
    messages.

    If the string to decode is short (less than 84 bytes long) a more direct
    approach is to use the unpack() function's "u" format after minor
    transliterations:

            tr#A-Za-z0-9+/##cd;                   # remove non-base64 chars
            tr#A-Za-z0-9+/# -_#;                  # convert to uuencoded format
            $len = pack("c", 32 + 0.75*length);   # compute length byte
            print unpack("u", $len . $_);         # uudecode and print



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Thu, 24 Sep 2009 15:05:43 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: How to check if a module is installed in Strawberry?
Message-Id: <4abb7cba$0$2524$da0feed9@news.zen.co.uk>


Water Lin wrote:
> If I am using Strawberry in my Windows, I need to find out if a module
> is already installed in Strawberry.
> 
> The command
> $ perldoc perllocal
> doesn't work under strawberry.
>       

The word "module" has a special meaning in Perl. Your problem with 
`perldoc perllocal` is probably unrelated.

1) You probably mean `perldoc perllocale`
2) Maybe you don't have your PATH set correctly.

You will usually get your problem solved much faster if you cut and 
paste the error message rather than write "doesn't work".

See http://www.rehabitation.com/clpmisc/clpmisc_guidelines.html
See http://www.catb.org/~esr/faqs/smart-questions.html

-- 
RGB


------------------------------

Date: Thu, 24 Sep 2009 15:53:00 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: How to check if a module is installed in Strawberry?
Message-Id: <slrnhbnm86.p5b.tadmc@tadmc30.sbcglobal.net>

RedGrittyBrick <RedGrittyBrick@spamweary.invalid> wrote:
>
> Water Lin wrote:
>> If I am using Strawberry in my Windows, I need to find out if a module
>> is already installed in Strawberry.
>> 
>> The command
>> $ perldoc perllocal
>> doesn't work under strawberry.
>>       
>
> The word "module" has a special meaning in Perl. 


And the OP is using "module" with that special meaning in mind...


> Your problem with 
> `perldoc perllocal` is probably unrelated.


Errr, no.

"perldoc perllocal" should list all of the locally installed modules.


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Thu, 24 Sep 2009 13:33:08 GMT
From: Peter Scott <Peter@PSDT.com>
Subject: Re: Log4perl -- how to copy FATALs to STDERR
Message-Id: <oyKum.70427$u76.36114@newsfe10.iad>

On Tue, 22 Sep 2009 17:14:20 -0700, fishfry wrote:
> Apologies if this is simple, I've been looking at the docs and can't
> figure it out. When I get a fatal error I want log4perl to log it to
> STDERR as well as the logfile. How do I do this?

You need a screen appender,  See 
http://search.cpan.org/~mschilli/Log-Log4perl-1.24/lib/Log/Log4perl/
Appender/Screen.pm

-- 
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/
http://www.informit.com/store/product.aspx?isbn=0137001274


------------------------------

Date: Thu, 24 Sep 2009 09:30:40 -0700 (PDT)
From: VANOSPAM <vanospam@cox.net>
Subject: open3 "sh -c" problem with pipes or redirect
Message-Id: <dd5ecf55-f086-48e9-99fc-f4462c2d842a@m20g2000vbp.googlegroups.com>

I have a perl programming that has been running for the last 10 or so
years.  It manages other processes, capturing all outputs into a
common logfile and killing dormant processes (no output) after a given
amount of time.  The user can print processes up or down as needed.
It utilizes the Event.pm module and it works fine.  Now I I have the
need to either run a process that utilizes a pipe "x | y" or setup two
separate processes and redirect stdin/stdout to use a named pipe ("x
>name_pipe" and "y <name_pipe").  In either case, the open3 call uses
a "sh -c" to execute the command.  The pid that is returned is the pid
of the shell command and when I attempt to terminate the process only
the shell command is killed, not the children.

I suspect the answer is to modify the stdin/stdout file handles as
needed and pass those to open3.  Is this the correct solution or is
there another way?

Currently, this is running on Solaris 8 & 10, sparc and Intel.  There
is no need for portability to windows.

Thanx.

Brad


------------------------------

Date: Thu, 24 Sep 2009 11:37:03 -0700 (PDT)
From: "C.DeRykus" <derykus@gmail.com>
Subject: Re: open3 "sh -c" problem with pipes or redirect
Message-Id: <5ae182f2-90eb-4556-bc06-cf61e5294ac4@a39g2000pre.googlegroups.com>

On Sep 24, 9:30=A0am, VANOSPAM <vanos...@cox.net> wrote:
> I have a perl programming that has been running for the last 10 or so
> years. =A0It manages other processes, capturing all outputs into a
> common logfile and killing dormant processes (no output) after a given
> amount of time. =A0The user can print processes up or down as needed.
> It utilizes the Event.pm module and it works fine. =A0Now I I have the
> need to either run a process that utilizes a pipe "x | y" or setup two
> separate processes and redirect stdin/stdout to use a named pipe ("x>name=
_pipe" and "y <name_pipe"). =A0In either case, the open3 call uses
>
> a "sh -c" to execute the command. =A0The pid that is returned is the pid
> of the shell command and when I attempt to terminate the process only
> the shell command is killed, not the children.
>
> I suspect the answer is to modify the stdin/stdout file handles as
> needed and pass those to open3. =A0Is this the correct solution or is
> there another way?
>
> Currently, this is running on Solaris 8 & 10, sparc and Intel. =A0There
> is no need for portability to windows.
>

Since portability wouldn't be an issue, you could try a negative
signal to target an entire process group. See perldoc -f kill or
perldoc perlipc for details.

--
Charles DeRykus



------------------------------

Date: Thu, 24 Sep 2009 13:09:35 -0700 (PDT)
From: VANOSPAM <vanospam@cox.net>
Subject: Re: open3 "sh -c" problem with pipes or redirect
Message-Id: <889675ee-b50b-48ed-9265-75f444d892c3@2g2000prl.googlegroups.com>

On Sep 24, 2:37=A0pm, "C.DeRykus" <dery...@gmail.com> wrote:
> On Sep 24, 9:30=A0am, VANOSPAM <vanos...@cox.net> wrote:
>
>
>
> > I have a perl programming that has been running for the last 10 or so
> > years. =A0It manages other processes, capturing all outputs into a
> > common logfile and killing dormant processes (no output) after a given
> > amount of time. =A0The user can print processes up or down as needed.
> > It utilizes the Event.pm module and it works fine. =A0Now I I have the
> > need to either run a process that utilizes a pipe "x | y" or setup two
> > separate processes and redirect stdin/stdout to use a named pipe ("x>na=
me_pipe" and "y <name_pipe"). =A0In either case, the open3 call uses
>
> > a "sh -c" to execute the command. =A0The pid that is returned is the pi=
d
> > of the shell command and when I attempt to terminate the process only
> > the shell command is killed, not the children.
>
> > I suspect the answer is to modify the stdin/stdout file handles as
> > needed and pass those to open3. =A0Is this the correct solution or is
> > there another way?
>
> > Currently, this is running on Solaris 8 & 10, sparc and Intel. =A0There
> > is no need for portability to windows.
>
> Since portability wouldn't be an issue, you could try a negative
> signal to target an entire process group. See perldoc -f kill or
> perldoc perlipc for details.
>
> --
> Charles DeRykus

Charles,
  thank you for the input.  Unfortunately I did try that and it
failed.  I also tried to setup a process group using setpgrp() but I
did not get that to work correctly either.  I may have being doing
something wrong so that is also a possibility.

Brad


------------------------------

Date: 24 Sep 2009 16:30:55 -0700
From: perldba@dba.invalid.com
Subject: problem with IO:Socket
Message-Id: <h9gvff01ert@drn.newsguy.com>

I am writing a client server utility in perl. The idea is for user to submit a
read query on a production database from development machines. The query will be
passed to a machine via tcp which can connect to production database and then
pass the result back to the client.

This model works great. here is the code of my prototype program. My problem and
question after the code.

Client program
#! /usr/bin/perl -w
use strict ;
use IO::Socket; 
my $sock = new IO::Socket::INET ( 
     PeerAddr => '127.0.0.1', 
     PeerPort => '7070', 
     Proto => 'tcp',
     ); 
die "Could not create socket: $!\n" unless $sock; 
$sock->autoflush(1);
my $sql_no = 0 ;
while (1) {
    $sql_no++;
    print "SQL[$sql_no]: " ;
    my $sql = <STDIN> ;
    print $sock "$sql" ;
    if ( substr($sql,0,4) eq "exit" ) {
        last ;
    }
    while (my $ret_line = <$sock> ) {
        chomp($ret_line);
        last if ($ret_line eq '<END>' ) ;
        print "$ret_line\n" ;
    }
}
print $sock "<EXIT>" . "\n" ;
close($sock);

Server program


#! /usr/bin/perl -w
use strict ;
use warnings;
use DBI ;
my $new_sock ;

use IO::Socket; 
my $sock = new IO::Socket::INET ( 
   LocalHost => '127.0.0.1', 
   LocalPort => '7070', 
   Proto => 'tcp', 
   Listen => 10,
   Reuse => 1);
die "Could not create socket: $!\n" unless $sock;
$sock->autoflush(1);
$SIG{CHLD} = 'IGNORE' ;
while ( $new_sock = $sock->accept()) {
    my $pid = fork();
    die "Cannot fork: $!" unless defined($pid);
    if ($pid == 0) {  # only child process
        $sock->autoflush(1);
        &process_sql();
        system("kill -9 $$") ;
    } 
}
close ($sock);

sub process_sql() {
    my $line ;
    my $ret_line ;
    my $dbh = DBI->connect('dbi:Pg:dbname=testdb','','',
              {'RaiseError' => 1, 'PrintError' => 1});
    $dbh->{AutoCommit} = 0 ;
    $dbh->{RaiseError} = 0 ;
    $dbh->{PrintError} = 0 ;
    while ($line = <$new_sock>) {
       chomp($line) ;
       last if ( $line eq "exit" ) ;
       my $chk_line = uc $line ;
       if ( substr($chk_line,0,6) ne "SELECT") {
           print $new_sock "Error: only select is allowed\n" ;
           print $new_sock "<END>" . "\n" ;
           next ;
       }
       my $sth = $dbh->prepare($line) ;
       $sth->execute() ;
       if ( $DBI::err) {
          print $new_sock "$DBI::errstr\n";
          print $new_sock "<END>" . "\n" ;
          next ;
       }
       while ( my @data = $sth->fetchrow_array() ) {
            $ret_line = join('|',@data);
            print $new_sock $ret_line . "\n" ;
       }
       print $new_sock "<END>" . "\n" ;
       $sth->finish();
    }
    $dbh->disconnect();
}

Now I am facing the program to change the program to 3 tier. The reason is that
we won't get an open port from dev to any of the machines which is on the prod
network. So what I need to do is to send the request from the client machine to
a proxy server which will route the query to another machine which will connect
to the database and give the result back to the proxy server, which will pass it
back to the client program.

What I did was to change the port of the above mentioned server program to 7071
and introduced another program. This program will act as server to the client
programs (on port 7070) and as a client to the db server running on port 7071.

#! /usr/bin/perl -w
use strict ;
use warnings;
use IO::Socket ;
my $new_sock ;

my $ssock = new IO::Socket::INET ( 
       LocalHost => '127.0.0.1', 
       LocalPort => '7071',
       Proto => 'tcp',
      ) ;
die "Could not create socket 7071 for app server: $!\n" unless $ssock;

my $sock = new IO::Socket::INET ( 
   LocalHost => '127.0.0.1', 
   LocalPort => '7070', 
   Proto => 'tcp', 
   Listen => 10,
   Reuse => 1);
die "Could not create socket: $!\n" unless $sock;

The server program on 7071 starts fine. (the same code pasted before as "server
program"). But when the above mentioned new program app.pl tries to start, it
errors out "Could not create 7071 for app server: Address already in use".

Why? I am using the same logic on what is working, except that this new script
app.pl is both a server and a client.

Is there a restriction on IO::Socket as only port it can use in a script.

TIA.



------------------------------

Date: Thu, 24 Sep 2009 15:15:40 +0200
From: Peter Makholm <peter@makholm.net>
Subject: Re: Regular expression help
Message-Id: <87ocp0mqkj.fsf@vps1.hacking.dk>

Jon Combe <jcombe@gmail.com> writes:

> Validate length to a minimum and maximum number of characters
> Must contain at least one vowel
> Must not contain any numbers
> Must contain no more than 2 adjacent repeated characters (so aa is OK,
> but aaa is not)

You can do it with a lot of look-ahead assertions:

/
  ^
    (?= .* [aeiouy] )  # At least one vowel
    (?! .* [0-9] )     # Does not contain number
    (?! .* (\w)\1\1 )  # no more than 2 adjacent repeated characters
    .{5,8}             # Between 5 and 8 chars
  $
/x

But the variant of regular expressions your application is using must
have both positive and negative lookaheads for this to work.

//Makholm


------------------------------

Date: Thu, 24 Sep 2009 08:24:08 -0700 (PDT)
From: Jon Combe <jcombe@gmail.com>
Subject: Re: Regular expression help
Message-Id: <b4f5a7c3-4ee8-44ae-86a8-50c79c7c67a3@o35g2000vbi.googlegroups.com>

> You can do it with a lot of look-ahead assertions:
>
> /
> =A0 ^
> =A0 =A0 (?=3D .* [aeiouy] ) =A0# At least one vowel
> =A0 =A0 (?! .* [0-9] ) =A0 =A0 # Does not contain number
> =A0 =A0 (?! .* (\w)\1\1 ) =A0# no more than 2 adjacent repeated character=
s
> =A0 =A0 .{5,8} =A0 =A0 =A0 =A0 =A0 =A0 # Between 5 and 8 chars
> =A0 $
> /x
>
> But the variant of regular expressions your application is using must
> have both positive and negative lookaheads for this to work.
>
> //Makholm

Thank you so much Makholm that does exactly what I want. I was already
starting to use look ahead but for the no more than 2 adjacent
repeated characters I was getting in a mess trying to use look behind
assertions which obviously aren't needed.

Jon.


------------------------------

Date: Thu, 24 Sep 2009 11:43:35 -0700 (PDT)
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Trying to parse/match a C string literal
Message-Id: <a351328d-a574-4f47-98ed-9ab0cfde5fcc@h14g2000pri.googlegroups.com>

Dear Perl community,

   I'm trying to write Perl code that scans through a C/C++ and
matches string literals.  I want to use a regular expression for this,
so that if given these inputs, it will extract these outputs:

input1: before "12 34 56" after
output1: 12 34 56

input2: before "12 34" 56" after
output2: 12 34

input3: before "12 34\" 56" after
output3: 12 34\" 56

input4: before "12 34\\" 56" after
output4: 12 34\\

input5: before "12 34\\\" 56" after
output5: 12 34\\\" 56

input6: before "12 34\\\\" 56" after
output6: 12 34\\\\

   Note that inputs 3 through 6 account for the backslash escape
character in that if a double-quote is directly preceded by a non-
escaped backslash, then that double-quote should not be interpreted as
the C string terminator.

   At first, I came up with this simple regular expression:

      m/" (.*) "/x

this puts everything between the first and the last quote into $1.
This works fine for input1, but reads too much with input2.

   Then I changed it to be non-greedy, like this:

      m/" (.*?) "/x

which works great for inputs 1 and 2, but now fails with input3, as it
doesn't account for escaped-out quotes.

   So then I added a negative look-behind to ensure that the last
character matched by the parentheses is not a backslash (I could use [^
\\] instead of the negative look-behind, but then we won't match empty
strings):

      m/" (.*? (?<!\\) ) "/x

This works with inputs 1 through 3, but fails at input4, since the
quote after the double-backslash should be the terminator, but isn't
treated as such (due to the fact that it is preceded by a backslash).

   So then I reasoned that, after the last non-backslash matched, an
even number of backslashes can be matched (as each pair of backslashes
represents one literal backslash), so I changed the expression to
this:

      m/" (.*? (?<!\\) (\\{2})* ) "/x

Now it works for all the inputs I gave.  I then added "?:" to the last
set of parentheses (so it wouldn't offset $2, $3, etc. if I decide to
add more later):

      m/" (.*? (?<!\\) (?:\\{2})* ) "/x

   I tested this out with the following code:

m/" (.*? (?<!\\) (?:\\{2})*) "/x and print "$1\n"  while <>;
before "12 34 56" after  # input 1
12 34 56
before "12 34" 56" after  # input 2
12 34
before "12 34\" 56" after  # input 3
12 34\" 56
before "12 34\\" 56" after  #input 4
12 34\\
before "12 34\\\" 56" after  # input 5
12 34\\\" 56
before "12 34\\\\" 56" after  # input 6
12 34\\\\

   So it looks like it works.  My question is, even though I came up
with a way of parsing a C string literal, is there a better or simpler
way of doing this?

   (Now, I know of the quotewords() function in the Text::ParseWords
module, but I don't think it's what I'm looking for.  I prefer a
regular expression that extracts the string literal (not individual
tokens), and I can embed it into other regular expressions that look
for other pieces of code.)

   I tried "perldoc -q string", but the best advice I could find was
to use Text::ParseWords, which I stated before is probably not what I
need.

   Thanks!

   -- Jean-Luc


------------------------------

Date: Thu, 24 Sep 2009 14:56:11 -0400
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Trying to parse/match a C string literal
Message-Id: <878wg4dvec.fsf@quad.sysarch.com>

>>>>> "jpc" == jl post@hotmail com <jl_post@hotmail.com> writes:

  jpc>    I'm trying to write Perl code that scans through a C/C++ and
  jpc> matches string literals.  I want to use a regular expression for this,
  jpc> so that if given these inputs, it will extract these outputs:

that can't be done easily with a single regex so don't even try. look at
text::balanced on cpan which is designed to match c strings and similar things.

uri


------------------------------

Date: Thu, 24 Sep 2009 12:11:28 -0700
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Trying to parse/match a C string literal
Message-Id: <86bpl0yx7j.fsf@blue.stonehenge.com>

>>>>> "jl" == jl post@hotmail com <jl_post@hotmail.com> writes:

jl>    I'm trying to write Perl code that scans through a C/C++ and
jl> matches string literals.  I want to use a regular expression for this,
jl> so that if given these inputs, it will extract these outputs:

jl> input1: before "12 34 56" after
jl> output1: 12 34 56

jl> input2: before "12 34" 56" after
jl> output2: 12 34

jl> input3: before "12 34\" 56" after
jl> output3: 12 34\" 56

jl> input4: before "12 34\\" 56" after
jl> output4: 12 34\\

jl> input5: before "12 34\\\" 56" after
jl> output5: 12 34\\\" 56

jl> input6: before "12 34\\\\" 56" after
jl> output6: 12 34\\\\

[...]

jl>       m/" (.*? (?<!\\) (?:\\{2})* ) "/x

I would just make a regex that does what you want, and ignore all that
fancy newfangled lookbehind/ahead/aside:

m/
  "     # quote
  (
   [^\"]+  # any non-special  characters are cool
   |      # ... or ...
   \.     # a backslash escaping the following character
  ) *     # repeated zero or more times
  "     # quote
/sx

If you need to remove the quotes from your match, just add an inner set of
parens around the juicy bits.

print "Just another Perl hacker,"; # the original

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion


------------------------------

Date: Thu, 24 Sep 2009 13:51:28 -0700
From: sln@netherlands.com
Subject: Re: Trying to parse/match a C string literal
Message-Id: <7nmnb59k40m5g933l98b4g9dgo0aucldb5@4ax.com>

On Thu, 24 Sep 2009 12:11:28 -0700, merlyn@stonehenge.com (Randal L. Schwartz) wrote:

>>>>>> "jl" == jl post@hotmail com <jl_post@hotmail.com> writes:
>
>jl>    I'm trying to write Perl code that scans through a C/C++ and
>jl> matches string literals.  I want to use a regular expression for this,
>jl> so that if given these inputs, it will extract these outputs:
>
<snip>
>I would just make a regex that does what you want, and ignore all that
>fancy newfangled lookbehind/ahead/aside:
>
>m/
>  "     # quote
>  (
>   [^\"]+  # any non-special  characters are cool
>   |      # ... or ...
>   \.     # a backslash escaping the following character
>  ) *     # repeated zero or more times
>  "     # quote
>/sx
>
>If you need to remove the quotes from your match, just add an inner set of
>parens around the juicy bits.
>
>print "Just another Perl hacker,"; # the original

If it were done this way, it would probably be better with
something like below.

-sln
---------------
use strict;
use warnings;

my $string = <DATA>;
print "\n",$string,"\n";

my $rx = qr/
  "     # quote
  (
    (?:
      [^\\"]*?  # any non-special  characters are cool
      |      # ... or ...
      \\.    # a backslash escaping the following character
    )*       # repeated zero or more times
  )
  "     # quote
/x;

while ($string =~  /$rx/sg)
{ print "<$1>\n"; } 

__DATA__
"  \"\"\"\"\" "  1 "this is one" 2 "this is  tw\o \" isin't it?" ""



------------------------------

Date: Thu, 24 Sep 2009 14:16:11 -0700
From: sln@netherlands.com
Subject: Re: Trying to parse/match a C string literal
Message-Id: <89onb5pb0kjf57svdgtvmgir074bt1ram5@4ax.com>

On Thu, 24 Sep 2009 13:51:28 -0700, sln@netherlands.com wrote:

>On Thu, 24 Sep 2009 12:11:28 -0700, merlyn@stonehenge.com (Randal L. Schwartz) wrote:
>
>>>>>>> "jl" == jl post@hotmail com <jl_post@hotmail.com> writes:
>>
>>jl>    I'm trying to write Perl code that scans through a C/C++ and
>>jl> matches string literals.  I want to use a regular expression for this,
>>jl> so that if given these inputs, it will extract these outputs:
>>
><snip>
>>I would just make a regex that does what you want, and ignore all that
>>fancy newfangled lookbehind/ahead/aside:
>>
>>m/
>>  "     # quote
>>  (
>>   [^\"]+  # any non-special  characters are cool
>>   |      # ... or ...
>>   \.     # a backslash escaping the following character
>>  ) *     # repeated zero or more times
>>  "     # quote
>>/sx
>>
>>If you need to remove the quotes from your match, just add an inner set of
>>parens around the juicy bits.
>>
This works good too.
m/
  "     # quote
  (
    (?:
      [^\\"]+  # any non-special  characters are cool
      |      # ... or ...
      \\.    # a backslash escaping the following character
    )*       # repeated zero or more times
  )
  "     # quote
/sx

-sln


------------------------------

Date: Thu, 24 Sep 2009 17:13:25 -0700 (PDT)
From: jl_post@hotmail.com
Subject: Re: Trying to parse/match a C string literal
Message-Id: <93ad546a-0676-4fdc-b4df-14541fb1d955@m3g2000pri.googlegroups.com>

On Sep 24, 12:43 pm, "jl_p...@hotmail.com" <jl_p...@hotmail.com>
wrote:
>
>    I'm trying to write Perl code that scans through a C/C++ and
> matches string literals.  I want to use a regular expression for
> this,

   Thanks for your responses!  I now have two regular expressions that
work well.  The one I came up with:

   m/" (.*? (?<!\\) (?:\\{2})* ) "/x

and one from Randal Schwartz, with some modification by poster sln and
myself:

   m/" ( (?: [^\\"] | \\. )* ) "/x

   When I tested them in my Perl script, I found that it read in and
processed 6827 C/C++ files in about 13 seconds, no matter which of the
above two regular expressions I used.

   (Actually, they initially clocked in around 45-55 seconds, but
after repeatedly running them, they "slimmed down" to a consistent 13
seconds.  I'm sure caching of some sort is involved somehow.)

   However, the second one you see above I modified a bit.  Randal's
suggestion was to use the '+' modifier after [^\\"] while sln
suggested using '*?'.

   So I experimented with these three variants:

   m/" ( (?: [^\\"] | \\. )* ) "/x
   m/" ( (?: [^\\"]+ | \\. )* ) "/x
   m/" ( (?: [^\\"]* | \\. )* ) "/x

   What I found out was that the version without any modifier took
about 13 seconds (when operating on 6827 files), the version with the
'+' modifier took about 24 seconds, and the version with the '*'
modifier took about 32 seconds.  (I made sure to run them over and
over to make sure caching had taken effect.)

   (I discovered that converting '*' and '+' into their non-greedy
versions '*?' and '+?' didn't seem to have a measurable effect.)

   So oddly enough, inclusion of the modifiers had an 11 to 19 second
penalty, with '*' being worse than '+'.  I'm not sure why this is so,
but it's interesting to point out:

   m/" (.*? (?<!\\) (?:\\{2})* ) "/x  # 13 seconds
   m/" ( (?: [^\\"] | \\. )* ) "/x    # 13 seconds
   m/" ( (?: [^\\"]+ | \\. )* ) "/x   # 24 seconds
   m/" ( (?: [^\\"]* | \\. )* ) "/x   # 32 seconds

(As an aside, converting \\. to (?:\\.)+ and (?:\\.)* didn't seem to
have an effect, probably because escaping a character was relatively
rare.)

   Therefore, if you want to match a C/C++ string literal, I'd
recommend using one of the following two regular expressions:

   m/" (.*? (?<!\\) (?:\\{2})* ) "/x
   m/" ( (?: [^\\"] | \\. )* ) "/x

They both seem to run about as fast.

   Thanks for all your help!

   -- Jean-Luc Romano


------------------------------

Date: Thu, 24 Sep 2009 17:27:19 -0700
From: RF <RF@NoDen.con>
Subject: Re: Using Perl's Bits 'n Pieces  ;-)
Message-Id: <7i2h13F2vdmrlU1@mid.individual.net>

Ben Morrow wrote:
> Quoth RF <RF@NoDen.con>:
>> Ben Morrow wrote:
>>> Quoth RF <RF@NoDen.con>:
>>>> and the answer was: "'perl' is not recognized as an in operable program 
>>>> or batch file." I guess another Perl module is required]
>>> You either don't have perl installed, or it isn't in your PATH. You can
>>> get perl for Win32 from http://strawberryperl.com/ . If you already have
>>> perl installed, you will have to tell us which version, where you got it
>>> from (ActiveState or Strawberry or somewhere else), and where it is
>>> installed.
>> There was NO Perl of any kind in my computer when I downloaded the 
>> roughly 70 files yesterday. I don't know what's built into that 
>> structure.
> 
> When you say 'that structure' you mean the package you originally
> downloaded? I would be extremely surprised if it included a copy of
> perl.
> 
>> I guess installing could do no harm. I am running Win2K SP4, 
>> so your juicy fruit would not work. Thanks again for your help.
> 
> The latest Strawberry builds have dropped support for 2k, but older
> releases are available from http://strawberryperl.com/releases.html . I
> would recommend using the 5.8.9.1 release from that page.
> 
> (Also, it's not a fruit so much as a flavour of icecream :).)
> 
> Ben

Hi again Ben,

I was hung up for a few weeks and am now having another look at this
problem. Thanks for the strawberryperl link. It installed ok but then I
found other instructions indicating that ActivePerl version should be
used. I had no problem installing it and, when I test it, it seems to work:

C:\>perl -v

This is perl, v5.10.1 built for MSWin32-x86-multi-thread
(with 2 registered patches, see perl -V for more detail)

Copyright 1987-2009, Larry Wall

Binary build 1006 [291086] provided by ActiveState
http://www.ActiveState.com
Built Aug 24 2009 13:48:26

This prog has a package manager ppm and the instructions for it were:

C:> ppm
ppm> install Lingua::GA::Gramadoir
ppm> quit

They worked and the last instruction given was:

"You should now have the front-end script gram-ga.pl installed.
Unfortunately, DOS isn't capable of displaying the ANSI control codes
that are used to highlight the errors in color. Also, you need to tell
An Gramadóir to display messages in the default DOS character encoding
(ibm-850). Therefore, to check a language text file called gaeilge.txt,
use the following:
C:> gram-ga.pl --aschod=cp850 --dath=none gaeilge.txt"


I set up the file and here is the response:

"C:\> gram-ga.pl --aschod=cp850 --dath=none gaeilge.txt
'gram-ga.pl' is not recognized as an internal or external command,
operable program or batch file.

C:\>perl -v"

Have you see this situation before?





------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2611
***************************************


home help back first fref pref prev next nref lref last post