[31361] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2613 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Sep 27 00:10:11 2009

Date: Sat, 26 Sep 2009 21:09:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 26 Sep 2009     Volume: 11 Number: 2613

Today's topics:
        Complex regex question <tuxedo@mailinator.com>
    Re: Complex regex question <tadmc@seesig.invalid>
    Re: Complex regex question <tuxedo@mailinator.com>
    Re: decimal round off issue <hjp-usenet2@hjp.at>
    Re: decimal round off issue <hjp-usenet2@hjp.at>
    Re: decimal round off issue <nospam-abuse@ilyaz.org>
    Re: decimal round off issue <nospam-abuse@ilyaz.org>
        FAQ 4.77 How do I pack arrays of doubles or floats for  <brian@theperlreview.com>
        FAQ 5.3 How do I count the number of lines in a file? <brian@theperlreview.com>
        FAQ 5.8 How can I make a filehandle local to a subrouti <brian@theperlreview.com>
        FAQ 6.7 How can I make "\w" match national character se <brian@theperlreview.com>
        FAQ 8.21 Where do I get the include files to do ioctl() <brian@theperlreview.com>
        FAQ 8.22 Why do setuid perl scripts complain about kern <brian@theperlreview.com>
        oversized installman filenames <sreservoir@gmail.com>
    Re: Trying to parse/match a C string literal sln@netherlands.com
    Re: Trying to parse/match a C string literal <cwilbur@chromatico.net>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 26 Sep 2009 18:41:33 +0200
From: Tuxedo <tuxedo@mailinator.com>
Subject: Complex regex question
Message-Id: <h9lg7t$1v3$00$1@news.t-online.com>

Hi,

I use a simple grep procedure munching through a domain zone file to return 
a report of existing domains against any particular keyword, which also 
includes matches in the nameserver field (although it is not meant to). For 
example, this is the first few lines result of 'grep KOMODO zonefile.txt':

KOMODODRAGON NS DNS2.GORGE.NET.
KOMODODRAGON NS SERV.GORGE.NET.
HELIOCENTRIC NS NS1.KOMODOTEK
HELIOCENTRIC NS NS2.KOMODOTEK
DIVEKOMODO NS NS1.PUREHOST
DIVEKOMODO NS NS2.PUREHOST
KOMODO-TECH NS NS1.CISCO
KOMODO-TECH NS NS2.CISCO
KOMODOSYSTEM NS DNS.NETFORCE.IT.
KOMODOSYSTEM NS NS2.IPOINT.IT.
KOMODOISLAND-TOURS NS NS1.BALINTER.NET.
KOMODOISLAND-TOURS NS NS2.BALINTER.NET.

Any domain match, being the first string starting with a new line, may have 
two or more name servers associated with the domain, so the result is one 
line p/match and name server (usually two but sometimes more lines).

However, I would like to output a list with only one line p/domain match, 
regardless of number of nameservers.

I would also like to exclude any occurrence returned from the nameserver 
field, ie. anything after a white space (eg. the third and fourth listing 
in the above example should not occur at all). In other words, only return 
matches that are not having a whitespace occuring before a new line (does 
this make sense.?.).

So the result is stripping any matches in the nameserver output altogether 
as well as any duplicate domains. When the above list is processed, the 
result would be simply one domain p/line and one line p/domain:

KOMODODRAGON
DIVEKOMODO
KOMODO-TECH
KOMODOSYSTEM
KOMODOISLAND-TOURS

The purpose is simply to return a list of domains against a particular 
keyword, stripping the irrelevant parts. I'm not quite sure how to do this, 
although I guess Perl is the best tool, being the de-facto regex master! 

Any suggestions or snippet code would be greatly appreciated!

Many thanks,
Tuxedo


------------------------------

Date: Sat, 26 Sep 2009 12:22:16 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: Complex regex question
Message-Id: <slrnhbsikk.tdm.tadmc@tadmc30.sbcglobal.net>

Tuxedo <tuxedo@mailinator.com> wrote:

> However, I would like to output a list with only one line p/domain match, 
> regardless of number of nameservers.
>
> I would also like to exclude any occurrence returned from the nameserver 
> field, ie. anything after a white space


> So the result is stripping any matches in the nameserver output altogether 
> as well as any duplicate domains. 


    perldoc -q duplicate

        How can I remove duplicate elements from a list or array?


> When the above list is processed, the 
> result would be simply one domain p/line and one line p/domain:
>
> KOMODODRAGON
> DIVEKOMODO
> KOMODO-TECH
> KOMODOSYSTEM
> KOMODOISLAND-TOURS


---------------
#!/usr/bin/perl
use warnings;
use strict;

my $term = 'KOMODO';

my %seen;
while ( <DATA> ) {
    if ( /^(\S*$term\S*)/ ) {
        print "$1\n" unless $seen{$1}++;
    }
}

__DATA__
KOMODODRAGON NS DNS2.GORGE.NET.
KOMODODRAGON NS SERV.GORGE.NET.
HELIOCENTRIC NS NS1.KOMODOTEK
HELIOCENTRIC NS NS2.KOMODOTEK
DIVEKOMODO NS NS1.PUREHOST
DIVEKOMODO NS NS2.PUREHOST
KOMODO-TECH NS NS1.CISCO
KOMODO-TECH NS NS2.CISCO
KOMODOSYSTEM NS DNS.NETFORCE.IT.
KOMODOSYSTEM NS NS2.IPOINT.IT.
KOMODOISLAND-TOURS NS NS1.BALINTER.NET.
KOMODOISLAND-TOURS NS NS2.BALINTER.NET.
---------------


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Sat, 26 Sep 2009 21:13:01 +0200
From: Tuxedo <tuxedo@mailinator.com>
Subject: Re: Complex regex question
Message-Id: <h9lp3t$j6e$00$1@news.t-online.com>

Tad J McClellan wrote:

[...]

>     perldoc -q duplicate

[...]

> ---------------
> #!/usr/bin/perl
> use warnings;
> use strict;
> 
> my $term = 'KOMODO';
> 
> my %seen;
> while ( <DATA> ) {
>     if ( /^(\S*$term\S*)/ ) {
>         print "$1\n" unless $seen{$1}++;
>     }
> }
> 

[...]

Thanks for the perldoc tip and for the working regex magic :-)

Tuxedo


------------------------------

Date: Sat, 26 Sep 2009 11:17:22 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: decimal round off issue
Message-Id: <slrnhbrn13.21p.hjp-usenet2@hrunkner.hjp.at>

On 2009-09-24 01:10, Ben Morrow <ben@morrow.me.uk> wrote:
> Quoth "Peter J. Holzer" <hjp-usenet2@hjp.at>:
>> Is there a "standard" C library on Windows which gcc has to use or does
>> it use the glibc? I suspect it's the former (I've seen similar results
>> with Microsofts C compiler).
>
> Huh? glibc is Linux- (well, and Hurd-) only.

Ports to other systems existed. I wouldn't be terribly surprised if a
(partial) port to Windows exists and is bundled with one of the ports of
gcc for Windows.


> gcc on Win32 uses MSVCRT.DLL,

That's what I thought (I wrote "I suspect it's the former"), but I
didn't know it. The last time I wrote C code for a MS platform was in
the MS-DOS days ...

	hp



------------------------------

Date: Sat, 26 Sep 2009 11:42:11 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: decimal round off issue
Message-Id: <slrnhbrofk.21p.hjp-usenet2@hrunkner.hjp.at>

On 2009-09-24 01:58, sln@netherlands.com <sln@netherlands.com> wrote:

[ unexpected results from perl printf on Windows ]

> Still, you have to wonder why MS, who supposedly is ANSI CRT
> would differ from other compilers in its sprintf results.

The C standard is deliberately vague on many aspects of floating point
arithmetic, because in the late 1980s there were still a lot of very
different implementations of fp arithmetic (both in software and
hardware). Mandating IEEE-754 compliant arithmetic would have been as
sure way to prevent the standard from being adopted by major vendors.
Even the C99 standard only contains IEEE-754 arithmetic as an option. 

So the MS printf implementation is almost certainly standard-conforming,
it just isn't as good as it could be. (I haven't checked, but I think
the error is below one ulp, so it is even correct)


> I thought it could be that gcc doesn't use perhaps an optimization
> that MS uses in its comiler that may pertain to floating point.

It is unlikely that this has anything to do with compiler optimizations.
Printf is just implemented differently (note that the MS implementation
apparently has a fixed number of decimal digits and prints only zeros
after that).


> On the other hand, there is alot of defines being passed to the
> Perl source. Are you sure that sprintf/printf from the CRT is
> is not being bypassed via custom Perl implementation?

No. But if Perl had its own implementation I would expect the output on
Linux and Windows to be the same. Since the output is different it is
very likely that Perl just uses the facilities of the local C library.

(The default perl FP->string conversion is a custom implementation
(since the C library doesn't offer the functionality) and it is buggy -
I've ranted about that before, but that isn't an issue here)

	hp



------------------------------

Date: Sun, 27 Sep 2009 02:44:12 +0000 (UTC)
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: decimal round off issue
Message-Id: <slrnhbtkbs.c40.nospam-abuse@chorin.math.berkeley.edu>

On 2009-09-24, Ben Morrow <ben@morrow.me.uk> wrote:
>> AFAIK, there are many gcc's on Win32, all (?) using different CRTL...
>
> Really? The only port I've ever seen is the MinGW port, which uses
> MSVCRT as it's libc. (I don't count Cygwin/Interix/whatever gccs as
> running on Win32, and neither does perl.)

I saw mentions of djgcc port.  At some time OS/2 EMX port was working
on Win32 with an appropriate syscalls library (RSX-NT, if I remember
correct) - but later people could not reproduce it; I did not collect
enough incentive to debug.

I know that klibc sources have __WIN32__ defines and subdirectories
scattered about...  Do not know whether klibc actually compiles under
Win32.  I know that Perl compiles - and at least in some repects works
- with klibc.

Yours,
Ilya


------------------------------

Date: Sun, 27 Sep 2009 02:46:45 +0000 (UTC)
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: decimal round off issue
Message-Id: <slrnhbtkgl.c40.nospam-abuse@chorin.math.berkeley.edu>

On 2009-09-26, Peter J. Holzer <hjp-usenet2@hjp.at> wrote:
> (The default perl FP->string conversion is a custom implementation
> (since the C library doesn't offer the functionality)

C library definitely offers the functionality.  And it was used for
decades without much problem....

> and it is buggy -

Agreed.

Ilya


------------------------------

Date: Sat, 26 Sep 2009 04:00:05 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 4.77 How do I pack arrays of doubles or floats for XS code?
Message-Id: <9lgvm.201589$cf6.114081@newsfe16.iad>

This is an excerpt from the latest version perlfaq4.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

4.77: How do I pack arrays of doubles or floats for XS code?

    The arrays.h/arrays.c code in the "PGPLOT" module on CPAN does just
    this. If you're doing a lot of float or double processing, consider
    using the "PDL" module from CPAN instead--it makes number-crunching
    easy.

    See <http://search.cpan.org/dist/PGPLOT> for the code.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Sat, 26 Sep 2009 16:00:03 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 5.3 How do I count the number of lines in a file?
Message-Id: <7Uqvm.202639$cf6.40680@newsfe16.iad>

This is an excerpt from the latest version perlfaq5.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

5.3: How do I count the number of lines in a file?

    One fairly efficient way is to count newlines in the file. The following
    program uses a feature of tr///, as documented in perlop. If your text
    file doesn't end with a newline, then it's not really a proper text
    file, so this may report one fewer line than you expect.

            $lines = 0;
            open(FILE, $filename) or die "Can't open `$filename': $!";
            while (sysread FILE, $buffer, 4096) {
                    $lines += ($buffer =~ tr/\n//);
                    }
            close FILE;

    This assumes no funny games with newline translations.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Sun, 27 Sep 2009 04:00:05 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 5.8 How can I make a filehandle local to a subroutine?  How do I pass filehandles between subroutines?  How do I make an array of filehandles?
Message-Id: <9rBvm.75610$nQ6.17735@newsfe07.iad>

This is an excerpt from the latest version perlfaq5.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

5.8: How can I make a filehandle local to a subroutine?  How do I pass filehandles between subroutines?  How do I make an array of filehandles?

    As of perl5.6, open() autovivifies file and directory handles as
    references if you pass it an uninitialized scalar variable. You can then
    pass these references just like any other scalar, and use them in the
    place of named handles.

            open my    $fh, $file_name;

            open local $fh, $file_name;

            print $fh "Hello World!\n";

            process_file( $fh );

    If you like, you can store these filehandles in an array or a hash. If
    you access them directly, they aren't simple scalars and you need to
    give "print" a little help by placing the filehandle reference in
    braces. Perl can only figure it out on its own when the filehandle
    reference is a simple scalar.

            my @fhs = ( $fh1, $fh2, $fh3 );

            for( $i = 0; $i <= $#fhs; $i++ ) {
                    print {$fhs[$i]} "just another Perl answer, \n";
                    }

    Before perl5.6, you had to deal with various typeglob idioms which you
    may see in older code.

            open FILE, "> $filename";
            process_typeglob(   *FILE );
            process_reference( \*FILE );

            sub process_typeglob  { local *FH = shift; print FH  "Typeglob!" }
            sub process_reference { local $fh = shift; print $fh "Reference!" }

    If you want to create many anonymous handles, you should check out the
    Symbol or IO::Handle modules.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Sat, 26 Sep 2009 22:00:02 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 6.7 How can I make "\w" match national character sets?
Message-Id: <C9wvm.20602$6f4.12331@newsfe08.iad>

This is an excerpt from the latest version perlfaq6.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

6.7: How can I make "\w" match national character sets?

    Put "use locale;" in your script. The \w character class is taken from
    the current locale.

    See perllocale for details.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Sat, 26 Sep 2009 10:00:04 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 8.21 Where do I get the include files to do ioctl() or syscall()?
Message-Id: <EClvm.75340$nQ6.50950@newsfe07.iad>

This is an excerpt from the latest version perlfaq8.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

8.21: Where do I get the include files to do ioctl() or syscall()?

    Historically, these would be generated by the h2ph tool, part of the
    standard perl distribution. This program converts cpp(1) directives in C
    header files to files containing subroutine definitions, like
    &SYS_getitimer, which you can use as arguments to your functions. It
    doesn't work perfectly, but it usually gets most of the job done. Simple
    files like errno.h, syscall.h, and socket.h were fine, but the hard ones
    like ioctl.h nearly always need to hand-edited. Here's how to install
    the *.ph files:

            1.  become super-user
            2.  cd /usr/include
            3.  h2ph *.h */*.h

    If your system supports dynamic loading, for reasons of portability and
    sanity you probably ought to use h2xs (also part of the standard perl
    distribution). This tool converts C header files to Perl extensions. See
    perlxstut for how to get started with h2xs.

    If your system doesn't support dynamic loading, you still probably ought
    to use h2xs. See perlxstut and ExtUtils::MakeMaker for more information
    (in brief, just use make perl instead of a plain make to rebuild perl
    with a new static extension).



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Fri, 25 Sep 2009 22:00:03 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 8.22 Why do setuid perl scripts complain about kernel problems?
Message-Id: <D3bvm.443851$Ta5.339834@newsfe15.iad>

This is an excerpt from the latest version perlfaq8.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

8.22: Why do setuid perl scripts complain about kernel problems?

    Some operating systems have bugs in the kernel that make setuid scripts
    inherently insecure. Perl gives you a number of options (described in
    perlsec) to work around such systems.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Sat, 26 Sep 2009 16:11:20 -0700 (PDT)
From: =?ISO-8859-1?Q?s_=B7_reservoir?= <sreservoir@gmail.com>
Subject: oversized installman filenames
Message-Id: <98b46e51-9f95-4ba3-8a58-e82e081d5a55@j19g2000vbp.googlegroups.com>

output is something to the effect of:

> no documentation in lib/Tie/StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/
StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/
lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/lib/perl/Tie/lib/perl/Tie/StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/lib/perl/Tie/lib/perl/Tie/usr/local/lib/perl/Tie/lib/perl/Tie/
lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/
lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/
lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/usr/local/
lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/
lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/
Tie/usr/local/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/lib/perl/Tie/
StdHandle.pm
> no documentation in lib/Tie/lib/perl/Tie/usr/local/lib/perl/Tie/
lib/perl/Tie/StdHandle.pm
> no documentation in lib/Tie/usr/local/lib/perl/Tie/StdHandle.pm
>   /usr/local/man/man3/AnyDBM_File.3
>   /usr/local/man/man3/Archive::Extract.3
>   /usr/local/man/man3/Archive::Tar.3
>   /usr/local/man/man3/Archive::Tar::File.3
>   /usr/local/man/man3/Archive::Tar::lib::perl::Archive::Tar::File.3
>   /usr/local/man/man3/Archive::Tar::lib::perl::Archive::Tar::lib::
perl::Archive::Tar::lib::perl::Archive::Tar::File.3
>   /usr/local/man/man3/Archive::Tar::lib::perl::Archive::Tar::lib::
perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::
lib::perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::
Tar::File.3
>   /usr/local/man/man3/Archive::Tar::lib::perl::Archive::Tar::lib::
perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::
Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::
Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::
lib::perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::
Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::
lib::perl::Archive::Tar::File.3
> Can't write-open /usr/local/man/man3/Archive::Tar::lib::perl::
Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::lib::
perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::
lib::perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::
Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::lib::perl::
Archive::Tar::lib::perl::Archive::Tar::lib::perl::Archive::Tar::lib::
perl::Archive::Tar::File.tmp: File name too long at installman line
220
> make[1]: *** [install.man] Error 36
> make[1]: Leaving directory `/home/ginkgo/.src/perl'
> make: *** [install] Error 2

which is obviously wrong. What usually causes this king of problem?
--

  "Six by nine. Forty two."
  "That's it. That's all there is."
  "I always thought something was fundamentally wrong with the
universe"


------------------------------

Date: Fri, 25 Sep 2009 21:02:33 -0700
From: sln@netherlands.com
Subject: Re: Trying to parse/match a C string literal
Message-Id: <e54rb553r76l3026utsm6i67hfhh7cnc3s@4ax.com>

On Thu, 24 Sep 2009 17:13:25 -0700 (PDT), jl_post@hotmail.com wrote:

>On Sep 24, 12:43 pm, "jl_p...@hotmail.com" <jl_p...@hotmail.com>
>wrote:
>>
>>    I'm trying to write Perl code that scans through a C/C++ and
>> matches string literals.  I want to use a regular expression for
>> this,
>
>   Thanks for your responses!  I now have two regular expressions that
>work well.  The one I came up with:
>
>   m/" (.*? (?<!\\) (?:\\{2})* ) "/x
>
>and one from Randal Schwartz, with some modification by poster sln and
>myself:
>
>   m/" ( (?: [^\\"] | \\. )* ) "/x
>
>   When I tested them in my Perl script, I found that it read in and
>processed 6827 C/C++ files in about 13 seconds, no matter which of the
>above two regular expressions I used.
>
>   (Actually, they initially clocked in around 45-55 seconds, but
>after repeatedly running them, they "slimmed down" to a consistent 13
>seconds.  I'm sure caching of some sort is involved somehow.)
>
>   However, the second one you see above I modified a bit.  Randal's
>suggestion was to use the '+' modifier after [^\\"] while sln
>suggested using '*?'.
>
>   So I experimented with these three variants:
>
>   m/" ( (?: [^\\"] | \\. )* ) "/x
>   m/" ( (?: [^\\"]+ | \\. )* ) "/x
>   m/" ( (?: [^\\"]* | \\. )* ) "/x
>
>   What I found out was that the version without any modifier took
>about 13 seconds (when operating on 6827 files), the version with the
>'+' modifier took about 24 seconds, and the version with the '*'
>modifier took about 32 seconds.  (I made sure to run them over and
>over to make sure caching had taken effect.)
>
>   (I discovered that converting '*' and '+' into their non-greedy
>versions '*?' and '+?' didn't seem to have a measurable effect.)
>
>   So oddly enough, inclusion of the modifiers had an 11 to 19 second
>penalty, with '*' being worse than '+'.  I'm not sure why this is so,
>but it's interesting to point out:
>
>   m/" (.*? (?<!\\) (?:\\{2})* ) "/x  # 13 seconds
>   m/" ( (?: [^\\"] | \\. )* ) "/x    # 13 seconds
>   m/" ( (?: [^\\"]+ | \\. )* ) "/x   # 24 seconds
>   m/" ( (?: [^\\"]* | \\. )* ) "/x   # 32 seconds
>
>(As an aside, converting \\. to (?:\\.)+ and (?:\\.)* didn't seem to
>have an effect, probably because escaping a character was relatively
>rare.)
>
>   Therefore, if you want to match a C/C++ string literal, I'd
>recommend using one of the following two regular expressions:
>
>   m/" (.*? (?<!\\) (?:\\{2})* ) "/x
>   m/" ( (?: [^\\"] | \\. )* ) "/x
>
>They both seem to run about as fast.
>
>   Thanks for all your help!
>
>   -- Jean-Luc Romano

If I had to worry about time I would probably use this
m/ "  ( (?: [^"\\]+ | (?:\\.)+ )* )  " /x

Your results may vary.
-sln

--------------------
Output:
"\"\"\"\"\\\\\\\\\\\\\\\\\" "  1 "this is one" 2 "this is  tw\o \" isin't it?" ""

(?x-ism:" ( (?: \\?. )*? ) ")
<\"\"\"\"\\\\\\\\\\\\\\\\\" >
<this is one>
<this is  tw\o \" isin't it?>
<>
the code took:2.84375 wallclock secs ( 2.84 usr +  0.00 sys =  2.84 CPU)

(?x-ism:" (.*? (?<!\\) (?:\\{2})* ) ")
<\"\"\"\"\\\\\\\\\\\\\\\\\" >
<this is one>
<this is  tw\o \" isin't it?>
<>
the code took:2.62468 wallclock secs ( 2.62 usr +  0.00 sys =  2.62 CPU)

(?x-ism:" ( (?: [^\\"] | \\. )* ) ")
<\"\"\"\"\\\\\\\\\\\\\\\\\" >
<this is one>
<this is  tw\o \" isin't it?>
<>
the code took:2.14033 wallclock secs ( 2.14 usr +  0.00 sys =  2.14 CPU)

(?x-ism: "  ( (?: [^"\\]+ | (?:\\.)+ )* )  " )
<\"\"\"\"\\\\\\\\\\\\\\\\\" >
<this is one>
<this is  tw\o \" isin't it?>
<>
the code took:1.74956 wallclock secs ( 1.75 usr +  0.00 sys =  1.75 CPU)

-----------------------

use strict;
use warnings;
use Benchmark ':hireswallclock';
my ($t0,$t1,$rx);

my $string = <DATA>;
print "\n",$string,"\n";

{
	$rx = qr/" ( (?: \\?. )*? ) "/x;
	print "\n$rx\n";
   	$t0 = new Benchmark;
	for (1..100_000) {
		1 while ($string =~  /$rx/sg);
	}
	$t1 = new Benchmark;
	while ($string =~  /$rx/sg) { print "<$1> \n"; } 
	print "the code took:",timestr(timediff($t1, $t0)),"\n";

	$rx = qr/" (.*? (?<!\\) (?:\\{2})* ) "/x;
	print "\n$rx\n";
   	$t0 = new Benchmark;
	for (1..100_000) {
		1 while ($string =~  /$rx/sg);
	}
	$t1 = new Benchmark;
	while ($string =~  /$rx/sg) { print "<$1> \n"; } 
	print "the code took:",timestr(timediff($t1, $t0)),"\n";

	$rx = qr/" ( (?: [^\\"] | \\. )* ) "/x;
	print "\n$rx\n";
	$t0 = new Benchmark;
	for (1..100_000) {
		1 while ($string =~  /$rx/sg);
	}
	$t1 = new Benchmark;
	while ($string =~  /$rx/sg) { print "<$1> \n"; } 
	print "the code took:",timestr(timediff($t1, $t0)),"\n";

	$rx = qr/ "  ( (?: [^"\\]+ | (?:\\.)+ )* )  " /x;
	print "\n$rx\n";
   	$t0 = new Benchmark;
	for (1..100_000) {
		1 while ($string =~  /$rx/sg);
	}
	$t1 = new Benchmark;
	while ($string =~  /$rx/sg) { print "<$1> \n"; } 
	print "the code took:",timestr(timediff($t1, $t0)),"\n";
}

__DATA__
"\"\"\"\"\\\\\\\\\\\\\\\\\" "  1 "this is one" 2 "this is  tw\o \" isin't it?" ""



------------------------------

Date: Sat, 26 Sep 2009 12:53:08 -0400
From: Charlton Wilbur <cwilbur@chromatico.net>
Subject: Re: Trying to parse/match a C string literal
Message-Id: <861vltabrf.fsf@mithril.chromatico.net>

>>>>> "uri" == Uri Guttman <uri@StemSystems.com> writes:

    uri> but you can have a string literal inside a comment and it needs
    uri> to be skipped. there are other cases i bet. ask damian why it
    uri> is better. :)

If you're actually parsing C, comments turn into whitespace during the
tokenizing in translation phase 3 -- "Each comment is replaced by one
space character." There are no string literals inside comments.

Further, the C90 standard, paragraph 3.1.9 says:

  Except within a character constant, a string literal, or a comment,
  the characters /* introduce a comment.  The contents of a comment are
  examined only to identify multibyte characters and to find the
  characters */ that terminate it.[21]

  [21] Thus, comments do not nest.

At least in C parsing, this really *isn't* a case of balanced text,
because all you are looking for is the closing */, and it can
successfully be handled by regular expressions.

Charlton



-- 
Charlton Wilbur
cwilbur@chromatico.net


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2613
***************************************


home help back first fref pref prev next nref lref last post