[18642] in Perl-Users-Digest
Perl-Users Digest, Issue: 810 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue May 1 09:11:36 2001
Date: Tue, 1 May 2001 06:10:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <988722610-v10-i810@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Tue, 1 May 2001 Volume: 10 Number: 810
Today's topics:
Hacker challenge. Can you break this script for me? <jfreeman@tassie.net.au>
Hacker Challenge. Can you break this script for me? <jfreeman@tassie.net.au>
Re: Hacker challenge. Can you break this script for me? (Gwyn Judd)
Re: Hacker challenge. Can you break this script for me? <jfreeman@tassie.net.au>
How can I find out with perl which processes are runnin <up4u2@hotmail.com>
How to get a list of Disabled user accounts with Win32: <nospam@nospam.com>
Re: How to: Create Regex which extracts N number of wor <notmyrealemail@fake.com>
Re: How to: Create Regex which extracts N number of wor <notmyrealemail@fake.com>
Re: one-line stderr, stdout redirection (Rudolf Polzer)
Re: one-line stderr, stdout redirection <dennis.kowalsk@daytonoh.ncr.com>
possible to dupe STDOUT to a file while still STDOUT-in <webmaster@webdragon.unmunge.net>
Re: possible to dupe STDOUT to a file while still STDOU <bart.lateur@skynet.be>
Re: pretty-printing perl? (Rudolf Polzer)
Re: R-E Perl Code <lauren_smith13@hotmail.com>
Re: Separate syslog file? (Porgie Tirebiter)
Re: Should Perl be first? (Martien Verbruggen)
Re: Strange string -> num conversion (Rudolf Polzer)
Re: XML::RSS and mod_perl <matt@sergeant.org>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Tue, 01 May 2001 21:42:02 +1000
From: Jfreeman <jfreeman@tassie.net.au>
Subject: Hacker challenge. Can you break this script for me?
Message-Id: <3AEEA10A.E2DC57C7@tassie.net.au>
Hi All
It has long been axiomatic that only (the Perl parser) can parse perl.
Stripcomments.pl is a script that will parse a perl script and remove all
comments. It will optionally also crunch the script down JAPH style to a user
defined line length.
First public demo via this newsgroup on 14/3/00 under heading Perl Hackers/Beta
Testers Wanted. This proved to be quite educational, both in terms of how much
perl I don't know, how many bugs a seemingly reliable script can contain, and
correct Usenet etiquette.
Thanks to a large number of suggestions for improvement and a complete re-write
this script will now successfully* parse 100% of all the perl code supplied in
the standard Active State Perl 5.6.0 distribution, around 120,00 lines of code,
in 600 scripts, from a wide variety of authors, in a myriad of styles.
* for the definition of success read the testing protocol part that follows.
This script is not the perl parser. There are a small number of known issues
documented in the known issues section of the pod that indicate things it can
not parse. Practically speaking these seem relatively uncommon as they have not
turned up in the roughly 120 thousand lines of code in the standard perl
distribution. Nonetheless they exist. Abigail's teaser is a beauty for those of
you who are into the more esoteric facets of Perl (see the pod).
Two Requests
1) Help find perl code that is not successfully parsed as I have exhausted my
available code base. To help you would need some perl scripts that are not part
of the standard perl distribution and about 5 minutes of free time. Details are
below. Alternatively if you are in that sort of a mood you could look at the
code and write shorts to exploit any weakness you can see. This is the hacker
challenge! Abigail is currently in the lead by a wide margin.
If anyone can tell me how I might get a burn of the entire CPAN that would be
the ultimate code test base for parsing scripts.
2) Have a look at the code and suggest improvements, if you have the time or
inclination.
Where to find the code
The code, with online pod, is available here:
http://www.dynamicflight.com.au/Perl/stripcomments.pl
You can cut or paste the code or just download the text (56K) or zip (16k) from
this location.
Why Bother?
Stripping comments per se is probably not the most useful perl parsing task you
could imagine. Shortening scripts for faster loading, making code obtuse and
un-maintainable a la the immortal JAPH style and as a back end for colour coding
about exhausts the obvious practical applications. None of these were why this
script came into being. The need for this script came about whilst writing an
indentation engine to fix the indentation of a perl script. It quickly becomes
evident that a fairly important first step when looking at the structural basis
of a piece of code is to strip off the comments which can confuse matters
enormously as they may contain { and } chars which are also the structural
delimiters in perl. In fact if you identify herepages and pod, strip comments,
quoted text, q constructs, regexes, paired {} () [] you can reduce a piece of
code to a structural skeleton like:
for {
if {
<block>
} else {
<block>
}
}
With a structural skeleton like this indentation is a breeze as the structure is
plainly, unambiguously, and completely defined. Each '{' IS the beginning of a
block and each '}' IS the end. Note the <block> represents code that (as it has
been stripped) will not contain the { or } structural delimiters.
I have a currently working stand alone indentation engine that is available at
http://www.dynamicflight.com.au/perl.htm if you are interested. Please note it
is yet to be rewritten to take advantage of the new improved parsing algorithms
developed for stripcomments.pl, although it is still quite functional on a
variety of code.
Those of you familiar with some other languages may have taken advantage of some
of the code parsers available to tidy things up. It is high time perl has a
stand alone pretty printer too.
How to help find code that breaks the script
First make a copy of some perl scripts to a new directory, say temp_dir. When
stripcomments.pl runs it overwrites the target script(s) although naturally it
writes a verified backup(s). If you don't want to bother restoring files from
these backups make some copies to a temporary directory!
The script has an inbuilt test harness activated if you specify a directory as
the target instead of a single script. If you call the script:
perl stripcomments.pl temp_dir 80
then stipcomments.pl will search the entire directory tree structure from the
root "temp-dir" and up (i.e. including all subdirs) for files with the
extensions .pl .plx .pm and .cgi. It will then tell you it has found xxx perl
files and ask you to confirm that you wish to strip them. If you answer anything
other than 'y' or 'Y' it will abort otherwise it will sequentially process all
the files. At the end it will report the stats. It takes less than 2 minutes to
process the entire perl distribution on my system.
Here is a sample session running stripcomments.pl on /perl (a copy of the perl
root dir):
C:\>perl stripcomments.pl perl 80
Checking directory .\perl\ (including all subdirectories) for perl files
Searching..................................................................
...........................................................................
...........................................................................
...........................................................................
....
Found 636 perl files in 315 directories
Confirm you wish to process all perl files in .\perl\ (Y/N)? y
Compile check .\perl\bin\config.pl
Compile check OK
Backup of original script written to .\perl\bin\config.pl.bak
Processing...
Compile check .\perl\bin\config.pl
Compile check OK
Done! Comment stripped file .\perl\bin\config.pl
...
[snip]
...
[cont]
...
Compile check .\perl\site\lib\SOAP\Transport\HTTP\Server.pm
Compile check OK
Backup of original script written to
.\perl\site\lib\SOAP\Transport\HTTP\Server.pm.bak
Processing...
Compile check .\perl\site\lib\SOAP\Transport\HTTP\Server.pm
Compile check OK
Done! Comment stripped file .\perl\site\lib\SOAP\Transport\HTTP\Server.pm
Stripcomments statistics for .\perl\
Elapsed time: 96 seconds
Total lines processed: 118,395
Total comments stripped: 11,155
Total files: 572
Successful: 572 (100.00%)
Failed: 0 (0.00%)
C:\>
I am interested in any broken files that turn up, as these will indicate
deficiencies in the script that may be able to be corrected.
If stripcomments.pl breaks a file(s) then the message at the end would look
like:
Stripcomments statistics for .\perl\
Elapsed time: 10 seconds
Total lines: 20,000
Total files: 100
Successful: 98 (98.00%)
Failed: 2 (2.00%)
Broken Scripts:
Script_breaker.pl
Cant_parse_me.pm
C:\>
If you are generous enough to send me a copy of the broken scripts (in their
unbroken form) I can analyse the reason for failure and make any bug fixes, or
if need be, additions to the known issues list.
Total time ~5 minutes to download the script, make a temp dir containing your
perl scripts, run script, and then email any broken files as an attachment. The
value to me would be huge as I have exhausted all the easily available perl
source code I have without finding any breakages. Yes, of course I did start
with plenty :-( but now there are none :-). I think this script works pretty
well. Please feel free to disillusion me.
There are some small known issues, using them to break the script is cheating,
or at the very least plagiarism!
Testing Protocol
The testing protocol is very simple.
1) First input the target script. Concatenate it into a string. Eval this
string, compiling it but avoid actually running it. Check $@ to ensure that no
errors are detected. If there is nothing in $@ then we can be sure that we have
a valid piece of perl that checks out using this eval method.
2) Next process the script and concatenate it (i.e. remove most of the new
lines) to give a line length approximating, and where possible not exceeding, a
user defined length (say 80, 160, or even 42000 if you want to turn a 2000 line
script into a serious one liner!
3) Run the new processed script through exactly the compile check algorithm used
in 1. As it was not initially broken it will only be broken now if
stripcomments.pl has broken it. Note if it is found to be broken than
stripcomments.pl restores the original file from the backup, and unlinks the
backup with a net effect of no change to the original file other than the
timestamp.
Why this logically works.
It is next to impossible to strip a # char and everything following it until the
end of that line (EOL) without causing a compilation error unless the #....EOL
is a real comment.
If you have:
for (0..$#array) {
and incorrectly strip '#array) {' you get a compilation error. This goes for
virtually all cases as you will either strip off a } or a ; at the end of the
line giving a fatal error.
One exception is (c) Abigail:
$_ = "Just another Perl Hacker # No comment, no comment!
# Yes, really!
# I am really a Perl Hacker!
";print;
In this case you could strip all the #...EOL without causing a compilation
error. Although stripcomments.pl did do this once upon a once upon it now
doesn't. This is one of the few cases that come to mind where you can
haphazardly hack off a non comment #...EOL without breaking a script.
Essentially you need some block or quoting context where the opening delimiter
and closing delimiter will remain intact after you strip the #..EOL and I
believe the script has all these covered.
So I am fairly confident that stripcomments.pl is not stripping anything that is
not a comment. Although there is a small potential that it may be doing so
undetected I feel that this is unlikely with a test base of 119,566 lines and no
breakages. More test material would be good.
The script concatenation forms the second part of the test. If you have
for (@foo) { # iterate over foo
$_++; # increment each foo
}
and concatenate it you will have
for (@foo) { # iterate over fo0 $_++; # increment each foo }
this is now a syntax error as the closing } is now hidden in the comment.
About the only way you can still have comments present in a script and still
concatenate it is if you are concatenating aiming for a standard 80 char line
the some comments could simply fortuitously fall at the end of the line, thus
not breaking the script. By getting stripcomments.pl to concatenate to a number
or different line lengths (causing any theoretical unstripped comments to move
position in the line) I believe it unlikely (but not of course impossible) that
this is occurring.
Totally esoteric guru question
For the real gurus here there is an interesting anomaly with the script
perl/lib/B/Assembler.pm. This script contains the following subs with 'miscoded'
regexes:
sub strip_comments {
my $stmt = shift;
# Comments only allowed in instructions which don't take string arguments
$stmt =~ s{
(?sx) # Snazzy extended regexp coming up. Also, treat
# string as a single line so .* eats \n characters.
^\s* # Ignore leading whitespace
(
[^"]* # A double quote '"' indicates a string argument. If we
# find a double quote, the match fails and we strip nothing.
)
\s*\# # Any amount of whitespace plus the comment marker...
.*$ # ...which carries on to end-of-string.
}{$1}; # Keep only the instruction and optional argument.
return $stmt;
}
sub parse_statement {
my $stmt = shift;
my ($insn, $arg) = $stmt =~ m{
(?sx)
^\s* # allow (but ignore) leading whitespace
(.*?) # Instruction continues up until...
(?: # ...an optional whitespace+argument group
\s+ # first whitespace.
(.*) # The argument is all the rest (newlines included).
)?$ # anchor at end-of-line
};
#[snip]
}
OK so the author obviously meant to add a /x modifier but forgot so his snazzy
regex is not as functional as hoped. The comments are interpretted by the perl
parser as literal strings to be matched literally.
While experimenting with different concatenation lengths this script sometimes
pops up as broken. What is interesting is that while you can concatenate the
whole regex, certain concatenations on this regex cause a syntax error. If you
concat the whole regex onto one line you get no worries, so we are not losing
the closing tokens and ;. However partial concatenations of these regexes cause
syntax errors in some cases but not in others?????
Try it yourself. Seems inexplicably weird to me as the entire content of these
regexes is a string literal with no comments. ??? a bug in the perl parser!
Cheers
James
------------------------------
Date: Tue, 01 May 2001 21:50:34 +1000
From: Jfreeman <jfreeman@tassie.net.au>
Subject: Hacker Challenge. Can you break this script for me?
Message-Id: <3AEEA30A.D6000C9F@tassie.net.au>
Hi All
It has long been axiomatic that only (the Perl parser) can parse perl.
Stripcomments.pl is a script that will parse a perl script and remove all
comments. It will optionally also crunch the script down JAPH style to a user
defined line length.
First public demo via this newsgroup on 14/3/00 under heading Perl Hackers/Beta
Testers Wanted. This proved to be quite educational, both in terms of how much
perl I don't know, how many bugs a seemingly reliable script can contain, and
correct Usenet etiquette.
Thanks to a large number of suggestions for improvement and a complete re-write
this script will now successfully* parse 100% of all the perl code supplied in
the standard Active State Perl 5.6.0 distribution, around 120,00 lines of code,
in 600 scripts, from a wide variety of authors, in a myriad of styles.
* for the definition of success read the testing protocol part that follows.
This script is not the perl parser. There are a small number of known issues
documented in the known issues section of the pod that indicate things it can
not parse. Practically speaking these seem relatively uncommon as they have not
turned up in the roughly 120 thousand lines of code in the standard perl
distribution. Nonetheless they exist. Abigail's teaser is a beauty for those of
you who are into the more esoteric facets of Perl (see the pod).
Two Requests
1) Help find perl code that is not successfully parsed as I have exhausted my
available code base. To help you would need some perl scripts that are not part
of the standard perl distribution and about 5 minutes of free time. Details are
below. Alternatively if you are in that sort of a mood you could look at the
code and write shorts to exploit any weakness you can see. This is the hacker
challenge! Abigail is currently in the lead by a wide margin.
If anyone can tell me how I might get a burn of the entire CPAN that would be
the ultimate code test base for parsing scripts.
2) Have a look at the code and suggest improvements, if you have the time or
inclination.
Where to find the code
The code, with online pod, is available here:
http://www.dynamicflight.com.au/Perl/stripcomments_pl.htm
You can cut or paste the code or just download the text (56K) or zip (16k) from
this location.
Why Bother?
Stripping comments per se is probably not the most useful perl parsing task you
could imagine. Shortening scripts for faster loading, making code obtuse and
un-maintainable a la the immortal JAPH style and as a back end for colour coding
about exhausts the obvious practical applications. None of these were why this
script came into being. The need for this script came about whilst writing an
indentation engine to fix the indentation of a perl script. It quickly becomes
evident that a fairly important first step when looking at the structural basis
of a piece of code is to strip off the comments which can confuse matters
enormously as they may contain { and } chars which are also the structural
delimiters in perl. In fact if you strip comments, quoted text, q constructs,
regexes, paired {} () [] you can reduce a piece of code to a structural skeleton
like:
for {
if {
<block>
} else {
<block>
}
}
With a structural skeleton like this indentation is a breeze as the structure is
plainly, unambiguously, and completely defined. Each '{' IS the beginning of a
block and each '}' IS the end. Note the <block> represents code that (as it has
been stripped) will not contain the { or } structural delimiters.
I have a currently working stand alone indentation engine that is available at
http://www.dynamicflight.com.au/perl.htm if you are interested. Please note it
is yet to be rewritten to take advantage of the new improved parsing algorithms
developed for stripcomments.pl, although it is still quite functional on a wide
variety of code.
Those of you familiar with some other languages may have taken advantage of some
of the code parsers available to tidy things up. It is high time perl has a
stand alone pretty printer too.
How to help find code that breaks the script
First make a copy of some perl scripts to a new directory, say temp_dir. When
stripcomments.pl runs it overwrites the target script(s) although naturally it
writes a verified backup(s). If you don't want to bother restoring files from
these backups make some copies to a temporary directory!
The script has an inbuilt test harness activated if you specify a directory as
the target instead of a single script. If you call the script:
perl stripcomments.pl temp_dir 80
then stipcomments.pl will search the entire directory tree structure from the
root "temp-dir" and up (i.e. including all subdirs) for files with the
extensions .pl .plx .pm and .cgi. It will then tell you it has found xxx perl
files and ask you to confirm that you wish to strip them. If you answer anything
other than 'y' or 'Y' it will abort otherwise it will sequentially process all
the files. At the end it will report the stats.
Here is a sample session running stripcomments.pl on /perl (a copy of the perl
root dir):
C:\>perl stripcomments.pl perl 80
Checking directory .\perl\ (including all subdirectories) for perl files
Searching..................................................................
...........................................................................
...........................................................................
...........................................................................
....
Found 636 perl files in 315 directories
Confirm you wish to process all perl files in .\perl\ (Y/N)? y
Compile check .\perl\bin\config.pl
Compile check OK
Backup of original script written to .\perl\bin\config.pl.bak
Processing...
Compile check .\perl\bin\config.pl
Compile check OK
Done! Comment stripped file .\perl\bin\config.pl
...
[snip]
...
[cont]
...
Compile check .\perl\site\lib\SOAP\Transport\HTTP\Server.pm
Compile check OK
Backup of original script written to
.\perl\site\lib\SOAP\Transport\HTTP\Server.pm.bak
Processing...
Compile check .\perl\site\lib\SOAP\Transport\HTTP\Server.pm
Compile check OK
Done! Comment stripped file .\perl\site\lib\SOAP\Transport\HTTP\Server.pm
Stripcomments statistics for .\perl\
Elapsed time: 96 seconds
Total lines processed: 118,395
Total comments stripped: 11,155
Total files: 572
Successful: 572 (100.00%)
Failed: 0 (0.00%)
C:\>
I am interested in any broken files that turn up, as these will indicate
deficiencies in the script that may be able to be corrected.
If stripcomments.pl breaks a file(s) then the message at the end would look
like:
Stripcomments statistics for .\perl\
Elapsed time: 10 seconds
Total lines: 20,000
Total files: 100
Successful: 98 (98.00%)
Failed: 2 (2.00%)
Broken Scripts:
Script_breaker.pl
Cant_parse_me.pm
C:\>
If you are generous enough to send me a copy of the broken scripts (in their
unbroken form) I can analyse the reason for failure and make any bug fixes, or
if need be, additions to the known issues list.
Total time ~5 minutes to download the script, make a temp dir containing your
perl scripts, run script, and then email any broken files as an attachment. The
value to me would be huge as I have exhausted all the easily available perl
source code I have without finding any breakages. Yes, of course I did start
with plenty :-( but now there are none :-). I think this script works pretty
well. Please feel free to disillusion me.
There are some small known issues, using them to break the script is cheating,
or at the very least plagiarism!
Testing Protocol
The testing protocol is very simple.
1) First input the target script. Concatenate it into a string. Eval this
string, compiling it but avoid actually running it. Check $@ to ensure that no
errors are detected. If there is nothing in $@ then we can be sure that we have
a valid piece of perl that checks out using this eval method.
2) Next process the script and concatenate it (i.e. remove most of the new
lines) to give a line length approximating, and where possible not exceeding, a
user defined length (say 80, 160, or even 42000 if you want to turn a 2000 line
script into a serious one liner!
3) Run the new processed script through exactly the compile check algorithm used
in 1. As it was not initially broken it will only be broken now if
stripcomments.pl has broken it. Note if it is found to be broken than
stripcomments.pl restores the original file from the backup, and unlinks the
backup with a net effect of no change to the original file other than the
timestamp.
Why this logically works.
It is next to impossible to strip a # char and everything following it until the
end of that line (EOL) without causing a compilation error unless the #....EOL
is a real comment.
If you have:
for (0..$#array) {
and incorrectly strip '#array) {' you get a compilation error. This goes for
virtually all cases as you will either strip off a } or a ; at the end of the
line giving a fatal error.
One exception is (c) Abigail:
$_ = "Just another Perl Hacker # No comment, no comment!
# Yes, really!
# I am really a Perl Hacker!
";print;
In this case you could strip all the #...EOL without causing a compilation
error. Although stripcomments.pl did do this once upon a once upon it now
doesn't. This is one of the few cases that come to mind where you can
haphazardly hack off a non comment #...EOL without breaking a script.
Essentially you need some block or quoting context where the opening delimiter
and closing delimiter will remain intact after you strip the #..EOL and I
believe the script has all these covered.
So I am fairly confident that stripcomments.pl is not stripping anything that is
not a comment. Although there is a small potential that it may be doing so
undetected I feel that this is unlikely with a test base of 119,566 lines and no
breakages. More test material would be good.
The script concatenation forms the second part of the test. If you have
for (@foo) { # iterate over foo
$_++; # increment each foo
}
and concatenate it you will have
for (@foo) { # iterate over fo0 $_++; # increment each foo }
this is now a syntax error as the closing } is now hidden in the comment.
About the only way you can still have comments present in a script and still
concatenate it is if you are concatenating aiming for a standard 80 char line
the some comments could simply fortuitously fall at the end of the line, thus
not breaking the script. By getting stripcomments.pl to concatenate to a number
or different line lengths (causing any theoretical unstripped comments to move
position in the line) I believe it unlikely (but not of course impossible) that
this is occurring.
For the real gurus here there is an interesting anomaly with the script
perl/lib/B/Assembler.pm. This script contains the following subs with 'miscoded'
regexes:
sub strip_comments {
my $stmt = shift;
# Comments only allowed in instructions which don't take string arguments
$stmt =~ s{
(?sx) # Snazzy extended regexp coming up. Also, treat
# string as a single line so .* eats \n characters.
^\s* # Ignore leading whitespace
(
[^"]* # A double quote '"' indicates a string argument. If we
# find a double quote, the match fails and we strip nothing.
)
\s*\# # Any amount of whitespace plus the comment marker...
.*$ # ...which carries on to end-of-string.
}{$1}; # Keep only the instruction and optional argument.
return $stmt;
}
sub parse_statement {
my $stmt = shift;
my ($insn, $arg) = $stmt =~ m{
(?sx)
^\s* # allow (but ignore) leading whitespace
(.*?) # Instruction continues up until...
(?: # ...an optional whitespace+argument group
\s+ # first whitespace.
(.*) # The argument is all the rest (newlines included).
)?$ # anchor at end-of-line
};
#[snip]
}
OK so the author obviously meant to add a /x modifier but forgot so his snazzy
regex is not as functional as hoped. The comments are interpretted by the perl
parser as literal strings to be matched literally.
While experimenting with different concatenation lengths this script sometimes
pops up as broken. What is interesting is that while you can concatenate the
whole regex, certain concatenations on this regex cause a syntax error. If you
concat the whole regex onto one line you get no worries, so we are not losing
the closing tokens and ;. However partial concatenations of these regexes cause
syntax errors in some cases but not in others?????
Try it yourself. Seems inexplicably weird to me as the entire content of these
regexes is a string literal with no comments. ??? a bug in the perl parser!
Cheers
James
------------------------------
Date: Tue, 01 May 2001 12:20:14 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Hacker challenge. Can you break this script for me?
Message-Id: <slrn9etafs.g1.tjla@thislove.dyndns.org>
"mein Luftkissenfahrzeug ist voll von den Aalen"
said Jfreeman (jfreeman@tassie.net.au) in
<3AEEA10A.E2DC57C7@tassie.net.au>:
> $stmt =~ s{
> (?sx) # Snazzy extended regexp coming up. Also, treat
>OK so the author obviously meant to add a /x modifier but forgot so his snazzy
>regex is not as functional as hoped. The comments are interpretted by the perl
>parser as literal strings to be matched literally.
Not so. See the perlre manpage (search for "(?imsx-imsx)").
--
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
LSD melts your mind, not in your hand.
------------------------------
Date: Tue, 01 May 2001 22:55:41 +1000
From: Jfreeman <jfreeman@tassie.net.au>
Subject: Re: Hacker challenge. Can you break this script for me?
Message-Id: <3AEEB24D.ED72C0FF@tassie.net.au>
Thanks Gwyn
Checked the manpage. I was not aware of that syntax and will arrange to parse for
it. Thank you. It seems to be missed in the Camel book, but may be I got
inattentive. The little engine who blinked perhaps!
I can now see why this concatenated regex parses:
$stmt =~ s{
(?sx) # Snazzy extended regexp coming up. Also, treat# string as a single line so
.* eats \n characters.^\s* # Ignore leading whitespace
([^"]* # A double quote '"' indicates a string argument. If we# find a double
quote, the match fails and we strip nothing.
)\s*\# # Any amount of whitespace plus the comment marker....*$ # ...which
carries on to end-of-string.}{$1}; # Keep only the instruction and optional
argument.
Although the closing }{$1}:#.... appears hidden behind the comment the parser is
following the rule expressed in the Camel book a don't put your closing delimiter
in the comments of a /x regex and still finding it. This odd behaviour was
troubling me. Thanks for the info, much appreciated.
James
Gwyn Judd wrote:
> "mein Luftkissenfahrzeug ist voll von den Aalen"
> said Jfreeman (jfreeman@tassie.net.au) in
> <3AEEA10A.E2DC57C7@tassie.net.au>:
> > $stmt =~ s{
> > (?sx) # Snazzy extended regexp coming up. Also, treat
>
> >OK so the author obviously meant to add a /x modifier but forgot so his snazzy
> >regex is not as functional as hoped. The comments are interpretted by the perl
> >parser as literal strings to be matched literally.
>
> Not so. See the perlre manpage (search for "(?imsx-imsx)").
>
> --
> Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
> LSD melts your mind, not in your hand.
------------------------------
Date: Tue, 1 May 2001 09:25:00 +0200
From: "Up4U2" <up4u2@hotmail.com>
Subject: How can I find out with perl which processes are running on Windows NT?
Message-Id: <9cloct$ebthn$1@ID-86232.news.dfncis.de>
How can I find out with perl which processes are running on an NT system?
I already spend a day or two on Perl FAQ / Documentation and Google search.
Peter
------------------------------
Date: Tue, 01 May 2001 08:01:27 -0400
From: Raj Wurttemberg <nospam@nospam.com>
Subject: How to get a list of Disabled user accounts with Win32::AdminMisc??
Message-Id: <ko8tet8ovammrqgeg1krllamvrh8i9oo24@4ax.com>
I'm trying to use Perl to display a list of users and show if the account is
disabled or active using the code from "Win32::AdminMisc". So far I have been
able to get a list of users but I don't seem to understand how to query the
list and pull the values with "Win32::AdminMisc::GetUserMiscAttributes". This
is the code I have so far:
# uinfo2.pl
# Load modules
use Win32::NetAdmin;
use Win32::AdminMisc;
# Find Windows NT PDC
#
if ( Win32::NetAdmin::GetDomainController("", "", $Server))
{
print "The primary domain controller is $Server\n";
} else {
print Win32::FormatMessage( Win32::NetAdmin::GetError() );
}
# Display a list of users in the Domain
#
if( Win32::AdminMisc::GetUsers( $Server, "", \@UserList ) )
{
print "The list of user accounts are:\n";
map { print "\t$_\n";} @UserList;
}
I know I'm missing something simple, but my Perl knowledge is not the best. Any
assistance would be appreciated.
Thanks,
/*Raj*/
raj|\|0@SPA|\/|starbase-01.com
------------------------------
Date: Tue, 01 May 2001 11:36:04 GMT
From: "BarryK" <notmyrealemail@fake.com>
Subject: Re: How to: Create Regex which extracts N number of words before target word
Message-Id: <EcxH6.46486$U4.10948483@news1.rdc1.tn.home.com>
Alas, this won't even compile on my machine.
"Garry Williams" <garry@ifr.zvolve.net|> wrote in message
news:slrn9equ02.c2k.garry@zfw.zvolve.net...
|> On Mon, 30 Apr 2001 13:17:45 GMT, BarryK <notmyrealemail@fake.com|>
|> wrote:
|>
|> Assume you have a target word, e.g. "cat", and you want to extract
|> that word and a certain number of words before it. How is one to do
|> this in a non-literal manner with a regular expression which will
|> support any number of pre-words to be extracted?
|>
|> Following does not work. It should replace target and previous two
|> words, to wit: word1 word2 Z
|>
|> code
|>
|> word1 word2 word3 word4 cat";
|>
|> s: \b.+\b{2}?cat :Z:xg;
|>
|> The quantifier ({2}) is quantifying the atom `\b'. I don't think
|> that's what you meant. The `?' following the quantifier {2} makes no
|> sense, since the {2} is not allowed any latitude -- it forces exactly
|> two.
|>
GW:
|> Here's one way to do what I think you want:
|>
|> perl -wle '$_="word1 word2 word3 word4 cat";'
|> e 's/\b(?:\w+ +){2}cat/Z/; print'
|> word1 word2 Z
|> $
|>
|> Obligatory mention: see perlre.
|>
|> --
|> Garry Williams
------------------------------
Date: Tue, 01 May 2001 12:27:22 GMT
From: "BarryK" <notmyrealemail@fake.com>
Subject: Re: How to: Create Regex which extracts N number of words before target word
Message-Id: <KYxH6.46511$U4.10972073@news1.rdc1.tn.home.com>
This works for two words:
$_ = "aaahhh dog whale cat male xxx yyy";
s|[A-Za-z0-9\.]+ [A-Za-z0-9\.]+ cat [A-Za-z0-9\.]+ [A-Za-z0-9\.]+ |X|;
print $_;
=======
> |> Assume you have a target word, e.g. "cat", and you want to extract
> |> that word and a certain number of words before it. How is one to do
> |> this in a non-literal manner with a regular expression which will
> |> support any number of pre-words to be extracted?
------------------------------
Date: Tue, 1 May 2001 13:49:10 +0200
From: eins@durchnull.de (Rudolf Polzer)
Subject: Re: one-line stderr, stdout redirection
Message-Id: <slrn9et8lm.2nm.eins@www42.t-offline.de>
Randal L. Schwartz <merlyn@stonehenge.com> wrote:
> >>>>> "Uri" == Uri Guttman <uri@sysarch.com> writes:
>
> RLS> But while I'm not sure what piece of documentation ensures it,
> RLS> I'm very sure that I know somehow that STDIN is always fileno 0,
> RLS> STDOUT always 1, and STDERR 2. Even when reopened as such.
>
> Uri> not exactly.
>
> Uri> perl -le 'close STDIN; open FOO, "/dev/null"; open STDIN, "/dev/tty"; print fileno( \*FOO ), " ", fileno( \*STDIN )'
> Uri> 0 3
>
> You cheated. You closed STDIN, so it no longer has its magical bit in
> the stash.
>
> I'm talking about "reopening STDIN". There's some special code
> that ensures that open STDIN forces no loss of fileno 0.
>
> Uri> the rule is simpler. a new file descriptor (file or socket) uses the
> Uri> lowest available integer in the fd table in the process.
Can you still use IO::Select when you have >30 open sockets? select() does
not work because the bit vector can only hold fileno()s in 0..31.
--
#!/usr/bin/perl -W -- WARNING: This copies a random file from
use strict;my$s;my$n=0;for # the current directory to your
(<*>){++$n;int rand$n or$s # signature file. Use at your
=$_};`cp $s ~/.signature`; # own risk! (c) 2001 Rudolf Polzer
------------------------------
Date: Tue, 1 May 2001 08:27:19 -0400
From: "Dennis Kowalski" <dennis.kowalsk@daytonoh.ncr.com>
Subject: Re: one-line stderr, stdout redirection
Message-Id: <3aeeaba8$1@rpc1284.daytonoh.ncr.com>
OK, I conceed the top-posting error. I hit the wrong button.
I did not feel writing a complete program with all error checking was
required just to get the point of the question across.
If you think the method proposed is bad, complain to Siram Srinivasan and
O'Reilly & Assoc.
The example I gave came straight off of page 49 I/O Redirection in the
Advanced Perl book.
I don't pretend to know it all. I trust the authors to know what they are
talking about.
Jon Ericson <Jonathan.L.Ericson@jpl.nasa.gov> wrote in message
news:86bspenfts.fsf@jon_ericson.jpl.nasa.gov...
> "Dennis Kowalski" <dennis.kowalsk@daytonoh.ncr.com> writes:
>
> > Try this
> >
> > open(LOG,">$filename");
> > *STDERR = *LOG;
> >
> > Gerald Shuman <nospam@newsranger.com> wrote in message
> > news:BrgH6.4149$SZ5.335582@www.newsranger.com...
> > > What's the perl equivalent of "exec 2> stderr.txt" in a shell script?
> > >
> > >
>
> Yikes! Top-posting, "Try this", and not checking the return value of
> a system call are three signs of bad advice. I suppose that the
> general idea (opening a filehandle and copying the entire typeglob to
> the STDERR typeglob) isn't terrible, but there are better solutions.
> (perlopentut covers this topic quite well.)
>
> Jon
------------------------------
Date: 1 May 2001 09:24:47 GMT
From: "Scott R. Godin" <webmaster@webdragon.unmunge.net>
Subject: possible to dupe STDOUT to a file while still STDOUT-ing? :)
Message-Id: <9clvcv$t7p$1@216.155.32.176>
I'd like to know if it's possible to do such a thing.. continue printing
to STDOUT but also have same echoed to a file without multiple print
statements or suffling back and forth with select($fh)
I have an instance where, a program I've written to output an html-based
template from form input is sent to the browser for preview, but I'd
also like it dumped to a local file at the same time, so that the user
can download it.
Maybe I'm just being silly about this, but is it even possible?
--
unmunge e-mail here:
#!perl -w
print map {chr(ord($_)-3)} split //, "zhepdvwhuCzhegudjrq1qhw";
# ( damn spammers. *shakes fist* take a hint. =:P )
------------------------------
Date: Tue, 01 May 2001 11:38:21 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: possible to dupe STDOUT to a file while still STDOUT-ing? :)
Message-Id: <oh7tetglrot6amgfp9t9stfa61jd0f8p92@4ax.com>
Scott R. Godin wrote:
>I'd like to know if it's possible to do such a thing.. continue printing
>to STDOUT but also have same echoed to a file without multiple print
>statements or suffling back and forth with select($fh)
What you're asking for, is known as tee-ing. There's even an entry n the
fAQ about it.
Found in perlfaq5.pod
How do I print to more than one file at once?
Unfortunately, the solution is rather Unix-centric:
To connect up to one filehandle to several output filehandles,
it's easiest to use the tee(1) program if you have it, and let
it take care of the multiplexing
Bummer. Possible solutions still include a tied filehandle, get tee()
from the GNU-for-Win32 toolbox (Cygwin), or use a perl script tee clone,
like (as per the FAQ)
<http://www.perl.com/CPAN/authors/id/TOMC/scripts/tct.gz>
>I have an instance where, a program I've written to output an html-based
>template from form input is sent to the browser for preview, but I'd
>also like it dumped to a local file at the same time, so that the user
>can download it.
Isn't that silly? Can't the user just rightclick (or something similar)
and choose "Save As..."?
--
Bart.
------------------------------
Date: Tue, 1 May 2001 13:46:14 +0200
From: eins@durchnull.de (Rudolf Polzer)
Subject: Re: pretty-printing perl?
Message-Id: <slrn9et8g6.2nm.eins@www42.t-offline.de>
Ilya Zakharevich <ilya@math.ohio-state.edu> wrote:
> [A complimentary Cc of this posting was sent to
> Rudolf Polzer
> <eins@durchnull.de>], who wrote in article <slrn9eqak8.nd5.eins@www42.t-offline.de>:
> > > > perldoc -q pretty-printer
> > > >
> > > > "Is there a pretty-printer (formatter) for Perl?"
> > >
> > > Is there any reason to trust the Perl FAQ?
> > >
> > > No.
> > >
> > > See the menu in cperl-mode.
> >
> > You will have to undo some changes, however. Only perl can parse Perl.
> >
> > Try (I did not check it): This should confuse your formatter.
>
> LOL! And you should stop beating your grandchilds. (I did not check it.)
Did you try it? I do not use vi and emacs, so I cannot.
--
#!/usr/bin/perl -- Random sig generator. Editor command in slrn => ~/siggs
$F=shift;open H,"+<$F";$_=join"",<H>;$s=index$_,"\n\n-- \n";$s<0||truncate
H,$s;close H;system"$ENV{EDITOR} $F</dev/tty>/dev/tty";$s=$n=0;for#sichtig
(<~/siggs/*>){++$n;int rand$n or$s=$_};`(echo "\n\n-- ")|cat - $s>>$F`+nan
------------------------------
Date: Tue, 01 May 2001 07:47:41 GMT
From: "Lauren Smith" <lauren_smith13@hotmail.com>
Subject: Re: R-E Perl Code
Message-Id: <xStH6.10529$WZ4.716407@paloalto-snr1.gtei.net>
"Mirek Rewak" <cave@pertus.com.pl> wrote in message
news:t2kset4k4p2obkcvkrvqjf29r8hjvc8s23@4ax.com...
> Hi,
> Is there a tool for reverse-engeneeing Perl code? I know that there is
> an add-in for WithClass but it don't work in WithClass 2000 (it seems
> that it was build for version 99 but there is only 2000 to download).
> I would prefer tool for Rational Rose but I didn't found.
The source for perl is available at
http://www.perl.com/pub/language/info/software.html#sourcecode. No
reverse engineering necessary!
As far as reverse engineering Perl code, you may want to pick up a book
on Perl. Here's a site with good recommendations:
http://www.perl.com/pub/language/critiques/index.html.
Lauren
------------------------------
Date: Tue, 01 May 2001 10:13:40 GMT
From: porgie@fst.fun (Porgie Tirebiter)
Subject: Re: Separate syslog file?
Message-Id: <3aee8b42.4829239@news.giganews.com>
On Sun, 29 Apr 2001 13:49:55 -0400, John Schmidt
<js@saltmine.radix.net> wrote:
>You have a UN*X problem, not a Perl problem. There are predefined
>syslog facilities which *really* *should* be listed in the syslog
>man pages.
When you say "syslog man pages", are you talking about UNIX man pages
or Perl man pages? I don't have any helpful syslog UNIX man pages:
$ cd /usr/man
$ ls */syslog*
man2/syslog.2.gz man3/syslog.3.gz man5/syslog.conf.5.gz
$
Sections 2 and 3 of the manual are C function calls. I checked on
another Linux box and got the same results.
>For a non-standard syslog facility, you'll want to use localX
>where X is a number between 0 and 7 - at least on the above OSes.
This is probably the information that I need to know. Thanks for the
help.
------------------------------
Date: Tue, 1 May 2001 22:22:37 +1000
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: Should Perl be first?
Message-Id: <slrn9etakd.os0.mgjv@martien.heliotrope.home>
On Sun, 29 Apr 2001 13:35:06 -0400,
David Coppit <newspost@coppit.org> wrote:
>
> My impression of VB is that it's a much simpler language than Perl or
> certainly C++, but probably not as simple as C. There are still a lot
> of annoying aspects, such as scoping issues and declarations. Perl's
> syntax will cause you more grief, but you'll probably find the
> language much more powerful than VB.
Hmmm. I would certainly not call C simple. Most certainly not. The
syntax isn't hard, writing simple stuff in it isn't hard, but writing a
large production ready library or program in C is certainly hard, and
requires loads of programming discipline. Higher level languages like
VB, Perl, Python, Java and C++ are easier to use as far as memory
management and out-of-bounds access is concerned, or even the provision
of many tools to perform tasks that need to be coded explicitly in C.
Besides that, C lives so dangerously close to the machine that it's very
easy to create code that inexplicably fails to produce correct results
when you compile it on a different platform. Many, many subtle pitfalls
and daemons lurk in the depths of C.
To the OP:
I'd have to agree with Tad most, in this thread (at least, I believe it
was Tad):
If you just want to get to work and make money, pick a language, and use
it. Pick a language that's at a suitably high level. Perl is one of my
favourites, and VB is very much down at the bottom of the list. But I've
always had a thing about trying to stay at least vaguely portable. VB
locks you very tightly in to one platform.
If you want to become a professional programmer, a good one, learn some
other languages, and C should probably be in there [1].
Experience is what's going to make you a good programmer. Experience in
more than one language will broaden your horizons, and shift your
perspective a few times. Perl is probably the language that allows for
the widest horizon and most varied perspective of all of them.
Oh, and don't forget to pick up some general language-independent
algorithm stuff. Knuth's got some good stuff on that [2].
Martien
[1] Although I know a few good programmers who don't use C, but they do
generally know cobol, lisp, algol or C++.
[2] Donald Knuth, The art of computer programming.
--
Martien Verbruggen |
Interactive Media Division | Little girls, like butterflies, need
Commercial Dynamics Pty. Ltd. | no excuse - Lazarus Long
NSW, Australia |
------------------------------
Date: Tue, 1 May 2001 13:53:17 +0200
From: eins@durchnull.de (Rudolf Polzer)
Subject: Re: Strange string -> num conversion
Message-Id: <slrn9et8td.2nm.eins@www42.t-offline.de>
Ren Maddox <ren@tivoli.com> wrote:
> On Mon, 30 Apr 2001, eins@durchnull.de wrote:
>
> > so C's atof has the same behaviour as perl's implicit string->num
> > conversion. To those who do _not_ get either 18 or 0 on
>
> Drat... so close. I forgot to look at atof (and strtod) and instead
> looked at atol (and strtol). strtol has a base argument, and atol
> uses base 10. strtod does not have a base argument, nor does the man
> page on my system make any mention of it handling bases in any way:
>
> The expected form of the string is optional leading white
> space as checked by isspace(3), an optional plus (``+'')
> or minus sign (``-'') followed by a sequence of digits
> optionally containing a decimal-point character, option
> ally followed by an exponent. An exponent consists of an
> ``E'' or ``e'', followed by an optional plus or minus
> sign, followed by a non-empty sequence of digits. If the
> locale is not "C" or "POSIX", different formats may be
> used.
>
> That "sequence of digits" part bothers me a bit in comparison to the
> observed behavior. I'd be surprised to find that the POSIX standard
> left the format of the number so up in the air -- guess I need to look
> it up.
>
> Still, the only way I can get the original "2.125" output is by
> including a decimal place:
>
> perl -le 'print 1 + "0x1.2"'
Did you try the C program?
--
#!/usr/bin/perl -W -- WARNING: This copies a random file from
use strict;my$s;my$n=0;for # the current directory to your
(<*>){++$n;int rand$n or$s # signature file. Use at your
=$_};`cp $s ~/.signature`; # own risk! (c) 2001 Rudolf Polzer
------------------------------
Date: Tue, 01 May 2001 10:21:14 +0100
From: Matt Sergeant <matt@sergeant.org>
Subject: Re: XML::RSS and mod_perl
Message-Id: <3AEE800A.1F80EA5A@sergeant.org>
Matt Morton-Allen wrote:
>
> It worked! Thankyou for your help. You've almost single handedly rekindled
> my faith in newsgroups!
Oh, no, please don't take that as standard fare for around here. Instead
join the mod_perl mailing list and the Perl-XML mailing list.
--
<Matt/>
/|| ** Founder and CTO ** ** http://axkit.com/ **
//|| ** AxKit.com Ltd ** ** XML Application Serving **
// || ** http://axkit.org ** ** XSLT, XPathScript, XSP **
// \\| // ** mod_perl news and resources: http://take23.org **
\\//
//\\
// \\
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 810
**************************************