[31514] in Perl-Users-Digest
Perl-Users Digest, Issue: 2773 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Jan 18 06:09:39 2010
Date: Mon, 18 Jan 2010 03:09:04 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 18 Jan 2010 Volume: 11 Number: 2773
Today's topics:
Re: a defense of ad hoc software development <cartercc@gmail.com>
css parser <catebekensail@yahoo.com>
Re: css parser <OJZGSRPBZVCX@spammotel.com>
Re: css parser <rkb@i.frys.com>
Re: css parser <sreservoir@gmail.com>
Re: css parser <m@rtij.nl.invlalid>
Re: FAQ 9.16 How do I decode a CGI form? <paduille.4061.mumia.w+nospam@earthlink.net>
Subroutines and $_[0] <me@me.com>
Re: Subroutines and $_[0] <OJZGSRPBZVCX@spammotel.com>
Re: Subroutines and $_[0] <derykus@gmail.com>
Re: Subroutines and $_[0] <derykus@gmail.com>
Re: Subroutines and $_[0] <ben@morrow.me.uk>
Re: Subroutines and $_[0] <uri@StemSystems.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sun, 17 Jan 2010 05:09:14 -0800 (PST)
From: ccc31807 <cartercc@gmail.com>
Subject: Re: a defense of ad hoc software development
Message-Id: <ba1509fc-7944-42a7-bde8-c16548630358@14g2000yqp.googlegroups.com>
On Jan 16, 12:30=A0am, seeWebInst...@rem.intarweb.org (Robert Maas,
http://tinyurl.com/uh3t) wrote:
> I'm less familiar with military specifications, but I would guess
> these are like InterNet specs, where you might code an API module
> for each set of specs, coding directly from the specs, then write
> the rest of the application in a more experimental style.
http://homepage.mac.com/simon.j.wright/pushface.org/mil_498/index.htm
Read it and weep.
> It is sometimes desireable to quickly write non-functional UI which
> shows the user how it "looks", and get their approval of that,
> before starting the real code under the UI.
Absolutely. However, one hundred percent of my scripts are designed to
be run from the command line, sometimes as cron jobs, sometimes called
by DOS batch files (for users uncomfortable with the CLI). I've
learned to write real specific prompts for the input data, and write
sanity checks throughout (e.g., "You entered the date of August 71,
2009. Do you want to continue? [y|n]")
CC
------------------------------
Date: Sun, 17 Jan 2010 14:26:33 -0800 (PST)
From: cate <catebekensail@yahoo.com>
Subject: css parser
Message-Id: <c0a4b057-e2b6-4e1a-a7dd-51672ba4278d@r19g2000yqb.googlegroups.com>
I would think this is possible - just a couple of hints to get
started.
I have a perl .asp script running in an apache environment, building a
html page. One of the scripts input arguments is a css class name. I
can't see the class yet but I do have the location of all the style
files. Is there a module which will parse these files into an xml-css
dom that I can inspect with XML:DOM or a like module?
Any suggestions on how to proceed would help me.
Thank you.
------------------------------
Date: Sun, 17 Jan 2010 23:28:40 +0100
From: "Jochen Lehmeier" <OJZGSRPBZVCX@spammotel.com>
Subject: Re: css parser
Message-Id: <op.u6o7d2hpmk9oye@frodo>
On Sun, 17 Jan 2010 23:26:33 +0100, cate <catebekensail@yahoo.com> wrote:
> Any suggestions on how to proceed would help me.
http://search.cpan.org/search?query=css&mode=all
------------------------------
Date: Sun, 17 Jan 2010 20:38:55 -0800 (PST)
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: css parser
Message-Id: <7e5ac50c-2e45-4e54-bdd1-eb408e1d90b1@r5g2000yqb.googlegroups.com>
On Jan 17, 2:26=A0pm, cate <catebekens...@yahoo.com> wrote:
> I would think this is possible - just a couple of hints to get
> started.
>
> I have a perl .asp script
Which is it? Do you have a Perl script or an asp script?
------------------------------
Date: Mon, 18 Jan 2010 00:23:53 -0500
From: sreservoir <sreservoir@gmail.com>
Subject: Re: css parser
Message-Id: <hj0r9n$f5b$1@speranza.aioe.org>
On 1/17/2010 11:38 PM, Ron Bergin wrote:
> On Jan 17, 2:26 pm, cate<catebekens...@yahoo.com> wrote:
>> I would think this is possible - just a couple of hints to get
>> started.
>>
>> I have a perl .asp script
>
> Which is it? Do you have a Perl script or an asp script?
presumable, asp using perlscript.
--
"Six by nine. Forty two."
"That's it. That's all there is."
"I always thought something was fundamentally wrong with the universe"
------------------------------
Date: Mon, 18 Jan 2010 08:48:50 +0100
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: css parser
Message-Id: <2cgd27-704.ln1@news.rtij.nl>
On Sun, 17 Jan 2010 20:38:55 -0800, Ron Bergin wrote:
> On Jan 17, 2:26Â pm, cate <catebekens...@yahoo.com> wrote:
>> I would think this is possible - just a couple of hints to get started.
>>
>> I have a perl .asp script
>
> Which is it? Do you have a Perl script or an asp script?
Asp is a technology that supports multiple languages, Perl is one of them.
HTH,
M4
------------------------------
Date: Sat, 16 Jan 2010 23:36:18 -0600
From: "Mumia W." <paduille.4061.mumia.w+nospam@earthlink.net>
Subject: Re: FAQ 9.16 How do I decode a CGI form?
Message-Id: <d8SdnaOysbG5P8_WnZ2dnUVZ_uudnZ2d@earthlink.com>
On 01/16/2010 05:04 PM, Helmut Richter wrote:
> [...]
> The documentation http://perldoc.perl.org/CGI.html says:
>
> | -utf8
> |
> |
> | This makes CGI.pm treat all parameters as UTF-8 strings. Use this with care,
> | as it will interfere with the processing of binary uploads. It is better to
> | manually select which fields are expected to return utf-8 strings and
> | convert them using code like this:
> |
> | 1. use Encode;
> | 2. my $arg = decode utf8=>param('foo');
>
> [...]
> e. The second possible interpretation is that, with the -utf8 pragma,
> param() delivers the form input data as textstrings. Then they can also
> be defaulted to textstrings, compared with other textstrings, and
> output to a new form provided that STDOUT is in :utf8 mode. This would
> not only be a reasonable behaviour but an extremely useful one.
> Therefore I find this the most plausible interpretation.
>
> f. Up to now, we have not considered uploading binary files. If behaviour
> (e) is intended, which we do not know due to lack of unambiguous
> documentation, then one could think of setting :utf8 mode also on
> STDIN. However, this would be a very bad idea, as then the entire input
> would have to be UTF-8 which is not the case for embedded binary files.
> An alternative implementation would be to first extract uploaded files
> from the input data, and then interpret the remainder as UTF-8 data
> (which can be guaranteed if properly specified in the accept-charset
> option of the <form> tag). Again: This would not only be a reasonable
> behaviour but an extremely useful one. Therefore I find this the most
> plausible interpretation.
> [...]
After reading your message and testing a little more, I'm a little more
confused than before ;-)
I upgraded to CGI.pm 3.48; I had no idea that I wasn't using the correct
version; CGI.pm 3.29 (Debian Lenny) doesn't complain if you give it an
unused/invalid option:
use CGI qw/-utf44/; # There is no utf44; there is no complaint either.
Strangely, after I upgraded to 3.48, your program worked perfectly. All
I need do is type in "München" for the location, and the file is
accepted and is not corrupted. From my point of view, CGI.pm 3.48 does
option "e" described above--possibly with some magic to exclude binary
files from the utf8 conversion.
The changes I made to your program were modest (no real changes). I
placed a copy here:
http://home.earthlink.net/~mumia.w.18.spam/docs/try-binary1.txt
Perhaps it's a version/library problem; this is my environment:
O/S: Debian Lenny i386
CGI.pm: 3.48
FCGI.pm: 0.67
Apache2: 2.2.9
Firefox 3.5.6 (x86/Linux)
My environment is fully UTF-8: console, Xorg, everything I could set.
------------------------------
Date: Sun, 17 Jan 2010 21:50:59 +0000
From: George <me@me.com>
Subject: Subroutines and $_[0]
Message-Id: <hj00o3$uis$1@canard.ulcc.ac.uk>
Dear All,
I am parsing a web page with the LWP module and then doing some regular
expression matching to print out specific chunks of code. The way I
organised it is:
while (regular expression matches) {
process_text($1);
}
So far, everything is fine - however I noticed the following with the
subroutine: if I write code as following using $_[0], then the next time
within the same subroutine that I use it to check against another
regular expression it is empty.
$_[0] =~ m{<span class="listing_default">(.*?)</span></a>\s*</div>}si;
$title=$1;
$mtc1 = $_[0] =~ m{<div class="listing_results_rating">(.*?)</div>}si;
if ($mtc1) {
$rating=0;
}
else {
$rating=$1;
}
But if use shift and save the value in a variable, it works fine, i.e.:
my $ocontent = shift;
$ocontent =~ m{<span class="listing_default">(.*?)</span></a>\s*</div>}si;
$title=$1;
$mtc1 = $ocontent =~ m{<div class="listing_results_rating">(.*?)</div>}si;
if ($mtc1) {
$rating=0;
}
else {
$rating=$1;
}
If anyone could shed some light on why this is the case, I would be
grateful.
Regards,
George
------------------------------
Date: Sun, 17 Jan 2010 23:27:20 +0100
From: "Jochen Lehmeier" <OJZGSRPBZVCX@spammotel.com>
Subject: Re: Subroutines and $_[0]
Message-Id: <op.u6o7buwmmk9oye@frodo>
On Sun, 17 Jan 2010 22:50:59 +0100, George <me@me.com> wrote:
> So far, everything is fine - however I noticed the following with the
> subroutine: if I write code as following using $_[0],
> But if use shift and save the value in a variable, it works fine
The $@ array in a sub is basically just an alias for the parameters you
called the sub with. If you modify it, you actually are modifying the
argument from the caller's point of view. If that happens with $1, you get
funny side effects, looks like.
Look:
~> perl -e '$a="first"; fn($a); print $a,"\n"; sub fn { $_[0]="second" }'
second
~> perl -e '"a"=~m/(.*)/; print "before: $1\n"; fn($1); print "after:
$1\n"; exit; \
sub fn { print "inside 1: @_\n"; "b" =~ m//; print "inside 2:
@_\n" }'
before: a
inside 1: a
inside 2: b
after: b
What you usually do in a sub is to somehow extract the alues from @_,
because usually you do not want to accidently modify the arguments (and it
is easier to see what arguments you expect):
sub f
{
my ($arg1,$arg2,$arg3)=@_;
...
}
or
sub f
{
my $args=shift @_;
...
}
------------------------------
Date: Sun, 17 Jan 2010 18:20:06 -0800 (PST)
From: "C.DeRykus" <derykus@gmail.com>
Subject: Re: Subroutines and $_[0]
Message-Id: <e6277564-352d-4c30-99bb-23be6764ec8d@21g2000yqj.googlegroups.com>
On Jan 17, 1:50=A0pm, George <m...@me.com> wrote:
> ...
> while (regular expression matches) {
> =A0 =A0 =A0 =A0 process_text($1);
>
> }
>
> So far, everything is fine - however I noticed the following with the
> subroutine: if I write code as following using $_[0], then the next time
> within the same subroutine that I use it to check against another
> regular expression it is empty.
>
> $_[0] =3D~ m{<span class=3D"listing_default">(.*?)</span></a>\s*</div>}si=
;
> =A0 $title=3D$1;
It's good form to check that the match succeeds before
assigning to backreferences. At the very least, your
control logic becomes much easier to follow:
if ( $_[0] =3D~ m{....} ) {
$title =3D $1;
...
}
Note: $title isn't defined if the match fails
> $mtc1 =3D $_[0] =3D~ m{<div class=3D"listing_results_rating">(.*?)</div>}=
si;
The match operator =3D~ will bind more tightly than =3D
so that'll be parsed as:
$mtc1 =3D ( $_[0] =3D~ m{...} );
That means $mtc1 will either be 0 if the match fails
or 1 if the match succeeds. Evidently you know that
but now the code becomes a bit tricky and isn't as
clear.
> =A0 if ($mtc1) {
> =A0 =A0 =A0 =A0 $rating=3D0;
> =A0 =A0 =A0 =A0 }
> =A0 else {
> =A0 =A0 =A0 =A0 $rating=3D$1;
> =A0 =A0 =A0 =A0 }
>
As noted, $mtc1 becomes a boolean test of the match
success as written so I suspect the above should be
flipped to read:
if ( $mtc1 ) { # succeeds
$rating =3D $1;
} else {
$rating =3D 0; # fails
}
But see how much clearer the following is and now you
don't need $mtc1 (unless $mtc1 is used later in your
code for other purposes):
if ( ($_[0] ) =3D~ m{...} ) { # match succeeds
$rating =3D $1;
...
} else { # match fails
$rating =3D 0;
..
}
> But if use shift and save the value in a variable, it works fine, i.e.:
> $mtc1 =3D $ocontent =3D~ m{<div class=3D"listing_results_rating">(.*?)</d=
iv>}si;
> if ($mtc1) {
> $rating=3D0;
> }
> else {
> $rating=3D$1;
> }
Did you get tangled in the web of your logic...?
--
Charles DeRykus
------------------------------
Date: Sun, 17 Jan 2010 18:30:46 -0800 (PST)
From: "C.DeRykus" <derykus@gmail.com>
Subject: Re: Subroutines and $_[0]
Message-Id: <9a3848e1-ea34-4e77-812a-8c8c7fe095e1@m25g2000yqc.googlegroups.com>
On Jan 17, 6:20=A0pm, "C.DeRykus" <dery...@gmail.com> wrote:
> ...
>
> > $_[0] =3D~ m{<span class=3D"listing_default">(.*?)</span></a>\s*</div>}=
si;
> > =A0 $title=3D$1;
>
> It's good form to check that the match succeeds before
> assigning to backreferences.
^^^
from
--
Charles DeRykus
------------------------------
Date: Mon, 18 Jan 2010 04:49:28 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Subroutines and $_[0]
Message-Id: <or5d27-ild.ln1@osiris.mauzo.dyndns.org>
Quoth "C.DeRykus" <derykus@gmail.com>:
> On Jan 17, 1:50 pm, George <m...@me.com> wrote:
>
> > while (regular expression matches) {
> > process_text($1);
> >
> > }
> >
> > So far, everything is fine - however I noticed the following with the
> > subroutine: if I write code as following using $_[0], then the next time
> > within the same subroutine that I use it to check against another
> > regular expression it is empty.
> >
> > $_[0] =~ m{<span class="listing_default">(.*?)</span></a>\s*</div>}si;
> > $title=$1;
>
> It's good form to check that the match succeeds before
> assigning to backreferences. At the very least, your
> control logic becomes much easier to follow:
>
> if ( $_[0] =~ m{....} ) {
> $title = $1;
> ...
> }
if (my ($title) = $_[0] =~ m{...}) {
...
}
is clearer.
Ben
------------------------------
Date: Mon, 18 Jan 2010 00:31:10 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Subroutines and $_[0]
Message-Id: <87ocksou75.fsf@quad.sysarch.com>
>>>>> "BM" == Ben Morrow <ben@morrow.me.uk> writes:
BM> Quoth "C.DeRykus" <derykus@gmail.com>:
>> On Jan 17, 1:50 pm, George <m...@me.com> wrote:
>>
>> > while (regular expression matches) {
>> > process_text($1);
>> >
>> > }
>> >
>> > So far, everything is fine - however I noticed the following with the
>> > subroutine: if I write code as following using $_[0], then the next time
>> > within the same subroutine that I use it to check against another
>> > regular expression it is empty.
>> >
>> > $_[0] =~ m{<span class="listing_default">(.*?)</span></a>\s*</div>}si;
>> > $title=$1;
>>
>> It's good form to check that the match succeeds before
>> assigning to backreferences. At the very least, your
>> control logic becomes much easier to follow:
>>
>> if ( $_[0] =~ m{....} ) {
>> $title = $1;
>> ...
>> }
BM> if (my ($title) = $_[0] =~ m{...}) {
BM> ...
BM> }
BM> is clearer.
or even this so you can declare $title and make sure it is set to
something useful
my $title = $_[0] =~ m{...}) ? $1 : '' ;
but i still can't see why he had to save the value in a lexical to make
it work. i think there is unpasted code that affects things.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 2773
***************************************