[31468] in Perl-Users-Digest
Perl-Users Digest, Issue: 2720 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Dec 14 14:09:42 2009
Date: Mon, 14 Dec 2009 11:09:06 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 14 Dec 2009 Volume: 11 Number: 2720
Today's topics:
Re: Come on Perl, rescue me one more time! <jl_post@hotmail.com>
Re: Come on Perl, rescue me one more time! <derykus@gmail.com>
Re: Come on Perl, rescue me one more time! <rburbrid@cisco.com>
Re: How big do your programs get before you modularise <m@rtij.nl.invlalid>
Re: How big do your programs get before you modularise <OJZGSRPBZVCX@spammotel.com>
Re: Searching all instances of a pattern across multi-l sln@netherlands.com
Re: Simple loop error (Seymour J.)
Re: Trying to avoid passing params to subs through glob <justin.0912@purestblue.com>
Re: Trying to avoid passing params to subs through glob <justin.0912@purestblue.com>
Re: Trying to avoid passing params to subs through glob <justin.0912@purestblue.com>
Re: Trying to avoid passing params to subs through glob <rburbrid@cisco.com>
Re: Trying to avoid passing params to subs through glob <uri@StemSystems.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sun, 13 Dec 2009 15:17:47 -0800 (PST)
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: Come on Perl, rescue me one more time!
Message-Id: <e066a4b2-1f2d-4ece-9f76-c465cb7e83d8@s31g2000yqs.googlegroups.com>
On Dec 12, 2:16=A0pm, laredotornado <laredotorn...@zipmail.com> wrote:
> I have a text file with the following content pattern ...
>
> (newline)
> line 1
> line 2
> line 3
> (newline)
> line 1
> line 2
> line 3
> line 4
> line 5
> (new line)
> line 1
> line 2
>
> What I would like to do is remove the first line immediately after any
> new line (carriage return) and keep all the other lines. =A0How can I do
> this with perl?
Dear Dave,
I assume by "new line" you really mean a blank line. If I'm right,
then you can remove the line after every blank line with this simple
one-line Perl script:
perl -lpe "length or <>" input.txt > output.txt
(Warning: This is untested.)
I hope this helps, Dave.
-- Jean-Luc
------------------------------
Date: Mon, 14 Dec 2009 02:30:33 -0800 (PST)
From: "C.DeRykus" <derykus@gmail.com>
Subject: Re: Come on Perl, rescue me one more time!
Message-Id: <4bf621b9-9a98-4cf3-ab1a-76c5363edea4@u18g2000pro.googlegroups.com>
On Dec 12, 1:16=A0pm, laredotornado <laredotorn...@zipmail.com> wrote:
> ...
>
> What I would like to do is remove the first line immediately after any
> new line (carriage return) and keep all the other lines. =A0How can I do
> this with perl?
>
perl -pe '<> if /^$/' input
--
Charles DeRykus
------------------------------
Date: Mon, 14 Dec 2009 09:51:06 -0500
From: Sir Robert Burbridge <rburbrid@cisco.com>
Subject: Re: Come on Perl, rescue me one more time!
Message-Id: <1260802226.347493@sj-nntpcache-3.cisco.com>
On 12/12/2009 04:16 PM, laredotornado wrote:
> Hi,
>
> I'm using Perl 5.8.8 on Mac 10.5.6. I have a text file with the
> following content pattern ...
>
> (newline)
> line 1
> line 2
> line 3
> (newline)
> line 1
> line 2
> line 3
> line 4
> line 5
> (new line)
> line 1
> line 2
>
> What I would like to do is remove the first line immediately after any
> new line (carriage return) and keep all the other lines. How can I do
> this with perl?
>
> Thanks, - Dave
Not a perl solution, but in a text editor do this search and replace:
vim filename.txt
:%s/\n\+/\r/g
Then save.
------------------------------
Date: Mon, 14 Dec 2009 18:31:51 +0100
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: How big do your programs get before you modularise most of it?
Message-Id: <7d9iv6-5l8.ln1@news.rtij.nl>
On Sun, 13 Dec 2009 22:56:54 +0000, Justin C wrote:
> very interesting, it's certainly made me think differently about what
> I'm working on at the moment, and that I really should have created a
> function (or three) for it - I can see how that would have helped, but I
> think I've gone too far with this to go back... but for next time.
Refactoring your code if you get it wrong is a basic part of writing
code. Don' put it off, just do it. Most of the time it actually pays back
before you are finished with the first release, if not, it definately
pays off when writing release two!
(I'm currently on a v2 where I wished I took my own advise...)
M4
------------------------------
Date: Mon, 14 Dec 2009 18:39:38 +0100
From: "Jochen Lehmeier" <OJZGSRPBZVCX@spammotel.com>
Subject: Re: How big do your programs get before you modularise most of it?
Message-Id: <op.u4xvccbxmk9oye@frodo>
On Mon, 14 Dec 2009 18:31:51 +0100, Martijn Lievaart <m@rtij.nl.invlalid>
wrote:
> Refactoring your code if you get it wrong is a basic part of writing
> code. Don' put it off, just do it.
... while running your unit tests frequently.
------------------------------
Date: Sun, 13 Dec 2009 15:43:13 -0800
From: sln@netherlands.com
Subject: Re: Searching all instances of a pattern across multi-lines
Message-Id: <55sai59gvj4000ud1a5ih2l2li8eloj16i@4ax.com>
On Sun, 13 Dec 2009 13:22:15 -0800 (PST), laredotornado <laredotornado@zipmail.com> wrote:
>Hi,
>
>I'm using Perl 5.8.8 on Mac 10.5.6. I found this script online for
>matching a pattern across multiple lines. The problem is, it only
>prints out one instance of the expression, and I would like it to
>print out all instances. What can I change so that it will print out
>all instances?
>
>
>#!/usr/bin/perl
>use strict;
>use warnings;
>
>open(my $file, "<", "myfile.txt")
> or die "Can't open file: $!";
>my $text = do { local $/; <$file> };
>
>if ($text =~ /(<\s*script[^<]*>.*?<\/script>)/gs) {
> print $1;
>}
>
>
>
>Thanks, - Dave
'while()' should work as others have said.
The above regex should take into account these forms:
<tag>
<tag/>
<tag attr> content </tag>
<tag attr/>
Try this. It takes into account all the above forms
plus handles attributes fairly well, without the need for
[^<]*, where the actual character '<' can exist in the value
part. Handling attrib/vals correctly and taking acccount of all
valid forms are important, it all goes toward partitioning the
data.
Also, this is a complex parse. It includes multiple atomic
markup units, which is debatably <tag> style and content.
Content being the current state that is not markup.
Ideally, the unit is parsed to find the start element 'script',
recording is turned on, then off at the end element 'script'.
As it is now, the regex you are using won't correctly parse the
$text string below.
Good luck!
-sln
------------
use strict;
use warnings;
my $text = <<HTML;
<script />
<notme>
<script attr = "asdf" attr = 'wafsd'/>
<script a = "asdf" b= 'wafsd'>
use strict;
use warnings;
print "hello world, I'm a <tag>\\n";
</script>
<script>
// comment me out c++ style
/* now c style
*/
</script>
HTML
my $name = 'script';
my $rx = qr /
(
< $name (?: \s+ (?: ".*?" | '.*?' | [^>]*? )+ )? \s* \/ >
|
< $name (?: \s+ (?: ".*?" | '.*?' | [^>]*? )+ )* \s* > .*? <\/$name\s*>
)
/xs;
while ( $text =~ /$rx/g) {
print '-'x20,"\n",$1,"\n";
}
__END__
Output:
--------------------
<script />
--------------------
<script attr = "asdf" attr = 'wafsd'/>
--------------------
<script a = "asdf" b= 'wafsd'>
use strict;
use warnings;
print "hello world, I'm a <tag>\n";
</script>
--------------------
<script>
// comment me out c++ style
/* now c style
*/
</script>
------------------------------
Date: Sun, 13 Dec 2009 13:06:54 -0500
From: Shmuel (Seymour J.) Metz <spamtrap@library.lspace.org.invalid>
Subject: Re: Simple loop error
Message-Id: <4b252d3e$3$fuzhry+tra$mr2ice@news.patriot.net>
In <hg2pnp$4gf$1@reader1.panix.com>, on 12/13/2009
at 01:20 PM, bks@panix.com (Bradley K. Sherman) said:
>Yes, he did,
No; "<=" is not an equality operator.
>See original message.
Oddly enough, you quoted the relevant text from the original message,
which clearly reads "<=".
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>
Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to spamtrap@library.lspace.org
------------------------------
Date: Sun, 13 Dec 2009 23:23:02 +0000
From: Justin C <justin.0912@purestblue.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <mj9gv6-ta1.ln1@purestblue.com>
In article <868wda63lk.fsf@blue.stonehenge.com>, Randal L. Schwartz wrote:
>>>>>> "Justin" == Justin C <justin.0911@purestblue.com> writes:
>
>Justin> Further to me recent problem of unblessed references, I'm re-writing my
>Justin> code, putting more of it into subroutines so that the main thread is
>Justin> more obvious. I'm finding that where I was using global variables there
>Justin> appear to be a lot of variables that need passing to some sub-routines.
>
> Sounds like related state correlated with related behavior.
>
> That's called an "object".
>
> Might wanna look into that. :)
>
> If you need, I can recommend a good book (or two :).
I bought it already, stop bugging me! ... Though I'm finding it harder
work than The Llama - but that may be because I don't have as much time
to sink into it as I did The Llama.
I think I'm finding the concepts more difficult to grasp, the exercises
are easy enough. It seems I've been "doing" some of the stuff for a
while (following examples in modules), but not understanding how it's
been working, or what's going on. The book[1] is making me think about
what it is that I've been doing - and how I can use those techniques in
my own code. Interesting stuff. I hope that my Christmas break will
allow more time for study.
> print "Just another Perl hacker,"; # the original
Don't they all say that? ;-) One day I want to submit a useful module to
CPAN, and on that day I'll treat myself to a JAPH T shirt from Think
Geek in celebration.
Justin.
1. For those who don't know what I'm talking about, it's The Alpaca
<URL:http://oreilly.com/catalog/9780596102067/>
--
Justin C, by the sea.
------------------------------
Date: Sun, 13 Dec 2009 23:43:58 +0000
From: Justin C <justin.0912@purestblue.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <uqagv6-me1.ln1@purestblue.com>
In article <slrnhi22gu.ci2.tadmc@tadbox.sbcglobal.net>, Tad McClellan wrote:
> Justin C <justin.0911@purestblue.com> wrote:
>
>> I'm thinking of creating a few hashes of references to these variables,
>> and then passing the whole hash (or a reference to it) to the
>> subroutine.
>>
>> my %excel = (
>> "worksheet" => \$ws,
>> "workboot" => \$wb,
>> "format_1" => \$format_1,
>> "format_2" => \$format_2,
>> );
>
>
> You have 5 variables to keep track of there.
>
> Consider using a hash _instead_ of the individual variables:
>
> my %excel = (
> ws => function_that_returns_worksheet_object(),
> wb => function_that_returns_workbook_object(),
> ...
> );
This is *very* interesting. Very, very interesting. Concrete examples using code that I'm using makes *so* much difference to my understanding.
> Then you would have only one variable (the hash) to keep track of.
>
>> The other thing is, if $ws, for example, was returned as a reference (I
>> think I really need to see where I should return references, and where I
>> should return the actual thing...
>
>
> If it is "small" return the actual thing, returning a reference instead
> might be a good idea for performance reasons (rather than maintainablility
> reasons) if the thing is "large".
>
>
>> if it's an object return the thing, if
>> it's scalar, array or hash return a ref?
>
>
> Scalar references are not needed very often. At this point you
> should try pretending that they do not even exist.
>
> Note that you cannot return an array or a hash "thing", you can only
> return a list or a scalar.
I *know* this yet having it actually stated (several times) helps to
hammer it home. If I pass a hash to a sub (function, must get used to
terminology - it's called a function because it performs 'some
function') the function receives a list in @_, if I want to use it as a
hash within the function I have to get it back into a hash - that's why
I should pass-by-reference, that way it's still a hash. (I know you
don't need telling the above, it just re-enforces it in my mind).
Passing a function only what it needs, rather than %excel I suppose I
should use:
populate_worksheet($excel{worksheet}, $data{$current_line});
(I'm thinking aloud, please excuse me). Just pass the "bits" that I'm
going to use.
I can see a complete re-write of the current project coming up. I have
to say, Perl is getting interesting again. It wasn't getting boring, but
I'm getting that "I want to know more" bug again.
Justin.
--
Justin C, by the sea.
------------------------------
Date: Mon, 14 Dec 2009 00:03:49 +0000
From: Justin C <justin.0912@purestblue.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <50cgv6-ig1.ln1@purestblue.com>
In article <vh88v6-nb42.ln1@osiris.mauzo.dyndns.org>, Ben Morrow wrote:
>
> Quoth Justin C <justin.0911@purestblue.com>:
[snip]
>
> One way to enforce discipline (until you get into the habit of doing
> things right) is to put the 'main body' of your script into a 'sub
> main', and then have a call to 'main()' as the only statement outside a
> subroutine. That way you won't find yourself declaring variables a
> globals when they could actually just be locals in 'main'.
That's it, make my Perl look like C.
>> To avoid a whole bunch of vars being global I can see that I'm going to
>> have to nest a lot of sub-routines so that I can keep scope minimal.
>
> I'm not sure what you mean by 'nest subroutines'. If you mean putting
> one named sub inside another named sub, like this
>
> sub foo {
> sub bar { ... }
> }
No, I mean (using your suggestion):
main();
sub main {
eeni();
}
sub eeni {
meeni();
}
sub meeni {
myni();
}
sub myni {
mo();
}
sub mo {
...;
}
> As a general rule, you should be
> passing subs the data they need to work on as parameters, and passing
> the results back as the return value.
I'm starting to get the idea. It's gonna make a helluva mess of what I'm
working on when I get to work tomorrow!
>> The other thing is, if $ws, for example, was returned as a reference (I
>> think I really need to see where I should return references, and where I
>> should return the actual thing... if it's an object return the thing, if
>> it's scalar, array or hash return a ref? maybe that's too crude)
>
> Objects are refs already, remember, so returning an object *is*
> returning a ref. (Strictly speaking, what you get back from a
> constructor like ->new isn't the object itself, but a reference to the
> object.)
Objects are refs, don't supply a reference to a reference to function
(doh!).
I bet that in about two weeks I'm going to look back at threads like
this and think that I've been a complete idiot.
> Returning an explicit ref to a local variable like $ws is
> rather rare. About the only usual case is when you've built a complex
> data structure and you return a ref to avoid needing to take it apart
> into a list and put it back together again in the caller.
Nope, you've lost me - please don't explain now, I'm sure it won't help.
Baby steps only for the hard of understanding.
>> I
>> surely shouldn't be creating a reference to it (in the hash
>> assignement), so it should be "worksheet" => $ws" instead of "\$ws",
>> shouldn't it?
>
> Err... it's not clear why you think this might be necessary, but
> populating a hash with refs-to-refs-to-objects would be a little
> peculiar.
Putting it like that did make me laugh out loud. I think that it must be
time to re-read perldoc perlref.
>> Sorry if the above sounds a bit random, it's hitting the keyboard as
>> it's coming to mind, and I shouldn't really be doing this now, I'm
>> supposed to be working... though the work I'm supposed to be doing is
>> getting this program to do what we want - it's a little chicken and egg
>> here at the moment.
>
> The compromise between 'learning to program better, which will save time
> in the future' and 'solving the immediate problem in front of me right
> now' not always easy to find. In your case I would say that working on
> the fundamentals is worth it, at this point.
Definitely. I feel close to bridging one or two major gaps in my
understanding. Thank you for the reply.
Justin.
--
Justin C, by the sea.
------------------------------
Date: Mon, 14 Dec 2009 09:45:00 -0500
From: Sir Robert Burbridge <rburbrid@cisco.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <1260801860.950006@sj-nntpcache-3.cisco.com>
On 12/13/2009 07:03 PM, Justin C wrote:
> In article<vh88v6-nb42.ln1@osiris.mauzo.dyndns.org>, Ben Morrow wrote:
>> Quoth Justin C<justin.0911@purestblue.com>:
>>> To avoid a whole bunch of vars being global I can see that I'm going to
>>> have to nest a lot of sub-routines so that I can keep scope minimal.
>> I'm not sure what you mean by 'nest subroutines'. If you mean putting
>> one named sub inside another named sub, like this
>>
>> sub foo {
>> sub bar { ... }
>> }
>
> No, I mean (using your suggestion):
>
> main();
>
> sub main {
> eeni();
> }
> sub eeni {
> meeni();
> }
Although, I've actually found this a very helpful paradigm:
sub do_something {
...
my $handler = sub {
...
};
if ($cond_a) {
return $handler->(do_a());
} elsif ($cond_b) {
return $handler->(do_b());
} else {
return $handler->(do_c());
}
}
-Sir
------------------------------
Date: Mon, 14 Dec 2009 13:46:14 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <874ont2y2x.fsf@quad.sysarch.com>
>>>>> "SRB" == Sir Robert Burbridge <rburbrid@cisco.com> writes:
SRB> Although, I've actually found this a very helpful paradigm:
actually that is a poor paradigm.
SRB> sub do_something {
SRB> ...
SRB> my $handler = sub {
SRB> ...
SRB> };
SRB> if ($cond_a) {
SRB> return $handler->(do_a());
SRB> } elsif ($cond_b) {
why the else (of the if) when you just returned? make it cleaner by just
falling through to a plain if.
SRB> return $handler->(do_b());
SRB> } else {
SRB> return $handler->(do_c());
SRB> }
SRB> }
better yet, use a dispatch table. i hate elsif's and try to never use
them. they are a marker for poor logic. hell, i eschew else as well. in
over 10k lines in one project i had about 3 elsif's and 11 else's. and
the code is very readable. it is just done with very clean logic and
code flow.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 2720
***************************************