[31474] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2726 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Dec 16 21:34:56 2009

Date: Wed, 16 Dec 2009 18:34:49 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 16 Dec 2009     Volume: 11 Number: 2726

Today's topics:
    Re: Trying to avoid passing params to subs through glob <uri@StemSystems.com>
    Re: Trying to avoid passing params to subs through glob <jl_post@hotmail.com>
    Re: Trying to avoid passing params to subs through glob <source@netcom.com>
    Re: Trying to avoid passing params to subs through glob <jurgenex@hotmail.com>
    Re: Trying to avoid passing params to subs through glob <jurgenex@hotmail.com>
    Re: Trying to avoid passing params to subs through glob <uri@StemSystems.com>
    Re: Trying to avoid passing params to subs through glob <jl_post@hotmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Wed, 16 Dec 2009 16:35:47 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <87iqc6d2ks.fsf@quad.sysarch.com>

>>>>> "DH" == David Harmon <source@netcom.com> writes:

  DH> On Tue, 15 Dec 2009 22:00:07 -0500 in comp.lang.perl.misc, "Uri Guttman"
  DH> <uri@StemSystems.com> wrote,
  >>>>>>> "BM" == Ben Morrow <ben@morrow.me.uk> writes:
  >> 
  BM> Quoth "Uri Guttman" <uri@StemSystems.com>:
  >> >> 
  >> >> i have seen too many subs which return undef on fail. in older perls
  >> >> that is problematic with ||.
  >> 
  BM> Can you explain how? AFAIK undef has always been false...
  >> 
  >> because you can't tell undef from '' or 0 or '0' with ||. 5.10 has //
  >> which solves that problem.

  DH> But that's not a problem in this thread, where undef is completely
  DH> acceptable as a false result from a sub indicating failure.  When you
  DH> wrote that undef was problematic with || I took that as meaning that it
  DH> didn't always work as documented in older perls.  After searching, I can
  DH> find no indication of that being true.  So it is not problematic; it
  DH> just doesn't always do everything you might imagine.

  DH> In another thread you suggest:
  DH>     my $month = $ARGV[0] || '' ;

you can't get undef from @ARGV.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Wed, 16 Dec 2009 13:48:10 -0800 (PST)
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <49efdbef-b165-4eda-9c00-dd5d351486fb@a10g2000pre.googlegroups.com>

On Dec 14, 3:44 pm, "Jochen Lehmeier" <OJZGSRPBZ...@spammotel.com>
wrote to Uri:
> I am with you in that your second code is better than
> the first, you still have to convince me that your second
> code is "better" than (just to name an example)
>
>    my $result=-1;
>    if (valid())
>    {
>      lots of code
>    }
>    return $result;
>
> You seem to be very focused on "else". I still fail to
> see why. There are many ways to write code with or
> without "else", all with different lengths, your
> measurement of code simply seems to consist of the
> number of bytes and number of "else"s.


   What Uri is trying to avoid is not "else"s per se, but rather
unnecessary blocks to begin with.  Every time you employ a block (with
proper indentation, hopefully), you leave the maintainer necessarily
having to figure out how the logic differs between code that takes
that block and code that skips that block.

   In the example you gave as you wrote it, it's easy to see that if
the if() block is not taken, then the function is returned from
immediately.  But what if, in place of the "lots of code", there was
literally a hundred lines of code?  If I was looking at that if()
condition, I'd know that in some cases the logic enters that block,
but I wouldn't know that in others the logic is done with that
function altogether.

   In order to know that, in certain conditions, that we're done with
the function and we can return, I necessarily would have to scroll all
the way down to the end of the if() block, where I would notice that
the $result is returned untouched.  It's much easier for me ("me"
being the maintainer, not the writer of the code) to know that when I
see:

   return -1  if not valid();
   lots of code
   return result;

that non-valid inputs results in an immediate return.

   Of course, this is just a simple example.  In real life, I often
come across functions with the following skeleton:

sub f
{
   my (...) = @_;

   my $found = 0;
   my $result;

   if (...)
   {
      # do something quick here

      if (...)
      {
         # do something quick here

         if (...)
         {
            while (... && !$found)
            {
               if (...)
               {
                  # do something quick here

                  if (...)
                  {
                     # lots of code

                     if (...)
                     {
                        # We found what we're looking for!
                        $result = ...;
                        $found = 1;
                     }
                  }
               }
            }
         }
      }
   }

   return $result;
}

   All this function does is search an input set for something we
want, and when we find it, it gets returned.  However, when we know
the inputs aren't what we want at the beginning, we still in the
function until the very end.  And when we know the element we're at in
the while() loop isn't what we want, we're still in it for the whole
loop.

   And when we finally do get the answer we want, we have to employ
the use of a $found variable to jump out of the loop to get to the end
of the function, where only then we can return.

   And notice how many '}' we have at the end.  Which matches to
what?  Can you tell at a glance?  I wrote the code myself, and I can't
immediately tell.

   When maintaining, when I come across code like the above, I'll see
the first if() and wonder, "What happens to code that doesn't take
this route? <scroll, scroll, scroll>  Oh, nothing."  Then I'll come
across the second if(), scroll, and think the same thing.  Same for
the third if().

   Then I come across the first if() in the while() loop and wonder,
"What happens to code that doesn't take this route? <scroll, scroll>
Oh, this element has nothing more to do with this iteration of the
loop."  Same with the next if().

   And when I come across the final if() (the one that reaches the
what we're looking for), I think, how does setting the $found and
$result variables affect future behavior of this function?  <scroll>
Oh, nothing more of consequence happens except that $result gets
returned."

   Instead of using the code above as I wrote it, how about I re-write
it like this:

sub f
{
   my (...) = @_;

   return  if not ...;

   # do something quick here

   return  unless ...;

   # do something quick here

   return  if not ...;

   while (...)
   {
      next  if not ...;

      # do something quick here

      next  unless ...;

      # lots of code

      # Return the result if we found what we're looking for:
      return $result  if ...;
   }
}

   Is this new code any better?  I think so.  For one thing, it
doesn't depend on the $found and $result variables, as they aren't
needed here.

   Also, look at those early "return" statements.  They tell the
maintainer at-a-glance that the rest of the function is no longer
necessary if certain conditions are met.  (You might think that's true
of the previous code, but if you have to scroll down hundreds of lines
to be greeted with half a dozen (or more) of '}'s (sometimes code
between them), you might find yourself wishing that there were a lot
less blocks to navigate through.

   Note the "next" statements in the while() loop.  They let you know
right away when the element you're examining is one you're interested
in and want to keep around.  If you know you don't need it, you can
just invoke "next" and skip right to the next element.

   Because of these "next" and extra "return" statements, I no longer
find myself wondering what follows code that doesn't meet the if()
conditions.  I know right away what happens, without the need to jump
around to the end of the loop (all the while hoping that the '}' that
I think is the end of the loop really is the right one).

   Also note that, with the exception of the top-level block that
defines the f() function, there is only one block, and it's just for
the while() loop.  A maintainer can see that there is only one place
where the logic of the code diverges (excepting, of course, the early
exits and skips caused by the "return" and "next" statements).

   Have you ever read a recipe in a culinary cookbook?  If you have,
you might have noticed that most recipes consist of instructions that
are mostly nicely lined up with the page.  Occasionally an instruction
will include multiple "sub-instructions" that are bulleted and/or
indented.  However, excessive (and nested) bulleting and indeting is
frowned upon, since it tends to turn the recipe into a logic puzzle,
which is not the original intent of the recipe writer.

   Likewise, I freely use early "return"s and "next"s, because I don't
want my code to become a puzzle to the maintainers of my code.  I want
to make it clear when the rest of the function is no longer of any use
(by using a "return" statement).  The maintainer shouldn't have to
scroll down dozens of lines and match up the right brace just to
figure that out for him/herself.

   Similarly, If I'm looping through a list or data structure and
determine that the current element won't be needed, I'll just use a
"next" statemnet to skip it.  Otherwise, the maintainer will
necessarily have to scroll down to the bottom of the loop (and match
up the proper braces) just to see that nothing else happens.

   Nevertheless, some coders I've worked with still insist on using
the overly-indented style, adamantly avoiding the early "return"s and
"next"s.  When asked why, one coder stated it's because he was taught
in a college C course that "every function should have exactly one
entrance point and exactly one exit point."

   While that may be good avice for C (in C you often need to perform
"clean up" code to free up memory and close file handles), that advice
is not so good for languages like C++ and Perl, which encourage the
use of self-cleaning objects.  (You can read up on "RAII" if you want
to learn more about it for C++.)

   In properly written C++ and Perl, if you exit a block or a routine
early, the variables created in the scope(s) you just left simply
clean themselves up!  With this kind of convenience, there's no reason
to prohibit "next"s and multiple "return" statements.

   In truth, some coders will avoid "next"s and early "return"s simply
because they've never used them and are not used to them (and
sometimes they'll even use that argument to justify why nobody else
should use them, either!).  But if this were a valid argument, then
everybody would necessarily have to code to the lowest common
denominator of knowledge, which is just plain silly.  Imagine that,
whenever a new coder is hired, a company would have to examine the new
coder's knowledge to determine what keywords and coding styles should
never be used again.  (You probably see where this is going, and why
avoiding code styles just because they're unfamiliar is a terrible
idea.)

   The use of less blocks is also less error-prone.  I've seen code
where someone accidentally put a line of code inbetween the wrong pair
of '}' braces (among half a dozen braces).  When I was debugging the
code, I noticed that all but two of the '}' could have been eliminate
(by judicial use of "next" and "return").  If there had only been two
'}' to begin with, then the bug caused by the misplaced code probably
would have never existed, as the code probably wouldn't have been
misplaced in the first place.

   So I think that Uri's point (and mine, definitely) is to try toward
the minimal number of blocks, since minimizing the number of blocks
tends to reduce code errors and is easier to maintain.  And "next"s
and multiple "return"s may FEEL wrong to some coders (especially to
those who have been taught not to use them), but if used correctly
they are a real blessing to the maintainer.

   I hope this clarifies things, Jochen.

   -- Jean-Luc Romano


------------------------------

Date: Wed, 16 Dec 2009 14:22:13 -0800
From: David Harmon <source@netcom.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <f6ednR3COY9OwLTWnZ2dnUVZ_hadnZ2d@earthlink.com>

On Wed, 16 Dec 2009 16:35:47 -0500 in comp.lang.perl.misc, "Uri Guttman"
<uri@StemSystems.com> wrote,
>>>>>> "DH" == David Harmon <source@netcom.com> writes:
>  DH> In another thread you suggest:
>  DH>     my $month = $ARGV[0] || '' ;
>
>you can't get undef from @ARGV.

use Data::Dumper;
my $month = $ARGV[0];
print Dumper \$month;

yields:
$VAR1 = \undef;


------------------------------

Date: Wed, 16 Dec 2009 14:55:32 -0800
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <qvoii5lir82qcibqd9lt88mooh24dlnthf@4ax.com>

"jl_post@hotmail.com" <jl_post@hotmail.com> wrote:
>On Dec 14, 3:44 pm, "Jochen Lehmeier" <OJZGSRPBZ...@spammotel.com>
>wrote to Uri:
>> I am with you in that your second code is better than
>> the first, you still have to convince me that your second
>> code is "better" than (just to name an example)
>>
>>    my $result=-1;
>>    if (valid())
>>    {
>>      lots of code
>>    }
>>    return $result;
>>
>> You seem to be very focused on "else". I still fail to
>> see why. There are many ways to write code with or
>> without "else", all with different lengths, your
>> measurement of code simply seems to consist of the
>> number of bytes and number of "else"s.
>
>
>   What Uri is trying to avoid is not "else"s per se, but rather
>unnecessary blocks to begin with.  Every time you employ a block (with
>proper indentation, hopefully), you leave the maintainer necessarily
>having to figure out how the logic differs between code that takes
>that block and code that skips that block.
>
>   In the example you gave as you wrote it, it's easy to see that if
>the if() block is not taken, then the function is returned from
>immediately.  But what if, in place of the "lots of code", there was
>literally a hundred lines of code? 

While I agree with you in general this particular argument backfires.

If you have a block with hundred lines of code then it is overdue to be
split up into 6-10 subs. Yes, there are certainly examples where
exceedingly long blocks of code are appropriate, but in general the old
rule of thumb "No sub longer than a single [VT100] screen" still has its
merrits.

jue


------------------------------

Date: Wed, 16 Dec 2009 14:59:08 -0800
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <3epii553aaddt44nadnu2h1cltf5kjgbg1@4ax.com>

"jl_post@hotmail.com" <jl_post@hotmail.com> wrote:
>   Of course, this is just a simple example.  In real life, I often
>come across functions with the following skeleton:
>
>sub f
>{
>   my (...) = @_;
>
>   my $found = 0;
>   my $result;
>
>   if (...)
>   {
>      # do something quick here
>
>      if (...)
>      {
>         # do something quick here
>
>         if (...)
>         {
>            while (... && !$found)
>            {
>               if (...)
>               {
>                  # do something quick here
>
>                  if (...)
>                  {
>                     # lots of code
>
>                     if (...)
>                     {
>                        # We found what we're looking for!
>                        $result = ...;
>                        $found = 1;
>                     }
>                  }
>               }
>            }
>         }
>      }
>   }
>
>   return $result;
>}
>
[...]
>   When maintaining, when I come across code like the above, 

 ... then I wonder, if the original programmer has never heard of subs.

jue


------------------------------

Date: Wed, 16 Dec 2009 18:12:43 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <87ljh2bjis.fsf@quad.sysarch.com>

>>>>> "DH" == David Harmon <source@netcom.com> writes:

  DH> On Wed, 16 Dec 2009 16:35:47 -0500 in comp.lang.perl.misc, "Uri Guttman"
  DH> <uri@StemSystems.com> wrote,
  >>>>>>> "DH" == David Harmon <source@netcom.com> writes:
  DH> In another thread you suggest:
  DH> my $month = $ARGV[0] || '' ;
  >> 
  >> you can't get undef from @ARGV.

  DH> use Data::Dumper;
  DH> my $month = $ARGV[0];
  DH> print Dumper \$month;

  DH> yields:
  DH> $VAR1 = \undef;

that is because you didn't test @ARGV for elements. please make
intelligent tests. if @ARGV has elements, those elements cannot be undef
because the shell can't pass undef via exec. checking $ARGV[0] directly
is bad coding but i didn't address that in this thread.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Wed, 16 Dec 2009 15:35:32 -0800 (PST)
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: Trying to avoid passing params to subs through globals
Message-Id: <e0ed9fcd-c7a9-431e-b8b1-99536a378e1b@k32g2000prb.googlegroups.com>

On Dec 16, 3:55 pm, J=FCrgen Exner <jurge...@hotmail.com> wrote:
> If you have a block with hundred lines of code then it is
> overdue to be split up into 6-10 subs. Yes, there are certainly
> examples where exceedingly long blocks of code are appropriate,
> but in general the old rule of thumb "No sub longer than a
> single [VT100] screen" still has its merrits.


   (For the record, I was going to type "hundreds of lines of code,"
but then I thought better of it and decided to write "a hundred lines
of code" instead.)

   But my point is that I don't control how other code is written, and
if it I run across a function I didn't write that has literally has
hundreds of lines of code, then I'm stuck debugging it regardless.
(In other words, I can't refuse to debug it just because it doesn't
look well-written.)

   The reason I encourage coders to use prodigious use of "next" and
"return" statements in Perl is because they are almost always done
correctly when used.  In other words, their use almost always makes my
job as a maintainer easier.

   However, I tend not to encourage breaking up long code into smaller
functions quite as often, for three main reasons:

   1.  Sometimes the algorithm is a valid -- yet long --
       algorithm that works well as a single function and
       would compromise the flow of the main logic if
       broken up.

   2.  For some reason, breaking up functions causes
       many programmers to resort to using global
       variables (instead of correctly implementing
       function parameter passing) and this is a HUGE
       pet peeve of mine.  (I'd rather deal with
       overly-long functions than with unnecessary
       global variables.)

   3.  Even if the function should be broken up,
       a long function that doesn't rely on global
       variables is not that much harder to debug
       than one that is broken up into smaller
       functions (especially if the function is
       well written, with easily distinguishable
       sections of code).  In other words, a long
       function might be a coding sin, but there
       are plenty worse sins out there (in my opinion
       as a maintainer).

   I'm not saying that large functions shouldn't be broken up into
smaller ones; what I'm saying is that, if I were to write a
"Maintainer's Wish List", my wish to have coders use "next" and
"return" prodigiously (which cuts down on the over-use of bracketed
blocks) ranks much higher than my wish to break up large functions
into smaller ones (especially since many codes mistakenly think that
breaking up functions gives them permission to use global variables).

   -- Jean-Luc


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2726
***************************************


home help back first fref pref prev next nref lref last post