[32200] in Perl-Users-Digest
Perl-Users Digest, Issue: 3465 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Aug 5 09:09:49 2011
Date: Fri, 5 Aug 2011 06:09:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 5 Aug 2011 Volume: 11 Number: 3465
Today's topics:
Choice of data structure <ela@yantai.org>
Re: Delaying interpolation in a qr <bernie@fantasyfarm.com>
Re: Delaying interpolation in a qr <uri@StemSystems.com>
Posting Guidelines for comp.lang.perl.misc ($Revision: tadmc@seesig.invalid
Re: seeking advice on problem difficulty <ben.usenet@bsb.me.uk>
Re: seeking advice on problem difficulty <ela@yantai.org>
Re: seeking advice on problem difficulty <jurgenex@hotmail.com>
Sorting hash %seen <ela@yantai.org>
Re: Sorting hash %seen <justin.1104@purestblue.com>
Re: Sorting hash %seen <tadmc@seesig.invalid>
Re: Sorting hash %seen <tadmc@seesig.invalid>
which is faster ? <nospam.gravitalsun@hotmail.com.nospam>
Re: which is faster ? <glex_no-spam@qwest-spam-no.invalid>
Re: which is faster ? <rvtol+usenet@xs4all.nl>
Re: which is faster ? <jurgenex@hotmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 5 Aug 2011 19:18:59 -0700
From: "ela" <ela@yantai.org>
Subject: Choice of data structure
Message-Id: <j1gfui$bv3$1@ijustice.itsc.cuhk.edu.hk>
"Rainer Weikusat" <rweikusat@mssgmbh.com> wrote in message
news:87ei11uzui.fsf@sapphire.mobileactivedefense.com...
> "ela" <ela@yantai.org> writes:
>> I've been working on this problem for 4 days and still cannot come out a
>> good solution and would appreciate if you could comment on the problem.
>>
>> Given a table containing cells delimited by tab like this
>
> [ please see original for the indeed gory details ]
>
> Provided I understood the problem correctly, a possible solution could
> look like this (this code has had very little testing): First, you
> define your groups by associating array references containing the group
> members with the 'group ID' with the help of a hash:
>
> $grp{1} = [1, 2];
>
> Then, you create a hash mapping the column name to the column value
> for each ID and put these hashes into an id hash associated with the
> ID:
>
> $id{1} = { F1 => 'SuperC1', F2 => 'C1', F3 => 'subC4' };
> $id{2} = { F1 => 'SuperC1', F2 => 'C1', F3 => 'subC3' };
While I'm revising the codes, I find that just because I overrely on hash
and that complicates my problem. What made you make a decision on using
array for "group" while hash for "id"?
------------------------------
Date: Thu, 04 Aug 2011 17:24:20 -0400
From: Bernie Cosell <bernie@fantasyfarm.com>
Subject: Re: Delaying interpolation in a qr
Message-Id: <th2m37dl9pjrenjlsa57r5c8h5njao6tuv@library.airnews.net>
"Uri Guttman" <uri@StemSystems.com> wrote:
} >>>>> "R" == Ruud <rvtol+usenet@xs4all.nl> writes:
}
} R> On 2011-08-04 20:30, Bernie Cosell wrote:
} >> I have a regular expression that includes a
} >> 'runtime' variable. [...] I need the interpolation of
} >> the variable to be deferred until the RE is *used*.
} but that still doesn't fix his request. i suspect an xy problem
} here. why does the interpolation have to happen so late?
It happens so late because I have pretty much what you suggest:
> .. why not build
} up a hash of qr's and select the needed one in the actual regex.
I actually have that..:o). Problem is that several of the qr's really need
to have a variable interpolated.
} R> But the example you gave, only needs eq, not any regular expression.
} R> See also perldoc -f index.
}
} agreed, but it may have been a too simplistic example.
I know: it was an example to show what I wanted from the RE (which is why I
emphasized which of the three 'warns' actually matched). The actual code
is similar in that I do a pattern match in a subroutine and what is
happening is that I'd like to late-bind the interpolated vbl because the
value I need for the RE is coming in as an arg to the subroutine.
As a simple [almost real] example from a different program, I have a fairly
complicated RE that parses a line from one of our system logfiles.
Depending on what I'm doing, I need that pattern to have an actual runtime
value put in. Very much simplified, I have:
my $pat = qr{(...)\ +(\d+)\ (\d\d:\d\d:\d\d).*rip=$targetip,} ;
to pick out the records from the logfile from a particular IP addr. I
don't actually know the IP addr I want until I've processed some other
logfiles, and so I have the RE set up with the "placeholder" variable. If
I bury the RE down into the subroutine, of course it all works just fine:
sub scan
{ my $targetip = $_[0] ;
my $pat = qr[that stuff above] ;
# and now I can do =~ /$pat/ and it works just fine
}
but I would *really* like to have the [several similar] REs together at the
head of the program where it is easy to find them and tweak them and keep
them in sync, etc.
/Bernie\
--
Bernie Cosell Fantasy Farm Fibers
bernie@fantasyfarm.com Pearisburg, VA
--> Too many people, too few sheep <--
------------------------------
Date: Thu, 04 Aug 2011 17:48:00 -0400
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Delaying interpolation in a qr
Message-Id: <87zkjozw4v.fsf@quad.sysarch.com>
>>>>> "BC" == Bernie Cosell <bernie@fantasyfarm.com> writes:
BC> "Uri Guttman" <uri@StemSystems.com> wrote:
BC> } >>>>> "R" == Ruud <rvtol+usenet@xs4all.nl> writes:
BC> }
BC> } R> On 2011-08-04 20:30, Bernie Cosell wrote:
BC> } >> I have a regular expression that includes a
BC> } >> 'runtime' variable. [...] I need the interpolation of
BC> } >> the variable to be deferred until the RE is *used*.
BC> } but that still doesn't fix his request. i suspect an xy problem
BC> } here. why does the interpolation have to happen so late?
BC> It happens so late because I have pretty much what you suggest:
>> .. why not build
BC> } up a hash of qr's and select the needed one in the actual regex.
BC> I actually have that..:o). Problem is that several of the qr's
BC> really need to have a variable interpolated.
but when? you haven't really given a time flow here and explained the
need for such a delay. i really do expect a simpler solution once i
understand the reasoning behind the delay.
BC> } R> But the example you gave, only needs eq, not any regular
BC> } R> expression. See also perldoc -f index.
BC> }
BC> } agreed, but it may have been a too simplistic example.
BC> I know: it was an example to show what I wanted from the RE (which
BC> is why I emphasized which of the three 'warns' actually matched).
BC> The actual code is similar in that I do a pattern match in a
BC> subroutine and what is happening is that I'd like to late-bind the
BC> interpolated vbl because the value I need for the RE is coming in
BC> as an arg to the subroutine.
so just interpolate that variable into the regex after you get it. no
need to have a qr for that. you do know you can mix qr's and vars into
one regex?
BC> As a simple [almost real] example from a different program, I have
BC> a fairly complicated RE that parses a line from one of our system
BC> logfiles. Depending on what I'm doing, I need that pattern to
BC> have an actual runtime value put in. Very much simplified, I
BC> have:
BC> my $pat = qr{(...)\ +(\d+)\ (\d\d:\d\d:\d\d).*rip=$targetip,} ;
BC> to pick out the records from the logfile from a particular IP
BC> addr. I don't actually know the IP addr I want until I've
BC> processed some other logfiles, and so I have the RE set up with
BC> the "placeholder" variable. If I bury the RE down into the
BC> subroutine, of course it all works just fine:
BC> sub scan
BC> { my $targetip = $_[0] ;
BC> my $pat = qr[that stuff above] ;
BC> # and now I can do =~ /$pat/ and it works just fine
BC> }
BC> but I would *really* like to have the [several similar] REs
BC> together at the head of the program where it is easy to find them
BC> and tweak them and keep them in sync, etc.
you can do that. just keep the dynamic late part out of those qr's. when
the actual value is passed to the sub, make a new regex with the proper
qr and the new value. it works fine and you get the late building of the
regex as you want. you can't delay the interpolation with just qr (other
than using string eval). another idea is to make a hash of anon subs
which return a qr built with one of the sub args. then your code will
get the desired anon sub, pass in the var and get back a qr you can
use. but again, this is overkill. if you know the value only just before
you need to use it in a regex, just interpolate it then. if it is inside
a larger regex, create a pre- and post qr for those parts. or do the sub
idea i mentioned. either solution is much better than trying to do the
late interpolation as it will never work (again, string eval excepted
but that is evil and never to be used unless a last resort).
uri
--
Uri Guttman -- uri AT perlhunter DOT com --- http://www.perlhunter.com --
------------ Perl Developer Recruiting and Placement Services -------------
----- Perl Code Review, Architecture, Development, Training, Support -------
------------------------------
Date: Fri, 05 Aug 2011 02:17:43 -0500
From: tadmc@seesig.invalid
Subject: Posting Guidelines for comp.lang.perl.misc ($Revision: 1.9 $)
Message-Id: <ib2dnXjFreKKBKbTnZ2dnUVZ5rCdnZ2d@giganews.com>
Outline
Before posting to comp.lang.perl.misc
Must
- Check the Perl Frequently Asked Questions (FAQ)
- Check the other standard Perl docs (*.pod)
Really Really Should
- Lurk for a while before posting
- Search a Usenet archive
If You Like
- Check Other Resources
Posting to comp.lang.perl.misc
Is there a better place to ask your question?
- Question should be about Perl, not about the application area
How to participate (post) in the clpmisc community
- Carefully choose the contents of your Subject header
- Use an effective followup style
- Speak Perl rather than English, when possible
- Ask perl to help you
- Do not re-type Perl code
- Provide enough information
- Do not provide too much information
- Do not post binaries, HTML, or MIME
Social faux pas to avoid
- Asking a Frequently Asked Question
- Asking a question easily answered by a cursory doc search
- Asking for emailed answers
- Beware of saying "doesn't work"
- Sending a "stealth" Cc copy
Be extra cautious when you get upset
- Count to ten before composing a followup when you are upset
- Count to ten after composing and before posting when you are upset
-----------------------------------------------------------------
Posting Guidelines for comp.lang.perl.misc ($Revision: 1.9 $)
This newsgroup, commonly called clpmisc, is a technical newsgroup
intended to be used for discussion of Perl related issues (except job
postings), whether it be comments or questions.
As you would expect, clpmisc discussions are usually very technical in
nature and there are conventions for conduct in technical newsgroups
going somewhat beyond those in non-technical newsgroups.
The article at:
http://www.catb.org/~esr/faqs/smart-questions.html
describes how to get answers from technical people in general.
This article describes things that you should, and should not, do to
increase your chances of getting an answer to your Perl question. It is
available in POD, HTML and plain text formats at:
http://www.rehabitation.com/clpmisc.shtml
For more information about netiquette in general, see the "Netiquette
Guidelines" at:
http://andrew2.andrew.cmu.edu/rfc/rfc1855.html
A note to newsgroup "regulars":
Do not use these guidelines as a "license to flame" or other
meanness. It is possible that a poster is unaware of things
discussed here. Give them the benefit of the doubt, and just
help them learn how to post, rather than assume that they do
know and are being the "bad kind" of Lazy.
A note about technical terms used here:
In this document, we use words like "must" and "should" as
they're used in technical conversation (such as you will
encounter in this newsgroup). When we say that you *must* do
something, we mean that if you don't do that something, then
it's unlikely that you will benefit much from this group.
We're not bossing you around; we're making the point without
lots of words.
Do *NOT* send email to the maintainer of these guidelines. It will be
discarded unread. The guidelines belong to the newsgroup so all
discussion should appear in the newsgroup. I am just the secretary that
writes down the consensus of the group.
Before posting to comp.lang.perl.misc
Must
This section describes things that you *must* do before posting to
clpmisc, in order to maximize your chances of getting meaningful replies
to your inquiry and to avoid getting flamed for being lazy and trying to
have others do your work.
The perl distribution includes documentation that is copied to your hard
drive when you install perl. Also installed is a program for looking
things up in that (and other) documentation named 'perldoc'.
You should either find out where the docs got installed on your system,
or use perldoc to find them for you. Type "perldoc perldoc" to learn how
to use perldoc itself. Type "perldoc perl" to start reading Perl's
standard documentation.
Check the Perl Frequently Asked Questions (FAQ)
Checking the FAQ before posting is required in Big 8 newsgroups in
general, there is nothing clpmisc-specific about this requirement.
You are expected to do this in nearly all newsgroups.
You can use the "-q" switch with perldoc to do a word search of the
questions in the Perl FAQs.
Check the other standard Perl docs (*.pod)
The perl distribution comes with much more documentation than is
available for most other newsgroups, so in clpmisc you should also
see if you can find an answer in the other (non-FAQ) standard docs
before posting.
It is *not* required, or even expected, that you actually *read* all of
Perl's standard docs, only that you spend a few minutes searching them
before posting.
Try doing a word-search in the standard docs for some words/phrases
taken from your problem statement or from your very carefully worded
"Subject:" header.
Really Really Should
This section describes things that you *really should* do before posting
to clpmisc.
Lurk for a while before posting
This is very important and expected in all newsgroups. Lurking means
to monitor a newsgroup for a period to become familiar with local
customs. Each newsgroup has specific customs and rituals. Knowing
these before you participate will help avoid embarrassing social
situations. Consider yourself to be a foreigner at first!
Search a Usenet archive
There are tens of thousands of Perl programmers. It is very likely
that your question has already been asked (and answered). See if you
can find where it has already been answered.
One such searchable archive is:
http://groups.google.com/advanced_search
If You Like
This section describes things that you *can* do before posting to
clpmisc.
Check Other Resources
You may want to check in books or on web sites to see if you can
find the answer to your question.
But you need to consider the source of such information: there are a
lot of very poor Perl books and web sites, and several good ones
too, of course.
Posting to comp.lang.perl.misc
There can be 200 messages in clpmisc in a single day. Nobody is going to
read every article. They must decide somehow which articles they are
going to read, and which they will skip.
Your post is in competition with 199 other posts. You need to "win"
before a person who can help you will even read your question.
These sections describe how you can help keep your article from being
one of the "skipped" ones.
Is there a better place to ask your question?
Question should be about Perl, not about the application area
It can be difficult to separate out where your problem really is,
but you should make a conscious effort to post to the most
applicable newsgroup. That is, after all, where you are the most
likely to find the people who know how to answer your question.
Being able to "partition" a problem is an essential skill for
effectively troubleshooting programming problems. If you don't get
that right, you end up looking for answers in the wrong places.
It should be understood that you may not know that the root of your
problem is not Perl-related (the two most frequent ones are CGI and
Operating System related), so off-topic postings will happen from
time to time. Be gracious when someone helps you find a better place
to ask your question by pointing you to a more applicable newsgroup.
How to participate (post) in the clpmisc community
Carefully choose the contents of your Subject header
You have 40 precious characters of Subject to win out and be one of
the posts that gets read. Don't waste them. Take care while
composing them, they are the key that opens the door to getting an
answer.
Spend them indicating what aspect of Perl others will find if they
should decide to read your article.
Do not spend them indicating "experience level" (guru, newbie...).
Do not spend them pleading (please read, urgent, help!...).
Do not spend them on non-Subjects (Perl question, one-word
Subject...)
For more information on choosing a Subject see "Choosing Good
Subject Lines":
http://www.cpan.org/authors/id/D/DM/DMR/subjects.post
Part of the beauty of newsgroup dynamics, is that you can contribute
to the community with your very first post! If your choice of
Subject leads a fellow Perler to find the thread you are starting,
then even asking a question helps us all.
Use an effective followup style
When composing a followup, quote only enough text to establish the
context for the comments that you will add. Always indicate who
wrote the quoted material. Never quote an entire article. Never
quote a .signature (unless that is what you are commenting on).
Intersperse your comments *following* each section of quoted text to
which they relate. Unappreciated followup styles are referred to as
"top-posting", "Jeopardy" (because the answer comes before the
question), or "TOFU" (Text Over, Fullquote Under).
Reversing the chronology of the dialog makes it much harder to
understand (some folks won't even read it if written in that style).
For more information on quoting style, see:
http://web.presby.edu/~nnqadmin/nnq/nquote.html
Speak Perl rather than English, when possible
Perl is much more precise than natural language. Saying it in Perl
instead will avoid misunderstanding your question or problem.
Do not say: I have variable with "foo\tbar" in it.
Instead say: I have $var = "foo\tbar", or I have $var = 'foo\tbar',
or I have $var = <DATA> (and show the data line).
Ask perl to help you
You can ask perl itself to help you find common programming mistakes
by doing two things: enable warnings (perldoc warnings) and enable
"strict"ures (perldoc strict).
You should not bother the hundreds/thousands of readers of the
newsgroup without first seeing if a machine can help you find your
problem. It is demeaning to be asked to do the work of a machine. It
will annoy the readers of your article.
You can look up any of the messages that perl might issue to find
out what the message means and how to resolve the potential mistake
(perldoc perldiag). If you would like perl to look them up for you,
you can put "use diagnostics;" near the top of your program.
Do not re-type Perl code
Use copy/paste or your editor's "import" function rather than
attempting to type in your code. If you make a typo you will get
followups about your typos instead of about the question you are
trying to get answered.
Provide enough information
If you do the things in this item, you will have an Extremely Good
chance of getting people to try and help you with your problem!
These features are a really big bonus toward your question winning
out over all of the other posts that you are competing with.
First make a short (less than 20-30 lines) and *complete* program
that illustrates the problem you are having. People should be able
to run your program by copy/pasting the code from your article. (You
will find that doing this step very often reveals your problem
directly. Leading to an answer much more quickly and reliably than
posting to Usenet.)
Describe *precisely* the input to your program. Also provide example
input data for your program. If you need to show file input, use the
__DATA__ token (perldata.pod) to provide the file contents inside of
your Perl program.
Show the output (including the verbatim text of any messages) of
your program.
Describe how you want the output to be different from what you are
getting.
If you have no idea at all of how to code up your situation, be sure
to at least describe the 2 things that you *do* know: input and
desired output.
Do not provide too much information
Do not just post your entire program for debugging. Most especially
do not post someone *else's* entire program.
Do not post binaries, HTML, or MIME
clpmisc is a text only newsgroup. If you have images or binaries
that explain your question, put them in a publically accessible
place (like a Web server) and provide a pointer to that location. If
you include code, cut and paste it directly in the message body.
Don't attach anything to the message. Don't post vcards or HTML.
Many people (and even some Usenet servers) will automatically filter
out such messages. Many people will not be able to easily read your
post. Plain text is something everyone can read.
Social faux pas to avoid
The first two below are symptoms of lots of FAQ asking here in clpmisc.
It happens so often that folks will assume that it is happening yet
again. If you have looked but not found, or found but didn't understand
the docs, say so in your article.
Asking a Frequently Asked Question
It should be understood that you may have missed the applicable FAQ
when you checked, which is not a big deal. But if the Frequently
Asked Question is worded similar to your question, folks will assume
that you did not look at all. Don't become indignant at pointers to
the FAQ, particularly if it solves your problem.
Asking a question easily answered by a cursory doc search
If folks think you have not even tried the obvious step of reading
the docs applicable to your problem, they are likely to become
annoyed.
If you are flamed for not checking when you *did* check, then just
shrug it off (and take the answer that you got).
Asking for emailed answers
Emailed answers benefit one person. Posted answers benefit the
entire community. If folks can take the time to answer your
question, then you can take the time to go get the answer in the
same place where you asked the question.
It is OK to ask for a *copy* of the answer to be emailed, but many
will ignore such requests anyway. If you munge your address, you
should never expect (or ask) to get email in response to a Usenet
post.
Ask the question here, get the answer here (maybe).
Beware of saying "doesn't work"
This is a "red flag" phrase. If you find yourself writing that,
pause and see if you can't describe what is not working without
saying "doesn't work". That is, describe how it is not what you
want.
Sending a "stealth" Cc copy
A "stealth Cc" is when you both email and post a reply without
indicating *in the body* that you are doing so.
Be extra cautious when you get upset
Count to ten before composing a followup when you are upset
This is recommended in all Usenet newsgroups. Here in clpmisc, most
flaming sub-threads are not about any feature of Perl at all! They
are most often for what was seen as a breach of netiquette. If you
have lurked for a bit, then you will know what is expected and won't
make such posts in the first place.
But if you get upset, wait a while before writing your followup. I
recommend waiting at least 30 minutes.
Count to ten after composing and before posting when you are upset
After you have written your followup, wait *another* 30 minutes
before committing yourself by posting it. You cannot take it back
once it has been said.
AUTHOR
Tad McClellan and many others on the comp.lang.perl.misc newsgroup.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.
------------------------------
Date: Thu, 04 Aug 2011 22:25:02 +0100
From: Ben Bacarisse <ben.usenet@bsb.me.uk>
Subject: Re: seeking advice on problem difficulty
Message-Id: <0.5fece175b59ace0c2e2b.20110804222502BST.87bow4uaxd.fsf@bsb.me.uk>
"ela" <ela@yantai.org> writes:
> "Ben Bacarisse" <ben.usenet@bsb.me.uk> wrote in message
> news:0.1974a0af27d9bc0648a0.20110804140923BST.87ty9xtjb0.fsf@bsb.me.uk...
>> my ($most_freq, %count) = ('', '' => 0);
> I guess the above line is doing some sort of initialization of zero's, but
> why is ", " used? Making both $most_freq and %count separated by comma to be
> zero?
It's equivalent to
my $most_freq = '';
my %count;
$count{$most_freq} = 0;
but that's not a very good choice. I should have done what one does
with functions like max and min and set $most_freq to the first data
item as Rainer suggested.
>> for my $item (@_) {
> How does this @_ correspond to @{$column[$col]}[@group] below?
That's how subroutines and argument passing work in Perl. I don't know
what else to say. It's too long since I read a Perl text so if I try to
explain it I'll probably use the wrong terms but let me try...
Subroutine arguments are evaluated in list context so any arrays you
write there get joined together (along with non-array arguments) into
one big argument array. Inside the subroutine The name for this array
is @_. In this case there is only one argument: this list that results
from slicing the array @{$column[$col]} with the @group array.
Did you catch the bit where I said to ignore what I wrote if the
duplicate IDs were not a typo? I really meant it. I don't think
slicing a column array is the right way to go with the data you have.
You might still find most_frequent a useful function, but that's about
it.
<snip>
--
Ben.
------------------------------
Date: Fri, 5 Aug 2011 18:55:21 -0700
From: "ela" <ela@yantai.org>
Subject: Re: seeking advice on problem difficulty
Message-Id: <j1gei8$bft$1@ijustice.itsc.cuhk.edu.hk>
"Ben Bacarisse" <ben.usenet@bsb.me.uk> wrote in message
news:0.1974a0af27d9bc0648a0.20110804140923BST.87ty9xtjb0.fsf@bsb.me.uk...
> Functions are crucial to managing complexity. I'd want a function
> 'most_frequent' that can take an array of values and find the frequency
> of the most common value among them. It could return both that value
> and the frequency. Something like:
>
I'd appreciate if I can learn more from you about the thinking philosophy.
As said previously, I only thought of a lot of "if"'s and "hash"'s and never
able to use function to wrap up some of the concepts. Would you mind telling
me by which cues trigger you to think about using function?
------------------------------
Date: Fri, 05 Aug 2011 04:42:49 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: seeking advice on problem difficulty
Message-Id: <g2jn37t7igeteletoka9jp9abjk93f7eh9@4ax.com>
"ela" <ela@yantai.org> wrote:
>"Ben Bacarisse" <ben.usenet@bsb.me.uk> wrote in message
>news:0.1974a0af27d9bc0648a0.20110804140923BST.87ty9xtjb0.fsf@bsb.me.uk...
>> Functions are crucial to managing complexity. I'd want a function
>> 'most_frequent' that can take an array of values and find the frequency
>> of the most common value among them. It could return both that value
>> and the frequency. Something like:
>
>I'd appreciate if I can learn more from you about the thinking philosophy.
>As said previously, I only thought of a lot of "if"'s and "hash"'s and never
>able to use function to wrap up some of the concepts. Would you mind telling
>me by which cues trigger you to think about using function?
That is actually a good question. And while there are many factors I
don't think there are any hard rules although learning how to split a
problem into smaller parts is the most crucial skill in software
engineering.
Some indicators for when to split code into smaller units and/or create
a function:
- abstract data types: when you design an abstract data type, then there
are standard operations which you need with any data type like e.g.
creating, initializing, and deleting elements and modifying values. For
example for anything list of xyz you need an empty list, append to a
list, access element of list, very likely concatenate two lists, apply
function to each element of list, .... So just go ahead and write a
function for each, sooner or later you will need it.
- application domain: often the application domain already provides for
a set of functions, e.g. if I were to write a module for statistics
obviously I need functions for mean, average, standard deviation, maybe
min and max, and so forth.
- code reuse: if I find myself typing the same code or very similar code
multiple times, then check if it can be broken out into a function,
possibly with parameters to account for the minor differences between
almost the same code.
- code complexity: if there are more then 2 to a max of 3 levels of
indentation (nested loops and if's) then usually that is an indication
that part of the code should probably be split off into a function.
- code complexity: an odd rule of thumb which nevertheless I still find
a very useful guideline: a VT220 had 24 lines of text. If your function
grew longer than about 2/3 of the screen (i.e. more than ~16 lines) if
was time to consider breaking it up. And if it didn't fit on a single
screen any more, then it was definitely time to check why this function
was so long.
- algorithmic considerations: whenever the algorithm is asking for
recursion then obviously you should create a function.
- code complexity: whenever you are working with higher order functions
consider to name the arguments, i.e. create a named function for
wanted() in File::find or the filter function in grep() and so on. Of
course for simple cases you can use anonymous functions, but whenever it
is a non-trivial function it helps to name it.
- algorithmic consideration: sometimes loops are better written as
recursions. Obviously in such cases you will need a function.
- algorithmic consideration: return values. Whenever I find myself
thinking 'and now all I need is to compute foobar from these 4 values'
then this is a strong indication that I should write a function foobar()
that takes 4 arguments and returns the desired value.
I am sure there are many more indications. But I am also sure that part
of it is personal preference. And most of all it is a matter of
experience. After all software engineering is still one of the major
fundamentals of computer science and functions are smallest building
blocks.
jue
------------------------------
Date: Fri, 5 Aug 2011 20:02:53 -0700
From: "ela" <ela@yantai.org>
Subject: Sorting hash %seen
Message-Id: <j1gigr$cup$1@ijustice.itsc.cuhk.edu.hk>
"Rainer Weikusat" <rweikusat@mssgmbh.com> wrote in message
> Then, you'll need something similar to Ben's most_frequent routine
I managed to take frequency by:
$seen{$_}++ for (map { $id{$_}{$col} } @{$grp{$grpid}});
but when I wanna sort it by:
my @sorted_keys = sort { $seen{$b} <=> $seen{$a}}
the error "Use of uninitialized value in hash element at test.pl" appears,
so how to sort the values then?
------------------------------
Date: Fri, 5 Aug 2011 13:02:27 +0100
From: Justin C <justin.1104@purestblue.com>
Subject: Re: Sorting hash %seen
Message-Id: <jn2tg8-fi2.ln1@zem.masonsmusic.co.uk>
On 2011-08-06, ela <ela@yantai.org> wrote:
>
> "Rainer Weikusat" <rweikusat@mssgmbh.com> wrote in message
>
>> Then, you'll need something similar to Ben's most_frequent routine
>
> I managed to take frequency by:
>
> $seen{$_}++ for (map { $id{$_}{$col} } @{$grp{$grpid}});
I've not tried it (and, TBH, I have trouble getting my head round it,
map is still new to me) but that doesn't look right to me.
AIUI, the $_ in $seen{$_}++ takes whatever is in $_ before you do the
'map', the $_ in the map function is, I believe, local to the map
function and therefore not available outside the function.
Justin.
--
Justin C, by the sea.
------------------------------
Date: Fri, 05 Aug 2011 07:18:32 -0500
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: Sorting hash %seen
Message-Id: <slrnj3nneb.c9o.tadmc@tadbox.sbcglobal.net>
ela <ela@yantai.org> wrote:
> my @sorted_keys = sort { $seen{$b} <=> $seen{$a}}
You have left off the list of things to be sorted.
Please post complete code.
> the error "Use of uninitialized value in hash element at test.pl" appears,
> so how to sort the values then?
The uninitialized value comes from the list of things to be sorted.
Since you have not shown where this list comes from, we cannot
help you...
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.
------------------------------
Date: Fri, 05 Aug 2011 07:22:08 -0500
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: Sorting hash %seen
Message-Id: <slrnj3nnl4.c9o.tadmc@tadbox.sbcglobal.net>
Justin C <justin.1104@purestblue.com> wrote:
> On 2011-08-06, ela <ela@yantai.org> wrote:
>> "Rainer Weikusat" <rweikusat@mssgmbh.com> wrote in message
>>
>>> Then, you'll need something similar to Ben's most_frequent routine
>>
>> I managed to take frequency by:
>>
>> $seen{$_}++ for (map { $id{$_}{$col} } @{$grp{$grpid}});
>
> I've not tried it (and, TBH, I have trouble getting my head round it,
> map is still new to me) but that doesn't look right to me.
>
> AIUI, the $_ in $seen{$_}++ takes whatever is in $_ before you do the
> 'map',
No, it takes on the values returned from the map.
Rewrite it, and it may become more clear:
foreach (map { $id{$_}{$col} } @{$grp{$grpid}}) {
$seen{$_}++
}
$_ in the 1st line is local to the map.
$_ in the 2nd line is local to the foreach.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.
------------------------------
Date: Fri, 05 Aug 2011 00:30:12 +0300
From: George Mpouras <nospam.gravitalsun@hotmail.com.nospam>
Subject: which is faster ?
Message-Id: <j1f313$2iq4$1@news.ntua.gr>
$#array
scalar @array or
my $n = @array ?
------------------------------
Date: Thu, 04 Aug 2011 22:08:25 +0000
From: "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid>
Subject: Re: which is faster ?
Message-Id: <4e3b1859$0$75663$815e3792@news.qwest.net>
On 08/04/11 21:30, George Mpouras wrote:
> $#array
> scalar @array or
> my $n = @array ?
perldoc Benchmark
Sounds like premature optimization, to me.
------------------------------
Date: Fri, 05 Aug 2011 00:09:13 +0200
From: "Dr.Ruud" <rvtol+usenet@xs4all.nl>
Subject: Re: which is faster ?
Message-Id: <4e3b1889$0$23934$e4fe514c@news2.news.xs4all.nl>
On 2011-08-04 23:30, George Mpouras wrote:
> which is faster ?
>
> $#array
> scalar @array or
> my $n = @array ?
If magic or tie or operator-overload is not involved, there will hardly
be any difference in speed. Just benchmark to find out.
$#array is the highest index of @array. The other two are about the
number of elements. $[ gives the minimum index which is normally 0.
--
Ruud
------------------------------
Date: Thu, 04 Aug 2011 15:58:14 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: which is faster ?
Message-Id: <5s8m37tl417verfmu5rv7rthsrmo50gj8n@4ax.com>
George Mpouras <nospam.gravitalsun@hotmail.com.nospam> wrote:
>$#array
>scalar @array or
>my $n = @array ?
IMNSHO you are asking the wrong question. Because each of these three
code pieces has a different semantic the right question would be "Which
code piece has the right semantic for my algorithm?"
jue
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 3465
***************************************