[30334] in Perl-Users-Digest
Perl-Users Digest, Issue: 1577 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon May 26 00:09:47 2008
Date: Sun, 25 May 2008 21:09:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sun, 25 May 2008 Volume: 11 Number: 1577
Today's topics:
blogs, longbets.org, and education of sociology <xahlee@gmail.com>
Re: blogs, longbets.org, and education of sociology sln@netherlands.co
Re: blogs, longbets.org, and education of sociology <Pidgeot18@verizon.invalid>
Re: blogs, longbets.org, and education of sociology <conrad@lewscanon.com.invalid>
Re: HTML Parsing issues - Part II chadda@lonemerchant.com
Re: HTML Parsing issues - Part II <noreply@gunnar.cc>
LWP::Parallel concerns chadda@lonemerchant.com
Re: LWP::Parallel concerns <ben@morrow.me.uk>
Re: LWP::Parallel concerns chadda@lonemerchant.com
Re: maintaining order in a hash (without Tie::IxHash) <m@rtij.nl.invlalid>
Re: problem upgrading Bundle::CPAN <ben@morrow.me.uk>
Why reading the FAQs is good (example) <dragnet\_@_/internalysis.com>
Re: Why reading the FAQs is good (example) <uri@stemsystems.com>
Re: Why reading the FAQs is good (example) <dragnet\_@_/internalysis.com>
Re: Why reading the FAQs is good (example) <dragnet\_@_/internalysis.com>
Re: Why reading the FAQs is good (example) <uri@stemsystems.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sun, 25 May 2008 16:25:33 -0700 (PDT)
From: "xahlee@gmail.com" <xahlee@gmail.com>
Subject: blogs, longbets.org, and education of sociology
Message-Id: <1c97ba8a-b758-4891-a3c4-20e462361934@z16g2000prn.googlegroups.com>
For about the past 10 years, i have been concerned in the programing
community's level of education in social issues.
I have found that recently, a news that would be of interest to
programers.
There was a bet at longbets.org (run by Long Now Foundation) regarding
the importance of blogs. The bet was made in 2002. The prediction has
a resolution date in 2007.
In 2008, the bet is resolved. See
=E2=80=9CDecision: Blogs vs. New York Times=E2=80=9D (2008-02-01) by Alexand=
er Rose
http://blog.longnow.org/2008/02/01/decision-blogs-vs-new-york-times/
I'd like encourage, for many of you, who have lots of opinions on
technical issues or social issues surrounding software, to make use of
longbets.org. It can help shape your thoughts from blog fart to
something more refined. In any case, your money will benefit society.
here's some examples you could try:
=E2=80=A2 I bet that Java will be out of the top 10 programing languages by
2020.
=E2=80=A2 I bet that the top 10 programing languages in 2015 (as determined =
by
requirement from job search engine), the majority will be those
characterized as dynamic languages (e.g. php, perl, python,
javascript, tcl, lisp. (as opposed to: C, Java, C++, C#, F#,
Haskell)).
=E2=80=A2 You bet that Linux as a desktop system will or will not have a
market share of such and such by the year xyz.
(I'm not sure the above =E2=80=9Cpredictions=E2=80=9D are candidates on long=
bets.org,
since one of their rule is that the =E2=80=9Cpredictions=E2=80=9D should be =
socially
important. Looking at existing entries on their site, the social
importance of the above items pale in comparison. (however, many of
their existing =E2=80=9Cpredictions=E2=80=9D are somewhat fringe))
* * *
Note, in almost all online forums where tech geekers gather (e.g.
newsgroups, slashdot, irc, etc), often they are anonymous, each fart
ignorant cries and gripes and heated arguments, often in a
irresponsible and carefree way.
One of the longbets.org's goal is to foster RESPONSIBILITY.
In recent years, i have often made claims that the Python's
documentation, it's writing quality and its documentation quality in
whole, is one of the worst.
Among all the wild claims in our modern world, from the sciences to
social or political issues, my claim about Python's technical writing
quality or its whole quality as a technical documentation, is actualy
trivial to verify by any standards. When presented to intellectuals of
the world at large, the claim's verifiability is trivial, almost as a
matter of fact checking (which are done by interns or newbie grads of
communinication/journalism/literature majors, working for journalism
houses). However, when i voiced my opinion on Python doc among
programing geekers online, it is often met with a bunch of wild cries.
Some of these beer drinking fuckheads are simply being a asshole,
which are expected by the nature of online tech geeking communities (a
significance percentage are bored young males). However, many others,
many with many years of programing experience as a professional,
sincerely tried to say something to the effect of =E2=80=9Cin my opinion it'=
s
good=E2=80=9D, or voice other stupid remarks to the effect of =E2=80=9Cwhy d=
on't you
fix it=E2=80=9D, and in fact find my claim, and its tone too fantastical, to=
the point thinking i'm a youngling who are bent on to do nothing but
vandalism. (the tech geekers use in-group slang for this: =E2=80=9Ctroll=E2=
=80=9D.)
The case of the Python doc is just one example. I have also, in the
past decade, in _appropriate_ online communties (e.g. newsgroups,
mailing lists), voiced opinions on Perl's doc, emacs's doc, criticism
on lisp nested syntax, =E2=80=9Csoftware engineering=E2=80=9D issues (e.g. O=
OP),
various issues of jargons and naming (e.g. currying, lisp1 vs lisp2,
tail recursion, closure), emacs's user interface issues, criticism on
the phenomenon of Open Source community's fervor for bug reporting,
criticism on IT industry celebrities such as Larry Wall and Guido von
Rossum, opinions on cross-posting, ... and others. Some of my claims
are indeed controversial by nature. By that i mean that there is no
consensus on the subject among its experts, and the issue is complex,
and has political implications. However, many trivially verifiable, or
even simple facts, are wildly debated or raised a ruckus, because the
programers are utterly ignorant of basic social knowledge, or due to
their political banding (e.g. a language faction, Open Source) or
current trends and fashions (e.g. OOP, Java, =E2=80=9CPatterns=E2=80=9D, =E2=
=80=9CeXtreme
Programing=E2=80=9D, ... , OpenSource and =E2=80=9CFree=E2=80=9D software mo=
vement, ...).
I think, the founding of Long Now Foundation with its longbets.org,
shares a concern i have on the tech geeking communities. In
particular, tech geekers need to have a broader education on social
sciences, needs to think in long term, and needs to foster personal
responsibility, when they act or voice opinions on their love of
technology. (note: not reading more motherfucking slashdot or
motherfucking groklaw or more great podcasts on your beatific language
or your postmodernistic fuckhead idols)
(One thing you can do, is actually take a course on philosophy,
history, law, economics, in your local community college.)
Disclaimer: I have no affiliation with Long Now Foundation.
* * *
See also:
=E2=80=9CResponsible Software Licensing=E2=80=9D (2003-07) by Xah Lee
http://xahlee.org/UnixResource_dir/writ/responsible_license.html
=E2=80=9COn Microsoft Hatred=E2=80=9D (2002-02-23) Xah Lee
http://xahlee.org/UnixResource_dir/writ/mshatred155.html
Xah
xah@xahlee.org
=E2=88=91 http://xahlee.org/
=E2=98=84
------------------------------
Date: Sun, 25 May 2008 17:30:16 -0700
From: sln@netherlands.co
Subject: Re: blogs, longbets.org, and education of sociology
Message-Id: <6t0k345vvochm8t3bsv7l1v38tq40j9oep@4ax.com>
Your opinion of yourself is only surpased by your monumental display
of mastery of the English language.
sln
On Sun, 25 May 2008 16:25:33 -0700 (PDT), "xahlee@gmail.com" <xahlee@gmail.com> wrote:
>For about the past 10 years, i have been concerned in the programing
>community's level of education in social issues.
>
>I have found that recently, a news that would be of interest to
>programers.
>
>There was a bet at longbets.org (run by Long Now Foundation) regarding
>the importance of blogs. The bet was made in 2002. The prediction has
>a resolution date in 2007.
>
>In 2008, the bet is resolved. See
>
>“Decision: Blogs vs. New York Times” (2008-02-01) by Alexander Rose
> http://blog.longnow.org/2008/02/01/decision-blogs-vs-new-york-times/
>
>I'd like encourage, for many of you, who have lots of opinions on
>technical issues or social issues surrounding software, to make use of
>longbets.org. It can help shape your thoughts from blog fart to
>something more refined. In any case, your money will benefit society.
>
>here's some examples you could try:
>
>• I bet that Java will be out of the top 10 programing languages by
>2020.
>
>• I bet that the top 10 programing languages in 2015 (as determined by
>requirement from job search engine), the majority will be those
>characterized as dynamic languages (e.g. php, perl, python,
>javascript, tcl, lisp. (as opposed to: C, Java, C++, C#, F#,
>Haskell)).
>
>• You bet that Linux as a desktop system will or will not have a
>market share of such and such by the year xyz.
>
>(I'm not sure the above “predictions” are candidates on longbets.org,
>since one of their rule is that the “predictions” should be socially
>important. Looking at existing entries on their site, the social
>importance of the above items pale in comparison. (however, many of
>their existing “predictions” are somewhat fringe))
>
> * * *
>
>Note, in almost all online forums where tech geekers gather (e.g.
>newsgroups, slashdot, irc, etc), often they are anonymous, each fart
>ignorant cries and gripes and heated arguments, often in a
>irresponsible and carefree way.
>
>One of the longbets.org's goal is to foster RESPONSIBILITY.
>
>In recent years, i have often made claims that the Python's
>documentation, it's writing quality and its documentation quality in
>whole, is one of the worst.
>
>Among all the wild claims in our modern world, from the sciences to
>social or political issues, my claim about Python's technical writing
>quality or its whole quality as a technical documentation, is actualy
>trivial to verify by any standards. When presented to intellectuals of
>the world at large, the claim's verifiability is trivial, almost as a
>matter of fact checking (which are done by interns or newbie grads of
>communinication/journalism/literature majors, working for journalism
>houses). However, when i voiced my opinion on Python doc among
>programing geekers online, it is often met with a bunch of wild cries.
>Some of these beer drinking fuckheads are simply being a asshole,
>which are expected by the nature of online tech geeking communities (a
>significance percentage are bored young males). However, many others,
>many with many years of programing experience as a professional,
>sincerely tried to say something to the effect of “in my opinion it's
>good”, or voice other stupid remarks to the effect of “why don't you
>fix it”, and in fact find my claim, and its tone too fantastical, to
>the point thinking i'm a youngling who are bent on to do nothing but
>vandalism. (the tech geekers use in-group slang for this: “troll”.)
>
>The case of the Python doc is just one example. I have also, in the
>past decade, in _appropriate_ online communties (e.g. newsgroups,
>mailing lists), voiced opinions on Perl's doc, emacs's doc, criticism
>on lisp nested syntax, “software engineering” issues (e.g. OOP),
>various issues of jargons and naming (e.g. currying, lisp1 vs lisp2,
>tail recursion, closure), emacs's user interface issues, criticism on
>the phenomenon of Open Source community's fervor for bug reporting,
>criticism on IT industry celebrities such as Larry Wall and Guido von
>Rossum, opinions on cross-posting, ... and others. Some of my claims
>are indeed controversial by nature. By that i mean that there is no
>consensus on the subject among its experts, and the issue is complex,
>and has political implications. However, many trivially verifiable, or
>even simple facts, are wildly debated or raised a ruckus, because the
>programers are utterly ignorant of basic social knowledge, or due to
>their political banding (e.g. a language faction, Open Source) or
>current trends and fashions (e.g. OOP, Java, “Patterns”, “eXtreme
>Programing”, ... , OpenSource and “Free” software movement, ...).
>
>I think, the founding of Long Now Foundation with its longbets.org,
>shares a concern i have on the tech geeking communities. In
>particular, tech geekers need to have a broader education on social
>sciences, needs to think in long term, and needs to foster personal
>responsibility, when they act or voice opinions on their love of
>technology. (note: not reading more motherfucking slashdot or
>motherfucking groklaw or more great podcasts on your beatific language
>or your postmodernistic fuckhead idols)
>
>(One thing you can do, is actually take a course on philosophy,
>history, law, economics, in your local community college.)
>
>Disclaimer: I have no affiliation with Long Now Foundation.
>
> * * *
>
>See also:
>
>“Responsible Software Licensing” (2003-07) by Xah Lee
> http://xahlee.org/UnixResource_dir/writ/responsible_license.html
>
>“On Microsoft Hatred” (2002-02-23) Xah Lee
> http://xahlee.org/UnixResource_dir/writ/mshatred155.html
>
> Xah
> xah@xahlee.org
>? http://xahlee.org/
>
>?
------------------------------
Date: Mon, 26 May 2008 03:12:54 GMT
From: Joshua Cranmer <Pidgeot18@verizon.invalid>
Subject: Re: blogs, longbets.org, and education of sociology
Message-Id: <WUp_j.5882$ED6.2250@trnddc02>
xahlee@gmail.com wrote:
> For about the past 10 years, i have been concerned in the programing
> community's level of education in social issues.
[ Adjusts killfile as necessary. ]
> I have found that recently, a news that would be of interest to
> programers.
>
> There was a bet at longbets.org (run by Long Now Foundation) regarding
> the importance of blogs. The bet was made in 2002. The prediction has
> a resolution date in 2007.
>
> In 2008, the bet is resolved. See
>
> “Decision: Blogs vs. New York Times” (2008-02-01) by Alexander Rose
> http://blog.longnow.org/2008/02/01/decision-blogs-vs-new-york-times/
^^^^^^^^^^
Recently? Also, work on that spelling of yours.
> I'd like encourage, for many of you, who have lots of opinions on
> technical issues or social issues surrounding software, to make use of
> longbets.org. It can help shape your thoughts from blog fart to
> something more refined. In any case, your money will benefit society.
I am getting this sense that you have some sort of monetary connection
to said site.
> • I bet that Java will be out of the top 10 programing languages by
> 2020.
FORTRAN was first used in the 1950s. IIRC, it's still in the top 10.
Languages die hard.
> • I bet that the top 10 programing languages in 2015 (as determined by
> requirement from job search engine), the majority will be those
> characterized as dynamic languages (e.g. php, perl, python,
> javascript, tcl, lisp. (as opposed to: C, Java, C++, C#, F#,
> Haskell)).
Right, once again Java-bashing in a Java forum. There's one (actually
two, but that's a different story) too many trolls in here!
I'd also like to point out that determining language use by "job search
engine" requirements is setting one up to certain biases and is not
sufficiently representative of the true patterns.
> Note, in almost all online forums where tech geekers gather (e.g.
> newsgroups, slashdot, irc, etc), often they are anonymous, each fart
> ignorant cries and gripes and heated arguments, often in a
> irresponsible and carefree way.
Okay, we already know that most /. users tend to act immature, but that
can hardly be said about newsgroups or IRC. Just read c.l.j.p's postings
for the last month to disprove your proposition.
> One of the longbets.org's goal is to foster RESPONSIBILITY.
How does making a bet make one responsible?
> In recent years, i have often made claims that the Python's
> documentation, it's writing quality and its documentation quality in
> whole, is one of the worst.
... Are you trying to be ironic on purpose?
> Among all the wild claims in our modern world, from the sciences to
> social or political issues, my claim about Python's technical writing
> quality or its whole quality as a technical documentation, is actualy
> trivial to verify by any standards.
Quality is subjective, so it's not trivial to verify.
> Some of these beer drinking fuckheads are simply being a asshole,
> which are expected by the nature of online tech geeking communities (a
> significance percentage are bored young males). However, many others,
> many with many years of programing experience as a professional,
> sincerely tried to say something to the effect of “in my opinion it's
> good”, or voice other stupid remarks to the effect of “why don't you
> fix it”, and in fact find my claim, and its tone too fantastical, to
> the point thinking i'm a youngling who are bent on to do nothing but
> vandalism. (the tech geekers use in-group slang for this: “troll”.)
Right, so in response to your complaints that something is poor, people
who try to (IMHO validly so) claim otherwise, or suggest that you take
the initiative to change the status quo makes them blithering idiots.
Although I'm sure that I have already lost all credibility with you, I
would like to point out one of the defining features of open source: if
you don't like it, you can change it. No one is pointing a gun at your
head and forcing you to use python's documentation.
Besides, you claim that longbets.org is fostering "responsibility." If
you want to change the world, take some responsibility and do it yourself.
> By that i mean that there is no
> consensus on the subject among its experts, and the issue is complex,
> and has political implications.
I think all concerned would agree that crossposting a message to several
groups (one of your examples) with the intent of criticizing those in
one group and providing information at best tangential to the charters
of other groups is of no merit, and is bad form.
> I think, the founding of Long Now Foundation with its longbets.org,
> shares a concern i have on the tech geeking communities. In
> particular, tech geekers need to have a broader education on social
> sciences, needs to think in long term, and needs to foster personal
Lesson 1: in public fora, screaming and using the most vulgar language
at someone is poor social form. In olden times, such language as you
have presented here might merit punishments like lashings, but in our
more modern egalitarian society, the worst punishment you will receive
is a stern glare.
Besides, I think in the long term. I'm already sorting out my retirement
funds and I've not received a college diploma yet.
> (note: not reading more motherfucking slashdot or
> motherfucking groklaw or more great podcasts on your beatific language
> or your postmodernistic fuckhead idols)
I read /. more to amuse myself on the idiots there, I don't read
groklaw, and I don't listen to podcasts. What I do do is program, read,
espouse my opinions on the current economic and political conditions,
read, check my email, read, read the newspaper, read, and pick up
another of McCullough's wonderful books and read some more.
> (One thing you can do, is actually take a course on philosophy,
> history, law, economics, in your local community college.)
And you should also take a course on Manners 101 at your local community
college.
I would finally like to add that you seem to put yourself on the
pedestal of being the sole person who is righteous in a quagmire of a
world, while the truth could not be further. Anyone who must resort to
base name-calling and mere obscenities when criticizing others has
problems of their own. (In my defense, I do not place myself on such a
pedestal: I respect the opinions of others in this newsgroup far above
myself and would also like to add that they are capable of restraining
themselves when reading provocative banter while I am not).
--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
------------------------------
Date: Sun, 25 May 2008 23:28:50 -0400
From: Lew <conrad@lewscanon.com.invalid>
Subject: Re: blogs, longbets.org, and education of sociology
Message-Id: <x8udnUDt0YLvs6fVnZ2dnUVZ_uidnZ2d@comcast.com>
xahlee@gmail.com wrote:
> For about the past 10 years, i [sic] have been
saying things like
> ... fart ignorant ..., often in a
> irresponsible and carefree way.
...
> Some of these beer drinking f**kheads are simply being a asshole,
...
> vandalism. (the tech geekers use in-group slang for this: “troll”.)
Actually, it's argot, not slang. The definition is fairly narrow and well
understood.
more obscenity:
> technology. (note: not reading more motherf**king slashdot or
> motherf**king groklaw or more great podcasts on your beatific language
> or your postmodernistic f**khead idols)
And the temerity of:
> (One thing you can do, is actually take a course on philosophy,
> history, law, economics, in your local community college.)
Yeah, those bastions of intellectual elitism.
Plonk, plonk, plonk.
--
Lew
------------------------------
Date: Sun, 25 May 2008 14:04:03 -0700 (PDT)
From: chadda@lonemerchant.com
Subject: Re: HTML Parsing issues - Part II
Message-Id: <59a2724f-7f97-478e-ace7-c6ed069aefc2@w8g2000prd.googlegroups.com>
On May 25, 5:15 am, Gunnar Hjalmarsson <nore...@gunnar.cc> wrote:
> cha...@lonemerchant.com wrote:
> > I'm trying to have the following script parse
>
> > <table class="item_description">
> > <tr>
> > <td>Acer Aspire AS5610-2089 Notebook, Intel Pentium Dual Core
> > T2080, 1.6 GHz, 1024GB, 160GB, DVD+/-R DL/DVD+RW Drive, 15.4" TFT,
> > WebCam, 56K Modem, Wireless, NIC, Vista Home Premium, Refurbished with
> > 90 Day Warranty</td>
> > </tr>
> > </table>
>
> > From the following url
>
> >http://www.doba.com/catalog/2988526.html
>
> use LWP::Simple;
> use HTML::TokeParser;
>
> my $html = get 'http://www.doba.com/catalog/2988526.html';
> my $p = HTML::TokeParser->new( \$html );
>
> while ( my $table = $p->get_tag('table') ) {
> last if $table->[1]{class} and
> $table->[1]{class} eq 'item_description';
> }
> print $p->get_trimmed_text('/td');
>
> --
How did you know to use $table->[1]{class} and say not $table->[0]
{class}? Is there something in the documentation that I missed?
------------------------------
Date: Sun, 25 May 2008 23:26:54 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: HTML Parsing issues - Part II
Message-Id: <69u3pbF34org9U1@mid.individual.net>
chadda@lonemerchant.com wrote:
> On May 25, 5:15 am, Gunnar Hjalmarsson <nore...@gunnar.cc> wrote:
>> cha...@lonemerchant.com wrote:
>>> I'm trying to have the following script parse
>>> <table class="item_description">
>>> <tr>
>>> <td>Acer Aspire AS5610-2089 Notebook, Intel Pentium Dual Core
>>> T2080, 1.6 GHz, 1024GB, 160GB, DVD+/-R DL/DVD+RW Drive, 15.4" TFT,
>>> WebCam, 56K Modem, Wireless, NIC, Vista Home Premium, Refurbished with
>>> 90 Day Warranty</td>
>>> </tr>
>>> </table>
>>> From the following url
>>> http://www.doba.com/catalog/2988526.html
>> use LWP::Simple;
>> use HTML::TokeParser;
>>
>> my $html = get 'http://www.doba.com/catalog/2988526.html';
>> my $p = HTML::TokeParser->new( \$html );
>>
>> while ( my $table = $p->get_tag('table') ) {
>> last if $table->[1]{class} and
>> $table->[1]{class} eq 'item_description';
>> }
>> print $p->get_trimmed_text('/td');
>
> How did you know to use $table->[1]{class} and say not $table->[0]
> {class}? Is there something in the documentation that I missed?
I played with Data::Dumper to figure it out. Don't know if it can be
derived from the docs.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Sun, 25 May 2008 10:21:52 -0700 (PDT)
From: chadda@lonemerchant.com
Subject: LWP::Parallel concerns
Message-Id: <a91e70bd-7f34-4340-82ce-1bd3942cfd31@v26g2000prm.googlegroups.com>
Maybe I'm wrong, but if I were to use LWP::Parallel to parse a remote
site for a few hours, then couldn't this be possibly interpreted as a
Denial of Service? And if could be interpreted as a possibly Denial of
Service attack, what could I do to possibly avoid it?
------------------------------
Date: Sun, 25 May 2008 18:39:01 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: LWP::Parallel concerns
Message-Id: <lqkog5-6pb.ln1@osiris.mauzo.dyndns.org>
Quoth chadda@lonemerchant.com:
> Maybe I'm wrong, but if I were to use LWP::Parallel to parse a remote
> site for a few hours, then couldn't this be possibly interpreted as a
> Denial of Service? And if could be interpreted as a possibly Denial of
> Service attack, what could I do to possibly avoid it?
By default LWP::Parallel won't make more than 5 requests to any given
host at a time. If you are worried that even this many over a sustained
period would be considered abuse, you could reduce it with ->max_req; or
you could use ->on_connect to do some more sophisticated rate-limiting.
Ben
--
The Earth is degenerating these days. Bribery and corruption abound.
Children no longer mind their parents, every man wants to write a book,
and it is evident that the end of the world is fast approaching.
Assyrian stone tablet, c.2800 BC ben@morrow.me.uk
------------------------------
Date: Sun, 25 May 2008 10:54:44 -0700 (PDT)
From: chadda@lonemerchant.com
Subject: Re: LWP::Parallel concerns
Message-Id: <90995788-90d4-4490-803c-ca7359cf0362@v26g2000prm.googlegroups.com>
On May 25, 10:39=A0am, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth cha...@lonemerchant.com:
>
> > Maybe I'm wrong, but if I were to use LWP::Parallel to parse a remote
> > site for a few hours, =A0then couldn't this be possibly interpreted as a=
> > Denial of Service? And if could be interpreted as a possibly Denial of
> > Service attack, what could I do to possibly avoid it?
>
> By default LWP::Parallel won't make more than 5 requests to any given
> host at a time. If you are worried that even this many over a sustained
> period would be considered abuse, you could reduce it with ->max_req; or
> you could use ->on_connect to do some more sophisticated rate-limiting.
>
Okay. Thanks.
------------------------------
Date: Sun, 25 May 2008 23:03:53 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: maintaining order in a hash (without Tie::IxHash)
Message-Id: <pan.2008.05.25.21.03.52@rtij.nl.invlalid>
On Thu, 22 May 2008 19:50:16 +0000, JĂĽrgen Exner wrote:
> nolo contendere <simon.chao@fmr.com> wrote:
>>However, regarding the problem of maintaining sort order of a hash, and
>>the Tie::IxHash module, I have a question.
>
> Maybe I am old-fashioned but to me 'sorted' and 'hash' is a
> contradiction in terms. A hash is a (partial) mapping from string to
> scalar and mappings do not have a sequence. Arrays are different because
> their domain (natural numbers) does have a natural order, therefore
> arrays inherit this order.
Perl has a relatively limited set of data structures. As far as mapping
structures, Perl only provides a hash.
There are many applications for a mapping structure that maintains order
(although insertion order is better done by an array).
In C++ you can use a map<>. One of the arguments is the sorting order (a
less-than routine, taking two keys). In fact, in standard C++, a Perl
like hash structure is missing, but is provided by f.i. the boost library
and many vendors. This is widely seen as a deficiency in the standard.
Similarly, although Perl can simulate most data structures with ease and
elegance (like using a hash as a set), an ordered map is something Perl
could very well use.
A real world example? Storing IP(v4 only) networks and checking if an IPA
is contained in a network stored in the map. Trivial with a sorted map
(store networks as 5 byte integers, 4 for the network address 1 for the
mask, search first key for less-than-or-equal ipa/32, check if ipa in
network found).
Probably not to difficult to do in Perl as well, but I cannot come up
directly with an easy and efficient (both in time and memory) way like
the above.
M4
------------------------------
Date: Sun, 25 May 2008 15:01:08 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: problem upgrading Bundle::CPAN
Message-Id: <428og5-lo1.ln1@osiris.mauzo.dyndns.org>
Quoth si <smcbutler@hotmail.com>:
>
> hi, i've been using cpan for several years with very few problems, i
> just tried to upgrate my Bundle::CPAN install and got a lot of
> failures. I'm running FC6 on an AMD opteron machine. could someone
> give me a hint how i might fix this?
>
> thx in advance.
>
>
> DIED. FAILED tests 1-29
> Failed 29/29 tests, 0.00% okay
> t/14gzopen......Use of uninitialized value in concatenation (.) or
> string at /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/Scalar/
> Util.pm line 30.
You need to install the XS version of Scalar::Util (for which you need a
C compiler). For some reason Fedora comes with the pure-Perl version,
which is Not Good Enough.
Ben
--
I have two words that are going to make all your troubles go away.
"Miniature". "Golf".
[ben@morrow.me.uk]
------------------------------
Date: Sun, 25 May 2008 21:10:03 -0500
From: Marc Bissonnette <dragnet\_@_/internalysis.com>
Subject: Why reading the FAQs is good (example)
Message-Id: <Xns9AA9E172266E9dragnetinternalysisc@216.196.97.131>
So I was commenting some code I wrote for a friend who's new to perl and
I came across the following in my code:
# replace decimals with 999999999 in order to check for non-numerical
# data, then switch it back (this is a lazyman's shortcut)
$in{hours} =~ s/\./999999999/;
if ($in{hours} =~ /\D/) {
push @missing,'Hours contains non-numerical data';
$missing = 1;
}
$in{hours}=~s/999999999/\./;
And I realized I've been too lazy with this for too long. So my first
thought was to post here with "how do I test for non-numerical in decimal
number data?" but, of course, that violates the "How to ask intelligent
questions in CPLM", so I googled it.
Came up with a bunch of non-relevant results until I saw a similar
question and the answer was "Your answer is in perldoc perldata"
So I did exactly that. Of course, the answer *is* right there:
warn "not a decimal number" unless /^-?\d+\.?\d*$/;
Which got me thinking - I take waaaay too many long-trips when there is
stuff in the regex language that would make my life easier, so I realized
that I need to understand the above, rather than just copy/paste into
code.
So, here's my understanding of
/^-?\d+\.?\d*$/;
/ start the search pattern
^ match the beginning of the line
-? match a minus sign once or not at all
\d+ match a digit character zero or more times
\.? match a decimal once or not at all
\d match a digit character
*$ all to the end of the line
/ end the search pattern
Erm... I'm not so sure it's all stuck in my head.
I *think* this means
"Warn {text} unless the input *only* matches minus signs, digit
characters and decimals, from the beginning to the end of the string"
Is that about right ?
I tried the above with the following in test.pl just to try to reinforce
it in my mind:
$foo = '3.5'; # no warning
$foo = '5'; # no warning
$foo = '-1'; # no warning
$foo = '4x4'; # warning - non-digit
$foo = '--3.2' # warning - more than one minus
$foo = '3.5.5' # warning - more than one decimal
Now that section of my code is:
if ($in{hours} !~ /^-?\d+\.?\d*$/) {
push @missing,'Hours contains non-numerical data';
$missing = 1;
}
Comments and general pointing and laughing welcome. :)
--
Marc Bissonnette
Looking for a new ISP? http://www.canadianisp.com
Largest ISP comparison site across Canada.
------------------------------
Date: Mon, 26 May 2008 03:09:29 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Why reading the FAQs is good (example)
Message-Id: <x7y75xsqvx.fsf@mail.sysarch.com>
>>>>> "MB" == Marc Bissonnette <dragnet\_@_/internalysis.com> writes:
MB> So, here's my understanding of
MB> /^-?\d+\.?\d*$/;
MB> / start the search pattern
MB> ^ match the beginning of the line
beginning of string in this case.
MB> -? match a minus sign once or not at all
MB> \d+ match a digit character zero or more times
one or more times. you must have digits before the decimal point.
MB> \.? match a decimal once or not at all
MB> \d match a digit character
MB> *$ all to the end of the line
that is \d* which is zero or more digits. then comes $ which is end of
the string (or before an ending newline.
MB> / end the search pattern
MB> I *think* this means
MB> "Warn {text} unless the input *only* matches minus signs, digit
MB> characters and decimals, from the beginning to the end of the string"
well it has its flaws. it matches most numbers but what about just
fractional numbers like .9? it fails there since it requires digits
before any decimal point. also it doesn't allow a leading + sign.
look at Regexp::Common on cpan. i am sure it has a number validation
regex in there. it is trickier than your example here as it allows all
number formats (including exponents).
MB> Now that section of my code is:
MB> if ($in{hours} !~ /^-?\d+\.?\d*$/) {
MB> push @missing,'Hours contains non-numerical data';
MB> $missing = 1;
MB> }
the $missing = 1 is a red flag as boolean flags are poor coding IMO. you
have @missing which supposedly contains error strings so just check that
if it isn't empty.
MB> Comments and general pointing and laughing welcome. :)
MUAHAHAHAHAHAHHAH!!!
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: Sun, 25 May 2008 22:31:01 -0500
From: Marc Bissonnette <dragnet\_@_/internalysis.com>
Subject: Re: Why reading the FAQs is good (example)
Message-Id: <Xns9AA9EF2C4CC97dragnetinternalysisc@216.196.97.131>
Uri Guttman <uri@stemsystems.com> fell face-first on the keyboard. This
was the result: news:x7y75xsqvx.fsf@mail.sysarch.com:
>>>>>> "MB" == Marc Bissonnette <dragnet\_@_/internalysis.com> writes:
>
> MB> So, here's my understanding of
>
> MB> /^-?\d+\.?\d*$/;
>
> MB> / start the search pattern
> MB> ^ match the beginning of the line
>
> beginning of string in this case.
>
> MB> -? match a minus sign once or not at all
> MB> \d+ match a digit character zero or more times
>
> one or more times. you must have digits before the decimal point.
> MB> \.? match a decimal once or not at all
> MB> \d match a digit character
> MB> *$ all to the end of the line
>
> that is \d* which is zero or more digits. then comes $ which is end of
> the string (or before an ending newline.
>
> MB> / end the search pattern
> MB> I *think* this means
> MB> "Warn {text} unless the input *only* matches minus signs, digit
> MB> characters and decimals, from the beginning to the end of the
> string"
>
> well it has its flaws. it matches most numbers but what about just
> fractional numbers like .9? it fails there since it requires digits
> before any decimal point. also it doesn't allow a leading + sign.
I noticed that when I was goofing around with it locally;
In readin your points and thinking about it (now that I understand it a
bit better), this seems to work (leading pluses or minuses, as well as
leading decimals, such as .5
/^\+?-?\d?\.?\d*$/
> look at Regexp::Common on cpan. i am sure it has a number validation
> regex in there. it is trickier than your example here as it allows all
> number formats (including exponents).
I'll definitely take a peek in there when the need arises - For now, I'm
helping a friend out by trying to automate some of their purchase orders,
packing slips and job accounting summaries, so I think the numerical data
will suffice being limited to positives and negatives :)
>
> MB> Now that section of my code is:
>
> MB> if ($in{hours} !~ /^-?\d+\.?\d*$/) {
> MB> push @missing,'Hours contains non-numerical data';
> MB> $missing = 1;
> MB> }
>
> the $missing = 1 is a red flag as boolean flags are poor coding IMO.
> you have @missing which supposedly contains error strings so just
> check that if it isn't empty.
Ya know, that thought honestly popped in my mind as soon as I hit send :)
It's an ingrained bad habit I'll have to work myself out of.
> MB> Comments and general pointing and laughing welcome. :)
>
> MUAHAHAHAHAHAHHAH!!!
:)
Better a MUAHAHAHAHHAHA than a RTFM :)
--
Marc Bissonnette
Looking for a new ISP? http://www.canadianisp.com
Largest ISP comparison site across Canada.
------------------------------
Date: Sun, 25 May 2008 22:37:43 -0500
From: Marc Bissonnette <dragnet\_@_/internalysis.com>
Subject: Re: Why reading the FAQs is good (example)
Message-Id: <Xns9AA9F04F0E702dragnetinternalysisc@216.196.97.131>
Marc Bissonnette <dragnet\_@_/internalysis.com> fell face-first on the
keyboard. This was the result:
news:Xns9AA9EF2C4CC97dragnetinternalysisc@216.196.97.131:
> Uri Guttman <uri@stemsystems.com> fell face-first on the keyboard.
> This was the result: news:x7y75xsqvx.fsf@mail.sysarch.com:
(snip)
>> well it has its flaws. it matches most numbers but what about just
>> fractional numbers like .9? it fails there since it requires digits
>> before any decimal point. also it doesn't allow a leading + sign.
>
> I noticed that when I was goofing around with it locally;
> In readin your points and thinking about it (now that I understand it
> a bit better), this seems to work (leading pluses or minuses, as well
> as leading decimals, such as .5
>
> /^\+?-?\d?\.?\d*$/
Apologies for following up my own post: I just realized the above has a
flaw: It matches on or zero beginning digits (.4 or 0.4) but not two digits
or more (22.4)
This works better:
/^\+?-?(\d?|\d+)\.?\d*$/
--
Marc Bissonnette
Looking for a new ISP? http://www.canadianisp.com
Largest ISP comparison site across Canada.
------------------------------
Date: Mon, 26 May 2008 04:04:03 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Why reading the FAQs is good (example)
Message-Id: <x7tzglsocs.fsf@mail.sysarch.com>
>>>>> "MB" == Marc Bissonnette <dragnet\_@_/internalysis.com> writes:
MB> I noticed that when I was goofing around with it locally;
MB> In readin your points and thinking about it (now that I understand it a
MB> bit better), this seems to work (leading pluses or minuses, as well as
MB> leading decimals, such as .5
MB> /^\+?-?\d?\.?\d*$/
that allows +-3
use [] to allow one char from a set:
/^[+-]?\d?\.?\d*$/
that allows either a single leading + or - but not both.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 1577
***************************************