[30335] in Perl-Users-Digest
Perl-Users Digest, Issue: 1578 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon May 26 11:10:03 2008
Date: Mon, 26 May 2008 08:09:15 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 26 May 2008 Volume: 11 Number: 1578
Today's topics:
Re: Dividing a problem into subproblems and using mulit <peter@makholm.net>
Dividing a problem into subproblems and using mulitiple mathematisch@gmail.com
Re: Dividing a problem into subproblems and using mulit mathematisch@gmail.com
Re: initialize object permanently (only once) <yingun@gmail.com>
Re: LWP::Parallel concerns <1usa@llenroc.ude.invalid>
new CPAN modules on Mon May 26 2008 (Randal Schwartz)
Partial e-mail body receiving by SMTP Server? <martin.rysanek@gmail.com>
Re: Partial e-mail body receiving by SMTP Server? <ramprasad.ap@gmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 26 May 2008 15:48:15 +0200
From: Peter Makholm <peter@makholm.net>
Subject: Re: Dividing a problem into subproblems and using mulitiple CPU's and perl in linux
Message-Id: <87tzglch28.fsf@hacking.dk>
mathematisch@gmail.com writes:
> processing? Each should use one of the four CPU's on the same machine
> and somehow report to a main program upon finishing, so that the
> partial results can be "merged".
The right solutions depends on how you need to communicate between the
master process and the children. If you don't need any communication
during the processing of a job and you return value is a simple
success/failure then I would prefer to use something simple like
Parallel::ForkManager:
my $pm = new Parallel::ForkManager(8);
$pm->run_on_finish( sub {
my ($pid, $exit_code, $ident) = @_;
print $job->id() . " ended with " .
( $exit_code >> 8 ? "success" : "failure");
} );
while( defined( $job = shift @jobqueue ) ) {
$pm->start($job) and next;
$result = process($job);
$pm->finish($result);
}
$pm->wait_all_children;
> As you see, it is more of a parallel programming question using perl.
> Is the answer related to "forking child processes" ? Is forking child
> processes and giving each child a chunk would be the best solution? Or
> am I wrong? I am not very familiar with the unix and perl.
The above is the "forking child" solution. Wether this is the right
solution depends on you problem. If you need a more complicated
feedback the "forking child" solution becomes more difficult. A
solution would be passing the child a filehandle and write the result
to this. You run_on_finish hook would the read back the result.
But I don't think this will ever be true parallel programming.
//Makholm
------------------------------
Date: Mon, 26 May 2008 06:25:04 -0700 (PDT)
From: mathematisch@gmail.com
Subject: Dividing a problem into subproblems and using mulitiple CPU's and perl in linux
Message-Id: <616d36ea-882e-42d4-a096-4ac3bc04f283@l42g2000hsc.googlegroups.com>
Dear Sir/Madame,
I have a couple of questions regarding multi-tasking using perl in a
unix multi-CPU environment.
I have a unix system with 4 CPU's. If i can divide the solution of a
problem into 4 independent subproblems, would there be a way then to
run 4 perl programs, each assigned one of the 4 problems for
processing? Each should use one of the four CPU's on the same machine
and somehow report to a main program upon finishing, so that the
partial results can be "merged".
As you see, it is more of a parallel programming question using perl.
Is the answer related to "forking child processes" ? Is forking child
processes and giving each child a chunk would be the best solution? Or
am I wrong? I am not very familiar with the unix and perl.
Do you have any information about a good tutorial about multi-tasking
with perl?
Thanks a lot for your response.
M.
------------------------------
Date: Mon, 26 May 2008 08:03:21 -0700 (PDT)
From: mathematisch@gmail.com
Subject: Re: Dividing a problem into subproblems and using mulitiple CPU's and perl in linux
Message-Id: <4eb737e5-e684-4dc9-a31d-a72d48ace3a9@59g2000hsb.googlegroups.com>
On May 26, 3:48 pm, Peter Makholm <pe...@makholm.net> wrote:
> mathemati...@gmail.com writes:
> > processing? Each should use one of the four CPU's on the same machine
> > and somehow report to a main program upon finishing, so that the
> > partial results can be "merged".
>
> The right solutions depends on how you need to communicate between the
> master process and the children. If you don't need any communication
> during the processing of a job and you return value is a simple
> success/failure then I would prefer to use something simple like
> Parallel::ForkManager:
>
> my $pm = new Parallel::ForkManager(8);
>
> $pm->run_on_finish( sub {
> my ($pid, $exit_code, $ident) = @_;
>
> print $job->id() . " ended with " .
> ( $exit_code >> 8 ? "success" : "failure");
>
> } );
>
> while( defined( $job = shift @jobqueue ) ) {
> $pm->start($job) and next;
>
> $result = process($job);
>
> $pm->finish($result);
>
> }
>
> $pm->wait_all_children;
>
> > As you see, it is more of a parallel programming question using perl.
> > Is the answer related to "forking child processes" ? Is forking child
> > processes and giving each child a chunk would be the best solution? Or
> > am I wrong? I am not very familiar with the unix and perl.
>
> The above is the "forking child" solution. Wether this is the right
> solution depends on you problem. If you need a more complicated
> feedback the "forking child" solution becomes more difficult. A
> solution would be passing the child a filehandle and write the result
> to this. You run_on_finish hook would the read back the result.
>
> But I don't think this will ever be true parallel programming.
>
> //Makholm
Thanks a lot Makholm.
------------------------------
Date: Mon, 26 May 2008 07:51:42 -0700 (PDT)
From: Keenlearner <yingun@gmail.com>
Subject: Re: initialize object permanently (only once)
Message-Id: <9d115472-3335-411c-8043-e8e74809bc71@w8g2000prd.googlegroups.com>
On May 17, 2:08=A0am, xhos...@gmail.com wrote:
> Keenlearner <yin...@gmail.com> wrote:
> > Hello, I am using Wordnet::QueryData which allow access to a very huge
> > dictionary data. The initialization of object
> > my $wn =3D WordNet::QueryData->new;
>
> > took
> > 2 wallclock secs ( 2.36 usr + =A00.07 sys =3D =A02.43 CPU)
>
> > Then the subsequent request for the data is exetremely fast
>
> ...
> > =A0 print "Noun count: ", scalar($wn->listAllWords("noun")), "\n";
>
> This one is problematic (i.e. slow) with the method I will outline below.
> So I commented it out and tested only the other requests.
>
> > I
> > am developing a web application, is there a way to make the
> > initialization of object permanently in memory ?
>
> Under straightforward CGI, the process ends when the request is done, so
> the object ends with it. =A0So the real answer here is to use mod_perl
> or one of the other ways of running CGI with a persistent perl process.
>
> > I tried to use the
> > Storable module. But that only give me a little increase in
> > performance. Anybody's idea is very much appreciated, Thank you.
>
> Again, the best answer is almost certainly to use mod_perl or the like.
> But just for kicks, I made it work with DBM::Deep, version 0.983. =A0This =
way
> the data is stored on disk and only the parts actually used are read into
> memory. =A0As alluded to above, listAllWords defeats the purpose of this a=
s
> it goes through so much stuff, causing to be read into memory.
>
> First one needs to prepare the database, to be held in "file.dbm". =A0This=
> only needs to be done one time, or whenever the data files are updated:
>
> perl -e 'use WordNet::QueryData; my $x=3D WordNet::QueryData->new(); \
> =A0 =A0delete $x->{data_fh}; use DBM::Deep; my $h=3DDBM::Deep->new("file.d=
bm"); \
> =A0 =A0$h->{foo}=3D$x;'
>
> Once that is done, you can replace "my $wn =3D WordNet::QueryData->new;"
> with:
>
> use DBM::Deep;
> my $x=3DDBM::Deep->new("file.dbm")->{foo};
> my $wn=3D{}; %$wn=3D%$x; =A0 #convert the top level to regular hash
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0#(so it can be blessed)
> delete $wn->{data_fh}; # I don't know why this is necessary
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0# as this entry should not =
be here to begin with
> bless $wn, 'WordNet::QueryData';
> $wn->openData; =A0## restore contents of $wn->{data_fh)
>
> And from here use $wn just as before. =A0This is very similar to the
> "Storable" method demonstrated a few months ago in comp.lang.perl.modules,=
> except using this method data is read from disk only when requested.
>
> Here are timings on my machine for your "subsequent requests", except for
> the listAllWords ones:
>
> original:
> =A0 0:01.73
> Storable
> =A0 0:00.50
> DBM::Deep
> =A0 0:00.07
>
> I have compared the output of all methods and they are identical, but that=
> doesn't constitute a rigorous test, so use with caution.
>
> Xho
>
> --
> --------------------http://NewsReader.Com/--------------------
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate=
> this fact.
Thansk, after few days of digging the mod_perl, I think so far the
mod_perl is the best solution. By using the PerlRequire for the
startup.pl script.
------------------------------
Date: Mon, 26 May 2008 14:26:20 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: LWP::Parallel concerns
Message-Id: <Xns9AAA6A2FB83EEasu1cornelledu@127.0.0.1>
Ben Morrow <ben@morrow.me.uk> wrote in
news:lqkog5-6pb.ln1@osiris.mauzo.dyndns.org:
> Quoth chadda@lonemerchant.com:
>> Maybe I'm wrong, but if I were to use LWP::Parallel to parse a remote
>> site for a few hours, then couldn't this be possibly interpreted as
>> a Denial of Service? And if could be interpreted as a possibly Denial
>> of Service attack, what could I do to possibly avoid it?
>
> By default LWP::Parallel won't make more than 5 requests to any given
> host at a time. If you are worried that even this many over a
> sustained period would be considered abuse,
On the other hand, this would not help at all if what you are doing is
against the terms of use the web site and use scraping to re-construct a
data set to which you do not have commercial use rights.
If you then go and set up a commercial operation of any sort using the
data you obtained that way, well, I am no lawyer, but I think a decent
case could be made against you.
Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
------------------------------
Date: Mon, 26 May 2008 04:42:20 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Mon May 26 2008
Message-Id: <K1GL2K.F2L@zorch.sf-bay.org>
The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN). You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.
API-Plesk-1.03
http://search.cpan.org/~nrg/API-Plesk-1.03/
OOP interface to the Plesk XML API (http://www.parallels.com/en/products/plesk/).
----
Abstract-Meta-Class-0.06
http://search.cpan.org/~adrianwit/Abstract-Meta-Class-0.06/
Simple meta object protocol implementation.
----
Apache-ASP-2.61
http://search.cpan.org/~chamas/Apache-ASP-2.61/
Active Server Pages for Apache with mod_perl
----
Catalyst-Runtime-5.7014
http://search.cpan.org/~mramberg/Catalyst-Runtime-5.7014/
Catalyst Runtime version
----
Class-MOP-0.56
http://search.cpan.org/~stevan/Class-MOP-0.56/
A Meta Object Protocol for Perl 5
----
Config-INI-Reader-Ordered-0.011
http://search.cpan.org/~hdp/Config-INI-Reader-Ordered-0.011/
.ini-file parser that returns sections in order
----
Coro-4.72
http://search.cpan.org/~mlehmann/Coro-4.72/
coroutine process abstraction
----
Crypt-Skip32-0.07
http://search.cpan.org/~esh/Crypt-Skip32-0.07/
32-bit block cipher based on Skipjack
----
Fey-DBIManager-0.05
http://search.cpan.org/~drolsky/Fey-DBIManager-0.05/
Manage a set of DBI handles
----
Fey-ORM-0.06
http://search.cpan.org/~drolsky/Fey-ORM-0.06/
A Fey-based ORM
----
Games-Go-Coordinate-0.04
http://search.cpan.org/~marcel/Games-Go-Coordinate-0.04/
represents a board coordinate in the game of Go
----
Games-Go-Rank-0.05
http://search.cpan.org/~marcel/Games-Go-Rank-0.05/
represents a player's rank in the game of Go
----
Gtk2-Ex-Clock-3
http://search.cpan.org/~kryde/Gtk2-Ex-Clock-3/
simple digital clock widget
----
HTML-Menu-TreeView-1.00
http://search.cpan.org/~lze/HTML-Menu-TreeView-1.00/
----
HTTP-Server-Simple-Er-v0.0.2
http://search.cpan.org/~ewilhelm/HTTP-Server-Simple-Er-v0.0.2/
lightweight server and interface
----
IO-Lambda-0.18
http://search.cpan.org/~karasik/IO-Lambda-0.18/
non-blocking I/O in lambda style
----
Log-Fine-0.10
http://search.cpan.org/~cfuhrman/Log-Fine-0.10/
Yet another logging framework
----
MARC-Charset-0.99
http://search.cpan.org/~esummers/MARC-Charset-0.99/
convert MARC-8 encoded strings to UTF-8
----
MARC-Errorchecks-1.14
http://search.cpan.org/~eijabb/MARC-Errorchecks-1.14/
Collection of MARC 21/AACR2 error checks
----
Module-Install-0.74
http://search.cpan.org/~adamk/Module-Install-0.74/
Standalone, extensible Perl module installer
----
Moose-0.45
http://search.cpan.org/~stevan/Moose-0.45/
A postmodern object system for Perl 5
----
MooseX-AttributeHelpers-0.09
http://search.cpan.org/~stevan/MooseX-AttributeHelpers-0.09/
Extend your attribute interfaces
----
MooseX-Daemonize-0.07
http://search.cpan.org/~stevan/MooseX-Daemonize-0.07/
Role for daemonizing your Moose based application
----
MooseX-Getopt-0.13
http://search.cpan.org/~stevan/MooseX-Getopt-0.13/
A Moose role for processing command line options
----
MooseX-MetaDescription-0.03
http://search.cpan.org/~stevan/MooseX-MetaDescription-0.03/
A framework for adding additional metadata to Moose classes
----
MooseX-Storage-0.13
http://search.cpan.org/~stevan/MooseX-Storage-0.13/
An serialization framework for Moose classes
----
MySQL-Log-ParseFilter-1.00
http://search.cpan.org/~dnichter/MySQL-Log-ParseFilter-1.00/
Parse and filter MySQL slow, general and binary logs
----
Net-FriendFeed-0.81
http://search.cpan.org/~kappa/Net-FriendFeed-0.81/
Perl interface to FriendFeed.com API
----
PerlIO-Util-0.16
http://search.cpan.org/~gfuji/PerlIO-Util-0.16/
A selection of general PerlIO utilities
----
Persistence-Entity-0.02
http://search.cpan.org/~adrianwit/Persistence-Entity-0.02/
Persistence API for perl classes.
----
Regexp-Common-time-0.03
http://search.cpan.org/~roode/Regexp-Common-time-0.03/
Date and time regexps.
----
Test-File-1.24_03
http://search.cpan.org/~bdfoy/Test-File-1.24_03/
test file attributes
----
Test-GlassBox-Heavy-0.04
http://search.cpan.org/~oliver/Test-GlassBox-Heavy-0.04/
Non-invasive testing of subroutines within Perl programs
----
Time-Normalize-0.06
http://search.cpan.org/~roode/Time-Normalize-0.06/
Convert time and date values into standardized components.
----
WWW-YourFileHost-0.04
http://search.cpan.org/~yusukebe/WWW-YourFileHost-0.04/
Get video informations from YourFileHost
----
WebService-Backlog-0.04
http://search.cpan.org/~yamamoto/WebService-Backlog-0.04/
Perl interface to Backlog.
----
WebService-Simple-0.12
http://search.cpan.org/~yusukebe/WebService-Simple-0.12/
Simple Interface To Web Services APIs
----
WebService-Simple-Google-Chart-0.03
http://search.cpan.org/~yusukebe/WebService-Simple-Google-Chart-0.03/
Get Google Chart URL and image file
----
ack-1.84
http://search.cpan.org/~petdance/ack-1.84/
grep-like text finder
----
mobirc-0.10
http://search.cpan.org/~tokuhirom/mobirc-0.10/
modern IRC to HTTP gateway
----
mobirc-0.11
http://search.cpan.org/~tokuhirom/mobirc-0.11/
modern IRC to HTTP gateway
----
smtm_1.6.10
http://search.cpan.org/~edd/smtm_1.6.10/
Display and update a configurable ticker of global stock quotes
If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.
This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
http://www.stonehenge.com/merlyn/LinuxMag/col82.html
print "Just another Perl hacker," # the original
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
------------------------------
Date: Mon, 26 May 2008 01:28:02 -0700 (PDT)
From: Grouppy <martin.rysanek@gmail.com>
Subject: Partial e-mail body receiving by SMTP Server?
Message-Id: <a1c42eda-f4db-4f4f-bb3c-a158d0fe83f4@a70g2000hsh.googlegroups.com>
I would like to setup a SMTP server for processing of e-mails (my SMTP
server). E-mails will be forwarded from well know Internet e-mail
servers to my SMTP server, but the only fragment of body is important
for processing (like first 10kB of e-mail, not attachments just text).
The rest of the transferred e-mail, should be discarded during
communication, otherwise it could exhausted communication capacity of
my server (having 1M e-mails to be delivered in an hour).
Is the possible to make my SMTP server by protocol to communicate with
an Internet e-mail server, that my SMTP server receive just only like
first 10kB of message and the rest will be discarded and the Internet
e-mail will believe that e-mail is delivered correctly.
Any idea or protocol remarks to be considered.
Regards
Martin Rysanek
------------------------------
Date: Mon, 26 May 2008 04:38:12 -0700 (PDT)
From: "ram@pragatee.com" <ramprasad.ap@gmail.com>
Subject: Re: Partial e-mail body receiving by SMTP Server?
Message-Id: <d8b35087-fc39-414a-899e-53c7b32870db@f24g2000prh.googlegroups.com>
On May 26, 1:28 pm, Grouppy <martin.rysa...@gmail.com> wrote:
> I would like to setup a SMTP server for processing of e-mails (my SMTP
> server). E-mails will be forwarded from well know Internet e-mail
> servers to my SMTP server, but the only fragment of body is important
> for processing (like first 10kB of e-mail, not attachments just text).
> The rest of the transferred e-mail, should be discarded during
> communication, otherwise it could exhausted communication capacity of
> my server (having 1M e-mails to be delivered in an hour).
>
> Is the possible to make my SMTP server by protocol to communicate with
> an Internet e-mail server, that my SMTP server receive just only like
> first 10kB of message and the rest will be discarded and the Internet
> e-mail will believe that e-mail is delivered correctly.
>
> Any idea or protocol remarks to be considered.
>
> Regards
>
> Martin Rysanek
What are you to do ? Read a spamtrap mailbox ?
though that may not be relevant , but usually it is not the best idea
to write a smtp server. YMMV
Use a postfix server and pipe it to a perl script if you like it
The perl script can then read the mail upto 10k and discard the rest
Thanks
Ram
PS:
Note to Spammers: Go ahead , send me spam
ram@pragatee.com
http://pragatee.com
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 1578
***************************************