[30582] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1825 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Aug 29 00:09:48 2008

Date: Thu, 28 Aug 2008 21:09:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 28 Aug 2008     Volume: 11 Number: 1825

Today's topics:
        [Q] How to seach an unmapped network drive <LeeHwasoo@gmail.com>
    Re: perl multithreading performance <cartercc@gmail.com>
    Re: perl multithreading performance <tzz@lifelogs.com>
    Re: perl multithreading performance <glex_no-spam@qwest-spam-no.invalid>
    Re: perl multithreading performance <bugbear@trim_papermule.co.uk_trim>
    Re: perl multithreading performance <fawaka@gmail.com>
    Re: perl multithreading performance <fawaka@gmail.com>
    Re: perl multithreading performance <m@rtij.nl.invlalid>
    Re: perl multithreading performance xhoster@gmail.com
    Re: perl threads <zentara@highstream.net>
    Re: perl threads <nitte.sudhir@gmail.com>
    Re: Question in Perl <news1234@free.fr>
    Re: Question in Perl <news1234@free.fr>
    Re: recursive filehandle <w.nijs@alf4all.demon.nl>
        Soap:lite does it support DIME (attachments) qa4ever@gmail.com
    Re: subprocesses lifecycle <hansmu@xs4all.nl>
    Re: subprocesses lifecycle <whynot@pozharski.name>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 28 Aug 2008 12:15:36 -0700 (PDT)
From: Back9 <LeeHwasoo@gmail.com>
Subject: [Q] How to seach an unmapped network drive
Message-Id: <ed231f68-3e76-4b25-956d-5f8cf897a04b@25g2000hsx.googlegroups.com>

Hi,

Does anyone know of how to seach an unmapped network drive in Windows
XP to map it?

Thanks,
Hwasoo


------------------------------

Date: Thu, 28 Aug 2008 04:45:56 -0700 (PDT)
From: cartercc <cartercc@gmail.com>
Subject: Re: perl multithreading performance
Message-Id: <f0393fd2-e2e4-4b6a-81ca-90de7e672538@d77g2000hsb.googlegroups.com>

On Aug 27, 5:53=A0pm, Martijn Lievaart <m...@rtij.nl.invlalid> wrote:

> Perl threading, well frankly, sucks. You may want to switch to another
> language with re support that meets your needs. I would go for C++ (with
> boost), but then I know that language very well.

I've been playing with Erlang. In this case, you could probably spawn
separate threads per line and have them all run concurrently. I
haven't done a 'real' project (yet) but I've written some toy scripts
that tear through large files in fractions of milliseconds.

CC


------------------------------

Date: Thu, 28 Aug 2008 08:39:57 -0500
From: Ted Zlatanov <tzz@lifelogs.com>
Subject: Re: perl multithreading performance
Message-Id: <868wuhgswi.fsf@lifelogs.com>

On Wed, 27 Aug 2008 23:53:09 +0200 Martijn Lievaart <m@rtij.nl.invlalid> wrote: 

ML> On Wed, 27 Aug 2008 14:15:34 -0700, dniq00 wrote:
>> Nope, it doesn't :( I already have the single-threaded script, which has
>> been working for years now, but the amount of logs it needs to process
>> keeps growing, and I'm basically at the point where it can only keep up
>> with the speed with which logs are being written, so if there's back-log
>> for whatever reason - it might not catch up, so I'm looking into how I
>> can improve its performance.

ML> Perl threading, well frankly, sucks. You may want to switch to another 
ML> language with re support that meets your needs. I would go for C++ (with 
ML> boost), but then I know that language very well.

Hadoop is a nice non-Perl framework for this kind of work.

Ted


------------------------------

Date: Thu, 28 Aug 2008 10:32:02 -0500
From: "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid>
Subject: Re: perl multithreading performance
Message-Id: <48b6c4f3$0$89384$815e3792@news.qwest.net>

dniq00@gmail.com wrote:
> Hello, oh almighty perl gurus!
> 
> I'm trying to implement multithreaded processing for the humongous
> amount of logs that I'm currently processing in 1 process on a 4-CPU
> server.
> 
> What the script does is for each line it checks if the line contains
> GET request, and if it does - goes through a list of pre-compiled
> regular expressions, trying to find a matching one. [...]

> Any ideas why in the world it's so slow? I did some research and
> couldn't find a lot of info, other than the way I do it pretty much
> the way it should be done, unless I'm missing something...

Another, much easier/faster approach, would be:

grep ' GET ' file | your_script.pl

The earlier you can filter out the work that's needed, the better, and 
you're not going to get much faster than grep.  The more refined you
can make that initial filtering of data to only send lines you're
interested in, to your program, the better.


------------------------------

Date: Thu, 28 Aug 2008 17:07:39 +0100
From: bugbear <bugbear@trim_papermule.co.uk_trim>
Subject: Re: perl multithreading performance
Message-Id: <SMidncb4f9zWUCvVnZ2dnUVZ8uWdnZ2d@posted.plusnet>

J. Gleixner wrote:
> Another, much easier/faster approach, would be:
> 
> grep ' GET ' file | your_script.pl
> 
> The earlier you can filter out the work that's needed, the better, and 
> you're not going to get much faster than grep.  The more refined you
> can make that initial filtering of data to only send lines you're
> interested in, to your program, the better.

As Jon Bentley summarised it;

protect expensive tests with cheap tests.

   BugBear


------------------------------

Date: 28 Aug 2008 17:49:51 GMT
From: Leon Timmermans <fawaka@gmail.com>
Subject: Re: perl multithreading performance
Message-Id: <48b6e53f$0$194$e4fe514c@news.xs4all.nl>

On Wed, 27 Aug 2008 14:25:32 -0700, dniq00 wrote:
> Thanks for the link - trying to figure out whattahellisgoingon there :)
> Looks like he's basically mmaps the input and begins reading it starting
> at different points. Thing is, I'm using <> as input, which can contain
> hundreds of gigabytes of data, so I'm not sure how's that going to work
> out...

Is your computer 64 or 32 bits? In the former case mmap will work for 
such large files, but the latter it won't. In that case it may not be a 
bad idea to split the log files into chunks that do fit into your memory 
space. An additional advantage of that would be that you may not need to 
use threads at all.

Regards,

Leon


------------------------------

Date: 28 Aug 2008 19:26:28 GMT
From: Leon Timmermans <fawaka@gmail.com>
Subject: Re: perl multithreading performance
Message-Id: <48b6fbe4$0$194$e4fe514c@news.xs4all.nl>

On Wed, 27 Aug 2008 23:53:09 +0200, Martijn Lievaart wrote:

> Perl threading, well frankly, sucks. You may want to switch to another
> language with re support that meets your needs.

Some would say all threading sucks. All approaches are either hard to get 
a proper performance from or hard to get correct. At least the queue 
approach perl promotes gets one of them right.

Also lets not forget that Perl at least supports preemptive threading. 
Ruby doesn't at all and python has a giant interpreter lock, making it 
useless for this kind of problem.

Regards,

Leon Timmermans


------------------------------

Date: Thu, 28 Aug 2008 22:28:30 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: perl multithreading performance
Message-Id: <pan.2008.08.28.20.28.30@rtij.nl.invlalid>

On Thu, 28 Aug 2008 19:26:28 +0000, Leon Timmermans wrote:

> On Wed, 27 Aug 2008 23:53:09 +0200, Martijn Lievaart wrote:
> 
>> Perl threading, well frankly, sucks. You may want to switch to another
>> language with re support that meets your needs.
> 
> Some would say all threading sucks. All approaches are either hard to
> get a proper performance from or hard to get correct. At least the queue
> approach perl promotes gets one of them right.

Well, Perl threading has it uses (and maybe this use case is one of 
them), but it has severe limitations. For instance, signals are out. That 
alone was the killer in each and every case I thought I could use threads 
in Perl.

Threading in general doesn't suck. It's hard to get right until you get 
some basic understanding, but after that I find threading a valuable tool 
in the toolbox.

Perl threading does suck in my opinion, I didn't know Python threading 
sucked harder.

M4



------------------------------

Date: 28 Aug 2008 23:27:56 GMT
From: xhoster@gmail.com
Subject: Re: perl multithreading performance
Message-Id: <20080828192759.156$Ws@newsreader.com>

Leon Timmermans <fawaka@gmail.com> wrote:
> On Wed, 27 Aug 2008 23:53:09 +0200, Martijn Lievaart wrote:
>
> > Perl threading, well frankly, sucks. You may want to switch to another
> > language with re support that meets your needs.
>
> Some would say all threading sucks. All approaches are either hard to get
> a proper performance from or hard to get correct. At least the queue
> approach perl promotes gets one of them right.
>
> Also lets not forget that Perl at least supports preemptive threading.
> Ruby doesn't at all and python has a giant interpreter lock, making it
> useless for this kind of problem.

I fleshed out the OPs example code to make it runnable, using a simple
foreach (1..400) {}; to simulate the processing of each line in the
consumer threads (400 because that is what provided a throughput of 30_000
per second in a simple non-threaded model) and was pleasantly surprised.
I got a substantial speed up by using threading, with a factor of 3
improvement in throughput by using $cpu_count=4 (4 consumer threads, plus
main thread).

I still wouldn't use threads on my own code for something like this,
though. I'd just start 4 processes assigning each a different chunk of the
data.


Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Thu, 28 Aug 2008 08:57:42 -0400
From: zentara <zentara@highstream.net>
Subject: Re: perl threads
Message-Id: <ck7db41uaaqhfr4hmet1eg69jo23cepe8o@4ax.com>

On Thu, 28 Aug 2008 02:02:38 -0700 (PDT), kath <nitte.sudhir@gmail.com>
wrote:

>know 'Perl ithreads are not lightweight!', i am using carefully, in
>the sense, I'm not using any shared variables between threads. Each
>thread will do dependency calculation, which will produces an output
>in a file separately. Later i parse those files to get dependency list
>for each project.

First I don't use win32, I use linux, but the advice should be the same
in this case.
If you don't share variables between threads, and you are writing
results to a file, to be processed later, you don't need threads.
Forking an independent process is better. (I realize on win32 it's all
threads, but Perl imposes it's own weight when using threads, so you
are better off with independent processes.

See the part on Win32::Process in
 http://perlmonks.org?node_id=500663


>
>Problem:
>Sometimes* the script waits forever or the script just hangs. That is
>waiting for threads. And cant continue with other scenario.

Threads must reach the end of their code blocks, or return, in order
for them to be joined. Maybe try detached threads, or use a shared
variable to signal them to return.
>
>[code]
>#this way i create threads
>#foreach scenario() ...{
>map {my $th = threads->create(\&worker, $_)} @$proj_arr; #proj_arr has
>projects of a scenario
># ...}
>
># This is how i wait for all threads to finish its job
>print "waiting for threads to finish";
>map {my $k = $_->join} threads->list;
>[/code]
>
>Is there a way i can overcome this? Or my perception about cause for

Somewhere in your worker code block, you must listen for a shared
variable, telling the thread to immediately return. Later threads
versions have kill signals you can send to threads, but I don't know if
they work on win32. A thread will not end, just by telling it to join.

zentara

-- 
I'm not really a human, but I play one on earth.
http://zentara.net/Remember_How_Lucky_You_Are.html 


------------------------------

Date: Thu, 28 Aug 2008 19:58:04 -0700 (PDT)
From: kath <nitte.sudhir@gmail.com>
Subject: Re: perl threads
Message-Id: <ec1582c5-8a21-41ae-99e7-b0d5893e053f@z11g2000prl.googlegroups.com>

Hi,
First, thanks for your time.

On Aug 28, 5:57 pm, zentara <zent...@highstream.net> wrote:
> Somewhere in your worker code block, you must listen for a shared
> variable, telling the thread to immediately return. Later threads
> versions have kill signals you can send to threads, but I don't know if
> they work on win32.
Will work on this.

>A thread will not end, just by telling it to join.
I dint know this. May be this was my wrong understanding.

Thanks again,
katharnakh.


------------------------------

Date: Fri, 29 Aug 2008 01:10:21 +0200
From: nntpman68 <news1234@free.fr>
To: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: Question in Perl
Message-Id: <48B7305D.4070602@free.fr>

and for the ones who are not used to the command line, but who're always 
onine:


http://perldoc.perl.org/ )However care must be taken, that this is the 
documentation of perl 5.10.0 and the version on your host might be older.

bye


N



Sherm Pendley wrote:
> bugbear <bugbear@trim_papermule.co.uk_trim> writes:
> 
>> Jürgen Exner wrote:
>>> leejinhoo@gmail.com wrote:
>>>> On Aug 27, 10:27 am, bugbear <bugbear@trim_papermule.co.uk_trim>
>>>> wrote:
>>>>> leejin...@gmail.com wrote:
>>>>>> Where can I find "perldoc perldebug"? I keep searching them but cannot
>>>>>> find them. Thanks!
>>>>> just type it on your command line.
>>>> Is there any "Help" function that I can use to search for explanations
>>>> for variables? 
>>> See 'perldoc perlvar':
>>>
>>> 	NAME
>>>     		perlvar - Perl predefined variables
>>>
>> And probably 'perldoc perldoc'
>>
>>  :-)
> 
> And, of course, "perldoc perl" for a list of all the available docs.
> 
> Also note that, if you're using ActiveState's Perl on Windows, there
> are (or were, when I last looked - it's been a while) links to the
> included docs in the Start menu.
> 
> sherm--
> 


------------------------------

Date: Fri, 29 Aug 2008 01:10:44 +0200
From: nntpman68 <news1234@free.fr>
Subject: Re: Question in Perl
Message-Id: <48b73072$0$8213$426a74cc@news.free.fr>

and for the ones who are not used to the command line, but who're always 
onine:


http://perldoc.perl.org/ )However care must be taken, that this is the 
documentation of perl 5.10.0 and the version on your host might be older.

bye


N



Sherm Pendley wrote:
> bugbear <bugbear@trim_papermule.co.uk_trim> writes:
> 
>> Jürgen Exner wrote:
>>> leejinhoo@gmail.com wrote:
>>>> On Aug 27, 10:27 am, bugbear <bugbear@trim_papermule.co.uk_trim>
>>>> wrote:
>>>>> leejin...@gmail.com wrote:
>>>>>> Where can I find "perldoc perldebug"? I keep searching them but cannot
>>>>>> find them. Thanks!
>>>>> just type it on your command line.
>>>> Is there any "Help" function that I can use to search for explanations
>>>> for variables? 
>>> See 'perldoc perlvar':
>>>
>>> 	NAME
>>>     		perlvar - Perl predefined variables
>>>
>> And probably 'perldoc perldoc'
>>
>>  :-)
> 
> And, of course, "perldoc perl" for a list of all the available docs.
> 
> Also note that, if you're using ActiveState's Perl on Windows, there
> are (or were, when I last looked - it's been a while) links to the
> included docs in the Start menu.
> 
> sherm--
> 


------------------------------

Date: Thu, 28 Aug 2008 21:33:56 +0200
From: Wijnand Nijs <w.nijs@alf4all.demon.nl>
Subject: Re: recursive filehandle
Message-Id: <48b6fda9$0$192$e4fe514c@news.xs4all.nl>

Randal L. Schwartz schreef:
>>>>>> "Wijnand" == Wijnand Nijs <w.nijs@alf4all.demon.nl> writes:
> 
> Wijnand> I have a problem with the next recursive subroutine used in a mailing
> Wijnand> script:
> 
> That's almost identical to one of my 254 Perl columns at
> http://www.stonehenge.com/merlyn/UnixReview/col19.html - you might want to
> read those, and others, for further insight.
> 

Thanks Randal, your columns are verry helpful.

Greetings...
Wijnand


------------------------------

Date: Thu, 28 Aug 2008 04:17:03 -0700 (PDT)
From: qa4ever@gmail.com
Subject: Soap:lite does it support DIME (attachments)
Message-Id: <5c84008f-dea3-4ac9-b777-8db2c45742c3@m3g2000hsc.googlegroups.com>

Quick question, soap:lite does it support DIME (attachments).
Anybody got some example code excersising on a public internet
service?

If not, any other perl library that does?

Thanks,
QA4Ever


------------------------------

Date: Thu, 28 Aug 2008 19:43:27 +0200
From: Hans Mulder <hansmu@xs4all.nl>
Subject: Re: subprocesses lifecycle
Message-Id: <48b6e449$0$199$e4fe514c@news.xs4all.nl>

Matthieu Imbert wrote:

> But it does not explain why in your example the parent script returns
> immediately when calling die, while in my case the parent script waits
> for children to end before returning. I thought that this could be
> related to the way you create child processes (with fork), whereas i
> create then with open. But this little test script returns immediately:
> 
> perl -e '
> open (CHILD,"sleep 30 |");
> die "byebye";
> '

By contrast, if I do this:

perl -e '
open my $child ,"sleep 30 |";
die "byebye";
'
, then I have to wait 30 seconds.

It looks like when my $child goes out of scope, perl closes the handle
and this implies waiting for the child to finish and then setting $?.

I would have thought your example should behave the same, but it doesn't
(not on my machine anyway).

Perhaps you need a double fork.  That is, your child could fork and then
the original child exits immediately, letting the grandchild to the real
work.  That way, your script won't have to wait when it decides to close
the $child handle.

What you'd really want, is a way to tell C<open> that you don't want
C<close> to wait for this child.  As far as I know, there is currently
no simple way to achieve that.

Hope this helps,

-- HansM




> 
> So the problem must come from something else. i have to understand why
> it behaves differently in my first script (i'll try to isolate the
> simplest reproducible demonstration code of the problem).
> 
> Currently, as a workaround, i added code that finds all subprocesses of
> my script and sends TERM, then wait 10s, then send KILL to all of them
> 
> 
> Matthieu


------------------------------

Date: Fri, 29 Aug 2008 00:06:54 +0300
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: subprocesses lifecycle
Message-Id: <ekgjo5xdk8.ln2@carpet.zombinet>

Matthieu Imbert <breafk@remove.this.gmail.com> wrote:
> Eric Pozharski wrote:
*SKIP*
> In your example code, the child process stays alive after the end of
> parent process. As there are probably 30 to 40 lines in /etc/passwd
> and it sleeps 1 second for each line, it's not surprising that it
> takes about half a minute to end and die.

Positive.  My fault.  I've moved B<sleep> in child before B<while> and
the child exits immediately (with regard to B<sleep> of course).  What I
don't understand is why the child succesfully writes in pipe.  The pipe
isn't closed if a reader exits?  I don't grok pipes obviously.

*SKIP*
> So the problem must come from something else. i have to understand why
> it behaves differently in my first script (i'll try to isolate the
> simplest reproducible demonstration code of the problem).

Consider reviewing the list of modules loaded.  There is B<waitpid> or
B<wait> somewhere.  Consider reviewing bugreports of B<perl> for your
distribution (there's such thing as distribution specific quirks, you
know).

Anyway,  I wish you a good luck (hunting for such things is a big
challenge).  Anyway, your understanding of Perl will improve a lot.

*CUT*

-- 
Torvalds' goal for Linux is very simple: World Domination


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 1825
***************************************


home help back first fref pref prev next nref lref last post