[30586] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1829 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Aug 31 14:09:51 2008

Date: Sun, 31 Aug 2008 11:09:14 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 31 Aug 2008     Volume: 11 Number: 1829

Today's topics:
        Multiple processes and tie'd files <tuctboh@gmail.com>
    Re: Multiple processes and tie'd files xhoster@gmail.com
    Re: Multiple processes and tie'd files <tuctboh@gmail.com>
    Re: Multiple processes and tie'd files xhoster@gmail.com
    Re: Multiple processes and tie'd files <alex@digriz.org.uk>
    Re: Multiple processes and tie'd files <fawaka@gmail.nl>
    Re: Multiple processes and tie'd files <tuctboh@gmail.com>
    Re: Multiple processes and tie'd files <tuctboh@gmail.com>
    Re: Multiple processes and tie'd files <tuctboh@gmail.com>
        new CPAN modules on Sun Aug 31 2008 (Randal Schwartz)
    Re: perl multithreading performance <nospam-abuse@ilyaz.org>
    Re: subprocesses lifecycle <whynot@pozharski.name>
    Re: subprocesses lifecycle xhoster@gmail.com
    Re: subprocesses lifecycle <whynot@pozharski.name>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 30 Aug 2008 18:29:39 -0700 (PDT)
From: Tuc <tuctboh@gmail.com>
Subject: Multiple processes and tie'd files
Message-Id: <50f03f0b-796e-4b7e-937c-951fbc169e14@p25g2000hsf.googlegroups.com>

Hi,

      I'm running into an issue when using a file I've tied, and there
are multiple long term running processes. I first ran into it with
Squid as a redirection program (Never resolved it), and now with
MimeDefang.

      When I tie to a DB_File, if one of the processes or even an
external process updates the file, the persistent processes aren't
seeing the update. I have to stop them and restart them for that to
happen. Sorta defeats the whole reason for using a tie'd file, I could
just put it into a hash.

     I've tried using the "sync" method on the handle for the tie,
before and after every read, still with no luck.

     Short of going to mysql (Which is like trying to swat a fly with
the supercollider) is there another option?

                     Thanks, Tuc



------------------------------

Date: 31 Aug 2008 02:20:18 GMT
From: xhoster@gmail.com
Subject: Re: Multiple processes and tie'd files
Message-Id: <20080830222022.210$uq@newsreader.com>

Tuc <tuctboh@gmail.com> wrote:
> Hi,
>
>       I'm running into an issue when using a file I've tied, and there
> are multiple long term running processes. I first ran into it with
> Squid as a redirection program (Never resolved it), and now with
> MimeDefang.
>
>       When I tie to a DB_File, if one of the processes or even an
> external process updates the file, the persistent processes aren't
> seeing the update. I have to stop them and restart them for that to
> happen.

Have you read the documentation for DB_File?

> Sorta defeats the whole reason for using a tie'd file, I could
> just put it into a hash.

If that is the "whole" reason you are using DB_File, then you shouldn't
be using DB_File in the first place.

>
>      I've tried using the "sync" method on the handle for the tie,
> before and after every read, still with no luck.

sync syncs up memory changes to the disk.  I don't think it is supposed to
sync disk changes back to memory.

>
>      Short of going to mysql (Which is like trying to swat a fly with
> the supercollider) is there another option?

Mysql is not a super-collider, it is a very light-weight fly swatter.  What
you are trying to doing with DB_File is like trying to swat a fly with a
pencil sharpener.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Sat, 30 Aug 2008 20:22:27 -0700 (PDT)
From: Tuc <tuctboh@gmail.com>
Subject: Re: Multiple processes and tie'd files
Message-Id: <87a1e799-6950-4307-a35b-c5851845e79b@b1g2000hsg.googlegroups.com>

On Aug 30, 10:20 pm, xhos...@gmail.com wrote:
> Tuc <tuct...@gmail.com> wrote:
> >     When I tie to a DB_File, if one of the processes or even an
> > external process updates the file, the persistent processes aren't
> > seeing the update. I have to stop them and restart them for that to
> > happen.
>
> Have you read the documentation for DB_File?
>
     I did, way back. Then 10 minutes after I posted I read it again
and found the section that said "Hey, Tuc, you can't do that with
DB_File".
>
> > Sorta defeats the whole reason for using a tie'd file, I could
> > just put it into a hash.
>
> If that is the "whole" reason you are using DB_File, then you shouldn't
> be using DB_File in the first place.
>
     What should I be using then? I need something that I can ask it
by a key, and get data back. It needs to be accessible from multiple
programs, and easily updated without modifying the program. I need it
to be fast/lightweight/not require any additional processes running.
>
>
> >      I've tried using the "sync" method on the handle for the tie,
> > before and after every read, still with no luck.
>
> sync syncs up memory changes to the disk.  I don't think it is supposed to
> sync disk changes back to memory.
>
     Had hoped.
>
>
> >      Short of going to mysql (Which is like trying to swat a fly with
> > the supercollider) is there another option?
>
> Mysql is not a super-collider, it is a very light-weight fly swatter.  What
> you are trying to doing with DB_File is like trying to swat a fly with a
> pencil sharpener.
>
      Doesn't make sense to start an instance of Mysql for a table
that will probably be 75-100 entries.

      So what do you suggest to be able to do this? Just "open, while,
close" a text file?

      I was also trying to keep with DB_File since another program
actually was generating it, DB_File the only available format. I might
be able to (and it looks like might have to, unless I want to keep 2
copies) remove the usage of the file from the other program.

                           Tuc


------------------------------

Date: 31 Aug 2008 04:15:41 GMT
From: xhoster@gmail.com
Subject: Re: Multiple processes and tie'd files
Message-Id: <20080831001545.045$tG@newsreader.com>

Tuc <tuctboh@gmail.com> wrote:
> On Aug 30, 10:20 pm, xhos...@gmail.com wrote:
> > Tuc <tuct...@gmail.com> wrote:
> > >     When I tie to a DB_File, if one of the processes or even an
> > > external process updates the file, the persistent processes aren't
> > > seeing the update. I have to stop them and restart them for that to
> > > happen.
> >
> > Have you read the documentation for DB_File?
> >
>      I did, way back. Then 10 minutes after I posted I read it again
> and found the section that said "Hey, Tuc, you can't do that with
> DB_File".

You might be able to use DB_File, you would just need to untie and retie
each time you want to sync.  But, if you have multiple concurrent accesses,
which you do otherwise the problem wouldn't exist, then you need to do
locking as well or your database file will be corrupted.

From the DB_File docs, it sound like Tie::DB_LockFile might be just
what you need, except that no module by that name actually seems to exist
on CPAN or anywhere else I can find.


> >
> > > Sorta defeats the whole reason for using a tie'd file, I could
> > > just put it into a hash.
> >
> > If that is the "whole" reason you are using DB_File, then you shouldn't
> > be using DB_File in the first place.
> >
>      What should I be using then? I need something that I can ask it
> by a key, and get data back. It needs to be accessible from multiple
> programs, and easily updated without modifying the program. I need it
> to be fast/lightweight/not require any additional processes running.

You will probably have to compromise somewhere along that list.  But
without knowing your usage patterns, it is hard to say where.


 ...
> >
> > >      Short of going to mysql (Which is like trying to swat a fly with
> > > the supercollider) is there another option?
> >
> > Mysql is not a super-collider, it is a very light-weight fly swatter.
> > What you are trying to doing with DB_File is like trying to swat a fly
> > with a pencil sharpener.
> >
>       Doesn't make sense to start an instance of Mysql for a table
> that will probably be 75-100 entries.

Database servers aren't just about size.  Allowing multiple connections to
access data quickly and concurrently without causing corruption or needless
slowness is the very reason that database servers exist.  Saying "I don't
need a database because it is only 100 rows" is like saying "I don't need
to put engine oil in my engine because I'm only going to drive 30 mph".

>       So what do you suggest to be able to do this? Just "open, while,
> close" a text file?

I don't see how this would get the job done.  There would have to be a
"print" in there someplace, or else the whole premise of your question
would be void.  And then there would have to be locking, or corruption
would happen.


>
>       I was also trying to keep with DB_File since another program
> actually was generating it, DB_File the only available format. I might
> be able to (and it looks like might have to, unless I want to keep 2
> copies) remove the usage of the file from the other program.

If this other program doesn't do locking and can't be made to do it
in a way compatible with your program, then you are already playing with
fire by having them touch the same DB_File file.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Sun, 31 Aug 2008 13:58:37 +0100
From: Alexander Clouter <alex@digriz.org.uk>
Subject: Re: Multiple processes and tie'd files
Message-Id: <t4hqo5-qap.ln1@woodchuck.wormnet.eu>

xhoster@gmail.com wrote:
> Tuc <tuctboh@gmail.com> wrote:
>> Hi,
>>
>>       I'm running into an issue when using a file I've tied, and there
>> are multiple long term running processes. I first ran into it with
>> Squid as a redirection program (Never resolved it), and now with
>> MimeDefang.
>>
>>       When I tie to a DB_File, if one of the processes or even an
>> external process updates the file, the persistent processes aren't
>> seeing the update. I have to stop them and restart them for that to
>> happen.
> 
> Have you read the documentation for DB_File?
>
The documentation for DB_File has *nothing* to say on this that is useful,
this is just general unix'y know how that you never really get to pick up
easily.
 
When I wrote a squid based url filtering blacklisting mcwhatsit I used
DB_File.  The important thing is to have *one* writer and many readers,
this means you can forget about locking altogether.

UNIX has this rather nice feature where when a file is open that FD's 'view'
of the file does not change even if you delete the file or edit it.  To see
the changes you have to close the FD and reopen the DB_File.  On long
running processes this is easy, I would recommend you just '(stat($file))[9]'
and see if the modification timestamp has changed at regular intervals.
If they have then untie and retie the file and you will see your updates.

The regular interval I would use alarm() and have a function that does this,
should keep things clean without messing up the logic of your core code.

>>      Short of going to mysql (Which is like trying to swat a fly with
>> the supercollider) is there another option?
> 
> Mysql is not a super-collider, it is a very light-weight fly swatter.  What
> you are trying to doing with DB_File is like trying to swat a fly with a
> pencil sharpener.
> 
No no no, MySQL is horrible!  Putting any network based database into the
critical loop of a realtime interactive is a bad bad idea.  You might get
away with using sqlite but probably would still feel dirty from the
experience, DB_File's are great for this kind of task.

Cheers

Alex


------------------------------

Date: 31 Aug 2008 13:26:46 GMT
From: Leon Timmermans <fawaka@gmail.nl>
Subject: Re: Multiple processes and tie'd files
Message-Id: <48ba9c16$0$190$e4fe514c@news.xs4all.nl>

Op Sat, 30 Aug 2008 20:22:27 -0700, schreef Tuc:
>      What should I be using then? I need something that I can ask it
> by a key, and get data back. It needs to be accessible from multiple
> programs, and easily updated without modifying the program. I need it to
> be fast/lightweight/not require any additional processes running.

>       Doesn't make sense to start an instance of Mysql for a table
> that will probably be 75-100 entries.
> 
>       So what do you suggest to be able to do this? Just "open, while,
> close" a text file?
> 

My advice would be to either use the BerkeleyDB module or SQLite, 
depending on your exact needs. 

Regards,

Leon Timmermans


------------------------------

Date: Sun, 31 Aug 2008 09:01:03 -0700 (PDT)
From: Tuc <tuctboh@gmail.com>
Subject: Re: Multiple processes and tie'd files
Message-Id: <c1e062ce-a631-403d-8b4c-50c122a7b42d@8g2000hse.googlegroups.com>

On Aug 31, 12:15 am, xhos...@gmail.com wrote:
>
> You might be able to use DB_File, you would just need to untie and retie
> each time you want to sync.  But, if you have multiple concurrent accesses,
> which you do otherwise the problem wouldn't exist, then you need to do
> locking as well or your database file will be corrupted.
>
> From the DB_File docs, it sound like Tie::DB_LockFile might be just
> what you need, except that no module by that name actually seems to exist
> on CPAN or anywhere else I can find.
>
     I was hoping not to have to incur the expense of untie/tie every
time.
But it seems like for a quick/easy solution, that'll be it.

     The long running processes are read only. An external program
from it will
be the only one with write/update capability. (Actually, when the file
gets rebuilt
it gets REBUILT. Basically looks like it re-writes the whole file from
scratch.
No "delete" of records, just "open, insert*X, close".
>
> >    What should I be using then? I need something that I can ask it
> > by a key, and get data back. It needs to be accessible from multiple
> > programs, and easily updated without modifying the program. I need it
> > to be fast/lightweight/not require any additional processes running.
>
> You will probably have to compromise somewhere along that list.  But
> without knowing your usage patterns, it is hard to say where.
>
     The upshot is that this is part of a sendmail milter. Every mail
in or out
gets run through the milter. On outbound ones, it checks to see if the
recipient
is the key to a record. If so, the sender of the email is changed to
the value
for that key and then sent along its way. If there isn't a match, it
checks the
sender against another file and if there is a key match, the sender is
changed
to the value for that key and sent along its way. If neither match,
its untouched.
The files are created with sendmails "makemap hash DBNAME <
TEXTFILE".
>
> Database servers aren't just about size.  Allowing multiple connections to
> access data quickly and concurrently without causing corruption or needless
> slowness is the very reason that database servers exist.  Saying "I don't
> need a database because it is only 100 rows" is like saying "I don't need
> to put engine oil in my engine because I'm only going to drive 30 mph".
>
> >       So what do you suggest to be able to do this? Just "open, while,
> > close" a text file?
>
> I don't see how this would get the job done.  There would have to be a
> "print" in there someplace, or else the whole premise of your question
> would be void.  And then there would have to be locking, or corruption
> would happen.
>

   Be reasonable, you know there was more to it than what was said, it
was
just a way to convey the idea of always opening a file, having a while
loop
to go line by line through the file, and then being able to find the
key and
use the data. If you need the real code :


#previous programming above here, including shbang to perl interpreter
undef $value;

open (MAILID,"</etc/mail/mailid");
while (<MAILID>) {
  ($key,$value)=split(/\t/,$_);
  if ($key =~ /^$lookingfor$/) {
    last;
  }
}
close (MAILID);

if ($value)
{
     #rest of processing here
}

>
> If this other program doesn't do locking and can't be made to do it
> in a way compatible with your program, then you are already playing with
> fire by having them touch the same DB_File file.
>
     sendmail only uses the file read only too. I do know it opens the
file every
email that comes through though.

Tuc


------------------------------

Date: Sun, 31 Aug 2008 09:07:35 -0700 (PDT)
From: Tuc <tuctboh@gmail.com>
Subject: Re: Multiple processes and tie'd files
Message-Id: <4b2d0026-666d-4b74-aaa1-628d92044665@d1g2000hsg.googlegroups.com>

On Aug 31, 8:58 am, Alexander Clouter <a...@digriz.org.uk> wrote:
>
> The documentation for DB_File has *nothing* to say on this that is useful,
> this is just general unix'y know how that you never really get to pick up
> easily.
>
     It does tell you to look at other options, one of which doesn't
seem to
exist. :)
>
> When I wrote a squid based url filtering blacklisting mcwhatsit I used
> DB_File.  The important thing is to have *one* writer and many readers,
> this means you can forget about locking altogether.
>
     Exactly the first place I ever ran into this. :) And yes, all my
processes
are readers in this case. (It wasn't in the squid case.. The first
time it saw
a user from a new IP it redirected them to a "Welcome" page, then
update a file
so the next request wasn't redirected)
>
> UNIX has this rather nice feature where when a file is open that FD's 'view'
> of the file does not change even if you delete the file or edit it.  To see
> the changes you have to close the FD and reopen the DB_File.  On long
> running processes this is easy, I would recommend you just '(stat($file))[9]'
> and see if the modification timestamp has changed at regular intervals.
> If they have then untie and retie the file and you will see your updates.
>
> The regular interval I would use alarm() and have a function that does this,
> should keep things clean without messing up the logic of your core code.
>

     Interesting idea, thanks. Its probably less expensive to do that
than
constantly untie/tie.
>
> No no no, MySQL is horrible!  Putting any network based database into the
> critical loop of a realtime interactive is a bad bad idea.  You might get
> away with using sqlite but probably would still feel dirty from the
> experience, DB_File's are great for this kind of task.
>
     Never used sqlite, but seems like more+more people are using it.
Might be
worth looking at just as a reference point.

                    Thanks, Tuc


------------------------------

Date: Sun, 31 Aug 2008 09:09:23 -0700 (PDT)
From: Tuc <tuctboh@gmail.com>
Subject: Re: Multiple processes and tie'd files
Message-Id: <519204b7-f10d-44ac-bf24-d299a6a6a471@f36g2000hsa.googlegroups.com>

On Aug 31, 9:26 am, Leon Timmermans <faw...@gmail.nl> wrote:
> Op Sat, 30 Aug 2008 20:22:27 -0700, schreef Tuc:
>
> My advice would be to either use the BerkeleyDB module or SQLite,
> depending on your exact needs.
>

     I downloaded/installed BerkeleyDB shortly after reading (again)
the
DB_File page. Now I have to figure out exactly how they do what I'm
looking
for.

Thanks, Tuc


------------------------------

Date: Sun, 31 Aug 2008 04:42:20 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Sun Aug 31 2008
Message-Id: <K6G7qK.nw0@zorch.sf-bay.org>

The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN).  You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.

Algorithm-Cluster-1.41
http://search.cpan.org/~mdehoon/Algorithm-Cluster-1.41/
Perl interface to the C Clustering Library. 
----
Astro-SIMBAD-Client-0.014
http://search.cpan.org/~wyant/Astro-SIMBAD-Client-0.014/
Fetch astronomical data from SIMBAD 4. 
----
Benchmark-Apps-0.03
http://search.cpan.org/~ambs/Benchmark-Apps-0.03/
Simple interface to benchmark applications. 
----
Bundle-Compress-Zlib-2.013
http://search.cpan.org/~pmqs/Bundle-Compress-Zlib-2.013/
Install Compress::Zlib and dependencies 
----
CLI-Application-0.01
http://search.cpan.org/~jkramer/CLI-Application-0.01/
(not yet) extensible CLI application framework 
----
CatalystX-ListFramework-Builder-0.31
http://search.cpan.org/~oliver/CatalystX-ListFramework-Builder-0.31/
Instant AJAX web front-end for DBIx::Class, using Catalyst 
----
CatalystX-ListFramework-Builder-0.32
http://search.cpan.org/~oliver/CatalystX-ListFramework-Builder-0.32/
Instant AJAX web front-end for DBIx::Class, using Catalyst 
----
DBIx-SchemaChecksum-0.08
http://search.cpan.org/~domm/DBIx-SchemaChecksum-0.08/
Generate and compare checksums of database schematas 
----
Data-TreeDumper-Renderer-GTK-0.02
http://search.cpan.org/~nkh/Data-TreeDumper-Renderer-GTK-0.02/
Gtk2::TreeView renderer for Data::TreeDumper 
----
DateTime-TimeZone-0.7904
http://search.cpan.org/~drolsky/DateTime-TimeZone-0.7904/
Time zone object base class and factory 
----
Geo-Direction-Distance-0.0.2
http://search.cpan.org/~kokogiko/Geo-Direction-Distance-0.0.2/
Process between Lat-Lng coordinates and direction - distance 
----
Geo-Direction-Name-0.0.2
http://search.cpan.org/~kokogiko/Geo-Direction-Name-0.0.2/
Transform direction name and degree each other. 
----
Geo-Google-MyMap-KMLURL-0.0.1
http://search.cpan.org/~kokogiko/Geo-Google-MyMap-KMLURL-0.0.1/
Create URL for downloading Full-spec KML from Google MyMap msid 
----
Geo-LocaPoint-0.0.3
http://search.cpan.org/~kokogiko/Geo-LocaPoint-0.0.3/
Simple encoder/decoder of LocaPoint 
----
Glib-Ex-ConnectProperties-3
http://search.cpan.org/~kryde/Glib-Ex-ConnectProperties-3/
link properties between objects 
----
Graphics-Primitive-Driver-Cairo-0.20
http://search.cpan.org/~gphat/Graphics-Primitive-Driver-Cairo-0.20/
Cairo backend for Graphics::Primitive 
----
Image-Pngslimmer-0.29
http://search.cpan.org/~acmcmen/Image-Pngslimmer-0.29/
slims (dynamically created) PNGs 
----
JS-JSON-0.02
http://search.cpan.org/~ingy/JS-JSON-0.02/
JSON module for JS 
----
Linux-USBKeyboard-0.02
http://search.cpan.org/~ewilhelm/Linux-USBKeyboard-0.02/
access devices pretending to be qwerty keyboards 
----
Module-Release-Git-0.10_04
http://search.cpan.org/~bdfoy/Module-Release-Git-0.10_04/
Use Git with Module::Release 
----
Module-Release-Git-0.10_05
http://search.cpan.org/~bdfoy/Module-Release-Git-0.10_05/
Use Git with Module::Release 
----
Module-Release-Git-0.10_06
http://search.cpan.org/~bdfoy/Module-Release-Git-0.10_06/
Use Git with Module::Release 
----
Module-Release-Git-0.10_07
http://search.cpan.org/~bdfoy/Module-Release-Git-0.10_07/
Use Git with Module::Release 
----
Module-Release-Git-0.10_08
http://search.cpan.org/~bdfoy/Module-Release-Git-0.10_08/
Use Git with Module::Release 
----
SMS-Send-TW-PChome-0.03
http://search.cpan.org/~snowfly/SMS-Send-TW-PChome-0.03/
SMS::Send driver for sms.pchome.com.tw 
----
SVN-Notify-Snapshot-0.04
http://search.cpan.org/~jpeacock/SVN-Notify-Snapshot-0.04/
Take snapshots from Subversion activity 
----
Shipwright-1.14_01
http://search.cpan.org/~sunnavy/Shipwright-1.14_01/
Best Practical Builder 
----
Text-Template-Simple-0.54_01
http://search.cpan.org/~burak/Text-Template-Simple-0.54_01/
Simple text template engine 
----
Tie-Hash-KeysMask-0.03
http://search.cpan.org/~schoejo/Tie-Hash-KeysMask-0.03/
Control key aliasing by mask function, e.g. omit case of character distinction 
----
UML-Class-Simple-0.15
http://search.cpan.org/~agent/UML-Class-Simple-0.15/
Render simple UML class diagrams, by loading the code 
----
XML-Generator-XMPP-0.02
http://search.cpan.org/~martijn/XML-Generator-XMPP-0.02/
easily create XMPP packets 
----
bin-wxcat-v0.0.1
http://search.cpan.org/~ewilhelm/bin-wxcat-v0.0.1/
pipe output to an unfocussed window 
----
github_creator-0.11
http://search.cpan.org/~bdfoy/github_creator-0.11/
----
github_creator-0.12
http://search.cpan.org/~bdfoy/github_creator-0.12/
----
indirect-0.04
http://search.cpan.org/~vpit/indirect-0.04/
Lexically warn about using the indirect object syntax. 
----
mpp-6
http://search.cpan.org/~pfeiffer/mpp-6/


If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.

This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
  http://www.stonehenge.com/merlyn/LinuxMag/col82.html

print "Just another Perl hacker," # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion


------------------------------

Date: Sun, 31 Aug 2008 15:54:31 +0000 (UTC)
From:  Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: perl multithreading performance
Message-Id: <g9eern$tvc$1@agate.berkeley.edu>

[A complimentary Cc of this posting was sent to
<dniq00@gmail.com>], who wrote in article <ce5b0a68-14f6-4296-94da-8edfca898e77@j22g2000hsf.googlegroups.com>:
> I'm trying to implement multithreaded processing for the humongous
> amount of logs that I'm currently processing in 1 process on a 4-CPU
> server.

Keep in mind that AFAIK, all multithreading support is long removed
from Perl.  Instead, the code which was designed to simulate fork()ing
under Win* is used as a substitution for multithreading support...

=========

Sorry that I can't be more specific with your speed issues: when I
discovered that under the "new doctrine" starting a new thread is
about 100-300 times SLOWER than starting a new Perl process, I just
gave up and did not do any other test...

Hope this helps,
Ilya



------------------------------

Date: Sat, 30 Aug 2008 23:33:32 +0300
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: subprocesses lifecycle
Message-Id: <sdnoo5xu6n.ln2@carpet.zombinet>

Eric Pozharski <whynot@pozharski.name> wrote:
> Matthieu Imbert <breafk@remove.this.gmail.com> wrote:
> *SKIP*
>> Currently, when I detect the timeout, I call die "error message". the
>> message is displayed, but the script does not return until
>> subprocesses finish (this may take several minutes, depending on what
>> the subprocesses do).
*SKIP*
> However you say that you have a problem.  I suppose you have to
> investigate why your script attempts to collect zombies.  It should
> not unless told so.

I've thought (and read) a lot about this.  I believe now, that my guess
is wrong.

There's no problem with zombies (and respectively waiting for childs).
As C.DeRykus clearly showed double fork doesn't help.

Now, I think, that B<perl> waits till pipe closes.  That happens when
writer (I intentionally say 'writer' but 'child', because it can be
child of B<init> (since double fork)) intentionally closes pipe or just
terminates.

I was wrong.  Again.  Sorry for inconvinience.

And what surprises me most is that, as Hans Mulder discovered, lexical
filehandles are waited, globals are not.  Wouldn't someone willing to dig
through source and explain why that's that way?  I've checked, both of
them are B<isa> B<FileHandle>.  And till they differ a lot.  Errmm,..
Can I guess again?

*CUT*

-- 
Torvalds' goal for Linux is very simple: World Domination


------------------------------

Date: 31 Aug 2008 02:58:08 GMT
From: xhoster@gmail.com
Subject: Re: subprocesses lifecycle
Message-Id: <20080830225812.268$H9@newsreader.com>

Eric Pozharski <whynot@pozharski.name> wrote:
>
> And what surprises me most is that, as Hans Mulder discovered, lexical
> filehandles are waited, globals are not.  Wouldn't someone willing to dig
> through source and explain why that's that way?

I am guessing it is because lexicals are destroyed when they go out of
scope, while globals are only destroyed during "global destruction", during
which time the automatic waiting behavior may not be working.

If one uses circular refs to prevent a lexical filehandle from going out of
scope until global destruction, they don't wait.  For example,
the below exits immediately:

perl -le ' my @y; open $y[0], "sleep 5 |" or die; push @y,\@y'

Sometimes it waits anyway.  Global destruction is hard to predict.
This waits:

perl -le ' my @y; open $y[0], "sleep 5 |" or die; push @y,\@y; $z=bless {}'


Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Sun, 31 Aug 2008 10:19:26 +0300
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: subprocesses lifecycle
Message-Id: <u8tpo5x2tu.ln2@carpet.zombinet>

Peter J. Holzer <hjp-usenet2@hjp.at> wrote:
*SKIP*
> terminate. But since you didn't call $pipe->autoflush the child won't
> actually try to write to the pipe until the buffer (4kB on Linux, 8kB on
> most other unixes) is full - that will be after about 75 or 150 lines,
> respectively.

That's what I'd messed up, line oriented at surface while buffered in
depth.  Thanks, now I feel much better.

-- 
Torvalds' goal for Linux is very simple: World Domination


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 1829
***************************************


home help back first fref pref prev next nref lref last post