[32817] in Perl-Users-Digest
Perl-Users Digest, Issue: 4082 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Nov 24 18:09:41 2013
Date: Sun, 24 Nov 2013 15:09:05 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sun, 24 Nov 2013 Volume: 11 Number: 4082
Today's topics:
Regular expression 'c' modifier <gamo@telecable.es>
Re: Regular expression 'c' modifier <hjp-usenet3@hjp.at>
Re: Regular expression 'c' modifier <bjoern@hoehrmann.de>
Re: Regular expression 'c' modifier <gamo@telecable.es>
Re: Regular expression 'c' modifier <gamo@telecable.es>
Re: Several Topics - Nov. 19, 2013 <gamo@telecable.es>
Re: Several Topics - Nov. 19, 2013 <jurgenex@hotmail.com>
Re: Several Topics - Nov. 19, 2013 <ben@morrow.me.uk>
Re: Several Topics - Nov. 19, 2013 <gamo@telecable.es>
Re: Several Topics - Nov. 19, 2013 <whynot@pozharski.name>
Re: Several Topics - Nov. 19, 2013 <rweikusat@mobileactivedefense.com>
Re: Several Topics - Nov. 19, 2013 <hjp-usenet3@hjp.at>
Re: Several Topics - Nov. 19, 2013 <gamo@telecable.es>
Re: Several Topics - Nov. 19, 2013 <hjp-usenet3@hjp.at>
Re: Several Topics - Nov. 19, 2013 <gamo@telecable.es>
Re: Several Topics - Nov. 19, 2013 <hjp-usenet3@hjp.at>
Re: Several Topics - Nov. 19, 2013 <cwilbur@chromatico.net>
Re: Several Topics - Nov. 19, 2013 <ben@morrow.me.uk>
Re: Several Topics - Nov. 19, 2013 <bill@todbe.com>
Re: Several Topics - Nov. 19, 2013 <rweikusat@mobileactivedefense.com>
Re: Writing a daemon to start/stop at boot and shutdown <justin.1303@purestblue.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sun, 24 Nov 2013 13:04:27 +0100
From: gamo <gamo@telecable.es>
Subject: Regular expression 'c' modifier
Message-Id: <l6sq0d$hmu$1@speranza.aioe.org>
Recently I see a script nuked by comparison to the C version,
and the two main causes were using bigint, which I don't
really need and 'use integer' could do the job, and don't use
the /gc modifier in a regex instead of a normal /g. I think
that the documentation of the c modifier is not very clear
about its importance. Here is a comparison:
#!/usr/bin/perl -W
use strict;
use Benchmark qw(cmpthese);
my $string = "aabc" x 8192;
my $i;
cmpthese(-3, {
g => sub { while ($string=~/(a+)/g) { $i = $1; } },
gc => sub { while ($string=~/(a+)/gc) { $i=$1; } },
});
__END__
Rate g gc
g 379/s -- -100%
gc 12105006/s 3192107% --
j
------------------------------
Date: Sun, 24 Nov 2013 21:38:54 +0100
From: "Peter J. Holzer" <hjp-usenet3@hjp.at>
Subject: Re: Regular expression 'c' modifier
Message-Id: <slrnl94ouu.a3j.hjp-usenet3@hrunkner.hjp.at>
On 2013-11-24 12:04, gamo <gamo@telecable.es> wrote:
> Recently I see a script nuked by comparison to the C version,
> and the two main causes were using bigint, which I don't
> really need and 'use integer' could do the job, and don't use
> the /gc modifier in a regex instead of a normal /g. I think
> that the documentation of the c modifier is not very clear
> about its importance. Here is a comparison:
>
> #!/usr/bin/perl -W
>
> use strict;
> use Benchmark qw(cmpthese);
>
> my $string = "aabc" x 8192;
> my $i;
>
> cmpthese(-3, {
> g => sub { while ($string=~/(a+)/g) { $i = $1; } },
> gc => sub { while ($string=~/(a+)/gc) { $i=$1; } },
> });
>
> __END__
>
> Rate g gc
> g 379/s -- -100%
> gc 12105006/s 3192107% --
Yes, matching 0 times in a string of length 2 is much faster than
matching 8192 times in a string of length 32768.
It is always suspicious if you get such a huge speedup and it is a good
idea to check that the new code is really equivalent to the old one.
hp
--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | hjp@hjp.at | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpat. -- Ralph Babel
------------------------------
Date: Sun, 24 Nov 2013 22:19:55 +0100
From: Bjoern Hoehrmann <bjoern@hoehrmann.de>
Subject: Re: Regular expression 'c' modifier
Message-Id: <q1r499pdcqk8t9sjmnfrd4ugip6i1f138d@hive.bjoern.hoehrmann.de>
* Peter J. Holzer wrote in comp.lang.perl.misc:
>On 2013-11-24 12:04, gamo <gamo@telecable.es> wrote:
>> my $string = "aabc" x 8192;
>> my $i;
>>
>> cmpthese(-3, {
>> g => sub { while ($string=~/(a+)/g) { $i = $1; } },
>> gc => sub { while ($string=~/(a+)/gc) { $i=$1; } },
>> });
>Yes, matching 0 times in a string of length 2 is much faster than
>matching 8192 times in a string of length 32768.
To elaborate on that, the pos() of a string is a property of the string,
and ordinarily the position would be reset on a match failure. With 'c'
the position is not reset, so after the first round through the loop the
`substr $string, pos $string` string would just be 'bc' which does not
match /(a+)/ so regardless of how many times `cmpthese` calls the `gc`
version, the loop body is executed only the first time if nothing resets
the string position.
--
Bjrn Hhrmann mailto:bjoern@hoehrmann.de http://bjoern.hoehrmann.de
Am Badedeich 7 Telefon: +49(0)160/4415681 http://www.bjoernsworld.de
25899 Dagebll PGP Pub. KeyID: 0xA4357E78 http://www.websitedev.de/
------------------------------
Date: Sun, 24 Nov 2013 22:46:08 +0100
From: gamo <gamo@telecable.es>
Subject: Re: Regular expression 'c' modifier
Message-Id: <l6ts31$788$1@speranza.aioe.org>
El 24/11/13 22:19, Bjoern Hoehrmann escribi:
> * Peter J. Holzer wrote in comp.lang.perl.misc:
>> Yes, matching 0 times in a string of length 2 is much faster than
>> matching 8192 times in a string of length 32768.
>
> To elaborate on that, the pos() of a string is a property of the string,
> and ordinarily the position would be reset on a match failure. With 'c'
> the position is not reset, so after the first round through the loop the
> `substr $string, pos $string` string would just be 'bc' which does not
> match /(a+)/ so regardless of how many times `cmpthese` calls the `gc`
> version, the loop body is executed only the first time if nothing resets
> the string position.
>
My fault. Taking a longer string and doing only one pass, Time::HiRes
says that /gc is only sigjhtly better than /g. The results don't
change in the number of matches, as does with a sub inside cmpthese.
Fortunately I don't have to change anything in the original code, as
the results are the expected with or without regex at all. Anyway, I
want to know now the difference between a regex like
while ($string =~ /(\d+)/gc){ $i=$1; #... }
and the index/substr equivalent when the $string contains digits and
only one character \n between numbers. I have to look at m and s
modifiers, too.
Thanks
------------------------------
Date: Sun, 24 Nov 2013 23:27:26 +0100
From: gamo <gamo@telecable.es>
Subject: Re: Regular expression 'c' modifier
Message-Id: <l6tuge$ckp$1@speranza.aioe.org>
El 24/11/13 22:46, gamo escribi:
> want to know now the difference between a regex like
> while ($string =~ /(\d+)/gc){ $i=$1; #... }
> and the index/substr equivalent when the $string contains digits and
> only one character \n between numbers. I have to look at m and s
> modifiers, too.
>
> Thanks
>
I am about to get a better result with index/substr/lenght
Time /gc = 3.466519 s.
Time ind = 3.106548 s.
Counters: 8388608, 8388608
with this code:
#!/usr/bin/perl -W
use strict;
my $string = "1123\n" x (8192 * 1024);
my $i;
my $n = chr(ord("\n"));
my ($c1, $c2);
use Time::HiRes qw(gettimeofday tv_interval);
my $t0 = [gettimeofday];
while ($string =~ /(\d+)/gc){
$i = $1;
$c1++ if ($i == 1123);
}
my $t1 = [gettimeofday];
my $j;
my $k=0;
while ($k<length($string)){
$j = index($string,$n,$k+1);
$i = substr($string, $k, $j-$k);
$k += length($i)+1;
$c2++ if ($i == 1123);
}
my $t2 = [gettimeofday];
print "Time /gc = ", tv_interval($t0,$t1), " s.\n";
print "Time ind = ", tv_interval($t1,$t2), " s.\n";
print "Counters: $c1, $c2\n";
__END__
But it's rather extrange to use, it's not simple.
Best regards
------------------------------
Date: Thu, 21 Nov 2013 23:15:57 +0100
From: gamo <gamo@telecable.es>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <l6m0mp$90b$1@speranza.aioe.org>
El 21/11/13 21:39, gamo escribi:
> El 21/11/13 21:19, Jrgen Exner escribi:
> ...
>> If I'm not mistaken then doing so involves large overhead while reading
>> and writing larger blocks (several kB) is orders of magnitude faster in
>> Perl.
>>
>> jue
>
>
> What do you recomend to read a large file (i.e. 1 GB)?
> a) sysread() with a length of, say 8192
> b) use File::Slurp
>
> Thanks
>
>
>
I'm afraid there is a better c) option that's to get -s size
and pass it to sysread(). However, File::Slurp do more things,
like edit in place a file.
Best regards
------------------------------
Date: Thu, 21 Nov 2013 14:23:38 -0800
From: Jrgen Exner <jurgenex@hotmail.com>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <fo1t89dj9a9up4d29h7as0lgptjtb1tqnc@4ax.com>
gamo <gamo@telecable.es> wrote:
>El 21/11/13 21:19, Jrgen Exner escribi:
>...
>> If I'm not mistaken then doing so involves large overhead while reading
>> and writing larger blocks (several kB) is orders of magnitude faster in
>> Perl.
>
>What do you recomend to read a large file (i.e. 1 GB)?
>a) sysread() with a length of, say 8192
>b) use File::Slurp
No idea, never had that need.
But definitely not by calling getc() 1.000.000.000 times :-)
Even the man page warns: "This is not particularly efficient."
jue
------------------------------
Date: Thu, 21 Nov 2013 23:01:55 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <30g2ma-0bg1.ln1@anubis.morrow.me.uk>
Quoth Jrgen Exner <jurgenex@hotmail.com>:
> gamo <gamo@telecable.es> wrote:
> >El 21/11/13 21:19, Jrgen Exner escribi:
> >...
> >> If I'm not mistaken then doing so involves large overhead while reading
> >> and writing larger blocks (several kB) is orders of magnitude faster in
> >> Perl.
> >
> >What do you recomend to read a large file (i.e. 1 GB)?
> >a) sysread() with a length of, say 8192
> >b) use File::Slurp
For a file which will fit in memory, File::Slurp. For a file which might
not, sysread, but with a larger buffer than that; I might use an 8M or
16M buffer (as opposed to your 8k), or larger.
> No idea, never had that need.
>
> But definitely not by calling getc() 1.000.000.000 times :-)
> Even the man page warns: "This is not particularly efficient."
It shouldn't be *that* bad, given that Perl buffers IO... there will be
some overhead breaking the buffer into characters and returning them one
at a time, but if the input is to be processed character-by-character
that overhead has to happen somewhere. It certainly shouldn't be
anywhere near as slow as sysreading one character at a time, unless the
kernel's being clever about readahead and the time taken for IO
completely dominates the time spent performing the syscall.
Ben
------------------------------
Date: Fri, 22 Nov 2013 00:19:55 +0100
From: gamo <gamo@telecable.es>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <l6m4er$iod$1@speranza.aioe.org>
El 22/11/13 00:01, Ben Morrow escribió:
>
> For a file which will fit in memory, File::Slurp. For a file which might
> not, sysread, but with a larger buffer than that; I might use an 8M or
> 16M buffer (as opposed to your 8k), or larger.
>
I tried with a big file and taking the buffer as large as the file, and
it's OK if you have the memory. It's faster.
size = 1042636916 bytes
read OK
File::Slurp takes 0.587991 s.
sysread() takes 0.249269 s.
All the best
------------------------------
Date: Fri, 22 Nov 2013 09:07:26 +0200
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <slrnl8u0le.c49.whynot@orphan.zombinet>
with <oeqs89502circ3h8gp6njp4juttv35u9id@4ax.com> Jürgen Exner wrote:
*SKIP*
>>I implemented three subroutines that do what you describe - produce
>>the reverse complement of a DNA sequence (alien DNA, that has ABCD
>>codons, for simplicity). [...]
> My feeling is that most of this reported slow performance may come
> from reading and writing individual characters: "it used the perl
> equivalent to C's getchar() and putchar().".
(that's me speculating here) May I remind everyone that B<reverse> was
making duplicate lists in the past not that far away?
*CUT*
--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom
------------------------------
Date: Fri, 22 Nov 2013 14:31:54 +0000
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <87zjow33tx.fsf@sable.mobileactivedefense.com>
Ben Morrow <ben@morrow.me.uk> writes:
> Quoth Jrgen Exner <jurgenex@hotmail.com>:
[read a large file into memory]
>> But definitely not by calling getc() 1.000.000.000 times :-)
>> Even the man page warns: "This is not particularly efficient."
>
> It shouldn't be *that* bad, given that Perl buffers IO... there will be
> some overhead breaking the buffer into characters and returning them one
> at a time, but if the input is to be processed character-by-character
> that overhead has to happen somewhere.
The Perl getc routine isn't the best way to access the contents of a
string character by character,
----------
use Benchmark;
timethese(-3,
{
getc => sub {
my ($c, $fh, $out);
open($fh, '<', '/tmp/syslog');
$out .= "$c " while defined($c = getc($fh));
},
substr => sub {
my ($all, $fh, $out, $pos);
local $/;
open($fh, '<', '/tmp/syslog');
$all = <$fh>;
$out .= substr($all, $pos++, 1).' ' while $pos < length($all);
}});
----------
and the 'this is not particularly efficient' is intended as a warning to
C programmers accustomed to stdio (before the advent of 'multithreading
for student morons', at least): There, getchar/ getc are usually macros
who load 'bytes' from an internal buffer directly into a register for
processing which means they're fast. But the perl getc is not at all
comparable to that because it is still a perl 'operator' which needs to be
invoked with arguments on a stack and which returns a value via stack
(and this value comes in its own SV).
------------------------------
Date: Fri, 22 Nov 2013 20:16:16 +0100
From: "Peter J. Holzer" <hjp-usenet3@hjp.at>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <slrnl8vbc0.dec.hjp-usenet3@hrunkner.hjp.at>
On 2013-11-21 23:19, gamo <gamo@telecable.es> wrote:
> El 22/11/13 00:01, Ben Morrow escribió:
>> For a file which will fit in memory, File::Slurp. For a file which might
>> not, sysread, but with a larger buffer than that; I might use an 8M or
>> 16M buffer (as opposed to your 8k), or larger.
>>
>
> I tried with a big file and taking the buffer as large as the file, and
> it's OK if you have the memory. It's faster.
>
> size = 1042636916 bytes
> read OK
> File::Slurp takes 0.587991 s.
> sysread() takes 0.249269 s.
Oh, that's fast. What kind of system is this?
I get about twice those times on a 2.4 GHz Xeon E5530 system.
But I guess you used the return value from read_file()?
If I use the buf_ref method (i.e. “read_file($ARGV[0], buf_ref => \$s)”
instead of “$s = read_file($ARGV[0])”) there is no significant speed
difference between sysread and read_file.
hp
--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | hjp@hjp.at | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
------------------------------
Date: Fri, 22 Nov 2013 22:08:36 +0100
From: gamo <gamo@telecable.es>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <l6oh4i$756$1@speranza.aioe.org>
El 22/11/13 20:16, Peter J. Holzer escribió:
> On 2013-11-21 23:19, gamo <gamo@telecable.es> wrote:
>> El 22/11/13 00:01, Ben Morrow escribió:
>>> For a file which will fit in memory, File::Slurp. For a file which might
>>> not, sysread, but with a larger buffer than that; I might use an 8M or
>>> 16M buffer (as opposed to your 8k), or larger.
>>>
>>
>> I tried with a big file and taking the buffer as large as the file, and
>> it's OK if you have the memory. It's faster.
>>
>> size = 1042636916 bytes
>> read OK
>> File::Slurp takes 0.587991 s.
>> sysread() takes 0.249269 s.
>
> Oh, that's fast. What kind of system is this?
>
linux, cpu i7
> I get about twice those times on a 2.4 GHz Xeon E5530 system.
>
> But I guess you used the return value from read_file()?
>
Yes I used the variables $bufferA and $bufferB to compare between,
(eq) before printing read OK
> If I use the buf_ref method (i.e. “read_file($ARGV[0], buf_ref => \$s)”
> instead of “$s = read_file($ARGV[0])”) there is no significant speed
> difference between sysread and read_file.
>
> hp
Must be a difference is you use sysopen(IN, ...) and
sysread(IN, $bufferB, $size)
where $size = -s $filename;
File::Slurp uses an heuristic to determine the $size.
Anyway, a variable of that size would be handled with extreme caution to
avoid some overheads and make a difference over regular parsed
input, i.e. line by line. The file I used is numeric, and I usually
read it line by line (no slurp) because something could be done
per number. I do not use NYTProf yet to see what really happens.
Best regards
------------------------------
Date: Sat, 23 Nov 2013 00:04:39 +0100
From: "Peter J. Holzer" <hjp-usenet3@hjp.at>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <slrnl8voo8.i9t.hjp-usenet3@hrunkner.hjp.at>
On 2013-11-22 21:08, gamo <gamo@telecable.es> wrote:
> El 22/11/13 20:16, Peter J. Holzer escribió:
>> On 2013-11-21 23:19, gamo <gamo@telecable.es> wrote:
>>> I tried with a big file and taking the buffer as large as the file, and
>>> it's OK if you have the memory. It's faster.
>>>
>>> size = 1042636916 bytes
>>> read OK
>>> File::Slurp takes 0.587991 s.
>>> sysread() takes 0.249269 s.
[...]
>> But I guess you used the return value from read_file()?
>>
>
> Yes I used the variables $bufferA and $bufferB to compare between,
> (eq) before printing read OK
I don't understand that answer.
I was referring to the difference between
my $s = read_file($ARGV[0]);
and
my $s;
read_file($ARGV[0], buf_ref => \$s);
The latter is about twice as fast (and consumes half the memory) because
it reads the file directly into $s instead of reading it into a
temporary variable then copying it into $s.
>> If I use the buf_ref method (i.e. “read_file($ARGV[0], buf_ref => \$s)”
>> instead of “$s = read_file($ARGV[0])”) there is no significant speed
>> difference between sysread and read_file.
>
> Must be a difference is you use sysopen(IN, ...) and
> sysread(IN, $bufferB, $size)
>
> where $size = -s $filename;
>
> File::Slurp uses an heuristic to determine the $size.
Again, I don't understand what you are trying to say.
my $s;
read_file($ARGV[0], buf_ref => \$s);
is almost exactly the same speed as
open (my $fh, '<', $ARGV[0]) or die "cannot open $ARGV[0]: $!";
my $size = stat($fh)->size;
my $s;
my $rc = sysread($fh, $s, $size);
Which is hardly surprising, since it does the same thing.
hp
--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | hjp@hjp.at | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
------------------------------
Date: Sat, 23 Nov 2013 02:11:24 +0100
From: gamo <gamo@telecable.es>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <l6ovbu$7ft$1@speranza.aioe.org>
El 23/11/13 00:04, Peter J. Holzer escribió:
>
> my $s;
> read_file($ARGV[0], buf_ref => \$s);
>
> The latter is about twice as fast (and consumes half the memory) because
> it reads the file directly into $s instead of reading it into a
> temporary variable then copying it into $s.
>
You are right: this method is as fast as reading with sysread.
>
> Which is hardly surprising, since it does the same thing.
>
> hp
>
>
It's a surprise that that method isn't recomended in the man page of
File::Slurp
Thanks
------------------------------
Date: Sat, 23 Nov 2013 08:38:00 +0100
From: "Peter J. Holzer" <hjp-usenet3@hjp.at>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <slrnl90mqo.116.hjp-usenet3@hrunkner.hjp.at>
On 2013-11-23 01:11, gamo <gamo@telecable.es> wrote:
> El 23/11/13 00:04, Peter J. Holzer escribió:
>> my $s;
>> read_file($ARGV[0], buf_ref => \$s);
>>
>> The latter is about twice as fast (and consumes half the memory) because
>> it reads the file directly into $s instead of reading it into a
>> temporary variable then copying it into $s.
>>
>
> You are right: this method is as fast as reading with sysread.
>
>>
>> Which is hardly surprising, since it does the same thing.
>
> It's a surprise that that method isn't recomended in the man page of
> File::Slurp
The man page mentions that this “is usually the fastest way to read a
file into a scalar”. I agree that this advantage should be pointed out
more prominently, and I suggested to Uri to put this variant in the
synopsis about a year ago (<slrnjmu067.909.hjp-usenet2@hrunkner.hjp.at>).
hp
--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | hjp@hjp.at | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
------------------------------
Date: Sat, 23 Nov 2013 11:37:16 -0500
From: Charlton Wilbur <cwilbur@chromatico.net>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <87ppprxef7.fsf@new.chromatico.net>
>>>>> "EP" == Eric Pozharski <whynot@pozharski.name> writes:
EP> (that's me speculating here) May I remind everyone that
EP> B<reverse> was making duplicate lists in the past not that far
EP> away?
Indeed; one of the things that makes my "naive" implementation so slow
was that at every stage of the process I created a new variable to hold
the intermediate result. This is a pattern I've seen a great deal of in
inexperienced programmers; on small data sets it's almost always useful
enough in exploratory programming to justify the overhead.
Computer resources weren't so limited when I learned to program that I
needed to learn to reverse a short string in place, but I'm glad I was
required to; the mindset generalizes nicely.
Charlton
--
Charlton Wilbur
cwilbur@chromatico.net
------------------------------
Date: Sat, 23 Nov 2013 20:01:48 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <c6e7ma-tfk2.ln1@anubis.morrow.me.uk>
Quoth Charlton Wilbur <cwilbur@chromatico.net>:
>
> Indeed; one of the things that makes my "naive" implementation so slow
> was that at every stage of the process I created a new variable to hold
> the intermediate result.
I don't think that's the case. All the variables in that sub are created
at compile time; and after the first iteration both arrays remain
allocated to full length.
This is worth knowing: although perl sets the length of an array to 0
when it goes out of scope, and frees any strings contained in the array
elements, the memory for the elements themselves remains allocated. This
is also true of scalars containing strings: they are made undefined, but
the memory allocated for the string is not freed. If you ever have
reason to allocate large strings or large arrays that you won't use
again, it's worth doing an explicit 'undef $a' or 'undef @a' to free the
memory. (Note that '$a = undef' is not sufficient.) Hashes behave the
same as arrays, except that hashes are never entirely emptied: perl
always keeps room for 7 keys.
Ben
------------------------------
Date: Sat, 23 Nov 2013 12:37:50 -0800
From: "$Bill" <bill@todbe.com>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <l6r3ms$256$1@dont-email.me>
On 11/23/2013 08:37, Charlton Wilbur wrote:
>
> Computer resources weren't so limited when I learned to program that I
> needed to learn to reverse a short string in place, but I'm glad I was
> required to; the mindset generalizes nicely.
Obviously you've never programmed a 64 kilo-byte or kilo-word processor.
I remember writing code to produce over 100 displays in a 10 KB space
on a 64 KW 16-bit mini-computer - had to roll them in from disk and re-use
the same overlay space. One of the problems with programmers (maybe just
people in general) today is no sense of efficiency/frugality. :)
------------------------------
Date: Sun, 24 Nov 2013 15:23:41 +0000
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: Several Topics - Nov. 19, 2013
Message-Id: <87li0dkema.fsf@sable.mobileactivedefense.com>
Ben Morrow <ben@morrow.me.uk> writes:
> Quoth Charlton Wilbur <cwilbur@chromatico.net>:
[...]
> This is worth knowing: although perl sets the length of an array to 0
> when it goes out of scope, and frees any strings contained in the array
> elements, the memory for the elements themselves remains allocated. This
> is also true of scalars containing strings: they are made undefined, but
> the memory allocated for the string is not freed. If you ever have
> reason to allocate large strings or large arrays that you won't use
> again, it's worth doing an explicit 'undef $a' or 'undef @a' to free the
> memory.
Why this? It is generally not a good idea to fight with infrastructure
software one happens to be using. In this case, this would mean *if*
there's a tangible reason to manage memory manually, perl is the wrong
tool, and if not, doing so nevertheless just add more entropy to the
code[*].
[*] I'm presently spending a lot of time inside a somewhat large (65,224
LOC) Java program and I estimate that at least 1/4 of this is code which
could simply be deleted because it doesn't do anything technically
useful, it just exists to work around the problem that the people who
wrote it weren't comfortable with the way Hibernate (ORM) manages object
persistency and/or didn't understand it (and weren't comfortable with
RDBMS transactional semantics and/or didn't understand them). And this
problem tends to amplify itself because code blocks including
'locationally relevant voodoo' and comments tend to be copied-and-pasted
elsewhere over and over again and neither the comments nor the voodoo
code are changed, consequently, both end up in places where even the
hypothetical purpose isn't intelligible to anyone any more.
------------------------------
Date: Fri, 22 Nov 2013 13:01:00 +0000
From: Justin C <justin.1303@purestblue.com>
Subject: Re: Writing a daemon to start/stop at boot and shutdown.
Message-Id: <c514ma-c72.ln1@zem.masonsmusic.co.uk>
On 2013-11-20, Christian Winter <thepoet_nospam@arcor.de> wrote:
> Am 20.11.2013 16:44, schrieb Justin C:
>> I've yet to write the file for /etc/init.d to start and stop my
>> program, but I don't think I should really go that far before I can
>> make my program detach itself and be able to stop when called with
>> '/usr/local/bin/progname stop'.
>>
>> I've been reading the docs of Daemon::Control, Daemon::Generic and
>> Proc::Daemon, but I can't get my program to detach - the problem
>> appears that it never returns from one of the subs.
>
> It's documented in RPC::Serializd::Server::NetServer, section
> "Things you might want to configure", where points (albeit in
> slightly roundabout way) to the "background" option of Net::Server.
> So somthing along the lines of this should work (untested):
>
> ...
> my $server = RPC::Serialized::Server::NetServer->new({
> net_server => {
> port => $port,
> background = 1
> },
Thank you, Chris, that's spot on. I don't know how I missed that in
the docs, especially seeing as they're so short. Reading, as that
part of the docs suggested, Net::Server docs I've also found how to
set GID and UID for the process - which I thought I was going to
have to do with a separate program to start and stop my daemon. Now
it's all rolled into one, now I've just got to set up the
start|stop|restart process.
Thanks again.
Justin.
--
Justin C, by the sea.
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 4082
***************************************