[32631] in Perl-Users-Digest
Perl-Users Digest, Issue: 3906 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Mar 21 09:09:50 2013
Date: Thu, 21 Mar 2013 06:09:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Thu, 21 Mar 2013 Volume: 11 Number: 3906
Today's topics:
Re: 'Needless flexibilities' and structured records [ve <rweikusat@mssgmbh.com>
a trival array/ hash benchmark <rweikusat@mssgmbh.com>
Re: a trival array/ hash benchmark <ben@morrow.me.uk>
Re: a trival array/ hash benchmark <*@eli.users.panix.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Wed, 20 Mar 2013 22:47:04 +0000
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: 'Needless flexibilities' and structured records [very long]
Message-Id: <878v5hpu3b.fsf@sapphire.mobileactivedefense.com>
Ben Morrow <ben@morrow.me.uk> writes:
> Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
>> Ben Morrow <ben@morrow.me.uk> writes:
>> > Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
>> >> Ben Morrow <ben@morrow.me.uk> writes:
>>
>> >> > You can look up the inheritance here instead of having the user specify
>> >> > it (DRY and all that...). As long as they
>> >>
>> >> Re: DRY
>> >>
>> >> That's a cute acronym for 'Design pattern aficionados Reinventing
>> >> elementary database theorY' (in short: Databases should not contain
>> >> redundant copies of the same data to avoid so-called 'update
>> >> anomalies', the database becoming inconsistent because not all
>> >> redudant copies of 'some data' are changed during an update. People
>> >> usually ignore that in favor of 'nobdoy will ever update X without
>> >> going to gatekeeper code Z which always keeps all copies in sync' and
>> >> looking dedicatedly in another direction whenever somebody mentions
>> >> that 'updates withoug going throug Z' are possible. This could almost
>> >> be called 'a basic design principle of modern Linux distributions'
>> >> :->).
>> >
>> > Um, what?
>>
>> You can get an explanation in more detail than I care to read at the
>> moment here:
>>
>> http://en.wikipedia.org/wiki/Database_normalization
>
> Yes, I know what database normalisation is. I don't see what relevance
> it has here, nor what relevance the design of Linux distros has.
>
>> If you strip out the technicalities relating to relational database
>> management systems, you arrive at 'the DRY principle'.
>
> So? I assume you're trying to make a point, rather than just talking to
> hear yourself speak, but I can't see what it might be.
And I 'assume' that the point of this 'snide remark' is - well - being
a snide remark.
>> >> > use parent "Parent";
>> >> > use slots qw/one two three/;
>> >> >
>> >> > in that order, @{ caller . "::ISA" } will contain the direct parents at
>> >> > this point. You can also croak if they're trying to use MI.
>> >>
>> >> I thought about that and decided against it: This would require @ISA
>> >> to be populated by the time the import method is executed and the code
>> >> would either need to operate in 'pushy nanny mode' ("I've told you you
>> >> must not inherit methods from more than one class and you must do as I say
>> >> because I say so!") or make a guess at which of the possibly many
>> >> direct and indirect(!) 'superclasses' is supposed to be the anchoring
>> >> point for the new slot name series. But in reality, @ISA is about
>> >> method lookups and not about 'allocating array slots in a
>> >> non-conflicting way to members of some "inheritance tree"'.
>> >>
>> >> One can assume that packages sharing an array reference for instance
>> >> data will usually also share method via @ISA but this need not be
>> >> true.
>> >
>> > I suppose; certainly a facility to inherit slots independently of @ISA
>> > might be useful. Data and method inheritance staying in sync must be the
>> > overwhelmingly common case, though, so optimising the 'use' interface
>> > for that would seem sensible.
>>
>> The issue is really that the @ISA array has a defined meaning: It is
>> used to facilitate subroutine sharing outside of the normal export/
>> import mechanism.
>
> No, it's used to implement method dispatch. This is quite different from
> sub sharing; for one thing, it uses a different call syntax.
Call this anything you like but it enables more than one 'package'
(call that 'class' if you like) to execute the same subroutine in
response to a so-called 'method call' and not about 'data
inheritance'.
>> This can be envisioned as being "a feeble/ strange
>> way to declare 'class ancestor/ descendant' relationships" and it is
>> usually even referred to as such but this is just 'a notational
>> convenience' and it really isn't something like this.
>>
>> Eg, the most common case of 'multiple inheritance' I'm presently
>> dealing with is a package named 'Thing' and all it provides is an
>> overloaded ""-operation which prints the package some reference is
>> blessed into, followed by the refaddr, followed by a 'tag' of some
>> kind in []. A package which wants to inherit this "" needs to provide
>> a tag method returning a tag and the Thing package maintains a cache
>> of 'object tags' in a package-global hash. This is supposed to make
>> diagnostic messages more intelligible and the provided functionality
>> is independent of any properties of packages 'inheriting' from Thing.
>
> This technique is called 'mixins' (at least, it is in the Perl
> world),
[digression removed]
For the situation at hand, it was an example of a package listed in
@ISA which is not supposed to be part of the 'data inheritance
hierarchy'.
>> >> Also just because some package is listed in @ISA doesn't
>> >> necessarily mean that it should also participiate in the 'data
>> >> inheritance hierarchy'. These are orthogonal issues and one of the
>> >> serious drawbacks of many existing 'OO support modules' is that the
>> >> authors usually fell prey to their own "Jack of all trades" desires
>> >> and 'hard-coded' technically independent policy choices they usually
>> >> happen to make in their code.
>>
>> 'Optimising for the common case' is really the same as 'hardcoding
>> technically unrelated policy choices' because someone considered them
>> "overwhelmingly important" or even "the only Right choice" or so but
>> in practice, this still just means "How I happen would have done
>> that" and how justified the generalization actually is is usally
>> unknown.
>
> Optimising for the common case doesn't mean making functionality
> unavailable, it just means making the 'easy' interface work the way you
> usually want it to.
In the given case, the 'easy' interface does work in the way 'I
usually want it to' because I usually don't 'want' to force policy
descisions which seem sensible to me onto others. Not the least
because I would be forcing them onto myself first and they wouldn't be
suitable for my use case.
[...]
>> But you were actually impolite in all directions: Both modules you
>> named are supposed to provide 'something completely different' and in
>> the case of Class:: it isn't even clear (to me, at least) whether
>> this involves using anonymous arrays as object representation at all.
>
> It does. Class::Accessor::Faster provides almost exactly the same as
> slots: it gives you a constructor, and accessors, for an arrayref-based
> object, and nothing else.
Compared to a 'pragamatic module' which provides a way to allocate
'array slot numbers' in a way suitable for using an anonymous array as
'object instance representation with data sharing' among members of
single-inheritance class hierarchy, that's something 'almost
completely different' except for the (irrelevant) implementation
detail that 'anonymous arrays' might also be used by Class::.
>> I was trying to be polite by formulating this in this way instead
>> of just pointing out that I strongly disagree with 'certain
>> implementation choices' implicit in Class:: and Moose::antyhing and
>> really don't appreciate something sensible I wrote being dragged
>> into these pits.
>
> So you have said, though you have yet to explain your disagreement
> beyond
[snide remark deleted]
>> I presented a scheme for manageing array slots in shared, anonymous
>> arrays and while this scheme is very simple, it is useful and it hasn't
>> (to the best of my knowledge) been published in some 'canned',
>> ready-to-use form so far (except maybe as 'minor internal part' of
>> another Wolpertinger-module).
>
> Had you cared to look, you would have found that CAF is pretty-much
> exactly the same thing.
It may be 'pretty much the same thing' in your opinion but this is an
opinion I don't share.
[more irrelevant stuff deleted]
>> > in fact, it's a rather obvious idea, with very few advantages over
>> > standard hashref-based objects, and some important disadvantages.
>> > In some specific situations, in particular where saving memory is
>> > important, it might be useful, but for general use it doesn't seem
>> > to me worth expending effort trying to work around the problems with
>> > arrayref objects unless you want to do a complete job and solve all
>> > the problems.
>>
>> Please feel free to argue against the points in favour of using
>> anonymous arrays for object representation and against using anonymous
>> hashes I made. I consider them sound but argueing against them should
>> certainly be possible.
>
> I have given some of the disadvantages already:
>
> - Inheriting attributes from more than one parent, by any means, is
> more difficult to get right;
As I already wrote a couple of times: This is for single-inheritance
hierarchies. Consequently, the fact that it really isn't suitable for
'class hetarchies' is no more a disadvantage of this scheme than it is
'a disadvantage' of a car that it can't fly or swim: It isn't
supposed to.
> - Changing the attributes of a class at runtime (including changing
> the inheritance) will cause existing compiled methods to refer to
> the wrong attributes;
I understand this as "code could be written with the explicit
intention to break this and then, it would break". Which isn't exactly
a surprise and holds for all code.
[yet more irrelevant stuff deleted]
BTW, if I had any talent as graphical artist (which I unforuntately
don't), I would scan the cover of the camel book, would decorate
the head of the poor beast with a pair of gigantic antlers (maybe with
some dripping bits of water plants on them) and would publish this as
a 'Remove wha doesn't fit into this picture' cartoon. I figure that
would leave you with a pair of antlers hanging in the air for no
particular reason and me with a camel. Just to return one of the snide
remarks.
------------------------------
Date: Wed, 20 Mar 2013 23:30:26 +0000
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: a trival array/ hash benchmark
Message-Id: <874ng5ps31.fsf@sapphire.mobileactivedefense.com>
When running the trivial microbenchmark
-----------
use Benchmark;
my $h = { Worschtsupp => 4 };
my $a = [4];
timethese(-5,
{
h => sub { return $h->{Worschtsupp}; },
a => sub { return $a->[0]; }});
------------
on 'some computer', the result of three runs was that the hash lookup
ran at about 28.31% of the speed of the array access on average.
10,000,000 hash lookups are needed in order to spend 1s of processing
time solely on that and about 33,333,333 could have been done in the
same time.
------------------------------
Date: Thu, 21 Mar 2013 00:21:39 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: a trival array/ hash benchmark
Message-Id: <jd0q1a-c4r1.ln1@anubis.morrow.me.uk>
Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
> When running the trivial microbenchmark
>
> -----------
> use Benchmark;
>
> my $h = { Worschtsupp => 4 };
> my $a = [4];
>
> timethese(-5,
> {
> h => sub { return $h->{Worschtsupp}; },
> a => sub { return $a->[0]; }});
> ------------
>
> on 'some computer', the result of three runs was that the hash lookup
> ran at about 28.31% of the speed of the array access on average.
> 10,000,000 hash lookups are needed in order to spend 1s of processing
> time solely on that and about 33,333,333 could have been done in the
> same time.
Like most microbenchmarks, this tells you very little about real code.
Try something a bit more realistic, like
#!/opt/perl/bin/perl
use Benchmark qw/cmpthese/;
sub one_a { my ($self) = @_; $self->[0]; }
sub one_h { my ($self) = @_; $self->{a}; }
sub two_a { my ($self) = @_; $self->one_a; }
sub two_h { my ($self) = @_; $self->one_h; }
sub if_a { my ($self) = @_; if (rand > 0.2) { $self->two_a } }
sub if_h { my ($self) = @_; if (rand > 0.2) { $self->two_h } }
my $aref = bless [1];
my $href = bless { a => 1 };
cmpthese -5, {
one_a => sub { $aref->one_a },
one_h => sub { $href->one_h },
two_a => sub { $aref->two_a },
two_h => sub { $href->two_h },
if_a => sub { $aref->if_a },
if_h => sub { $href->if_h },
};
for which I get
Rate if_h if_a two_h two_a one_h one_a
if_h 858310/s -- -3% -26% -32% -61% -65%
if_a 886965/s 3% -- -23% -30% -60% -64%
two_h 1157184/s 35% 30% -- -8% -48% -53%
two_a 1259550/s 47% 42% 9% -- -43% -49%
one_h 2227216/s 159% 151% 92% 77% -- -9%
one_a 2458615/s 186% 177% 112% 95% 10% --
so, as I suspected, the method-call overhead completely dominates the
overhead of the hash lookup. If you can save one method call, you will
save more time than you would have saved by using an arrayref; and under
normal circumstances I would not hesitate to break a method into two if
it made the code clearer.
Ben
------------------------------
Date: Thu, 21 Mar 2013 00:44:23 +0000 (UTC)
From: Eli the Bearded <*@eli.users.panix.com>
Subject: Re: a trival array/ hash benchmark
Message-Id: <eli$1303202033@qz.little-neck.ny.us>
In comp.lang.perl.misc, Rainer Weikusat <rweikusat@mssgmbh.com> wrote:
> When running the trivial microbenchmark
I love these sorts of things, so I tried to duplicate your results.
> -----------
> use Benchmark;
>
> my $h = { Worschtsupp => 4 };
> my $a = [4];
>
> timethese(-5,
> {
> h => sub { return $h->{Worschtsupp}; },
> a => sub { return $a->[0]; }});
> ------------
>
> on 'some computer', the result of three runs was that the hash lookup
> ran at about 28.31% of the speed of the array access on average.
Why not post actual results? When I run it three times:
a: 5 wallclock secs ( 5.30 usr + 0.00 sys = 5.30 CPU) @ 18706996.60/s (n=99147082)
h: 4 wallclock secs ( 5.24 usr + 0.00 sys = 5.24 CPU) @ 15713758.59/s (n=82340095)
a: 6 wallclock secs ( 5.57 usr + 0.00 sys = 5.57 CPU) @ 28009483.30/s (n=156012822)
h: 6 wallclock secs ( 5.27 usr + 0.00 sys = 5.27 CPU) @ 24815075.90/s (n=130775450)
a: 5 wallclock secs ( 5.21 usr + 0.00 sys = 5.21 CPU) @ 19115772.55/s (n=99593175)
h: 4 wallclock secs ( 5.03 usr + 0.00 sys = 5.03 CPU) @ 22237998.61/s (n=111857133)
And adding two more test cases, cause Benchmarks Are Fun:
my %rh = ( Worschtsupp => 4 );
my @ra = ( 4 );
rh => sub { return $rh{Worschtsupp}; },
ra => sub { return $a[0]; }
I get these:
ra: 5 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @ 25632694.64/s (n=133802666)
rh: 6 wallclock secs ( 5.25 usr + 0.00 sys = 5.25 CPU) @ 17644305.71/s (n=92632605)
ra: 5 wallclock secs ( 5.07 usr + 0.00 sys = 5.07 CPU) @ 26592107.10/s (n=134821983)
rh: 5 wallclock secs ( 5.37 usr + 0.00 sys = 5.37 CPU) @ 18121233.71/s (n=97311025)
ra: 5 wallclock secs ( 5.25 usr + 0.01 sys = 5.26 CPU) @ 27205398.10/s (n=143100394)
rh: 5 wallclock secs ( 5.09 usr + 0.00 sys = 5.09 CPU) @ 17936919.45/s (n=91298920)
Array lookups appear faster than hash lookups (not shocking) and
dereferences are a little slower still. But nothing like that dramatic
difference you saw. I'm running v5.14.2 from Ubuntu 12.04.1 LTS, on an unloaded real
(ie not VM) box.
Elijah
------
has learned that benchmark results can vary widely based on Perl version
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 3906
***************************************