[12098] in Perl-Users-Digest
Perl-Users Digest, Issue: 5697 Volume: 8
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon May 17 11:15:21 1999
Date: Mon, 17 May 99 08:01:17 -0700
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 17 May 1999 Volume: 8 Number: 5697
Today's topics:
Monadic classes, eponymous metaobjects, and translucent <tchrist@mox.perl.com>
Special: Digest Administrivia (Last modified: 12 Dec 98 (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 17 May 1999 08:47:48 -0700
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Monadic classes, eponymous metaobjects, and translucent data members
Message-Id: <37402c14@cs.colorado.edu>
I've reworked quite a bit of perltootc, so I'm going to send the whole
thing out again. Here's its table of contents:
Class Data as Package Variables
Putting All Your Eggs in One Basket
Inheritance Concerns
The Eponymous Meta-Object
Indirect References to Class Data
Monadic Classes
Translucent Attributes
Class Data as Lexical Variables
Privacy and Responsibility
File-Scoped Lexicals
More Inheritance Concerns
Locking the Door and Throwing Away the Key
Translucency Revisited
Again, you can access pre-formatted versions from the links off the
What's New page at http://language.perl.com/admin/whats_new.html if
you prefer.
Meanwhile, I've got a plane to catch. :-)
--tom
=head1 NAME
perltootc - Tom's OO Tutorial for Class Data in Perl
=head1 DESCRIPTION
When designing an object class, you are sometimes faced with the situation
of wanting common state data shared by all objects of that class.
Such I<class data> act somewhat like global variables for the entire
class, but unlike program-wide globals, class data have meaning only to
the class itself.
Here are a few examples where class data might come in handy:
=over
=item *
to keep a count of the objects you've created, or how many are
still extant.
=item *
to extract the name or file descriptor for a logfile used by a debugging
method.
=item *
to access collective data, like the total amount of cash dispensed by
all ATMs in a network in a given day.
=item *
to access the last object created by a class, or the most accessed object,
or to retrieve a list of all objects.
=back
Unlike a true global, class data should not be accessed directly.
Instead, their state is inspected, and perhaps altered, through the
mediated access of I<class methods>. These class data accessor methods
are similar in spirit and function to accessors used to manipulate the
state of instance data on an object. They provide a clear firewall
between interface and implementation.
You should allow access to class data through either the class name
or any object of that class. If we assume that $an_object
is of type Some_Class, and the &Some_Class::population_count method
accesses class data, then these two invocations should both be possible,
and almost certainly.
Some_Class->population_count()
$an_object->population_count()
The question is, where do you store the state which that method
accesses?
In theory, one could simply use instance data stored on every object to
implement class data. But you might not want to do that. For one thing,
it would have some small impact on storage efficiency. More importantly,
duplication could lead to error.
Class data provide a much cleaner solution that emulation through instance
data. Unlike more restrictive languages like C++, where these are called
static data members, Perl provides no formal, syntactic mechanism to
declare class data, any more than it provides a syntactic mechanism
to declare instance data. Instead, Perl provides the developer with
a broad set of powerful but flexible features that the creative mind
can use to produce a custom design uniquely crafted to the particular
demands of the situation.
A class in Perl is typically implemented in a module. A module consists
of two complementary feature sets: a package for interfacing with the
outside world, and a lexical file scope for privacy. Either of these two
mechanisms can be used to implement class data. That means you get to
decide whether to put your class data in package variables or to put them
in lexical variables.
And those aren't the only decisions to make. If you choose to use package
variables, you can make your class data accessor methods either ignorant
of inheritance or sensitive to it. If you choose lexical variables,
you can elect to permit access to them from anywhere in the entire file
scope, or you can limit direct data access exclusively to the methods
implementing those attributes.
=head1 Class Data as Package Variables
Because a class in Perl is really just a package, using package variables
to hold class data is the most natural choice. This makes it simple
for each class to have its own class data. Let's say you have a class
called Some_Class that needs a couple of different attributes that you'd
like to be global to the entire class. The simplest thing to do is to
use package variables like $Some_Class::CData1 and $Some_Class::CData2
to hold these attributes. But we certainly don't want to encourage
outsiders to diddle those bits directly, so we provide access methods
to mediate access.
In the accessor methods below, we'll for now just ignore the first
argument--that bit to the left of the arrow on method invocation, which
is either a class name or an object reference.
package Some_Class;
sub CData1 {
shift; # XXX: ignore calling class/object
$Some_Class::CData1 = shift if @_;
return $Some_Class::CData1;
}
sub CData2 {
shift; # XXX: ignore calling class/object
$Some_Class::CData2 = shift if @_;
return $Some_Class::CData2;
}
This technique is highly legible and should be completely straightforward
to even the novice Perl programmer. By fully qualifying the package
variables, they stand out clearly when reading the code. Unfortunately,
if you misspell one of these, you've introduced an error that's hard
to catch. It's also somewhat disconcerting to see the class name itself
hard-coded in so many places.
Both these problems can be easily fixed. Just add the C<use strict>
pragma, then pre-declare your package variables.
package Some_Class;
use strict;
our($CData1, $CData2); # our() is new to perl5.006
sub CData1 {
shift; # XXX: ignore calling class/object
$CData1 = shift if @_;
return $CData1;
}
sub CData2 {
shift; # XXX: ignore calling class/object
$CData2 = shift if @_;
return $CData2;
}
As with any other global variable, some programmers prefer to start their
package variables with capital letters. This helps clarity somewhat, but
by no longer fully qualifying the package variables, their significance
can be lost when reading the code. You can fix this easily enough by
choosing better names than used here.
=head2 Putting All Your Eggs in One Basket
Just as the mindless enumeration of accessor methods for instance data
grows tedious after the first few (see L<perltoot>), so too does the
repetition begin to grate when listing out accessor methods for class
data. Repetition runs counter to the primary virtue of a programmer:
Laziness, here manifesting as that innate urge every programmer feels
to factor out duplicate code whenever possible.
Here's what to do. First, make just one hash to hold all shared data
attributes of the class proper (rather than those particular to each
object).
package Some_Class;
use strict;
our %ClassData = ( # our() is new to perl5.006
CData1 => "",
CData2 => "",
);
Now clone off class data accessor methods for each key in the
%ClassData hash.
for my $datum (keys %ClassData) {
no strict "refs"; # to register new methods in package
*$datum = sub {
shift; # XXX: ignore calling class/object
$ClassData{$datum} = shift if @_;
return $ClassData{$datum};
}
}
It's true that you could work out a solution employing
an &AUTOLOAD method, but this is unlikely to prove satisfactory.
It would have to distinguish between class data and object data; it could
interfere with inheritance; and it would have to careful about DESTROY.
Such complexity is uncalled for in most cases, and certainly in this one.
You may wonder why we're rescinding strict refs for the loop. We're
manipulating the package's symbol table to introduce new function names
using symbolic references (indirect naming), which the strict pragma
would otherwise forbid. Normally, symbolic references are a dodgy
notion at best. This isn't just because they can be used accidentally
when you aren't meaning to. It's also because for most kinds of uses
to which beginning Perl programmers attempt to put symbolic references,
we have much better approaches, like nested hashes or hashes of arrays.
There's nothing wrong with using symbolic references to manipulate
something that is meaningful only from the perspective of the package
symbol symbol table, like method names or package variables. In other
words, when you want to refer to the symbol table, use symbol references.
Clustering all the class variables in one place has several advantages.
They're easy to spot, initialize, and change. The aggregation also
makes them convenient to access externally, such as from a debugger
or a persistence package. The only possible problem is that we don't
automatically know the name of each class's class object, should it have
one. This issue is addressed below in L<"The Eponymous Meta-Object">.
=head2 Inheritance Concerns
Suppose you have an instance of a derived class, and you access class
data using an inherited method call. Should that wind up referring
to the base class's data attributes, or to those in the derived class?
How would it work in the earlier examples? The derived class inherits
all the base class's methods, including those that access class data.
But what package are the class data attributes stored in?
The answer is that, as written, class data are stored in the package into
which those methods were compiled. When you invoke the &CData1 method
on the name of the derived class or on one of that class's objects, the
version shown above is still run, so you'll access $Some_Class::CData1--or
in the method cloning version, C<$Some_Class::ClassData{CData1}>.
Think of these class methods as executing in the context of their base
class, not in that of their derived class. Sometimes this is exactly
what you want. If Feline subclasses Carnivore, then the population of
Carnivores in the world should go up when a new Feline is born.
But what if you wanted to figure out how many Felines you have apart
from Carnivores? The current approach doesn't support that.
You'll have to decide on a case-by-case basis whether it makes any
sense for class data to be package-relative. If you want it to be so,
then stop ignoring the first argument to the function. It will either
be a package name if the method was invoked directly on a class name,
or else it will be an object reference if the method was invoked on an
object reference--in which case the ref() function provides the class
of that object. Then use the resulting class name as the package in
which to look up the variable.
package Some_Class;
sub CData1 {
my $obclass = shift;
my $class = ref($obclass) || $obclass;
my $varname = $class . "::CData1";
no strict "refs"; # to access package data symbolically
$$varname = shift if @_;
return $$varname;
}
And then do likewise for all other class data attributes (such as CData2,
etc.) that you wish to access package variables in the invoking package
instead of the compiling package as we had previously.
Once again we temporarily disable the strict references ban, because
otherwise we couldn't use the fully-qualified symbolic name for
the package global. This is perfectly reasonable: since all package
variables by definition live in a package, there's nothing wrong with
accessing them via that package's symbol table. That's what it's there
for (well, somewhat).
What about the case of using a single hash for everything and then cloning
methods? What would that look like? The only difference would be the
closure used to produce new method entries for the class's symbol table.
no strict "refs";
*$datum = sub {
my $obclass = shift;
my $class = ref($obclass) || $obclass;
my $varname = $class . "::ClassData";
$varname->{$datum} = shift if @_;
return $varname->{$datum};
}
=head2 The Eponymous Meta-Object
The %ClassData hash in the previous example is neither the most
imaginative nor the most intuitive of names. Is there something else that
might make more sense, be more useful, or both?
As it happens, yes, there is. For the (unblessed) "class meta-object",
we'll use a package variable of the same name as the package itself.
In our case, that's %Some_Class::Some_Class. This is reminiscent of
classes that name their constructors eponymously. That is, class
Some_Class would use &Some_Class::Some_Class as a constructor, probably
even exporting that name as well. The StrNum class in Recipe 13.14 in
I<The Perl Cookbook> does this, if you're looking for an example.
Within the scope of a package Some_Class declaration, let the eponymously
named hash %Some_Class be that package's meta-object. This predictable
approach has many benefits, including having a well-known identifier to
be used to aid in debugging, transparent persistence, or checkpointing.
It's also the obvious name for monadic classes and translucent attributes,
discussed later.
Here's an example of such a class. Notice how the name of the
hash storing the meta-object is the same as the name of the package
used to implement the class.
package Some_Class;
use strict;
# create class meta-object using that most perfect of names
our %Some_Class = ( # our() is new to perl5.006
CData1 => "",
CData2 => "",
);
# this accessor is calling-package-relative
sub CData1 {
my $obclass = shift;
my $class = ref($obclass) || $obclass;
no strict "refs"; # to access eponymous meta-object
$class->{CData1} = shift if @_;
return $class->{CData1};
}
# but this accessor is not
sub CData2 {
shift; # XXX: ignore calling class/object
no strict "refs"; # to access eponymous meta-object
__PACKAGE__ -> {CData2} = shift if @_;
return __PACKAGE__ -> {CData2};
}
In the second accessor method, the __PACKAGE__ notation was used for
two reasons. First, to avoid hardcoding the literal package name
in the code in case we later want to change that name. Second, to
clarify to the reader that what matters here is the package currently
being compiled into, not the package of the invoking object or class.
If the long string of non-alphabetic characters bothers you, you can
always put the __PACKAGE__ in a variable first.
sub CData2 {
shift; # XXX: ignore calling class/object
no strict "refs"; # to access eponymous meta-object
my $class = __PACKAGE__;
$class->{CData2} = shift if @_;
return $class->{CData2};
}
Even though we're using symbolic references for good not evil, some
folks tend to become unnerved when they see so many places with strict
ref checking disabled. Given a symbolic reference, you can always
produce a real reference (the reverse is not true, though). So we'll
create a subroutine that does this conversion for us. If invoked as a
function of no arguments, it returns a reference to the compiling class's
eponymous hash. Invoked as a class method, it returns a reference to
the eponymous hash of its caller. And when invoked as an object method,
this function returns a reference to the eponymous hash for whatever
class the object belongs to.
package Some_Class;
use strict;
our %Some_Class = ( # our() is new to perl5.006
CData1 => "",
CData2 => "",
);
# tri-natured: function, class method, or object method
sub _classobj {
my $obclass = shift || __PACKAGE__;
my $class = ref($obclass) || $obclass;
no strict "refs"; # to convert sym ref to real one
return \%$class;
}
for my $datum (keys %{ _classobj() } ) {
# turn off strict refs so that we can
# register a method in the symbol table
no strict "refs";
*$datum = sub {
use strict "refs";
my $self = shift->_classobj();
$self->{$datum} = shift if @_;
return $self->{$datum};
}
}
=head2 Indirect References to Class Data
A reasonably common strategy for handling class data is to store
a reference to each class attribute on the object itself. This is
a strategy you've probably seen before, such as in L<perltoot> and
L<perlbot>, but there may be variations in the example below that you
haven't thought of before.
package Some_Class;
our($CData1, $CData2); # our() is new to perl5.006
sub new {
my $obclass = shift;
return bless my $self = {
ObData1 => "",
ObData2 => "",
CData1 => \$CData1,
CData2 => \$CData2,
} => (ref $obclass || $obclass);
}
sub ObData1 {
my $self = shift;
$self->{ObData1} = shift if @_;
return $self->{ObData1};
}
sub ObData2 {
my $self = shift;
$self->{ObData2} = shift if @_;
return $self->{ObData2};
}
sub CData1 {
my $self = shift;
my $dataref = ref $self
? $self->{CData1}
: \$CData1;
$$dataref = shift if @_;
return $$dataref;
}
sub CData2 {
my $self = shift;
my $dataref = ref $self
? $self->{CData2}
: \$CData2;
$$dataref = shift if @_;
return $$dataref;
}
As written above, a derived class will inherit these methods, which will
consequently access class data in the base class's package. This is not
necessarily expected behavior in all circumstances. Here's an example
that uses a variable meta-object, taking care to access the proper
package's data.
package Some_Class;
use strict;
our %Some_Class = ( # our() is new to perl5.006
CData1 => "",
CData2 => "",
);
sub _classobj {
my $self = shift;
my $class = ref($self) || $self;
no strict "refs";
# get (hard) ref to eponymous meta-object
return \%$class;
}
sub new {
my $obclass = shift;
my $classobj = $obclass->_classobj();
bless my $self = {
ObData1 => "",
ObData2 => "",
CData1 => \$classobj->{CData1},
CData2 => \$classobj->{CData2},
} => (ref $obclass || $obclass);
return $self;
}
sub ObData1 {
my $self = shift;
$self->{ObData1} = shift if @_;
return $self->{ObData1};
}
sub ObData2 {
my $self = shift;
$self->{ObData2} = shift if @_;
return $self->{ObData2};
}
sub CData1 {
my $self = shift;
$self = $self->_classobj() unless ref $self;
my $dataref = $self->{CData1};
$$dataref = shift if @_;
return $$dataref;
}
sub CData2 {
my $self = shift;
$self = $self->_classobj() unless ref $self;
my $dataref = $self->{CData2};
$$dataref = shift if @_;
return $$dataref;
}
We're now not only strict refs clean, and using an eponymous meta-object
seems to make the code cleaner. Unlike the previous version, this one
does something reasonable in the face of inheritance: it accesses the
class meta-object in the invoking class instead of the one into which
the method was initially compiled.
Inquisitive folk can easily access data in the class meta-object, making
it easy to dump the complete class state using an external mechanism such
as when debugging or implementing a persistent class. This works because
the class meta-object is a package variable, has a well-known name, and
clusters all its data together. (To be honest, transparent persistence
is not always feasible, but it's certainly an appealing idea.)
There's still no check that the object accessor methods have not been
invoked on the classname. If strict ref checking is enabled, you'd
blow up. If not, then you get the eponymous meta-object. What you do
with--or about--this is up to you. The next two sections demonstrate
innovative uses for this powerful feature.
=head2 Monadic Classes
Several standard modules shipped with Perl provide class interfaces
without any attribute methods whatsoever. The most commonly used module
not numbered amongst the pragmata, the Exporter module, is a class with
neither constructors nor attributes. Its job is simply to provide a
standard interface for modules wishing to export part of their namespace
into that of their caller. Modules use the Exporter's &import method by
setting their inheritance list in their package's @ISA array to mention
"Exporter". But class Exporter provides no constructor, so you can't have
several instances of the class. In fact, you can't have any--it just
doesn't make any sense. All you get is are its methods. Its interface
contains no statefulness, so state data is wholly superfluous.
Another sort of class that pops up from time to time is one that supports
a unique instance. Such classes are called I<monadic classes>, or less
formally, I<highlander classes>, since there can be but a solitary
instance of that class in creation at any one time.
If a class is monadic, where do you store its state, that is, its
data attributes? How do you make sure that ther's never more than
one instance?
While you could merely use a slew of package variables, it's a lot
cleaner to use the eponymously named hash. Here's a complete example
of a monadic class:
package Cosmos;
%Cosmos = ();
# accessor method for "name" attribute
sub name {
my $self = shift;
$self->{name} = shift if @_;
return $self->{name};
}
# read-only accessor method for "birthday" attribute
sub birthday {
my $self = shift;
die "can't reset birthday" if @_; # XXX: croak() is better
return $self->{birthday};
}
# accessor method for "stars" attribute
sub stars {
my $self = shift;
$self->{stars} = shift if @_;
return $self->{stars};
}
# oh my - one of our stars just went out!
sub nova {
my $self = shift;
my $count = $self->stars();
$self->stars($count - 1) if $count > 0;
}
# constructor/initializer method - fix by reboot
sub bigbang {
my $self = shift;
%$self = (
name => "the world according to tchrist",
birthday => time(),
stars => 0,
);
return $self; # yes, it's probably a class. SURPRISE!
}
# After the class is compiled, but before any use or require
# returns, we start off the universe with a bang.
__PACKAGE__ -> bigbang();
Hold on, that doesn't look like anything special. Those attribute
accessors look no different than they would if this were a regular class
instead of a monadic one. The crux of the matter is there's nothing
that says that $self must hold a reference to a blessed object. It just
has to be something you can invoke methods on. Here the package name
itself, Cosmos, works as an object. Look at the &nova method. Is that
a class method or an object method? The answer is that static analysis
cannot reveal the answer. Perl doesn't care, and neither should you.
In the three attribute methods, C<%$self> is really accessing the %Cosmos
package variable.
If like Stephen Hawking, you posit the existence of multiple, sequential,
and unrelated universes, then you can invoke the &bigbang method yourself
at any time to start everything all over again. You might think of
&bigbang as more of an initializer than a constructor, since the function
doesn't allocate new memory; it only initializes what's already there.
But like any other constructor, it does return a scalar value to use
for later method invocations.
Imagine that some day in the future, you decide that one universe just
isn't enough. You could write a new class from scratch, but you already
have an existing class that does what you want--except that it's monadic,
and you want more than just one cosmos.
That's what code reuse via subclassing is all about. Just look here
how brief your task turns out to be:
package Multiverse;
use Cosmos;
@ISA = qw(Cosmos);
sub new {
my $protoverse = shift;
my $class = ref($protoverse) || $protoverse;
my $self = {};
return bless($self, $class)->bigbang();
}
1;
Because we were careful to be good little creators when we designed our
Cosmos class, we can now reuse it without touching a single bit of code
when it comes time to write our Multiverse class. The same code that
worked when invoked as a class method continues to work perfectly well
when invoked against separate instances of a derived class.
The astonishing thing about the Cosmos class above is that the value
returned by the &bigbang "constructor" is not a reference to a blessed
object at all. It's just the class's own name. Perl doesn't mind
a whit, and neither should you. A class name is, for virtually all
intents and purposes, a perfectly acceptable object. It has state and
behavior, and although its identity might be deemed a trifle unwavering,
it certainly manifests inheritance, polymorphism, and encapsulation.
And what more can you ask of an object?
Recognizing the unification of what other programming languages might
think of as class methods and object methods into just plain methods
is key to understanding object orientation in Perl. "Class methods" and
"object methods" are distinct only in the compartmentalizing mind of the
Perl programmer, not to the Perl language.
Along those same lines, a constructor is nothing special either, which
is part of why Perl has no pre-ordained name for them. "Constructor"
is just an informal term loosely used to describe a method that returns
a scalar value that you can make further method calls against. So long
as its either a class name or an object reference, that's good enough.
It doesn't even have to be a reference to a brand new object, either.
You can have as many--or as few--constructors as you want, and you can
name them whatever you care to. Blindly and obediently using new()
for each and every constructor you ever write is to speak Perl with
such a severe C++ accent that you do a disservice to both languages.
There's no reason to insist that each class have but one constructor,
or that that constructor be named new(), or that that constructor be
used solely as a class method and not an object method.
In the next section, we show how useful it can be to further distance
ourselves from any formal distinction between class method calls and
object method calls, both in constructors and in accessor methods.
=head2 Translucent Attributes
A package's eponymous hash can be used for more than just containing
per-class, global state data. It can also serve as a sort of template
containing default settings for object attributes. These default settings
can then be used in constructors for initialization of a particular object.
The class's eponymous hash can also be used to implement I<translucent
attributes>. A translucent attribute is an object attribute which, when
the value is retrieved from an object that has not intentionally set
that particular attribute, will instead return the value for that field
in the eponymous meta-object. In other words, translucent attributes
"see through" uninitialized object attribute to get at the meta-object's
values instead.
We'll apply something of a copy-on-write approach to these
translucent attributes. If you're just fetching values from them,
you get translucency. But if you store a new value to them, that new
value lodges on the current object. On the other hand, if you use the
class as an object and store the attribute value directly on the class,
then the meta-object's value changes, and later fetch operations on
objects with uninitialized values for those attributes will retrieve the
meta-object's new values. Objects with their own initialized values,
however, won't see any change.
Let's look at some concrete examples of using these properties before we
show how to implement them. Suppose that a class named Some_Class
had a translucent data member called "color". First you set the color
in the meta-object, then you create three objects using a constructor
that happens to be named &spawn.
use Vermin;
Vermin->color("vermilion");
$ob1 = Vermin->spawn(); # so that's where Jedi come from
$ob2 = Vermin->spawn();
$ob3 = Vermin->spawn();
print $obj3->color(); # prints "vermilion"
Each of these objects' colors is now "vermilion", because that's the value
that the meta-object has for that attribute, and the objects in question
don't have their own color values.
Changing the attribute on one object has no effect on other objects
previously created.
$ob3->color("chartreuse");
print $ob3->color(); # prints "chartreuse"
print $ob1->color(); # prints "vermilion", translucently
If you now use $ob3 to spawn off another object, the new object will
take the color its parent held, which now happens to be "chartreuse".
That's because the constructor uses the invoking object as its template
for initializing data attributes. When that invoking object is the
class name, the object used as a template is the eponymous meta-object.
When the invoking object is a reference to an instantiated object, the
&spawn constructor uses that existing object as a template.
$ob4 = $ob3->spawn(); # $ob3 now template, not %Vermin
print $ob4->color(); # prints "chartreuse"
Any actual values set on the template object will be copied to the
new object. But attributes undefined in the template object, being
translucent, will remain undefined and consequently translucent in the
new one as well.
Now let's change the color attribute on the entire class:
Vermin->color("azure");
print $ob1->color(); # prints "azure"
print $ob2->color(); # prints "azure"
print $ob3->color(); # prints "chartreuse"
print $ob4->color(); # prints "chartreuse"
That color change took effect only in the first pair of objects, which
were still translucently accessing the meta-object's values. The second
pair had per-object initialized colors, and so didn't change.
One important question remains. Changes to the meta-object are reflected
in translucent attributes in the entire class, but what about
changes to discrete objects? If you change the color of $ob3, does the
value of $ob4 see that change? Or vice-versa. If you change the color
of $ob4, does then the value of $ob3 shift?
$ob3->color("amethyst");
print $ob3->color(); # prints "amethyst"
print $ob4->color(); # hmm: "chartreuse" or "amethyst"?
While one could argue that in certain rare cases it should, let's not
do that. Good taste aside, we want the answer to the question posed in
the comment above to be "chartreuse", not "amethyst". So we'll treat
these attributes similar to the way process attributes like environment
variables, user and group IDs, or the current working directory are
treated across a fork(). You can change only yourself, but you will see
those changes reflected in your unspawned children. Changes to one object
will not propagate up to the parent or down to any existing child objects.
Those made later, however, will see the changes.
If you have an object with a discrete (read: not translucent) attribute
value, but you want to make that object's attribute value translucent
again later, what do you do? Let's design the class so that when invoke
an accessor method with C<undef> as its argument, that attribute
returns to translucency.
$ob4->color(undef); # back to "azure"
Here's a complete implementation of Vermin as described above.
package Vermin;
# here's the class meta-object, eponymously named.
# it holds all class data, and also all instance data
# so the latter can be used for both initialization
# and translucency.
our %Vermin = ( # our() is new to perl5.006
PopCount => 0, # capital for class data
color => "beige", # small for instance data
);
# constructor method
# invoked as class method or object method
sub spawn {
my $obclass = shift;
my $class = ref($obclass) || $obclass;
my $self = {};
bless($self, $class);
$class->{PopCount}++;
# init fields from invoking object, or omit if
# invoking object is the class to provide translucency
%$self = %$obclass if ref $obclass;
return $self;
}
# translucent accessor for "color" attribute
# invoked as class method or object method
sub color {
my $self = shift;
my $class = ref($self) || $self;
# handle class invocation
unless (ref $self) {
$class->{color} = shift if @_;
return $class->{color}
}
# handle object invocation
$self->{color} = shift if @_;
if (defined $self->{color})) { # not exists!
return $self->{color};
} else {
return $class->{color};
}
}
# class data accessor for "PopCount" attribute
# invoked as class method or object method
# but uses object solely to locate meta-object
sub population {
my $obclass = shift;
my $class = ref($obclass) || $obclass;
return $class->{PopCount};
}
# instance destructor
# invoked only as object method
sub DESTROY {
my $self = shift;
my $class = ref $self;
$class->{PopCount}--;
}
Here are a couple of helper methods that might be convenient. They aren't
accessor methods at all. They're used to detect accessibility of data
attributes. The &is_translucent method determines whether a particular
object attribute is coming from the meta-object. The &has_attribute
method detects whether a class implements a particular property at all.
It could also be used to distingish undefined properties from non-existent
ones.
# detect whether an object attribute is translucent
# (typically?) invoked only as object method
sub is_translucent {
my($self, $attr) = @_;
return !defined $self->{$attr};
}
# test for presence of attribute in class
# invoked as class method or object method
sub has_attribute {
my($self, $attr) = @_;
my $class = ref $self if $self;
return exists $class->{$attr};
}
If you prefer to install your accessors more generically, you can make
use of the upper-case versus lower-case convention to register into the
package appropriate methods cloned from generic closures.
for my $datum (keys %{ +__PACKAGE__ }) {
*$datum = ($datum =~ /^[A-Z]/)
? sub { # install class accessor
my $obclass = shift;
my $class = ref($obclass) || $obclass;
return $class->{$datum};
}
: sub { # install translucent accessor
my $self = shift;
my $class = ref($self) || $self;
unless (ref $self) {
$class->{$datum} = shift if @_;
return $class->{$datum}
}
$self->{$datum} = shift if @_;
return defined $self->{$datum}
? $self -> {$datum}
: $class -> {$datum}
}
}
Translations of this closure-based approach into C++, Java, and Python
have been left as exercises for the reader. Be sure to send us mail as
soon as you're done.
=head1 Class Data as Lexical Variables
=head2 Privacy and Responsibility
Did you happen to notice how in the previous examples, we didn't prefix
the package variables used for class data with an underscore, nor did we
do so for the names of the hash keys used for instance data? You don't
need little markers on data names to suggest nominal privacy on attribute
variables or hash keys, because these are B<already> notionally private!
Outsiders have no business whatsoever playing with anything within a
class save through the mediated access of its documented interface; in
other words, through method invocations. And not even through just any
method, either. Methods that begin with an underscore are traditionally
considered off-limits outside the class. If outsiders skip the documented
method interface to poke around the internals of your class and end up
breaking something, that's not your fault--it's theirs.
Perl believes in individual responsibility rather than mandated control.
Perl respects you enough to let you choose your own preferred level of
pain, or of pleasure. Perl believes that you are creative, intelligent,
and capable of making your own decisions--and fully expects you to
take complete responsibility for your own actions. In a perfect world,
these admonitions alone would suffice, and everyone would be intelligent,
responsible, happy, and creative. And careful. One probably shouldn't
forget careful, and that's a good bit harder to expect. Even Einstein
would take wrong turns by accident and end up lost in the wrong part
of town.
Some folks get the heebie-jeebies when they see package variables
hanging out there for anyone to reach over and diddle them. Some folks
live in constant fear that someone somewhere might do something wicked.
The solution to that problem is simply to fire the wicked, of course.
But unfortunately, it's not as simple as all that. These cautious
types are also afraid that they or others will do something not so
much wicked as careless, whether by accident or out of desperation.
If we fire everyone who ever gets careless, pretty soon there won't be
anybody left to get any work done.
Whether it's needless paranoia or sensible caution, this uneasiness can
be a problem for some people. We can take the edge off their paranoia by
providing the option of storing class data as lexical variables instead
of as package variables. The my() operator is the source of all privacy
in Perl, and it is a powerful form of privacy indeed.
It is widely perceived, and indeed has often been written, that Perl
provides no data hiding, that it affords the class designer no privacy
nor isolation, merely a rag-tag assortment of weak and unenforcible
social conventions instead. This perception is demonstrably false and
easily disproven. In the next section, we show how to implement forms
of privacy that are far, far stronger than those provided in nearly any
other object-oriented language.
=head2 File-Scoped Lexicals
A lexical variable is visible only through the end of its static scope.
That means that the only code able to access that variable
is code residing textually below the my() operator through of its
block if it has one, or through the end of the current file if it doesn't.
Starting again with our simplest example given at the start of this
document, we replace our() variables with my() versions.
package Some_Class;
my($CData1, $CData2); # file scope, not in any package
sub CData1 {
shift; # XXX: ignore calling class/object
$CData1 = shift if @_;
return $CData1;
}
sub CData2 {
shift; # XXX: ignore calling class/object
$CData2 = shift if @_;
return $CData2;
}
So much for that old $Some_Class::CData1 package variable and its brethren!
Those are gone now, replaced with lexicals. No one outside the
scope can reach in and diddle the class state without resorting to the
documented interface. Not even subclasses or superclasses of
this one have unmediated access to $CData1. They have to invoke the &CData1
method against Some_Class or an instance thereof, just like anybody else.
To be rigorously honest, that last statement assumes you haven't gone
off and packed several classes together into the same file scope,
nor strewn your class implementation across several different files.
Accessibility of those variables is based uniquely on the static file
scope. It has nothing to do with the package. That means that any code
in a different file but the same package (class) could not access those
variables, yet code in the same file but a different package (class)
could. There are sound reasons why we usually suggest a one-to-one
mapping between files and packages and modules and classes. You don't
have to stick to this suggestion if you really know what you're doing,
but you're apt to confuse yourself otherwise, especially at first.
If you'd like to aggregate your class data into one lexically scoped,
composite structure, you're perfectly free to do so.
package Some_Class;
my %ClassData = (
CData1 => "",
CData2 => "",
);
sub CData1 {
shift; # XXX: ignore calling class/object
$ClassData{CData1} = shift if @_;
return $ClassData{CData1};
}
sub CData2 {
shift; # XXX: ignore calling class/object
$ClassData{CData2} = shift if @_;
return $ClassData{CData2};
}
To make this more scalable as other class data attributes are added,
we can again register closures into the package symbol table much as
before to create accessor methods for them.
package Some_Class;
my %ClassData = (
CData1 => "",
CData2 => "",
);
for my $datum (keys %ClassData) {
no strict "refs";
*$datum = sub {
shift; # XXX: ignore calling class/object
$ClassData{$datum} = shift if @_;
return $ClassData{$datum};
};
}
Requiring even your own class to use accessor methods like anyone else is
probably a good thing. But demanding and expecting that everyone else,
be they subclass or superclass, friend or foe, will all come to your
object through mediation is more than just a good idea. It's absolutely
critical to the model. Let there be in your mind no such thing as
"public" data, nor even "protected" data, which is a seductive but
ultimately destructive notion. Both will come back to bite at you.
That's because as soon as you take that first step out of the solid
model in which all state is considered completely private, save from the
perspective of its own accessor methods, you have violated the envelope.
And having pierced that encapsulating envelope, you shall doubtless
someday pay the price, when future changes in the implementation break
unrelated code. Considering that avoiding this infelicitous outcome was
precisely why you consented to suffer the slings and arrows of obsequious
abstraction by turning to object orientation in the first place, such
breakage seems unfortunate in the extreme.
=head2 More Inheritance Concerns
Suppose that Some_Class were used as a base class from which to derive
Another_Class. If you invoke a &CData method on the derived class or
on an object of that class, what do you get? Would the derived class
have its own state, or would it piggyback on its base class's versions
of the class attributes?
The answer is that under the scheme outlined above, the derived class
would B<not> have its own state data. As before, whether you consider
this a good thing or a bad one depends on the semantics of the classes
involved and on the class data itself.
The cleanest, sanest, simplest way to address per-class state in a
lexical word is for the derived class to override its base class's
version of the method that accesses the class data. Since the actual
method called is the one in the object's derived class if this exists,
you automatically get per-class state this way. Any urge to provide
an unadvertised method to sneak out a reference to the %ClassData hash
should be strenuously resisted.
As with any other overridden method, the implementation in the
derived class always has the option of invoking its base class's
version of the method in addition to its own. Here's an example:
package Another_Class;
@ISA = qw(Some_Class);
my %ClassData = (
CData1 => "",
);
sub CData1 {
my($self, $newvalue) = @_;
if (@_ > 1) {
# set locally first
$ClassData{CData1} = $newvalue;
# then pass the buck up to the first
# overridden version, if there is one
if ($self->can("SUPER::CData1")) {
$self->SUPER::CData1($newvalue);
}
}
return $ClassData{CData1};
}
Those dabbling in multiple inheritance might be concerned
about there being more than one override.
for my $parent (@ISA) {
my $methname = $parent . "::CData1";
if ($self->can($methname) {
$self->$methname($newvalue);
}
}
=head2 Locking the Door and Throwing Away the Key
As currently implemented, any code within the same scope as the
file-scoped lexical %ClassData can diddle it should they be so inclined.
Is that ok? Is it acceptable or even desirable to allow other parts of
the implementation of this class to access class data directly?
That depends on how much of a protocol stickler you want
to be. Think back to the Cosmos class. If the &nova method had directly
diddled $Cosmos::Stars or C<$Cosmos::Cosmos{stars}>, then we wouldn't
have been able to reuse the class when it came to inventing a Multiverse.
So letting even the class itself access its own class data without the
mediating intervention of properly designed accessor methods is probably
not a good idea after all.
Restricting access to class data from the class itself is usually not
enforcible even in strongly object-oriented languages. Is this something
that in Perl B<can> be enforced? Why of course it is! This is Perl,
in which all things are possible, albeit perhaps not always expedient.
Here's one way:
package Some_Class;
{ # scope for hiding $CData1
my $CData1;
sub CData1 {
shift; # XXX: unused
$CData1 = shift if @_;
return $CData1;
}
}
{ # scope for hiding $CData2
my $CData1;
sub CData2 {
shift; # XXX: unused
$CData2 = shift if @_;
return $CData2;
}
}
No one--absolutely no one--is allowed to diddle the class data without
the mediation of the managing accessor method, since it and it alone
can see its respective lexical variable. This use of mediated access
to class data is far stronger privacy than most OO languages provide.
A black hole is more apt to bleed data than are lexical closures.
Perl's flexibility is such that you can make classes that are even more
private than you can in oft-touted systems with their tangled hierarchy
of private, partial, protected, public, persnickety, friend, foe, and
ferocious datum attributes.
The repetition of code used to create per-datum accessor methods chafes
at our Laziness, so we'll again use closures to create similar
methods.
package Some_Class;
{ # scope for ultra-private meta-object for class data
my %ClassData = (
CData1 => "",
CData2 => "",
);
for my $datum (keys %ClassData ) {
no strict "refs";
*$datum = sub {
use strict "refs";
my ($self, $newvalue) = @_;
$ClassData{$datum} = $newvalue if @_ > 1;
return $ClassData{$datum};
}
}
}
The closure above can be modified to take inheritance into account using
the &UNIVERSAL::can method and SUPER as shown previously.
=head2 Translucency Revisited
The Vermin class used to demonstrate translucent used an eponymously
named package variable, %Vermin, as its meta-object. If you prefer to
use absolutely no package variables beyond those necessary to appease
inheritance or possibly the Exporter, this strategy is closed to you.
That's too bad, because translucenct attributes are an appealing
technique. It would be valuable to devise an implementation using
only lexicals.
There's a second reason why you might wish to avoid the eponymous
package hash. If you use class names with double-colons in them, you
would end up poking around somewhere you might not have meant to poke.
package Vermin;
$class = "Vermin";
$class->{PopCount}++;
# accesses $Vermin::Vermin{PopCount}
package Vermin::Noxious;
$class = "Vermin::Noxious";
$class->{PopCount}++;
# accesses $Vermin::Noxious{PopCount}
What's going here is that in the first case, because the class name
had no double-colons, we got the hash in the to the current package.
But in the second case, we didn't a hash in the current package--we got
one in an entirely different package. Perl doesn't supported relative
packages in its naming conventions, so any double-colons trigger a
fully-qualified lookup instead of just looking in the current package.
In practice, it is unlikely that the Vermin class had an existing
package variable named %Noxious that you just blew away. If you're
still mistrustful, you could always stake out your own territory
where you know the rules, such as using Eponymous::Vermin::Noxious or
Hieronymus::Vermin::Boschious or Leave_Me_Alone::Vermin::Noxious as class
names instead. Sure, it's in theory possible that someone else has
a class named Eponymous::Vermin with its own %Noxious hash, but this
kind of thing is always true. There's no arbiter of package names.
It's always the case that globals like @Cwd::ISA would collide if more
than one class uses the same package Cwd package.
If this still leaves you with an uncomfortable tinge of paranoia,
we have another solution for you. There's nothing that says that
just to use a classwise meta-object, either for monadic classes or for
translucent attributes. You just code up the methods a bit differently
so that they access a lexical instead. It's not really even any harder
this way, either.
Here's another implementation of the Vermin class with semantics identical
to those given previously, but this time using no package variables.
package Vermin;
# here's the class meta-object, used
# to holds all class data, and also all instance data
# so the latter can be used for both initialization
# and translucency. it's a template.
my %ClassData = (
PopCount => 0, # capital for class data
color => "beige", # small for instance data
);
# constructor method
# invoked as class method or object method
sub spawn {
my $obclass = shift;
my $class = ref($obclass) || $obclass;
my $self = {};
bless($self, $class);
$ClassData{PopCount}++;
# init fields from invoking object, or omit if
# invoking object is the class to provide translucency
%$self = %$obclass if ref $obclass;
return $self;
}
# translucent accessor for "color" attribute
# invoked as class method or object method
sub color {
my $self = shift;
my $class = ref($self) || $self;
# handle class invocation
unless (ref $self) {
$ClassData{color} = shift if @_;
return $ClassData{color}
}
# handle object invocation
$self->{color} = shift if @_;
if (defined $self->{color})) { # not exists!
return $self->{color};
} else {
return $ClassData{color};
}
}
# class data accessor for "PopCount" attribute
# invoked as class method or object method
sub population {
return $ClassData{PopCount};
}
# instance destructor; invoked only as object method
sub DESTROY {
$ClassData{PopCount}--;
}
# detect whether an object attribute is translucent
# (typically?) invoked only as object method
sub is_translucent {
my($self, $attr) = @_;
$self = \%ClassData if !ref $self;
return !defined $self->{$attr};
}
# test for presence of attribute in class
# invoked as class method or object method
sub has_attribute {
my($self, $attr) = @_;
return exists $ClassData{$attr};
}
=head1 NOTES
We use the hypothetical our() syntax for package variables. It works
like use vars, but looks like my(). It should be in this summer's
major release of perl (that would be 5.006) -- we hope.
The usual mealy-mouthed package mungeing doubtless applies to setting
up names of instance data. For example, C<$self-E<gt>{ObData1}> should
probably be C<$self-E<gt>{ __PACKAGE__ . "_ObData1" }>, but that would
just confuse the examples.
=head1 SEE ALSO
L<perltoot>, L<perlobj>, L<perlmod>, and L<perlbot>.
The Tie::SecureHash module from CPAN is worth checking out.
=head1 AUTHOR AND COPYRIGHT
Copyright (c) 1999 Tom Christiansen
All rights reserved.
When included as part of the Standard Version of Perl, or as part of
its complete documentation whether printed or otherwise, this work
may be distributed only under the terms of Perl's Artistic License.
Any distribution of this file or derivatives thereof I<outside>
of that package require that special arrangements be made with
copyright holder.
Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain. You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit. A simple comment in the code giving
credit would be courteous but is not required.
=head1 ACKNOWLEDGEMENTS
Russ Albery, Jon Orwant, Larry Rosler, Nat Torkington, and Stephen
Warren all contributed suggestions and corrections to this piece.
Thanks especially to Damian Conway for his ideas and feedback, and
without whose indirect prodding I might never have taken the time
to show others how much Perl has to offer in the way of objects once
you start thinking outside the tiny little box that today's "popular"
object-oriented languages enforce.
--
"Writing is easy. Al you do is stare at a blank sheet of paper until
drops of blood form on your forehead."
--Gene Foulwer
------------------------------
Date: 12 Dec 98 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Special: Digest Administrivia (Last modified: 12 Dec 98)
Message-Id: <null>
Administrivia:
Well, after 6 months, here's the answer to the quiz: what do we do about
comp.lang.perl.moderated. Answer: nothing.
]From: Russ Allbery <rra@stanford.edu>
]Date: 21 Sep 1998 19:53:43 -0700
]Subject: comp.lang.perl.moderated available via e-mail
]
]It is possible to subscribe to comp.lang.perl.moderated as a mailing list.
]To do so, send mail to majordomo@eyrie.org with "subscribe clpm" in the
]body. Majordomo will then send you instructions on how to confirm your
]subscription. This is provided as a general service for those people who
]cannot receive the newsgroup for whatever reason or who just prefer to
]receive messages via e-mail.
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V8 Issue 5697
**************************************