[4602] in java-interest
Re: language level support for serialization
daemon@ATHENA.MIT.EDU (Jonathan Locke)
Tue Jan 9 00:11:24 1996
Date: Mon, 8 Jan 1996 19:05:04 -0800 (PST)
From: Jonathan Locke <jonl@sealevelsoftware.com>
To: Arthur van Hoff <Arthur.Vanhoff@Eng.Sun.COM>
Cc: java-interest@java.Eng.Sun.COM, jim.waldo@sun.com
In-Reply-To: <199601081826.KAA10568@jakarta.Eng.Sun.COM>
On Mon, 8 Jan 1996, Arthur van Hoff wrote:
> Hi Jonathan,
>
> > one thing i was really hoping java might do is provide built-in support for
> > serialization. C++ could never hope to do such a thing because of the whole
> > pointers-are-out-of-control issue... but JAVA could conceivably do it! a
> > native method written by Sun (not sure what class it belongs in) could
> > take an Object, a generic Stream interface and a direction (stream in or out)
> > and that method should be able to traverse the graph of objects that are
> > sub-objects of the Object you want to serialize (taking care to avoid
> > cycles) and it should be able to turn the pointers into a persistent
> > representation (zero-based file offsets perhaps) and back into pointers
> > again when streaming back in (the pointer *values* have to change,
> > obviously, but the *meaning* of the pointers can be reconstructed from
> > the information in the stream). one small issue: if someone messes with
> > the pointers in the file, the serialize method should be able to catch
> > on (or there might be some security issues) and say "File corrupted". i
> > realize this is a non-trivial problem, but the benefits could be tremendous
> > because all Java applications would be able to serialize *anything* by just
> > stuffing things into classes and saying "serialize yourself" to the object
> > of that class. imagine being able to take an arbitrarily complex object and
> > push it through a socket connection to another application! what power and
> > elegance! and it would only be possible in a safe language like Java. it
> > would be yet another point on the scoreboard of C++ versus Java. oh... and
> > let's not forget that Sun already has a good chunk of the code required to
> > implement the feature (the garbage collector's mark phase has to do exactly
> > the traversal mentioned above).
>
> This is certainly possible and we've had prototypes for a few years now. The
> problem is the security implications. What stops me from creating an array of
> bits and turning it into an arbitrary object? What about the versioning issues?
> We are working on Solutions for all of these. If you are interested please
> contact Jim Waldo (jim.waldo@sun.com).
i don't feel the security implications are necessarily the end of this idea.
i haven't seriously studied java security issues, but here are a couple
reasons why i think serialization of objects can be made safe:
1) are there any java objects you can create which are insecure to begin with?
if so, what's to prevent someone from creating such an insecure object
with "new"? to admit that creating arbitrary objects is a security
problem is to admit that java can only be secure if "new" is restricted in
some way. so what? let someone turn a bunch of bits into a bunch of
objects? they shouldn't be able to create any truly dangerous objects to
begin with. right?
2) if there were a dangerous object in Java called "MachineKiller" that you
weren't allowed to create, yes someone could create a bunch of bits that
created a MachineKiller object, but NO the native java method that read-in
(de-serialized) the stream data would NOT create such an object any more
than the new operator would. unsafe objects (if there is such a thing)
would have to be listed somewhere so that operator new couldn't make them.
you would use that same list to ensure that de-serialization couldn't create
MachineKiller objects. just think of de-serialization as a really fancy
variant of operator new that bypasses new and uses raw data to reconstruct
an object.
the last point is much more important (and difficult).
what about versioning?
the same question could be asked (in part) about java as a whole.
or about object programming languages in general. how does one version a
class library? MFC just creates a whole new library each time...
MFC30.dll, MFC40.dll etc. in part this is due to limitations in C++, but
maintaining the differences between AWT1.0 and AWT2.0 may also require
this kind of solution... i don't know. if enough stuff has to change
for whatever reason... it becomes too hard to be backward compatible...
and something has to bend (or break). at what point do modifications make
X version 2 a wholly new class as opposed to a subclass of X version 1?
but i'm straying from the subject at hand...
it's key to recognize that the problems with versioning of serialized data
exist right now in java anyway (although nobody has made complicated
enough apps to really run into it yet...) and it would be *nice* to solve
the problem generically... but if no reasonable solution exists, it's *still*
VERY USEFUL to have language based serialization... as a programmer you
just have to think long and hard about the structure of the classes you
are going to serialize... if you have to change one of your serialized
data structures at some point... you are just going to have to change your
classes/methods to deal with the old data and the new data. the neat thing
is your code could actually be fairly pleasant (please try to read my
pseudocode as such and not real java code):
class MyData
{
int x; // version 1.0 data to be serialized
String foo;
}
class MyData20 extends MyData
{
int y; // in version 2.0 we need this value too.
}
class Whatever
{
MyData m; // this may be either 1.0 or 2.0 data...
void save(Stream s)
{
// serialize the MyData object into stream s
s.exportObject(m);
}
void load(Stream s) throws ClassCastException, ImportObjectException
{
// deserialize an object and cast it to our data type
// if the file format is wrong, the cast or the importObject
// method call will throw an exception
m = (MyData)s.importObject();
if (m.getClass() == MyData)
{
// we loaded a 1.0 object
}
if (m.getClass() == MyData20)
{
// we loaded a 2.0 object
}
}
}
you get the gist of it...
the very nature of java would make the whole thing really easy... in comparison
with C++, where you would be FORCED to write all the serialization and
version-stamping/version-checking/object-creation code manually... a real
class/object environment can have bigger payoffs than just eliminating
stray pointers and memory corruption (although that's enough to win most
people over!).
J
>
> Have fun,
>
> Arthur van Hoff
>
>
-
This message was sent to the java-interest mailing list
Info: send 'help' to java-interest-request@java.sun.com