[32204] in Perl-Users-Digest
Perl-Users Digest, Issue: 3469 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Aug 9 06:14:29 2011
Date: Tue, 9 Aug 2011 03:14:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 9 Aug 2011 Volume: 11 Number: 3469
Today's topics:
Which data serialization format? <tn@movb.de>
Re: Which data serialization format? <bjoern@hoehrmann.de>
Re: Which data serialization format? <tn@movb.de>
Re: Which data serialization format? <bjoern@hoehrmann.de>
Re: Which data serialization format? <tzz@lifelogs.com>
Re: Which data serialization format? <peter@makholm.net>
Re: Which data serialization format? <tn@movb.de>
Re: Which data serialization format? <tn@movb.de>
Re: XML::Simple drives me mad <dvaldenaire@gmail.com>
Re: XML::Simple drives me mad <tadmc@seesig.invalid>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 8 Aug 2011 17:25:05 +0200
From: Tobias Nissen <tn@movb.de>
Subject: Which data serialization format?
Message-Id: <20110808172505.0dfc7e0b@hal.movb.de>
Hi,
I am unsure which data serialization format to use. I need a format that
also supports serialization of binary data.
A complete RPC-implementation, as in the case of Thrift, would be a
nice add-on. On the other hand I could do it myself and would have some
more control over what's happening under the hood.
Since it's pretty standard nowadays, I'd really like to use JSON, but
sending binary data over it could become quiet awkward.
What I am looking for is a stable, simple, proven and tested system,
which's Perl(!) implementation is in widespread use.
That's basically my question for this post. If you like to read on,
here's my findings so far...
I have made some tests with the official Perl-Implementation of Thrift=E2=
=81=B0
and have not yet discovered any major issues that would prevent me from
using it. However, it lacks proper documenation and I am not entirely
convinced that the user-mailing list would be especially helpful, once
I dig deeper into it.
Protocol Buffers OTOH has robust implementations in some languages, but
apparantly Perl's not one of them. There's Google::ProtocolBuffers=C2=B9
which hasn't had a release since 2008 and is stuck at version 0.08. Then
there's protobuf-perl=C2=B2 -- which is dead. Last but not least there's
protobuf-perlxs=C2=B3 which received two bug reports this year and had its
last commit in august last year.
BSON could be an alternative, but it's Perl-Implementation is, well,
minimal=E2=81=B4.
BTW, I want a format with a large Perl user base in particular.
Best regards and TIA,
Tobias
_________
=E2=81=B0 http://thrift.apache.org/
=C2=B9 http://search.cpan.org/~gariev/Google-ProtocolBuffers-0.08/lib/Googl=
e/ProtocolBuffers.pm
=C2=B2 http://code.google.com/p/protobuf-perl/
=C2=B3 http://code.google.com/p/protobuf-perlxs/
=E2=81=B4 http://search.cpan.org/~minimal/BSON-0.03/lib/BSON.pm
------------------------------
Date: Mon, 08 Aug 2011 18:11:18 +0200
From: Bjoern Hoehrmann <bjoern@hoehrmann.de>
Subject: Re: Which data serialization format?
Message-Id: <vd1047h20nqibqln966j83upalsu0hbp9m@hive.bjoern.hoehrmann.de>
* Tobias Nissen wrote in comp.lang.perl.misc:
>I am unsure which data serialization format to use. I need a format that
>also supports serialization of binary data.
>
>A complete RPC-implementation, as in the case of Thrift, would be a
>nice add-on. On the other hand I could do it myself and would have some
>more control over what's happening under the hood.
>
>Since it's pretty standard nowadays, I'd really like to use JSON, but
>sending binary data over it could become quiet awkward.
>
>What I am looking for is a stable, simple, proven and tested system,
>which's Perl(!) implementation is in widespread use.
That would seem to be Storable.pm, which is a Core module since v5.7.3;
it supports pretty much everything you can reasonably represent in Perl
including, say, circular references, which formats like JSON are unable
to support without adding awkward indirection. There are a couple of RPC
related modules on CPAN that use Storable for marshalling. Main problem
would be that there may be version incompatibilities.
--
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
------------------------------
Date: Mon, 8 Aug 2011 18:52:19 +0200
From: Tobias Nissen <tn@movb.de>
Subject: Re: Which data serialization format?
Message-Id: <20110808185219.48ab1b14@hal.movb.de>
Bjoern Hoehrmann wrote:
> * Tobias Nissen wrote in comp.lang.perl.misc:
>> I am unsure which data serialization format to use. I need a format
>> that also supports serialization of binary data.
>>
>> A complete RPC-implementation, as in the case of Thrift, would be a
>> nice add-on. On the other hand I could do it myself and would have
>> some more control over what's happening under the hood.
>>
>> Since it's pretty standard nowadays, I'd really like to use JSON, but
>> sending binary data over it could become quiet awkward.
>>
>> What I am looking for is a stable, simple, proven and tested system,
>> which's Perl(!) implementation is in widespread use.
>
> That would seem to be Storable.pm, which is a Core module since
> v5.7.3; it supports pretty much everything you can reasonably
> represent in Perl including, say, circular references, which formats
> like JSON are unable to support without adding awkward indirection.
Sorry, I forgot to mention it, but I like the idea of the format to be
programming language independent. Thrift e.g. seems to officially
support 14 different languages, which was one of the reasons why I
picked it for my experiments.
I also dislike the absence of basic types (like String, Int, Bool, ...)
when using Storable. Or the the fact that I'd have to write a type
checker myself. Also there are no schemas and hence no automatic code
generation. It's all too dynamic for my use case.
(No, XML (in whatever form) is not an option :-) )
> There are a couple of RPC related modules on CPAN that use Storable
> for marshalling. Main problem would be that there may be version
> incompatibilities.
That, too, is something that both Thrift and Protocol Buffers try to
address.
------------------------------
Date: Mon, 08 Aug 2011 20:37:54 +0200
From: Bjoern Hoehrmann <bjoern@hoehrmann.de>
Subject: Re: Which data serialization format?
Message-Id: <t760479hp12spbdd7841okn0i7srt6vsr6@hive.bjoern.hoehrmann.de>
* Tobias Nissen wrote in comp.lang.perl.misc:
>Sorry, I forgot to mention it, but I like the idea of the format to be
>programming language independent. Thrift e.g. seems to officially
>support 14 different languages, which was one of the reasons why I
>picked it for my experiments.
>
>I also dislike the absence of basic types (like String, Int, Bool, ...)
>when using Storable. Or the the fact that I'd have to write a type
>checker myself. Also there are no schemas and hence no automatic code
>generation. It's all too dynamic for my use case.
The types are there just as they exist in Perl, and you can use any tool
that is compatible with the type system for things like validation. It's
just bits on the disk, but with all serialization formats you get things
like "dictionary with key string example and value number 1" in memory;
JSON::Schema for instance does not require you to pass in actual JSON.
The popular formats with good tool support in Perl are YAML and JSON and
for both you'd have to use an encoding like Base64 to reliably serialize
binary data (YAML though allows you to tag values as Base64-encoded, so
in theory the support for binary data is better there, but tool support
for that is a bit lacking).
Note that Perl itself does not distinguish between text and binary, you
will have to include logic for that in the code regardless of the format
you pick, if you actually need to tell those cases apart (if you do not
care, note that U+0000 through U+00FF are perfectly valid characters and
can be represented easily in either format; in that sense both support
binary data quite well).
--
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
------------------------------
Date: Mon, 08 Aug 2011 13:50:41 -0500
From: Ted Zlatanov <tzz@lifelogs.com>
Subject: Re: Which data serialization format?
Message-Id: <87aabj20vy.fsf@lifelogs.com>
On Mon, 8 Aug 2011 17:25:05 +0200 Tobias Nissen <tn@movb.de> wrote:
TN> I have made some tests with the official Perl-Implementation of Thriftâ°
TN> and have not yet discovered any major issues that would prevent me from
TN> using it. However, it lacks proper documenation and I am not entirely
TN> convinced that the user-mailing list would be especially helpful, once
TN> I dig deeper into it.
I suffered quite a bit using Thrift from Perl for the
Net::Cassandra::Easy module. It was frustrating. I think Chip
Salzenberg had a similar experience and ended up routing around the
Thrift Perl modules, using the C or C++ Thrift interfaces, last time I
talked to him. Thrift in Perl is slow, slow, slow.
The Thrift developers asked me to rewrite their Perl support to make it
more efficient. I don't have the cycles to do it, but if anyone wants
to do it... feel free.
There's also Avro, which at one time was a contender to replace Thrift
for Cassandra. It didn't, but it's a pretty nice protocol, similar to
Thrift and probably better for Perl support.
Today, I would either use JSON-encoded binary data or I would find a
way to use Google Protocol Buffers through a C/C++ embedded library,
depending on the complexity of the work.
Ted
------------------------------
Date: Mon, 08 Aug 2011 21:55:45 +0200
From: Peter Makholm <peter@makholm.net>
Subject: Re: Which data serialization format?
Message-Id: <87bovzsmny.fsf@vps1.hacking.dk>
Bjoern Hoehrmann <bjoern@hoehrmann.de> writes:
> That would seem to be Storable.pm, which is a Core module since v5.7.3;
> it supports pretty much everything you can reasonably represent in Perl
> including, say, circular references, which formats like JSON are unable
> to support without adding awkward indirection.
On the other hand, supporting pretty much everything has a cost even
though you don't use cyclic structures. For one benchmark[0] I measured
JSON::XS to be about 4 times faster going from perl to serialized format
and 25% faster the other way too.
The JSON::XS output was even smaller than the Storable output...
0) https://github.com/pmakholm/benchmark-serialize-perl/blob/master/README
But in the end it all depends on your needs. If you need support for
non-treeish data JSON is a no-go. If you need direct support for blessed
references JSON needs to be wrapped (I have never benchmarked this). If
ypu need a open door for non-perl languages Storable is a no-go.
If you think it is worth discussing (that is, in you case it will not be
a microoptimization), then come up with some example structures and feed
them to Benchmark::Serialize. This gives you numbers - and then we can
discuss the relevance of these numbers afterwards.
> There are a couple of RPC related modules on CPAN that use Storable
> for marshalling. Main problem would be that there may be version
> incompatibilities.
Yeah, I have some bad experiences. Version incompatibilities. 32 bit/64
bit incompatibilities. The 3 years later we want to interface the system
from this lua scriptable C++ project incompatibilities.
But hey, used right it is at least endian agnostic. Not that I remember
the last time I deployed my in-house developed project on anything not
from the x86 family of endianness...
Even if I needed the extra features I would not easily go for Storable.
//Makholm
------------------------------
Date: Tue, 9 Aug 2011 09:24:21 +0200
From: Tobias Nissen <tn@movb.de>
Subject: Re: Which data serialization format?
Message-Id: <20110809092421.5d28178c@hal.movb.de>
Ted Zlatanov wrote:
[...]
> Today, I would either use JSON-encoded binary data or I would find a
> way to use Google Protocol Buffers through a C/C++ embedded library,
> depending on the complexity of the work.
Since the whole thing is going to do RPC I want something that does not
process messages not conforming to some kind of schema. JSON::Schema=C2=B9
does not seem to be in a "production grade" state.
I'm not quite sure whether protoxs=C2=B2 validates messages/requests/calls
(whatever you want to call it) upon receipt. But at least the generated
code forces programmers on the client side and on the server side to a
common set of fields.
I want to use Moose throughout the code. Is there some RPC framework
that it more suited for the use with Moose (and protobuf) than others?
There are so many to pick from!
=C2=B9 http://search.cpan.org/dist/JSON-Schema/
=C2=B2 http://code.google.com/p/protobuf-perlxs/
------------------------------
Date: Tue, 9 Aug 2011 09:29:53 +0200
From: Tobias Nissen <tn@movb.de>
Subject: Re: Which data serialization format?
Message-Id: <20110809092953.24a9c3a7@hal.movb.de>
Tobias Nissen wrote:
[...]
> I'm not quite sure whether protoxs=C2=B2 validates messages/requests/calls
> (whatever you want to call it) upon receipt.
Ah please ignore that, protobuf doesn't have an RPC-implementation, I
confused it with Thrift.
------------------------------
Date: Mon, 8 Aug 2011 00:05:35 -0700 (PDT)
From: Denis Valdenaire <dvaldenaire@gmail.com>
Subject: Re: XML::Simple drives me mad
Message-Id: <02cd9bfb-b364-42de-9fee-484798baa375@a31g2000vbt.googlegroups.com>
Hi there,
Thanks for you help !!!
> > my $result = $config->{sync_method}->{sync_modules}->{ena};
> > for my $row (@$result) {
> > print $row->{name}, "\n";
> > }
That solution worked. I finally understand why a direct usage of
$config->{... etc. } like :
for my $row (@$config->{sync_method}->{sync_modules}->{ena}) {
print $row->{name}, "\n";
}
did not work (Not an ARRAY reference at ./test.pl line 35.) and
for my $row (@{$config->{sync_method}->{sync_modules}->{ena}}) {
print $row->{name}, "\n";
}
Also work.. i need to read documentation about the usage of {}...
The last solution with $_ was also elegant.
Thanks a lot again for all !!!
Denis
------------------------------
Date: Mon, 08 Aug 2011 08:01:41 -0500
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: XML::Simple drives me mad
Message-Id: <slrnj3vn35.m8g.tadmc@tadbox.sbcglobal.net>
Denis Valdenaire <dvaldenaire@gmail.com> wrote:
> for my $row (@{$config->{sync_method}->{sync_modules}->{ena}}) {
> i need to read documentation about the usage of {}...
It is an application of "Use Rule 1" from:
perldoc perlreftut
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 3469
***************************************