[32115] in Perl-Users-Digest
Perl-Users Digest, Issue: 3380 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed May 11 14:09:28 2011
Date: Wed, 11 May 2011 11:09:10 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 11 May 2011 Volume: 11 Number: 3380
Today's topics:
alternatives for external data storage and manipulation <cartercc@gmail.com>
Re: alternatives for external data storage and manipula sln@netherlands.com
Re: alternatives for external data storage and manipula <cartercc@gmail.com>
Re: alternatives for external data storage and manipula <john@castleamber.com>
Re: alternatives for external data storage and manipula <cartercc@gmail.com>
Re: alternatives for external data storage and manipula <john@castleamber.com>
Re: Bug in assigning a V-string literal to substr() in <jl_post@hotmail.com>
Re: Bug in assigning a V-string literal to substr() in <uri@StemSystems.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Wed, 11 May 2011 08:04:58 -0700 (PDT)
From: ccc31807 <cartercc@gmail.com>
Subject: alternatives for external data storage and manipulation
Message-Id: <c0df012a-8f3a-4e89-a72a-6026b7a4f629@r27g2000prr.googlegroups.com>
The vast majority of my work (95%+) involves writing data to and
reading data from various kinds of databases (including Excel as a
'database'). For this I use CSV files.
The other 5% more or less involves keeping data handy for my programs
to read and write, for which I've been using Storable. It's never
failed, and is quick and easy.
I'm beginning to see requirements for a form of non-DB storage that
others can hand-edit for particular reasons. In the past, I've
handrolled a kind of INI format which has worked perfectly, but this
isn't really satisfactory for more complicated data structures that
will contain much more data.
My requirements are (first) to write data in a format that will easily
convert to and from multi-level Perl data structures, and (second)
that will:
- be readable and editable by humans,
- that will contain up to hundreds of records of fair complexity
- that can communicate natively with Perl
- that uses formatting that users cannot easily corrupt
I've been looking at some alternatives, but haven't reached any
conclusions. They are
- XML
- JSON
- INI
- YAML
- CSV (just for giggles: actually CSV works perfectly with Excel,
except that people tend to save Excel files in different ways and I
can't rely on opening files with CSV format after a user plays with in
it Excel)
Before I spend any more time on this, I though I'd ask for comments
here.
TIA, CC.
P.S. For those of you who like to see code, I've used a few scripts
that use XML, but XML is hard on the eyes and I'd really like to avoid
XML's verbosity if I could. Is JSON or YAML the answer?
------------------------------
Date: Wed, 11 May 2011 08:37:15 -0700
From: sln@netherlands.com
Subject: Re: alternatives for external data storage and manipulation
Message-Id: <0mals6h77gs66j17do5p9htloppjbghcvk@4ax.com>
On Wed, 11 May 2011 08:04:58 -0700 (PDT), ccc31807 <cartercc@gmail.com> wrote:
[...]
>I'm beginning to see requirements for a form of non-DB storage that
>others can hand-edit for particular reasons. In the past, I've
>handrolled a kind of INI format which has worked perfectly, but this
>isn't really satisfactory for more complicated data structures that
>will contain much more data.
>
>My requirements are (first) to write data in a format that will easily
>convert to and from multi-level Perl data structures, and (second)
>that will:
> - be readable and editable by humans,
> - that will contain up to hundreds of records of fair complexity
> - that can communicate natively with Perl
> - that uses formatting that users cannot easily corrupt
>
>I've been looking at some alternatives, but haven't reached any
>conclusions. They are
> - XML
> - JSON
> - INI
> - YAML
> - CSV (just for giggles: actually CSV works perfectly with Excel,
>except that people tend to save Excel files in different ways and I
>can't rely on opening files with CSV format after a user plays with in
>it Excel)
>
I think that people, no matter how carefull they are, have never
been trusted to alter data records outside a strictly controlled
validation bubble. I don't think that will ever change.
YAML, no matter how readable, is actually quite complex to parse
given the range of nesting it can do. To me, its meant as a transport
mediem for station to station encode/decoding processors, not
really intermediate hand editing.
-sln
------------------------------
Date: Wed, 11 May 2011 08:48:27 -0700 (PDT)
From: ccc31807 <cartercc@gmail.com>
Subject: Re: alternatives for external data storage and manipulation
Message-Id: <36c403ca-bb87-4a41-919d-d68f0f629ba3@x38g2000pri.googlegroups.com>
On May 11, 11:37=A0am, s...@netherlands.com wrote:
> I think that people, no matter how carefull they are, have never
> been trusted to alter data records outside a strictly controlled
> validation bubble. I don't think that will ever change.
People cannot be trusted to alter data records INSIDE a strictly
controlled validation bubble. Bad data isn't a problem that any kind
of programming can solve. People can mistype an IP address and it will
still LOOK like a valid IP address.
> YAML, no matter how readable, is actually quite complex to parse
> given the range of nesting it can do. To me, its meant as a transport
> mediem for station to station encode/decoding processors, not
> really intermediate hand editing.
What I need is for users to be able to alter simple things like the IP
address of a machine to connect to, passwords, file names, and
potentially a directory path. This is driven by their unwillingness to
contact me to change these parameters and my unwillingness to be
contacted to change these parameters. For simple scripts, the INI file
format works well. I'm looking for something that will handle a file
with several dozen records and maybe a dozen fields per record,
including multi-value and/or multi-level fields.
Thanks, CC.
------------------------------
Date: Wed, 11 May 2011 12:07:12 -0500
From: John Bokma <john@castleamber.com>
Subject: Re: alternatives for external data storage and manipulation
Message-Id: <87liyd5fpb.fsf@castleamber.com>
ccc31807 <cartercc@gmail.com> writes:
> The vast majority of my work (95%+) involves writing data to and
> reading data from various kinds of databases (including Excel as a
> 'database'). For this I use CSV files.
>
> The other 5% more or less involves keeping data handy for my programs
> to read and write, for which I've been using Storable. It's never
> failed, and is quick and easy.
>
> I'm beginning to see requirements for a form of non-DB storage that
> others can hand-edit for particular reasons. In the past, I've
> handrolled a kind of INI format which has worked perfectly, but this
> isn't really satisfactory for more complicated data structures that
> will contain much more data.
>
> My requirements are (first) to write data in a format that will easily
> convert to and from multi-level Perl data structures, and (second)
> that will:
> - be readable and editable by humans,
> - that will contain up to hundreds of records of fair complexity
> - that can communicate natively with Perl
> - that uses formatting that users cannot easily corrupt
I think 1) and 4) will bite each other very often. I think that 4 should
be (or maybe 5): if the data gets corrupted this must be easy to detect.
Also, who is going to edit those files, what are their skills?
XML might be a good choice since you can validate it, and there are
editors dedicated to editing XML.
[..]
> P.S. For those of you who like to see code, I've used a few scripts
> that use XML, but XML is hard on the eyes and I'd really like to avoid
> XML's verbosity if I could. Is JSON or YAML the answer?
Depends. If you are the main user and don't need the overhead of XML;
use JSON or YAML.
If XML is a problem, switching to a good XML editor or an editor that
can handle XML more sanely might be the improvement you're looking for.
--
John Bokma j3b
Blog: http://johnbokma.com/ Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl & Python Development: http://castleamber.com/
------------------------------
Date: Wed, 11 May 2011 10:27:32 -0700 (PDT)
From: ccc31807 <cartercc@gmail.com>
Subject: Re: alternatives for external data storage and manipulation
Message-Id: <d9eb799c-b0b2-4a0a-a48f-37daa18f9b3c@w36g2000vbi.googlegroups.com>
On May 11, 1:07=A0pm, John Bokma <j...@castleamber.com> wrote:
> I think 1) and 4) will bite each other very often. I think that 4 should
> be (or maybe 5): if the data gets corrupted this must be easy to detect.
If the data gets corrupted, the app doesn't function. I can't control
the validity of the data, the only thing I can control is that the
format of the file. IOW, if given a valid file name, the program
opens, reads, and closes the file, otherwise it reports an error and
the user fixes it, but if the file format is corrupt, I have to fix
it.
> Also, who is going to edit those files, what are their skills?
Computer literate non-programmers, think sys admins. The know what you
mean when you tell them to save a text file in Notepad as CSV or JSON
or XML using the All Files type.
> XML might be a good choice since you can validate it, and there are
> editors dedicated to editing XML.
If I have to validate the file format, I've already lost. I need to be
able to give the user a file that he/she can open, read, change, and
close in a simple text editor, like Notepad. Yeah, I could write a
database utility but I shouldn't need to for IT staff to make simple
changes to what will be (in essence) configuration files.
> Depends. If you are the main user and don't need the overhead of XML;
> use JSON or YAML.
>
> If XML is a problem, switching to a good XML editor or an editor that
> can handle XML more sanely might be the improvement you're looking for.
It's too easy to damage an XML file, the format is fragile and you
have to precisely match opening and closing tags with proper nesting
and all that. This is why the simple
INI 'key=3Dvalue' and CSV 'key,col1,col2,col3,col4,col5' formats work so
well.
CC.
------------------------------
Date: Wed, 11 May 2011 12:52:36 -0500
From: John Bokma <john@castleamber.com>
Subject: Re: alternatives for external data storage and manipulation
Message-Id: <87mxit6s63.fsf@castleamber.com>
ccc31807 <cartercc@gmail.com> writes:
> On May 11, 1:07Â pm, John Bokma <j...@castleamber.com> wrote:
>> I think 1) and 4) will bite each other very often. I think that 4 should
>> be (or maybe 5): if the data gets corrupted this must be easy to detect.
>
> If the data gets corrupted, the app doesn't function. I can't control
> the validity of the data, the only thing I can control is that the
> format of the file. IOW, if given a valid file name, the program
> opens, reads, and closes the file, otherwise it reports an error and
> the user fixes it, but if the file format is corrupt, I have to fix
> it.
>
>> Also, who is going to edit those files, what are their skills?
>
> Computer literate non-programmers, think sys admins. The know what you
> mean when you tell them to save a text file in Notepad as CSV or JSON
> or XML using the All Files type.
>
>> XML might be a good choice since you can validate it, and there are
>> editors dedicated to editing XML.
>
> If I have to validate the file format, I've already lost.
What I meant to write is "since there are already plenty of tools out
there to make validation easier".
You have to validate the data that you read, always. You have to check
for entries missing and to check if entries have values that are legal.
The advantage of XML is that there the tools for validation are already
written, and that there are tools that given a sample XML file infer a
schema that can be used to validate the sample XML (which you might have
to fine-tune, of course), e.g. RELAX NG / Trang / Jing, see
http://relaxng.org/
If you use YAML/JSON you have to still check the values you get
(missing? within range) inside your program.
> I need to be able to give the user a file that he/she can open, read,
> change, and close in a simple text editor, like Notepad. Yeah, I could
> write a database utility but I shouldn't need to for IT staff to make
> simple changes to what will be (in essence) configuration files.
OK, XML in this case is harder since it's more verbose and hence more
prone to accidental errors.
>> Depends. If you are the main user and don't need the overhead of XML;
>> use JSON or YAML.
>>
>> If XML is a problem, switching to a good XML editor or an editor that
>> can handle XML more sanely might be the improvement you're looking for.
>
> It's too easy to damage an XML file, the format is fragile and you
> have to precisely match opening and closing tags with proper nesting
> and all that. This is why the simple
> INI 'key=value' and CSV 'key,col1,col2,col3,col4,col5' formats work so
> well.
<config>
<key>value</key>
</config>
Sure, it's more verbose, and hence more error prone. But you are /totally/
in control of the validation.
What happens if your INI has key+value or kye=value?
What happens if your CSV is missing a , or has one too many (assuming
notepad)?
--
John Bokma j3b
Blog: http://johnbokma.com/ Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl & Python Development: http://castleamber.com/
------------------------------
Date: Tue, 10 May 2011 15:58:16 -0700 (PDT)
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: Bug in assigning a V-string literal to substr() in a loop?
Message-Id: <d1055286-4d44-4223-ba0c-40fb140d1716@24g2000yqk.googlegroups.com>
On May 3, 5:37=A0pm, "Uri Guttman" <u...@StemSystems.com> wrote:
>
> =A0 =A0 =A0 =A0 Note: Version Strings (v-strings) have been deprecated. =
=A0They
> =A0 =A0 =A0 =A0 will be removed in some future release after Perl 5.8.1. =
=A0The
> =A0 =A0 =A0 =A0 marginal benefits of v-strings were greatly outweighed by=
the
> =A0 =A0 =A0 =A0 potential for Surprise and Confusion.
>
> so the answer is don't use v-strings. bug or no bug they were a bad idea
> and never worked well enough to keep around.
Thanks for that. I never saw that deprecation announcement.
However, while I can find that same warning in "perldoc perldata"
for Perl 5.8, I can't seem to find it in any perldocs for Perl 5.12,
or on perldoc.perl.org . In fact, in "perldoc per5120delta" it
explicitly says, "v-strings are not deprecated." (So maybe the v-
string deprecation itself was deprecated?)
Bad idea or not, this still appears to be a bug, so I think I
should report it. For all we know, the bug may not lie with v-strings
themselves, but rather with substr() or loops in general. (It's hard
to know for sure, and investigating this bug might uncover something
else wrong that no one was aware of.)
Cheers,
-- Jean-Luc
------------------------------
Date: Wed, 11 May 2011 00:36:24 -0400
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Bug in assigning a V-string literal to substr() in a loop?
Message-Id: <87aaetkg53.fsf@quad.sysarch.com>
>>>>> "jpc" == jl post@hotmail com <jl_post@hotmail.com> writes:
jpc> On May 3, 5:37 pm, "Uri Guttman" <u...@StemSystems.com> wrote:
>>
>> Note: Version Strings (v-strings) have been deprecated. They
>> will be removed in some future release after Perl 5.8.1. The
>> marginal benefits of v-strings were greatly outweighed by the
>> potential for Surprise and Confusion.
>>
>> so the answer is don't use v-strings. bug or no bug they were a bad idea
>> and never worked well enough to keep around.
jpc> Thanks for that. I never saw that deprecation announcement.
jpc> However, while I can find that same warning in "perldoc perldata"
jpc> for Perl 5.8, I can't seem to find it in any perldocs for Perl 5.12,
jpc> or on perldoc.perl.org . In fact, in "perldoc per5120delta" it
jpc> explicitly says, "v-strings are not deprecated." (So maybe the v-
jpc> string deprecation itself was deprecated?)
jpc> Bad idea or not, this still appears to be a bug, so I think I
jpc> should report it. For all we know, the bug may not lie with
jpc> v-strings themselves, but rather with substr() or loops in
jpc> general. (It's hard to know for sure, and investigating this bug
jpc> might uncover something else wrong that no one was aware of.)
regardless of the deprecation state, v-strings are a poor idea. you can
report the bug but then you should drop the idea of using them. i don't
know of anyone who uses them or likes them. it was just a poorly thought
out idea which never worked as expected.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 3380
***************************************