[31074] in Perl-Users-Digest
Perl-Users Digest, Issue: 2319 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Apr 5 16:14:47 2009
Date: Sun, 5 Apr 2009 13:14:41 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sun, 5 Apr 2009 Volume: 11 Number: 2319
Today's topics:
Re: Perl symantics (Randal L. Schwartz)
Re: Perl symantics (Tim McDaniel)
Re: Perl symantics <ben@morrow.me.uk>
Process header record and concatenate files <sas_l_739@yahoo.com.au>
Re: Process header record and concatenate files <someone@example.com>
Re: Process header record and concatenate files <tadmc@seesig.invalid>
Re: Process header record and concatenate files <whynot@pozharski.name>
resolve single line with multiple items into mutliple l <ela@yantai.org>
Re: resolve single line with multiple items into mutlip <tadmc@seesig.invalid>
Re: resolve single line with multiple items into mutlip <ela@yantai.org>
Re: resolve single line with multiple items into mutlip <jurgenex@hotmail.com>
Re: resolve single line with multiple items into mutlip <ela@yantai.org>
Re: resolve single line with multiple items into mutlip <tadmc@seesig.invalid>
Re: resolve single line with multiple items into mutlip <jurgenex@hotmail.com>
Re: resolve single line with multiple items into mutlip <willem@snail.stack.nl>
Re: sftp; Emits a warning if the operation fails. <jimsgibson@gmail.com>
Substitutions based on Posix ERE's in perl <peter@makholm.net>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 03 Apr 2009 13:14:55 -0700
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Perl symantics
Message-Id: <86fxgph4u8.fsf@blue.stonehenge.com>
>>>>> "Tim" == Tim McDaniel <tmcd@panix.com> writes:
Tim> For me, I like the sanity checking that can happen with strong typing.
Tim> If I know that variable "count" is used for the count of the number of
Tim> items that it's supposed to process, for examle, then I know it should
Tim> contain only an integer and I'd really like to know if code tries to
Tim> assign 'frog' to it, ideally at compile time.
You mean, you won't be writing unit tests for that, or relying on
the + operator to properly blow up?
Compile-time checking is insufficient to perform all tests, hence you will
need to write tests. Since you're writing tests anyway, compile-time checking
isn't necessary, since you're already checking for proper behavior.
Hence, compile-time checking is irrelevant.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
------------------------------
Date: Fri, 3 Apr 2009 23:32:04 +0000 (UTC)
From: tmcd@panix.com (Tim McDaniel)
Subject: Re: Perl symantics
Message-Id: <gr669k$fgv$1@reader1.panix.com>
In article <86fxgph4u8.fsf@blue.stonehenge.com>,
Randal L. Schwartz <merlyn@stonehenge.com> wrote:
>>>>>> "Tim" == Tim McDaniel <tmcd@panix.com> writes:
>
>Tim> For me, I like the sanity checking that can happen with strong
>Tim> typing. If I know that variable "count" is used for the count
>Tim> of the number of items that it's supposed to process, for
>Tim> examle, then I know it should contain only an integer and I'd
>Tim> really like to know if code tries to assign 'frog' to it,
>Tim> ideally at compile time.
>
>You mean, you won't be ... relying on the + operator to properly blow
>up?
At compile time? Only if it does flow analysis.
>Compile-time checking is insufficient to perform all tests, hence you
>will need to write tests. Since you're writing tests anyway,
>compile-time checking isn't necessary, since you're already checking
>for proper behavior.
There's variable amounts of effort involved, though. See how many
replies here say some version of "Add use warnings; use strict;".
Sure, those checks are not sufficient to catch ALL problems, but that
little effort gets a big payback. Similarly, "int" and such give
similar large-grained control but can give large (if not complete)
paybacks.
Also, where it is possible make a declaration that sticks, that's an
assertion provided in one place and used everywhere, instead of adding
an assertion everywhere the value gets assigned to or used.
I have a vague memory of Perl adding some sort of attributes facility
for variables, but the same vague memory suggests that they were
half-baked and have never been really supported or popular. Am I
mistaken? If they are solid now, where can I read up on them?
--
Tim McDaniel, tmcd@panix.com
------------------------------
Date: Sat, 4 Apr 2009 02:31:21 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Perl symantics
Message-Id: <9spia6-fcv2.ln1@osiris.mauzo.dyndns.org>
Quoth tmcd@panix.com:
>
> I have a vague memory of Perl adding some sort of attributes facility
> for variables, but the same vague memory suggests that they were
> half-baked and have never been really supported or popular. Am I
> mistaken? If they are solid now, where can I read up on them?
Err... perldoc attributes? :)
That interface is quite low-level. You might rather look at the
Attribute::Handlers module, though since it's written by Damian Conway
it does some *rather* funky stuff behind the scenes.
Basic use of sub attributes is entirely solid. See Catalyst for an
example of a framework that makes extensive use of them. Beyond that I
have no idea (I don't know of any problems, but I've never used it
either).
Ben
------------------------------
Date: Sat, 4 Apr 2009 19:22:48 -0700 (PDT)
From: Scott Bass <sas_l_739@yahoo.com.au>
Subject: Process header record and concatenate files
Message-Id: <98e7ff8c-c579-4291-ad55-7b1272a2e38e@r31g2000prh.googlegroups.com>
Hi,
I'm not looking for a full blown solution, just architectural advice
for the following design criteria...
Input File(s): (tilde delimited)
Line 1:
Header Record:
SourceSystem~EffectiveDate~ExtractDateAndTime~NumberRecords~FileFormatVersion
RemainingRecords:
72 columns of delimited data
Ouput File:
Concatenate the input files into a single output file. A subset of
the header fields are prepended to the data lines as follows:
SourceSystem~EffectiveDate~ExtractDateAndTime~72 columns of delimited
data
Design Criteria:
1) If number of records in the file does not match the number of
records reported in the header (incomplete FTP), abort the entire
file, print an error message, but continue processing the remaining
files.
(I'll use split and join to process the header and prepend to the
remainder).
2) Specify the list of input files on the command line. Specify the
output file on the command line. For example:
concat.pl -in foo.dat bar.dat blah.dat -out concat.dat
or possibly:
concat.pl -in src_*.dat -out concat.dat
(I'll use GetOptions to process the command line)
My thoughts:
1) Slurp the file into an array (minus first record). Count the
elements in the array. Abort if not equal to the number in the
header, else concat to the output file.
2) Process the file, reading records. At EOF, get record number from
$. . If correct, rewind to beginning of file handle and concat to
output file. (Not sure how to do the rewind bit).
3) Process the file, writing to a temp file. At EOF, get record
number from $. . If correct, concat the temp file to the output file.
Questions:
A) If I've globbed the files on the command line and am processing
the file handle <>, how do I know when the file name has changed?
B) When that happens, how do I reset $. to 1?
C) Of the three approaches above, which is the "best"? Performance
is important but not critical. I lean toward #3, since I need to
cater for files too large for #1. Or if you have a better idea please
let me know.
I hope this wasn't too cryptic...I was trying to keep it short.
Thanks,
Scott
------------------------------
Date: Sat, 04 Apr 2009 20:40:16 -0700
From: "John W. Krahn" <someone@example.com>
Subject: Re: Process header record and concatenate files
Message-Id: <BKVBl.1095$g%5.1068@newsfe23.iad>
Scott Bass wrote:
>
> I'm not looking for a full blown solution, just architectural advice
> for the following design criteria...
>
> Input File(s): (tilde delimited)
> Line 1:
> Header Record:
> SourceSystem~EffectiveDate~ExtractDateAndTime~NumberRecords~FileFormatVersion
>
> RemainingRecords:
> 72 columns of delimited data
>
> Ouput File:
> Concatenate the input files into a single output file. A subset of
> the header fields are prepended to the data lines as follows:
>
> SourceSystem~EffectiveDate~ExtractDateAndTime~72 columns of delimited
> data
>
> Design Criteria:
> 1) If number of records in the file does not match the number of
> records reported in the header (incomplete FTP), abort the entire
> file, print an error message, but continue processing the remaining
> files.
>
> (I'll use split and join to process the header and prepend to the
> remainder).
>
> 2) Specify the list of input files on the command line. Specify the
> output file on the command line. For example:
>
> concat.pl -in foo.dat bar.dat blah.dat -out concat.dat
>
> or possibly:
>
> concat.pl -in src_*.dat -out concat.dat
>
> (I'll use GetOptions to process the command line)
>
> My thoughts:
>
> 1) Slurp the file into an array (minus first record). Count the
> elements in the array. Abort if not equal to the number in the
> header, else concat to the output file.
>
> 2) Process the file, reading records. At EOF, get record number from
> $. . If correct, rewind to beginning of file handle and concat to
> output file. (Not sure how to do the rewind bit).
>
> 3) Process the file, writing to a temp file. At EOF, get record
> number from $. . If correct, concat the temp file to the output file.
>
> Questions:
>
> A) If I've globbed the files on the command line and am processing
> the file handle <>, how do I know when the file name has changed?
perldoc -f eof
[ snip ]
In a "while (<>)" loop, "eof" or "eof(ARGV)" can be used to
detect the end of each file, "eof()" will only detect the
end of the last file.
> B) When that happens, how do I reset $. to 1?
When you reach the end-of-file as determined by the eof function close
the ARGV filehandle and $. will be reset.
> C) Of the three approaches above, which is the "best"? Performance
> is important but not critical.
You'd probably have to test them with real data to determine the "best".
John
--
Those people who think they know everything are a great
annoyance to those of us who do. -- Isaac Asimov
------------------------------
Date: Sat, 4 Apr 2009 23:02:41 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: Process header record and concatenate files
Message-Id: <slrngtgbb1.6ap.tadmc@tadmc30.sbcglobal.net>
Scott Bass <sas_l_739@yahoo.com.au> wrote:
> Questions:
>
> A) If I've globbed the files on the command line and am processing
> the file handle <>, how do I know when the file name has changed?
>
> B) When that happens, how do I reset $. to 1?
perldoc -f eof
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Sun, 05 Apr 2009 15:27:42 +0300
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: Process header record and concatenate files
Message-Id: <slrngth8ui.7s8.whynot@orphan.zombinet>
On 2009-04-05, Scott Bass <sas_l_739@yahoo.com.au> wrote:
*SKIP*
> A) If I've globbed the files on the command line and am processing
> the file handle <>, how do I know when the file name has changed?
>
> B) When that happens, how do I reset $. to 1?
If I got your problem right, you've missed I<@ARGV>; then you could
make your own cycle off command-line files. Or alternatively monitor
I<$ARGV> and maintain your own line counter instead of using I<$.>.
> C) Of the three approaches above, which is the "best"? Performance
> is important but not critical. I lean toward #3, since I need to
> cater for files too large for #1. Or if you have a better idea please
> let me know.
use Your::Taste qw| full |;
or
use Your::Intuiton qw| reverse |;
or
use Benchmark qw| timethese |;
> I hope this wasn't too cryptic...I was trying to keep it short.
You better show your code. Perl is powerfully expressive or
expressively powerful (I doubt I would ever get that right).
--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom
------------------------------
Date: Sun, 5 Apr 2009 14:50:04 +0800
From: "ela" <ela@yantai.org>
Subject: resolve single line with multiple items into mutliple lines, single items
Message-Id: <gr9kaq$nr3$1@ijustice.itsc.cuhk.edu.hk>
Old line(columns tab-delimited):
Col1 Col2 Col3 ... Coln
A B1@B2 C ... N1@N2@N3
New lines
A B1 C .. N1
A B1 C .. N2
A B1 C .. N3
A B2 C .. N1
A B2 C .. N2
A B2 C .. N3
The problem is: although pattern matching can recognize "@", but how to
write the code generically so to get all N1, N2 and N3, such that the number
of items aren't known beforehand?
------------------------------
Date: Sun, 5 Apr 2009 08:42:42 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: resolve single line with multiple items into mutliple lines, single items
Message-Id: <slrngthdai.akv.tadmc@tadmc30.sbcglobal.net>
ela <ela@yantai.org> wrote:
> Old line(columns tab-delimited):
>
> Col1 Col2 Col3 ... Coln
> A B1@B2 C ... N1@N2@N3
>
> New lines
> A B1 C .. N1
> A B1 C .. N2
> A B1 C .. N3
> A B2 C .. N1
> A B2 C .. N2
> A B2 C .. N3
>
> The problem is: although pattern matching can recognize "@", but how to
> write the code generically so to get all N1, N2 and N3, such that the number
> of items aren't known beforehand?
-------------------------
#!/usr/bin/perl
use warnings;
use strict;
$_ = "A\tB1\@B2\tC\tN1\@N2\@N3\n";
#1 while s/(.*?)([^\t]+)\@([^\t\n]+)(.*\n)/$1$2$4$1$3$4/;
1 while s/(.*?) # before pair to expand
([^\t]+) # left value
\@
([^\t\n]+) # right value
(.*\n) # after pair to expand
/$1$2$4$1$3$4/x;
print;
-------------------------
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Sun, 5 Apr 2009 23:44:59 +0800
From: "ela" <ela@yantai.org>
Subject: Re: resolve single line with multiple items into mutliple lines, single items
Message-Id: <grajlp$5du$1@ijustice.itsc.cuhk.edu.hk>
I really thank Willem & McClellan, who proposed solns. Yet, there are too
many symbols that can't be Googled, I fail to understand their codes....
------------------------------
Date: Sun, 05 Apr 2009 09:09:33 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: resolve single line with multiple items into mutliple lines, single items
Message-Id: <rllht45qejmd4s8g67rkfam9kr1ss58n22@4ax.com>
"ela" <ela@yantai.org> wrote:
>Old line(columns tab-delimited):
>
>Col1 Col2 Col3 ... Coln
>A B1@B2 C ... N1@N2@N3
>
>New lines
>A B1 C .. N1
>A B1 C .. N2
>A B1 C .. N3
>A B2 C .. N1
>A B2 C .. N2
>A B2 C .. N3
>
>The problem is: although pattern matching can recognize "@", but how to
>write the code generically so to get all N1, N2 and N3, such that the number
>of items aren't known beforehand?
split() line at tab (to get indivudual column), then foreach() column
split() at '@' to get list of individual values.
This automatically leads to a nested loop, which you can use nicely to
print the lines in the desired order.
jue
------------------------------
Date: Mon, 6 Apr 2009 00:22:16 +0800
From: "ela" <ela@yantai.org>
Subject: Re: resolve single line with multiple items into mutliple lines, single items
Message-Id: <gralvh$6i3$1@ijustice.itsc.cuhk.edu.hk>
> split() line at tab (to get indivudual column), then foreach() column
> split() at '@' to get list of individual values.
>
> This automatically leads to a nested loop, which you can use nicely to
> print the lines in the desired order.
>
> jue
It seems that this is also a direction, can foreach() be recursively used?
Because I don't want to write "n" foreach()'s.
------------------------------
Date: Sun, 5 Apr 2009 11:11:34 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: resolve single line with multiple items into mutliple lines, single items
Message-Id: <slrngthm1m.c3h.tadmc@tadmc30.sbcglobal.net>
ela <ela@yantai.org> wrote:
> I really thank Willem & McClellan, who proposed solns. Yet, there are too
> many symbols that can't be Googled,
Good, because you don't want to find random crap on the interweb.
You want to find focused and accurate information on your own hard disk.
perldoc perlrequick
perldoc perlretut
perldoc perlre
> I fail to understand their codes....
If you ask specifice questions about specific bits of code
(after trying to find out in the std docs first), you will
likely get help here.
"I fail to understand their codes" is too general for us
to be able to help you.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Sun, 05 Apr 2009 10:00:48 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: resolve single line with multiple items into mutliple lines, single items
Message-Id: <0tmht4t9rdfcuirru8l6g936vh4fuctdj0@4ax.com>
"ela" <ela@yantai.org> wrote:
>> split() line at tab (to get indivudual column), then foreach() column
>> split() at '@' to get list of individual values.
>>
>> This automatically leads to a nested loop, which you can use nicely to
>> print the lines in the desired order.
>
>It seems that this is also a direction, can foreach() be recursively used?
???
Recursion and loops are two different ways to achive the same result:
repeating the execution of some code with modified data. Yes, of course
you can mix them as you like, but why would you want to?
>Because I don't want to write "n" foreach()'s.
Having said that, I spoke too hastely. Nested foreach() are great to get
the individual values and store them e.g. in an AoA.
But creating the output within the same loop is very awkward and you
will be far better of storing the data first and using a second loop as
suggested by others or by using a recursive algorithm.
jue
------------------------------
Date: Sun, 5 Apr 2009 08:23:30 +0000 (UTC)
From: Willem <willem@snail.stack.nl>
Subject: Re: resolve single line with multiple items into mutliple lines, single items
Message-Id: <slrngtgqk2.148s.willem@snail.stack.nl>
ela wrote:
) Old line(columns tab-delimited):
)
) Col1 Col2 Col3 ... Coln
) A B1@B2 C ... N1@N2@N3
)
) New lines
) A B1 C .. N1
) A B1 C .. N2
) A B1 C .. N3
) A B2 C .. N1
) A B2 C .. N2
) A B2 C .. N3
)
) The problem is: although pattern matching can recognize "@", but how to
) write the code generically so to get all N1, N2 and N3, such that the number
) of items aren't known beforehand?
Well obviously first you create an array of arrays for the rows and columns.
And then how about something which looks a bit like:
for $i (1 .. $n) {
@columns = map {
my @row = @$_;
map {
(@row[0..($i-1)], $_, @row[($i+1).. $n])
} split('@', $row[$i]);
} @columns;
}
But of course this is overly complex and can probably be redoces to a
clever one-liner...
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
------------------------------
Date: Fri, 03 Apr 2009 14:13:16 -0700
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: sftp; Emits a warning if the operation fails.
Message-Id: <030420091413163339%jimsgibson@gmail.com>
In article
<93b253d4-d898-471f-8dfc-2a4fad996b2f@e2g2000vbe.googlegroups.com>,
okey <oldyork90@yahoo.com> wrote:
> Changing to net::sftp from ftp. What does this mean (in discussing
> how a particular method fails)
>
> Emits a warning if the operation fails.
>
> Are they talking about the return status?, or is there something else
> that must be captured or monitored?
>
> Thank you
>
> Example text: =============
>
> $sftp->do_remove($path)
>
> Sends a SSH_FXP_REMOVE command to remove the remote file $path.
>
> Emits a warning if the operation fails.
>
> Returns the status code for the operation. To turn the status code
> into a text message, take a look at the fx2txt function in
> Net::SFTP::Util.
A quick glance at the source code for the Net::SFT module reveals that
it uses the croak routine of the Carp module for "emitting" error
messages. While the do_remove function is not part of the Net::SFTP
package itself, the do_read module is and uses croak, so do_remove is
probably similar. The croak routine prints a message to standard error
output, like die, but gives the line of the calling program (your
program) instead of the module.
You can view the source code for Net::SFTP by executing 'perldoc -m
Net::SFTP' or browsing to
<http://cpansearch.perl.org/src/DBROBINS/Net-SFTP-0.10/lib/Net/SFTP.pm>
--
Jim Gibson
------------------------------
Date: Sun, 05 Apr 2009 17:57:53 +0200
From: Peter Makholm <peter@makholm.net>
Subject: Substitutions based on Posix ERE's in perl
Message-Id: <87skknum7y.fsf@vps1.hacking.dk>
For a project I have to implement lookup using NAPTR records from
DNS. Basicaly the consist of an substitution using POSIX ERE
syntax. (See RFC 3403).
Parsing the regexp to perl with minor corrections would probably solve
my problem in maost of the well behaved cases. But what about the not
so well behaved cases?
Is there an easy way to sanitize a regular expression such that it
is safe to run? Or should I write a full translation from ERE to perl?
Searhing CPAN doesn't ive me anything usefull.
Any other ideas?
//Makholm
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 2319
***************************************