[29191] in Perl-Users-Digest
Perl-Users Digest, Issue: 435 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon May 14 18:09:35 2007
Date: Mon, 14 May 2007 15:09:04 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 14 May 2007 Volume: 11 Number: 435
Today's topics:
Re: Dynamic page break usenet@DavidFilmer.com
Re: Dynamic page break <edMbj@aes-intl.com>
Re: Dynamic page break usenet@DavidFilmer.com
Re: Dynamic page break <spamtrap@dot-app.org>
Re: Dynamic page break usenet@DavidFilmer.com
Re: Dynamic page break <edMbj@aes-intl.com>
Re: Dynamic page break <edMbj@aes-intl.com>
Re: Dynamic page break <bik.mido@tiscalinet.it>
execute current PL script in kdevelop on FC5/KDE <tlviewer@yahoo.com>
Re: Firefox -- browsing activity logger? <ignoramus20083@NOSPAM.20083.invalid>
Re: Firefox -- browsing activity logger? <glex_no-spam@qwest-spam-no.invalid>
Re: How can I search and replace a string while preserv <jgibson@mail.arc.nasa.gov>
Net::SSH2 - Authentication without a password? <CSB001@gmail.com>
Re: Parsing a text file line-by-line: skipping badly-fo denis.papathanasiou@gmail.com
Re: Parsing a text file line-by-line: skipping badly-fo <someone@example.com>
Re: Parsing a text file line-by-line: skipping badly-fo denis.papathanasiou@gmail.com
Re: Parsing a text file line-by-line: skipping badly-fo (Greg Bacon)
Re: Parsing a text file line-by-line: skipping badly-fo <m@rtij.nl.invlalid>
Re: Parsing a text file line-by-line: skipping badly-fo denis.papathanasiou@gmail.com
Re: Regular Expression Question <ced@blv-sam-01.ca.boeing.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 14 May 2007 11:09:43 -0700
From: usenet@DavidFilmer.com
Subject: Re: Dynamic page break
Message-Id: <1179166183.372215.201790@o5g2000hsb.googlegroups.com>
On May 14, 10:55 am, Ed Jay <e...@aes-intl.com> wrote:
> a method for inserting page breaks at
If you're using FORMATs, set $- to zero.
If not using formats, print "\f";
--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)
------------------------------
Date: Mon, 14 May 2007 11:40:00 -0700
From: Ed Jay <edMbj@aes-intl.com>
Subject: Re: Dynamic page break
Message-Id: <mvah43lvdf8k1toh7s6p9c8938c7gsrpf3@4ax.com>
usenet@DavidFilmer.com scribed:
>On May 14, 10:55 am, Ed Jay <e...@aes-intl.com> wrote:
>> a method for inserting page breaks at
>
>If you're using FORMATs, set $- to zero.
>
>If not using formats, print "\f";
Thanks. I know how to add page breaks, but I was asking how to place page
breaks dynamically.
The report I want to print has dynamically created sentences, paragraphs,
etc., so it can be half a page long up to several pages in length. It would
be preferable to print without widows, with page numbers, etc.
--
Ed Jay (remove 'M' to respond by email)
------------------------------
Date: 14 May 2007 11:55:56 -0700
From: usenet@DavidFilmer.com
Subject: Re: Dynamic page break
Message-Id: <1179168956.455164.263330@k79g2000hse.googlegroups.com>
On May 14, 11:40 am, Ed Jay <e...@aes-intl.com> wrote:
> Thanks. I know how to add page breaks, but I was asking how to place page
> breaks dynamically.
Judging by the context of your message, I think you're confusing the
term "dynamically" with "automatically". If you want Perl to
"automatically" manage your pagination for you (including optional
headers/footers, page numbering, etc) and "automatically" insert page
breaks when you exceed a certain linecount threshold then you may use
a format:
perldoc perlform
You can "dynamically" insert a page break in a format by setting $- to
zero in your program at the point that you want to force the page
break to occur.
--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)
------------------------------
Date: Mon, 14 May 2007 14:59:07 -0400
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: Dynamic page break
Message-Id: <m2wszbgrk4.fsf@local.wv-www.com>
Ed Jay <edMbj@aes-intl.com> writes:
> The report I want to print has dynamically created sentences, paragraphs,
> etc., so it can be half a page long up to several pages in length. It would
> be preferable to print without widows, with page numbers, etc.
You could output to an intermediate format such as TeX or XSL, then call
an external tool to typeset that to a PDF.
LaTeX produces especially nice-looking results, IMHO.
Links:
<http://www.latex-project.org/>
<http://www.w3.org/TR/xsl/>
<http://xmlgraphics.apache.org/fop/>
sherm--
--
Web Hosting by West Virginians, for West Virginians: http://wv-www.net
Cocoa programming in Perl: http://camelbones.sourceforge.net
------------------------------
Date: 14 May 2007 12:01:15 -0700
From: usenet@DavidFilmer.com
Subject: Re: Dynamic page break
Message-Id: <1179169275.086454.285530@k79g2000hse.googlegroups.com>
On May 14, 11:55 am, use...@DavidFilmer.com wrote:
> perldoc perlform
Oh, waitaminute - I just noticed in your reply that you want to avoid
widows. format isn't smart enough to do that; you'll need Dr.
Damian's full blown Text::Autoformat:
http://search.cpan.org/~dconway/Text-Autoformat-1.13/lib/Text/Autoformat.pm
--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)
------------------------------
Date: Mon, 14 May 2007 12:06:57 -0700
From: Ed Jay <edMbj@aes-intl.com>
Subject: Re: Dynamic page break
Message-Id: <5qch43586m7gsbqjnrjr6bkg17ulvmok9n@4ax.com>
Sherm Pendley scribed:
>Ed Jay <edMbj@aes-intl.com> writes:
>
>> The report I want to print has dynamically created sentences, paragraphs,
>> etc., so it can be half a page long up to several pages in length. It would
>> be preferable to print without widows, with page numbers, etc.
>
>You could output to an intermediate format such as TeX or XSL, then call
>an external tool to typeset that to a PDF.
>
>LaTeX produces especially nice-looking results, IMHO.
>
>Links:
>
> <http://www.latex-project.org/>
>
> <http://www.w3.org/TR/xsl/>
> <http://xmlgraphics.apache.org/fop/>
>
Thanks much.
--
Ed Jay (remove 'M' to respond by email)
------------------------------
Date: Mon, 14 May 2007 12:06:42 -0700
From: Ed Jay <edMbj@aes-intl.com>
Subject: Re: Dynamic page break
Message-Id: <qpch43dsm1akrfjogj66spo5snar6i68f1@4ax.com>
usenet@DavidFilmer.com scribed:
>On May 14, 11:55 am, use...@DavidFilmer.com wrote:
>> perldoc perlform
>
>Oh, waitaminute - I just noticed in your reply that you want to avoid
>widows. format isn't smart enough to do that; you'll need Dr.
>Damian's full blown Text::Autoformat:
> http://search.cpan.org/~dconway/Text-Autoformat-1.13/lib/Text/Autoformat.pm
Thanks much.
--
Ed Jay (remove 'M' to respond by email)
------------------------------
Date: Mon, 14 May 2007 21:18:15 +0200
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Dynamic page break
Message-Id: <17dh43dvqvdjlbmgvq1r39mhctlkqphdmp@4ax.com>
On Mon, 14 May 2007 14:59:07 -0400, Sherm Pendley
<spamtrap@dot-app.org> wrote:
>LaTeX produces especially nice-looking results, IMHO.
IYNSHO. Not only yours.
> <http://www.latex-project.org/>
Somewhat surprisingly, a better entry point would perhaps be
<http://www.tug.org/>
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
------------------------------
Date: Mon, 14 May 2007 13:05:47 -0700
From: Mark Pryor <tlviewer@yahoo.com>
Subject: execute current PL script in kdevelop on FC5/KDE
Message-Id: <pan.2007.05.14.20.05.47.14758@yahoo.com>
hello,
I have FC5 installed on my laptop with kdevelop and KDE.
I go to MainMenu -> Programming -> kdevelop: Scripting
I open my target PL script, make a few changes, and save it. Now I
want to run it and capture the output into the window. I see the tools
menu, with "execute command". However there is no shortcut to execute the
current script (like %f).
Can anyone show the best way to do this in kdevelop?
tia,
Mark
------------------------------
Date: Mon, 14 May 2007 14:07:41 -0500
From: Ignoramus20083 <ignoramus20083@NOSPAM.20083.invalid>
Subject: Re: Firefox -- browsing activity logger?
Message-Id: <t9CdnYHHh6_gLtXbnZ2dnUVZ_h_inZ2d@giganews.com>
On Sat, 12 May 2007 06:55:24 GMT, Bart Lateur <bart.lateur@pandora.be> wrote:
> Ignoramus6365 wrote:
>
>>I have a hell of a time in doing so, due to Google's use of
>>javascript, which is not known to WWW::Mechanize. This is by far not
>>my first web scraping script, I did maybe a dozen or two prior, but I
>>am not getting headway with it. I tried and tried and thought that I
>>was doing a perfect job, but still I fail.
>
> Try one of the other scrapers instead, that use a browser's core to
> scrape. For your application (Firefox, Linux) I suspect
> Mozilla::Mechanize might work.
>
> http://search.cpan.org/perldoc?Mozilla::Mechanize
>
> (Oddly enough, the introduction to it is not found in the POD, but in
> the README:
>
> http://search.cpan.org/dist/Mozilla-Mechanize/README
> )
>
> That is, if you can get the module it depends upon, Gtk2::MozEmbed, to
> work.
>
> http://search.cpan.org/dist/Gtk2-MozEmbed/
>
>
OK, I tried and tried, but could not install Mozilla::DOM. That module
says that it requires mozilla-xpcom or firefox-xpcom. How to install
these (is there a tar file, RPM, or ANYTHING else) is a very closely
guarded mystery.
i
------------------------------
Date: Mon, 14 May 2007 16:13:28 -0500
From: "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid>
Subject: Re: Firefox -- browsing activity logger?
Message-Id: <4648d0f8$0$502$815e3792@news.qwest.net>
Ignoramus20083 wrote:
>
> OK, I tried and tried, but could not install Mozilla::DOM. That module
> says that it requires mozilla-xpcom or firefox-xpcom. How to install
> these (is there a tar file, RPM, or ANYTHING else) is a very closely
> guarded mystery.
>
> i
http://www.mozilla.org/projects/xpcom/
------------------------------
Date: Mon, 14 May 2007 12:38:52 -0700
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: How can I search and replace a string while preserving (not removing) trailing spaces?
Message-Id: <140520071238522994%jgibson@mail.arc.nasa.gov>
In article <1179027791.676809.209840@q75g2000hsh.googlegroups.com>,
rsarpi <rsarpi@gmail.com> wrote:
> Sorry the dumb question. I'm a newbie.
>
> But how can I search and replace a string while preserving (not
> removing) trailing spaces?
>
> let me explain: the command 's/$old_string/$new_string/g' inserts the
> new string and *moves* spaces to the right or left (depending on the
> length of the 'new_string') making the whole sentence bigger or
> smaller.
>
> for example:
>
> my $sentence = "*Peter Parker is Spider Man *";
> print length($sentence); #prints 43 characters
>
> I'd like to know a trick in which, if I get rid of "Peter" I would
> _still_ get 43 characters in length for the whole sentence between
> both quotation marks. Here the whole sentence between quotation marks
> expands until it reaches 43 chars in length.
>
> my $sentence = "*Parker is Spider Man *";
> print length($sentence );#still 43
>
>
> And if I add to the sentence Benjamin (between Peter and Parker) I
> would like to get exactly 43 characters between both quotation marks.
> Here the whole sentence between quotation marks shrinks until it gets
> up 43 chars in length.
>
> my $sentence = "*Peter Benjamin Parker is Spider Man *";
> print length($sentence ); #still 43
>
Use substr to return an lvalue and assign with spaces:
substr($x,length $x,43) = " "x(43-length $x);
or, more simply:
while( length $x < 43 ) { $x .= ' '; };
or just
$x .= ' ' x (43-length $x);
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
------------------------------
Date: 14 May 2007 12:31:11 -0700
From: CsB <CSB001@gmail.com>
Subject: Net::SSH2 - Authentication without a password?
Message-Id: <1179171071.522062.114690@o5g2000hsb.googlegroups.com>
I have installed Net::SSH2 installed and use it to connect to 90% of
the networking equipment I need to access. It works great, thank you
to zentara for the suggestion.
However, I'm trying to find a solution for one vendor that decided to
be different from the others.
For arguments sake, we'll call the vendor 'alpha'.
Alpha equipment apparently requires an SSH login name but no password.
When I use SecureCRT to connect to one of the alpha boxes, (using
Keyboard Interactive or Password as the authentication method) I am
prompted for a username. Once I enter the proper username, the SSH2
session is established, a shell is started, and I am then presented
with a new set of userid and password requests which are authenticated
against our RADIUS servers. It never prompts me for a password to
accompany the initial username request. I have confirmed (with the
administrator of this equipment) that it doesn't require a password to
establish the SSH2 session.
I cannot find many examples of Net::SSH2 so I have struggled to find
an answer. While reading through the docs, I have tried various
options including:
$ssh2->auth_password( 'username' ) or die "Unable to login $@ \n";
$ssh2->auth_password( 'username', '' ) or die "Unable to login $@ \n";
$ssh2->auth_keyboard( 'username' ) or die "Unable to login $@ \n";
$ssh2->auth_list( username );
(then checking the value of $ssh->auth_ok
All have failed. I cannot seem to figure out how to emulate the login
process executed by SecureCRT.
Any suggestions would be greatly appreciated.
------------------------------
Date: 14 May 2007 11:45:55 -0700
From: denis.papathanasiou@gmail.com
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <1179168354.983185.78490@e51g2000hsg.googlegroups.com>
> You say your program exits with an error, but you didn't say what
> the error is.
My fault, I should have been more precise.
$? actually returns 0 but I know that is incorrect because the output
is not as expected.
The large text file contains data from "A" to "Z", so a successful run
would result in 26 smaller files.
But the output we get stops at "R", so either one of the "R" lines (or
possibly the start of the "S" data) is malformed.
> What's the error? What version of perl are you using? What's your
> operating system?
$ perl -v
This is perl, v5.8.4 built for i386-linux-thread-multi
$ uname -sro
Linux 2.4.27-2-386 GNU/Linux
> Your chances of receiving a helpful reply are even better if you can
> provide input that causes the problem. Yes, transmitting non-printable
> characters on Usenet is a pain, so uuencode the input or write a Perl
> program that can recreate it!
Getting to the exact line with the problem has been surprisingly
difficult: the input file is 14 gb in size, which is too big for the
hex editor we use (shed).
I've also tried split to break up the file into smaller chunks, so I
can load the "R" or "S" chunk into shed and look at the line, but
split suffers the same problem, i.e. it only gets so far through the
original file before it quits, leaving the "S" to "Z" range unsplit.
I'd also thought it might have to do with the $. command (perhaps at
14 gb, it exceeds perl's ability to count that high?), but removing
that logic in my script didn't change the result.
------------------------------
Date: Mon, 14 May 2007 19:28:15 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <jL22i.9646$V75.3767@edtnps89>
denis.papathanasiou@gmail.com wrote:
> I have a script which reads a plain text (dos) file line-by-line and
> splits it into several smaller files, based on a single attribute.
>
> The code (below) works, except when a line is malformed (i.e., the
> line contains binary or control characters), and the script just exits
> with an error:
>
> open(IN, "$IN_FILE") or die "\n\terror: Could not read $IN_FILE $!
perldoc -q quoting
Also, you should get into the habit of using the three argument form of open:
open IN, '<', $IN_FILE or die "\n\terror: Could not read $IN_FILE $!\n";
> \n"; ;
> binmode(IN);
You can also incorporate that into the open statement:
open IN, '<:raw', $IN_FILE or die "\n\terror: Could not read $IN_FILE $!\n";
> while( $ln=<IN> ) {
> if( $ln =~ m/\r\n$/ ) {
> $ln =~ s/\r\n$/\n/; # dos2unix: convert CR LF to LF
You don't need to match the same pattern twice:
if ( $ln =~ s/\r\n$/\n/ ) {
Or more portable and correct:
if ( $ln =~ s/\015\012\z/\n/ ) {
> if( $. > 0 ) { # skip the header line
$. starts out at 1 so it is *always* greater than 0 (unless you explicitly
change it.)
> $sym = substr($ln, 10, 16);
> $sym =~ s/ //g;
Use the three argument open() so you won't have to worry about whitespace in
the file name. However there are other characters that are not valid in a
file name that you should remove such as "\0" and '/'.
$sym =~ tr!\0/!!d
> if( $prior_sym ne $sym ) {
> if( $prior_sym ne '' ) { close(OUT); }
> $sym_file = $OUT_PATH . "/" . $sym . "." . $OUT_SUFFIX ;
> open(OUT, ">$sym_file") or die "\n\terror: Could not write to
open OUT, '>:raw', $sym_file or die "\n\terror: Could not write to
$sym_file $!\n";
> $sym_file $!\n";
> binmode(OUT);
> }
> print OUT $ln;
> $prior_sym = $sym ;
> }
> }
> }
> close(IN);
>
> What I'd like it to do, instead, is if it hits a bad line, write a
> warning and keep going to the end of the file.
>
> I've tried wrapping the block above in "eval { }; warn $@ if $@;" but
> that doesn't trap the error; even with eval/warn, a bad line will
> cause the script to exit.
>
> Is there a better way of doing this?
John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
------------------------------
Date: 14 May 2007 12:42:00 -0700
From: denis.papathanasiou@gmail.com
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <1179171720.099993.100230@k79g2000hse.googlegroups.com>
> perldoc -q quoting
>
> Also, you should get into the habit of using the three argument form of open:
>
> open IN, '<', $IN_FILE or die "\n\terror: Could not read $IN_FILE $!\n";
>
> > \n"; ;
> > binmode(IN);
>
> You can also incorporate that into the open statement:
>
> open IN, '<:raw', $IN_FILE or die "\n\terror: Could not read $IN_FILE $!\n";
Thanks for the suggestion; I've been working with an old template, and
since it was functional, I never bothered to make it more idiomatic.
> > while( $ln=<IN> ) {
> > if( $ln =~ m/\r\n$/ ) {
> > $ln =~ s/\r\n$/\n/; # dos2unix: convert CR LF to LF
>
> You don't need to match the same pattern twice:
>
> if ( $ln =~ s/\r\n$/\n/ ) {
>
> Or more portable and correct:
>
> if ( $ln =~ s/\015\012\z/\n/ ) {
I'm guilty of some spaghetti there: the dos2unix line was added later,
and I just stuck it in there w/o thinking about the statement before
it.
> > if( $. > 0 ) { # skip the header line
>
> $. starts out at 1 so it is *always* greater than 0 (unless you explicitly
> change it.)
Really? If I leave that statement out, it winds up processing the
first line, but when it's there, it skips the first line.
> > $sym = substr($ln, 10, 16);
> > $sym =~ s/ //g;
>
> Use the three argument open() so you won't have to worry about whitespace in
> the file name. However there are other characters that are not valid in a
> file name that you should remove such as "\0" and '/'.
>
> $sym =~ tr!\0/!!d
>
> > if( $prior_sym ne $sym ) {
> > if( $prior_sym ne '' ) { close(OUT); }
> > $sym_file = $OUT_PATH . "/" . $sym . "." . $OUT_SUFFIX ;
> > open(OUT, ">$sym_file") or die "\n\terror: Could not write to
>
> open OUT, '>:raw', $sym_file or die "\n\terror: Could not write to
> $sym_file $!\n";
These are all great comments, but they don't help with the original
problem: any thoughts on why the block terminates before processing
every line of the original input file?
------------------------------
Date: Mon, 14 May 2007 20:40:18 -0000
From: gbacon@hiwaay.net (Greg Bacon)
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <134hi9i78k1rq7d@corp.supernews.com>
In article <1179168354.983185.78490@e51g2000hsg.googlegroups.com>,
<denis.papathanasiou@gmail.com> wrote:
: > You say your program exits with an error, but you didn't say what
: > the error is.
:
: My fault, I should have been more precise.
Yes, precision helps in diagnosing technical problems!
Is your program exiting silently, i.e., with no error message?
You wrote that you expected files named A-Z but R is the last
file created. Looking at your logic, your code skips input lines
that don't have CR NL. Is this your intent? Could the lines with
symbols in S-Z be "hidden" in the sense that they fail the test
in the following line?
if( $ln =~ m/\r\n$/ ) {
Debugging output will help you find the problem input. I'd add
at least two warnings:
while( $ln=<IN> ) {
if( $ln =~ s/\r\n\z/\n/ ) {
if( $. > 1 ) { # skip the header line
# the rest of your code...
}
else {
warn "$0: $IN_FILE:$.: skipping...\n";
}
}
warn "$0: $IN_FILE:$.: exiting...\n";
Hope this helps,
Greg
--
(As far as I can see, it is always a man who makes the [Faustian] agreement.
A woman is more likely to be the contract's benefit than its negotiator.
The assumption is that Old Slewfoot fully controls her. Obviously, the
story is literature.) -- Gary North
------------------------------
Date: Mon, 14 May 2007 22:57:30 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <pan.2007.05.14.20.57.41@rtij.nl.invlalid>
On Mon, 14 May 2007 12:42:00 -0700, denis.papathanasiou wrote:
> These are all great comments, but they don't help with the original
> problem: any thoughts on why the block terminates before processing
> every line of the original input file?
Maybe go back to the good old ways of debugging, add print statements
that tell what the program is doing. Tee this so you save it to a file as
well for later reference, or ptint to a logfile in the first place.
This will not tell you what is wrong, but may pinpoint the location in
the 14GB file where your program goes wrong.
HTH,
M4
------------------------------
Date: 14 May 2007 14:00:17 -0700
From: denis.papathanasiou@gmail.com
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <1179176417.053606.44790@w5g2000hsg.googlegroups.com>
> Is your program exiting silently, i.e., with no error message?
Yes, $? is 0
> You wrote that you expected files named A-Z but R is the last
> file created. Looking at your logic, your code skips input lines
> that don't have CR NL. Is this your intent? Could the lines with
> symbols in S-Z be "hidden" in the sense that they fail the test
> in the following line?
>
> if( $ln =~ m/\r\n$/ ) {
Yes, that's the intent, because if a line doesn't end in CR, it is
malformed and cannot be parsed further.
While it's likely that there is at least one line that fits that
description (and hence fails the $ln =~ m/\r\n$/ test), the bulk of
the S-Z data *does* end in CR (I verified this by doing a tail on the
input file).
So those lines, i.e. the S-Z lines which do end in CR should not be
skipped.
> Debugging output will help you find the problem input. I'd add
> at least two warnings:
>
> while( $ln=<IN> ) {
> if( $ln =~ s/\r\n\z/\n/ ) {
> if( $. > 1 ) { # skip the header line
> # the rest of your code...
> }
> else {
> warn "$0: $IN_FILE:$.: skipping...\n";
> }
> }
>
> warn "$0: $IN_FILE:$.: exiting...\n";
>
Thanks, I'll try that.
In the meantime, I also tried doing a head of the first 120761073
lines (split exits after processing 120761072 lines in total, which is
not the full size of the file), and it gave me an interesting error:
$ head -120761073 qte20070430 > xy.1
head: error reading `qte20070430': Input/output error
$ echo $?
1
$ tail -2 xy.1
134950345PRIG 000008192000000028000008197000000003R
PP000000001715724200 C
134950355TRIG 000008192000000052000008197000000014$
So the last line there has the problem (well-formed lines are 90 bytes
long), but my hex editor doesn't show anything unusual after the "4"
character:
offs asc hex dec oct bin
0135: 0 30 048 060 00110000
0136: 0 30 048 060 00110000
0137: 0 30 048 060 00110000
0138: 0 30 048 060 00110000
0139: 0 30 048 060 00110000
0140: 8 38 056 070 00111000
0141: 1 31 049 061 00110001
0142: 9 39 057 071 00111001
0143: 7 37 055 067 00110111
0144: 0 30 048 060 00110000
0145: 0 30 048 060 00110000
0146: 0 30 048 060 00110000
0147: 0 30 048 060 00110000
0148: 0 30 048 060 00110000
0149: 0 30 048 060 00110000
0150: 0 30 048 060 00110000
0151: 1 31 049 061 00110001
0152: 4 34 052 064 00110100
(end)
152/153 (dec)
------------------------------
Date: 14 May 2007 12:48:33 -0700
From: "comp.llang.perl.moderated" <ced@blv-sam-01.ca.boeing.com>
Subject: Re: Regular Expression Question
Message-Id: <1179172113.106915.205830@u30g2000hsc.googlegroups.com>
On May 12, 6:43 am, "John W. Krahn" <some...@example.com> wrote:
> jonasforss...@yahoo.se wrote:
>
> > I have a file like this:
>
> > 2 45 3
> > 3 44 2 65
>
> > In other words, the first number states how many follwing numbers
> > there will be.
>
> > ^([0-9]) ([0-9]){$1}$
>
> > The above regexp does not seem to work for parsing this. I guess the
> > $1 parameter is not accepted.
> > Any ideas of an alternative?
>
> while ( <FILE> ) {
> my ( $count, @data ) = /\d+/g;
> if ( $count == @data ) {
> print "Yes! we have valid data.\n";
> }
> else {
> warn "Error: expected $count elements but received " . @data . "
> instead.\n";
> }
> }
>
Or, if backref's weren't required:
my( $tot, $cnt );
print "yes..." if $tot = ($cnt,()) = /\d+/g and $tot == $cnt+1;
--
Charles DeRykus
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 435
**************************************