[29192] in Perl-Users-Digest
Perl-Users Digest, Issue: 436 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue May 15 03:10:01 2007
Date: Tue, 15 May 2007 00:09:04 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 15 May 2007 Volume: 11 Number: 436
Today's topics:
Re: Creating .txt/.html file using perl script sl123@netherlands.area
Re: Creating .txt/.html file using perl script <xicheng@gmail.com>
Re: Creating Packages <whoami@whereami.net>
Re: Creating Packages <sisyphus1@nomail.afraid.org>
Re: looking for perl professionals sl123@netherlands.area
new CPAN modules on Tue May 15 2007 (Randal Schwartz)
Re: Parsing a text file line-by-line: skipping badly-fo denis.papathanasiou@gmail.com
Re: Parsing a text file line-by-line: skipping badly-fo (Greg Bacon)
Re: Parsing a text file line-by-line: skipping badly-fo denis.papathanasiou@gmail.com
Re: Parsing a text file line-by-line: skipping badly-fo sl123@netherlands.area
Re: Parsing a text file line-by-line: skipping badly-fo sl123@netherlands.area
Re: Parsing a text file line-by-line: skipping badly-fo sl123@netherlands.area
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 14 May 2007 21:59:37 -0700
From: sl123@netherlands.area
Subject: Re: Creating .txt/.html file using perl script
Message-Id: <n7fi43di49avo1jkt1488o3rjq9hpfrt71@4ax.com>
On 13 May 2007 16:50:33 -0700, Xicheng Jia <xicheng@gmail.com> wrote:
>On May 13, 6:47 pm, Xicheng Jia <xich...@gmail.com> wrote:
>> On May 13, 11:53 am, Xicheng Jia <xich...@gmail.com> wrote:
>>
>>
>>
>>
>>
>> > On May 12, 5:45 pm, jdblackf...@gmail.com wrote:
>>
>> > > Hello,
>>
>> > > I was wondering if it was possible to create simple program that will
>> > > create .txt (or .html) files based on information provided via
>> > > prompts. If so, would someone be willing to assist me with this? I
>> > > know absolutely nothing about Perl (or any other programming language
>> > > for that matter)
>>
>> > > For instance, if I wanted to create a file called 01.20.07.html based
>> > > off of a specific date (01 is MM, 20 is DD, 07 is YY) that looked like
>> > > this:
>>
>> > > <html>
>> > > <body>
>> > > <table>
>> > > <tr><td><img src="http://www.mywebsite.com/01.20.07/1.JPG">
>> > > <td><img src="http://www.mywebsite.com/01.20.07/2.JPG">
>> > > <td><img src="http://www.mywebsite.com/01.20.07/3.JPG">
>> > > </tr>
>> > > <tr><td><img src="http://www.mywebsite.com/01.20.07/4.JPG">
>> > > <td><img src="http://www.mywebsite.com/01.20.07/5.JPG">
>> > > <td><img src="http://www.mywebsite.com/01.20.07/6.JPG">
>> > > </tr>
>> > > </body>
>> > > </html>
>>
>> > > I would like to be able to enter the following information when
>> > > prompted by the program:
>>
>> > > MM
>> > > DD
>> > > YY
>> > > Number of images
>>
>> > > If possible, I would like to have the code know how many table rows to
>> > > create (in intervals of 3) based on the number of images.
>>
>> > > Is this something someone here can assist with?
>>
>> > You might want to check any Perl templating modules like TT(Template
>> > Toolkits),
>>
>> > http://search.cpan.org/dist/Template-Toolkit/
>>
>> > which could possibly make your stuff much easier.(in your case, the
>> > template file need just several inputs and a simple loop.)
>>
>> > BTW. if you can use <div> elements instead of table to organize your
>> > output, that might be much easier, my 2 cents. :-)
>>
>> > Regards,
>> > Xicheng- Hide quoted text -
>> > - Show quoted text -
>>
>> Below is a TT test code for your needs, you actually only need to
>> figure a data structure and then TT can do the other stuff pretty
>> easily:
>>
>> you run it this way: ./ex1.pl 01.20.70 13 5
>>
>> If you can use <div> and set CSS styles to control their (float|
>> width|...)s, then a plain 1-D array will do all stuff...
>>
>> Regards,
>> Xicheng
>>
>> #### ex1.pl ####
>> #!/usr/bin/perl
>> use warnings;
>> use strict;
>> use Template;
>>
>> if (@ARGV < 2) {
>> print "./ex1.pl mm.dd.yy numPic [numCol]\n";
>> exit;}
>>
>> my ($date, $numPic) = @ARGV;
>> my $numCol = $ARGV[2] || 3;
>>
>> my $residuals = $numCol - ($numPic % $numCol);
>> my $items;
>> my $n = 0;
>>
>> for my $item (1..$numPic) {
>> push @{$items->[$n]}, $item;
>> $n++ if not $item % $numCol;}
>>
>
>XC> push @{$items->[$n]}, (0) x $residuals;
>
>A bug from the above line, should be :
>
> push @{$items->[$n]}, (0) x $residuals if not $residuals == $numCol;
>
>Regards,
>Xicheng
>
>>
>> my $tt = Template->new();
>> my $input = 'ex1.tt';
>> my $args = {
>> site => $date,
>> items => $items,};
>>
>> $tt->process($input, $args, "$date.html") or die $tt->error( );
>>
>> __END__
>>
>> #### ex1.tt ####
>> <html>
>> <head>
>> <title>TT test page</title>
>> </head>
>> <body>
>> <table>
>> [% FOREACH item IN items -%]
>> <tr>
>> [% FOREACH cell IN item -%]
>> [% IF !cell -%]
>> <td> </td>
>> [% ELSE -%]
>> <td><img src="http://www.mywebsite.com/[% site %]/[% cell
>> %].JPG"></td>
>> [% END -%]
>> [% END -%]
>> </tr>
>> [% END -%]
>> </table>
>> </body>
>> </html>- Hide quoted text -
>>
>> - Show quoted text -
>
Hahahaha, you make me lafff.
Systematic/programmable/generalization code.......
"Well, if you could just get it to here, then if so, manually do this step".
Its just fuckin magic!!
Hahahahahahaaaaaaaaaa.
No time to do it right? Get off you fat fuckin ass and do it by hand losers!!!!
Or you could pay an expert to give you a million dolla, 1 time use gem <---
losers
------------------------------
Date: 15 May 2007 00:00:55 -0700
From: Xicheng Jia <xicheng@gmail.com>
Subject: Re: Creating .txt/.html file using perl script
Message-Id: <1179212455.867628.304810@p77g2000hsh.googlegroups.com>
On May 15, 12:59 am, s...@netherlands.area wrote:
> On 13 May 2007 16:50:33 -0700, Xicheng Jia <xich...@gmail.com> wrote:
>
>
>
>
>
> >On May 13, 6:47 pm, Xicheng Jia <xich...@gmail.com> wrote:
> >> On May 13, 11:53 am, Xicheng Jia <xich...@gmail.com> wrote:
>
> >> > On May 12, 5:45 pm, jdblackf...@gmail.com wrote:
>
> >> > > Hello,
>
> >> > > I was wondering if it was possible to create simple program that will
> >> > > create .txt (or .html) files based on information provided via
> >> > > prompts. If so, would someone be willing to assist me with this? I
> >> > > know absolutely nothing about Perl (or any other programming language
> >> > > for that matter)
>
> >> > > For instance, if I wanted to create a file called 01.20.07.html based
> >> > > off of a specific date (01 is MM, 20 is DD, 07 is YY) that looked like
> >> > > this:
>
> >> > > <html>
> >> > > <body>
> >> > > <table>
> >> > > <tr><td><img src="http://www.mywebsite.com/01.20.07/1.JPG">
> >> > > <td><img src="http://www.mywebsite.com/01.20.07/2.JPG">
> >> > > <td><img src="http://www.mywebsite.com/01.20.07/3.JPG">
> >> > > </tr>
> >> > > <tr><td><img src="http://www.mywebsite.com/01.20.07/4.JPG">
> >> > > <td><img src="http://www.mywebsite.com/01.20.07/5.JPG">
> >> > > <td><img src="http://www.mywebsite.com/01.20.07/6.JPG">
> >> > > </tr>
> >> > > </body>
> >> > > </html>
>
> >> > > I would like to be able to enter the following information when
> >> > > prompted by the program:
>
> >> > > MM
> >> > > DD
> >> > > YY
> >> > > Number of images
>
> >> > > If possible, I would like to have the code know how many table rows to
> >> > > create (in intervals of 3) based on the number of images.
>
> >> > > Is this something someone here can assist with?
>
> >> > You might want to check any Perl templating modules like TT(Template
> >> > Toolkits),
>
> >> > http://search.cpan.org/dist/Template-Toolkit/
>
> >> > which could possibly make your stuff much easier.(in your case, the
> >> > template file need just several inputs and a simple loop.)
>
> >> > BTW. if you can use <div> elements instead of table to organize your
> >> > output, that might be much easier, my 2 cents. :-)
>
> >> > Regards,
> >> > Xicheng- Hide quoted text -
> >> > - Show quoted text -
>
> >> Below is a TT test code for your needs, you actually only need to
> >> figure a data structure and then TT can do the other stuff pretty
> >> easily:
>
> >> you run it this way: ./ex1.pl 01.20.70 13 5
>
> >> If you can use <div> and set CSS styles to control their (float|
> >> width|...)s, then a plain 1-D array will do all stuff...
>
> >> Regards,
> >> Xicheng
>
> >> #### ex1.pl ####
> >> #!/usr/bin/perl
> >> use warnings;
> >> use strict;
> >> use Template;
>
> >> if (@ARGV < 2) {
> >> print "./ex1.pl mm.dd.yy numPic [numCol]\n";
> >> exit;}
>
> >> my ($date, $numPic) = @ARGV;
> >> my $numCol = $ARGV[2] || 3;
>
> >> my $residuals = $numCol - ($numPic % $numCol);
> >> my $items;
> >> my $n = 0;
>
> >> for my $item (1..$numPic) {
> >> push @{$items->[$n]}, $item;
> >> $n++ if not $item % $numCol;}
>
> >XC> push @{$items->[$n]}, (0) x $residuals;
>
> >A bug from the above line, should be :
>
> > push @{$items->[$n]}, (0) x $residuals if not $residuals == $numCol;
>
> >Regards,
> >Xicheng
>
> >> my $tt = Template->new();
> >> my $input = 'ex1.tt';
> >> my $args = {
> >> site => $date,
> >> items => $items,};
>
> >> $tt->process($input, $args, "$date.html") or die $tt->error( );
>
> >> __END__
>
> >> #### ex1.tt ####
> >> <html>
> >> <head>
> >> <title>TT test page</title>
> >> </head>
> >> <body>
> >> <table>
> >> [% FOREACH item IN items -%]
> >> <tr>
> >> [% FOREACH cell IN item -%]
> >> [% IF !cell -%]
> >> <td> </td>
> >> [% ELSE -%]
> >> <td><img src="http://www.mywebsite.com/[% site %]/[% cell
> >> %].JPG"></td>
> >> [% END -%]
> >> [% END -%]
> >> </tr>
> >> [% END -%]
> >> </table>
> >> </body>
> >> </html>- Hide quoted text -
>
> >> - Show quoted text -
>
> Hahahaha, you make me lafff.
> Systematic/programmable/generalization code.......
>
> "Well, if you could just get it to here, then if so, manually do this step".
> Its just fuckin magic!!
>
> Hahahahahahaaaaaaaaaa.
>
> No time to do it right? Get off you fat fuckin ass and do it by hand losers!!!!
> Or you could pay an expert to give you a million dolla, 1 time use gem <---
>
> losers- Hide quoted text -
>
> - Show quoted text -
Hi, thank you for your comments. You are half right, and I just
started learning TT and am still far from touching the gory details of
this tool. :-( ....... clpm for me is pretty much like a classroom
where I practice myself and listen from others (especially those
professionals). I really apprecdiate all your input and suggestions.
Best regards,
Xicheng
------------------------------
Date: Tue, 15 May 2007 00:13:32 +0100
From: "IanW" <whoami@whereami.net>
Subject: Re: Creating Packages
Message-Id: <f2aqf8$2ogh$1@energise.enta.net>
"Sisyphus" <sisyphus1@nomail.afraid.org> wrote in message
news:46487224$0$27448$afc38c87@news.optusnet.com.au...
[..]
> require Exporter;
> @GD::3DBarGrapher::ISA = qw(Exporter);
> @GD::3DBarGrapher::EXPORT_OK = qw(creategraph);
>
> The module could then be loaded as:
>
> use GD::3DBarGrapher qw(creategraph);
>
> and the function creategraph() could be used - there would be no need for
> the "fully qualified" name. (See 'perldoc Exporter' for full
> details/options.)
>
> You don't really need Exporter at all - it's just a convenience that's
> usually provided, so that users can call (for example) the creategraph
> function as 'creategraph()' instead of having to write
> 'GD::3DBarGrapher::creategraph()'
That's excellent thanks - I think I was getting bogged down thinking I
needed to call a "new" instance of the module and then set the config
options via object handles etc, but I can see I don't need all that.
I set up 3DBarGrapher.pm as you suggested, including the exporter and a
test.pl file, and it works nicely :-)
> And you'll want to designate a version number:
>
> $GD::3DBarGrapher::VERSION = '0.01'; # or whatever number you want
Is there any general guide on version numbers? That is, I don't at present
have a to-do list for enhancements to it, so would be inclined to call it
version 1.0. However, I notice alot of modules seem to be 0.xx.
> To create a proper CPAN distro, you'll also need a Makefile.PL, a CHANGES
> file , a README file , and a test suite (or at least a test.pl) to ensure
> that things are working as expected.
I found this page: http://cpan.uwinnipeg.ca/htdocs/perl/perlnewmod.html and
so have requested a Pause account. It mentions using things like
make-starter or hx2s. I tried the latter but it puts alot of complicated
stuff into the .pm template and creates some other similarly mysterious
files. I will try make-starter when I get my Pause account & cpan email. But
if I just created the files you mention manually, then presumably I would
just pack them into a tarball and upload?
Regards
Ian
------------------------------
Date: Tue, 15 May 2007 10:24:16 +1000
From: "Sisyphus" <sisyphus1@nomail.afraid.org>
Subject: Re: Creating Packages
Message-Id: <4648fd9e$0$26514$afc38c87@news.optusnet.com.au>
"IanW" <whoami@whereami.net> wrote in message
news:f2aqf8$2ogh$1@energise.enta.net...
>
.
.
>
> Is there any general guide on version numbers? That is, I don't at present
> have a to-do list for enhancements to it, so would be inclined to call it
> version 1.0. However, I notice alot of modules seem to be 0.xx.
I think there was once a convention that while you considered your module to
be "alpha" you would designate 0.xx versions. If that convention still
exists, I think it is fairly widely ignored.
I wouldn't worry about it too much - I've not yet seen an author criticised
for the choice of version number.
>
>> To create a proper CPAN distro, you'll also need a Makefile.PL, a CHANGES
>> file , a README file , and a test suite (or at least a test.pl) to ensure
>> that things are working as expected.
I forgot to mention the MANIFEST file.
>
> I found this page: http://cpan.uwinnipeg.ca/htdocs/perl/perlnewmod.html
> and so have requested a Pause account. It mentions using things like
> make-starter or hx2s. I tried the latter but it puts alot of complicated
> stuff into the .pm template and creates some other similarly mysterious
> files. I will try make-starter when I get my Pause account & cpan email.
> But if I just created the files you mention manually, then presumably I
> would just pack them into a tarball and upload?
>
I've only ever created the files manually. There is quite possibly some work
to be saved if one goes to the trouble of becoming familiar with one of the
automated procedures. I've never bothered doing that.
I just manually create the files in the appropriate directory structure,
tar, gzip, and "Upload a file to CPAN" from the pause menu (
https://pause.perl.org/pause/authenquery ).
Best to first take a look at http://www.cpan.org/modules/04pause.html .
Cheers,
Rob
------------------------------
Date: Mon, 14 May 2007 21:42:04 -0700
From: sl123@netherlands.area
Subject: Re: looking for perl professionals
Message-Id: <geei435b9bqrdfge0ku5r3bqb45c3a4ld0@4ax.com>
On 14 May 2007 06:20:06 -0700, Doug <roaring.chicken@gmail.com> wrote:
>Hello ,
>
>My name is Doug Cohen and I am a Senior Recruiter for c CIDC is a
>profitable and growing Internet-based software company that has been
>developing and refining its online gaming technology and eBusiness
>operational expertise for over seven years. Through our technology, we
>have become the most trusted name in online gaming and entertainment
>worldwide. CIDC developed the gaming and entertainment software for
>sites such as Everest Poker and Everest Casino at
>www.everestpoker.com/www.everestcasino.com. For more information,
>please visit our corporate site at www.cidc.com. We have an excellent
>benefits package including Medical/Dental, 401K, vacation, informal,
>team and fun environment. (sorry, no telecommuting at this time)
>
>Due to our growth we are currently recruiting for a full time Software
>engineer-Perl for our Cambridge, MA location:
>
>Job Summary
>
>You would become a member of our Server Team build, in which you would
>be able to use all of your software skills to implement, maintain and
>extend dynamic web applications for on-line gaming and Internet
>marketing. In addition, you would develop and maintain a number of
>support and data analysis scripts.
>Requirements
>
>To be successful in this role your background will need 2 to 5 years
>of:
>
>· Perl in Unix/Linux environment
>· Experience with development of e-commerce applications involving
>mod_perl, Mason, Catalyst, Template Toolkit, Apache web servers, DBI,
>CGI, SQL, HTML, and relational databases.
>· Experience with financial applications, Informix, AJAX,
>Internationalization and Localization a plus
>· Computer Science degree or equivalent is a must
>· Good communication and organizational skills
>
>
>
>If you are interested in this position or finding out other
>opportunities at CDIC forward your resume to dcohen@cidc.com or call
>him at 617-547-6323 x315 (relocation is available)
So ah, what the fuck do you pay fat cat? With your dynamic gigantic company and all?
------------------------------
Date: Tue, 15 May 2007 04:42:11 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Tue May 15 2007
Message-Id: <JI2FqB.1MCD@zorch.sf-bay.org>
The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN). You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.
AFS-Monitor-0.3.2
http://search.cpan.org/~alfw/AFS-Monitor-0.3.2/
Perl interface to AFS monitoring and debugging tools
----
Acme-Dahut-Call-0.02
http://search.cpan.org/~perigrin/Acme-Dahut-Call-0.02/
replicates the melodious sound of the wild Dahut ... in Text.
----
Acme-POE-Acronym-Generator-1.02
http://search.cpan.org/~bingos/Acme-POE-Acronym-Generator-1.02/
Generate random POE acronyms.
----
Algorithm-C3-0.07
http://search.cpan.org/~blblack/Algorithm-C3-0.07/
A module for merging hierarchies using the C3 algorithm
----
Algorithm-Scale2x-0.01
http://search.cpan.org/~bricas/Algorithm-Scale2x-0.01/
Generic implementation of the Scale2x algorithm
----
Apache-Session-1.82_05
http://search.cpan.org/~chorny/Apache-Session-1.82_05/
A persistence framework for session data
----
Apache2-AuthZSympa-0.5.0
http://search.cpan.org/~doumbzh/Apache2-AuthZSympa-0.5.0/
Authorization module based on Sympa mailing list server group definition
----
Bio-DOOP-DOOP-0.20
http://search.cpan.org/~tibi/Bio-DOOP-DOOP-0.20/
DOOP API main module
----
Bio-ECell-0.01
http://search.cpan.org/~gaou/Bio-ECell-0.01/
Perl interface for E-Cell Simulation Environment.
----
Bio-MAGE-20030502.3
http://search.cpan.org/~jasons/Bio-MAGE-20030502.3/
Container module for classes in the MAGE package: MAGE
----
Bio-MAGE-Utils-20030502.0
http://search.cpan.org/~jasons/Bio-MAGE-Utils-20030502.0/
----
CORBA-Python-0.37
http://search.cpan.org/~perrad/CORBA-Python-0.37/
----
CPAN-Mini-Extract-1.15
http://search.cpan.org/~adamk/CPAN-Mini-Extract-1.15/
Create CPAN::Mini mirrors with the archives extracted
----
CPAN-Mini-Extract-1.16
http://search.cpan.org/~adamk/CPAN-Mini-Extract-1.16/
Create CPAN::Mini mirrors with the archives extracted
----
Cache-Adaptive-0.03
http://search.cpan.org/~kazuho/Cache-Adaptive-0.03/
A Cache Engine with Adaptive Lifetime Control
----
Catalyst-Model-Net-Amazon-0.01001
http://search.cpan.org/~cfranks/Catalyst-Model-Net-Amazon-0.01001/
Catalyst model for Net::Amazon SOAP interface
----
Class-C3-XS-0.05
http://search.cpan.org/~blblack/Class-C3-XS-0.05/
XS speedups for Class::C3
----
Config-Model-0.609
http://search.cpan.org/~ddumont/Config-Model-0.609/
Model to create configuration validation tool
----
Config-Model-CursesUI-1.003
http://search.cpan.org/~ddumont/Config-Model-CursesUI-1.003/
Curses interface for configuration tree
----
Config-Model-Xorg-0.501
http://search.cpan.org/~ddumont/Config-Model-Xorg-0.501/
----
DBD-Unify-0.64
http://search.cpan.org/~hmbrand/DBD-Unify-0.64/
DBI driver for Unify database systems
----
Egg-Release-2.05
http://search.cpan.org/~lushe/Egg-Release-2.05/
Version of Egg WEB Application Framework.
----
GD-Image-Scale2x-0.04
http://search.cpan.org/~bricas/GD-Image-Scale2x-0.04/
Implementation of the Scale2x algorithm for the GD library
----
GD-Image-Scale2x-0.05
http://search.cpan.org/~bricas/GD-Image-Scale2x-0.05/
Implementation of the Scale2x algorithm for the GD library
----
Games-QuizTaker-2.01
http://search.cpan.org/~tstanley/Games-QuizTaker-2.01/
Take your own quizzes and tests
----
Games-RolePlay-MapGen-0.31.1
http://search.cpan.org/~jettero/Games-RolePlay-MapGen-0.31.1/
The base object for generating dungeons and maps
----
Gungho-0.08
http://search.cpan.org/~dmaki/Gungho-0.08/
Yet Another High Performance Web Crawler Framework
----
HTML-Tested-0.24
http://search.cpan.org/~bosu/HTML-Tested-0.24/
Provides HTML widgets with the built-in means of testing.
----
Image-Info-1.25
http://search.cpan.org/~tels/Image-Info-1.25/
Extract meta information from image files
----
InSilicoSpectro-1.0.19
http://search.cpan.org/~alexmass/InSilicoSpectro-1.0.19/
Open source Perl library for proteomics
----
LWP-Online-0.03
http://search.cpan.org/~adamk/LWP-Online-0.03/
Does your process have access to the web
----
Math-Random-MT-Auto-5.05
http://search.cpan.org/~jdhedden/Math-Random-MT-Auto-5.05/
Auto-seeded Mersenne Twister PRNGs
----
Math-Random-MT-Auto-5.06
http://search.cpan.org/~jdhedden/Math-Random-MT-Auto-5.06/
Auto-seeded Mersenne Twister PRNGs
----
MogileFS-Utils-2.10
http://search.cpan.org/~bradfitz/MogileFS-Utils-2.10/
----
NCGI-0.06
http://search.cpan.org/~mlawren/NCGI-0.06/
A Common Gateway Interface (CGI) Class
----
POE-Component-SSLify-0.08
http://search.cpan.org/~apocal/POE-Component-SSLify-0.08/
Makes using SSL in the world of POE easy!
----
Params-Util-0.25
http://search.cpan.org/~adamk/Params-Util-0.25/
Simple, compact and correct param-checking functions
----
RPC-XML-Parser-XS-0.01
http://search.cpan.org/~mikage/RPC-XML-Parser-XS-0.01/
Fast XML-RPC parser written in C
----
RTx-EmailCompletion-0.02
http://search.cpan.org/~nchuche/RTx-EmailCompletion-0.02/
Add auto completion on RT email fields
----
Safe-Caller-0.04
http://search.cpan.org/~schubiger/Safe-Caller-0.04/
A nicer interface to caller() with code execution restriction
----
Text-Filter-1.9
http://search.cpan.org/~jv/Text-Filter-1.9/
base class for objects that can read and write text lines
----
WWW-RobotRules-Parser-0.04
http://search.cpan.org/~dmaki/WWW-RobotRules-Parser-0.04/
Just Parse robots.txt
----
WebService-MusicBrainz-0.09
http://search.cpan.org/~bfaist/WebService-MusicBrainz-0.09/
----
WebService-OCTranspo-0.026
http://search.cpan.org/~doneill/WebService-OCTranspo-0.026/
Access Ottawa bus schedule information from www.octranspo.com
----
WebService-OCTranspo-0.027
http://search.cpan.org/~doneill/WebService-OCTranspo-0.027/
Access Ottawa bus schedule information from www.octranspo.com
----
Win32-IIS-Admin-1.017
http://search.cpan.org/~mthurn/Win32-IIS-Admin-1.017/
Administer Internet Information Service on Windows
----
YAML-Tiny-1.06
http://search.cpan.org/~adamk/YAML-Tiny-1.06/
Read/Write YAML files with as little code as possible
----
Yahoo-Marketing-2.01
http://search.cpan.org/~jlavallee/Yahoo-Marketing-2.01/
an interface for Yahoo! Search Marketing's Web Services.
----
okbiff-0.05
http://search.cpan.org/~mschwern/okbiff-0.05/
check if you have mail on OkCupid.com
----
okbiff-0.06
http://search.cpan.org/~mschwern/okbiff-0.06/
check if you have mail on OkCupid.com
----
threads-shared-1.11
http://search.cpan.org/~jdhedden/threads-shared-1.11/
Perl extension for sharing data structures between threads
If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.
This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
http://www.stonehenge.com/merlyn/LinuxMag/col82.html
print "Just another Perl hacker," # the original
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
------------------------------
Date: 14 May 2007 15:32:48 -0700
From: denis.papathanasiou@gmail.com
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <1179181968.400765.288540@q75g2000hsh.googlegroups.com>
Using the extra warnings gave me this:
$ ./split-file.pl qte20070330
./split-file.pl: qte20070330:120761073: skipping...
134950355TRIG 000008192000000052000008197000000014
$ echo $?
0
Looking at the tail end of the problem line gave me this:
offs asc hex dec oct bin
0119: 0 30 048 060 00110000
0120: 0 30 048 060 00110000
0121: 1 31 049 061 00110001
0122: 4 34 052 064 00110100
0123: 0A 010 012 00001010
The difference between the malformed line is that it contains a single
linefeed character (hex 0a) at the 63rd byte, whereas a normal/well-
formed line is 90 bytes long, ending in carriage return (hex 0d) plus
linefeed (hex 0a).
So it seems that the single linefeed (0a character) fools perl into
thinking that it's come to EOF, terminating the "while( $ln=<IN> )
{ }" loop.
So if that's true, how can I guard against this condition?
------------------------------
Date: Mon, 14 May 2007 23:05:56 -0000
From: gbacon@hiwaay.net (Greg Bacon)
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <134hqqkaaqn7jf6@corp.supernews.com>
In article <1179176417.053606.44790@w5g2000hsg.googlegroups.com>,
<denis.papathanasiou@gmail.com> wrote:
: > You wrote that you expected files named A-Z but R is the last
: > file created. Looking at your logic, your code skips input lines
: > that don't have CR NL. Is this your intent? Could the lines with
: > symbols in S-Z be "hidden" in the sense that they fail the test
: > in the following line?
: >
: > if( $ln =~ m/\r\n$/ ) {
:
: Yes, that's the intent, because if a line doesn't end in CR, it is
: malformed and cannot be parsed further.
Assuming you haven't changed the value of $/ (documented in the
perlvar manpage), $ln contains newline-terminated records, so
control wouldn't reach the above conditional without a newline
at the end.
Note that your regular expression tests for a carriage return
followed by a newline at the end of $ln. Looking at the output
in a followup farther downthread, there's at least one record
that's being ignored because it doesn't have a carriage return.
You report that head(1) is failing with an I/O error. Can anyone
read the entire input? Does the following command succeed?
wc -l qte20070430
Greg
--
"Unsustainable," say economists.
"Bubble," say the sourpusses.
"Buy," say the lumpeninvestoriat.
-- Bill Bonner
------------------------------
Date: 14 May 2007 16:16:48 -0700
From: denis.papathanasiou@gmail.com
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <1179184608.002623.319400@l77g2000hsb.googlegroups.com>
> Assuming you haven't changed the value of $/ (documented in the
> perlvar manpage), $ln contains newline-terminated records, so
> control wouldn't reach the above conditional without a newline
> at the end.
>
> Note that your regular expression tests for a carriage return
> followed by a newline at the end of $ln. Looking at the output
> in a followup farther downthread, there's at least one record
> that's being ignored because it doesn't have a carriage return.
Right, what should happen is: that line fails the regex text, so I
should see the warning.
But, and here's what I don't understand, the "while( $ln=<IN> )
{ }" loop should continue because the end of file has not been
reached.
So if the lone 0a character isn't triggering the end of that loop,
what is?
BTW, I haven't touched the value of $/ -- in fact the only code prior
to the block I pasted in the original post is just this:
#!/usr/bin/perl
#
#
# definition of necessary
# command-line arguments
#
die "\nUsage\n\tperl split-file.pl [Input file name ({file}YYYYMMDD)]
[Output file path] [Output suffix]\n" unless @ARGV ;
$IN_FILE = $ARGV[0];
$OUT_PATH = $ARGV[1];
$OUT_SUFFIX = $ARGV[2];
$prior_sym = '';
> You report that head(1) is failing with an I/O error. Can anyone
> read the entire input? Does the following command succeed?
>
> wc -l qte20070430
Yes, I'd tried that earlier, before using split, and here's what
happened:
$ wc -l qte20070430
wc: qte20070430: Input/output error
120781227 qte20070430
------------------------------
Date: Mon, 14 May 2007 18:04:38 -0700
From: sl123@netherlands.area
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <211i43p1f9qn8krlq2rhf2ecb1ol58q7pr@4ax.com>
On 14 May 2007 08:10:29 -0700, denis.papathanasiou@gmail.com wrote:
>I have a script which reads a plain text (dos) file line-by-line and
>splits it into several smaller files, based on a single attribute.
>
>The code (below) works, except when a line is malformed (i.e., the
>line contains binary or control characters), and the script just exits
>with an error:
>
>open(IN, "$IN_FILE") or die "\n\terror: Could not read $IN_FILE $!
>\n"; ;
>binmode(IN);
>while( $ln=<IN> ) {
> if( $ln =~ m/\r\n$/ ) {
> $ln =~ s/\r\n$/\n/; # dos2unix: convert CR LF to LF
> if( $. > 0 ) { # skip the header line
> $sym = substr($ln, 10, 16);
> $sym =~ s/ //g;
> if( $prior_sym ne $sym ) {
> if( $prior_sym ne '' ) { close(OUT); }
> $sym_file = $OUT_PATH . "/" . $sym . "." . $OUT_SUFFIX ;
> open(OUT, ">$sym_file") or die "\n\terror: Could not write to
>$sym_file $!\n";
> binmode(OUT);
> }
> print OUT $ln;
> $prior_sym = $sym ;
> }
> }
>}
>close(IN);
>
>What I'd like it to do, instead, is if it hits a bad line, write a
>warning and keep going to the end of the file.
>
>I've tried wrapping the block above in "eval { }; warn $@ if $@;" but
>that doesn't trap the error; even with eval/warn, a bad line will
>cause the script to exit.
>
>Is there a better way of doing this?
Why are you opening a file in "binmode" then relying on translated(or not)
crlf's when you read a line?
Was it an after thought? Did you first open it in text mode, then try binmode
when it didn't work?
Was this file produced via some capture involving terminal emulation (ie: VT100,vt52)?
This goes back a long way but you could have a corrupted file allocation table entry
for it.
Go back to the "old fashioned" (not old) way, binmode and buffer with fixed size reads,
level I or II. If it can't do that then the fat entry for it is bad and you need
a professional piece of software to pick it out.
I know you can't bring a 14 gig file into a hex/bin editor so just read it in
chunks, analyze each chunk for ctrl codes then write each clean chunk out to a
text file.
Read 2k, analyze it, write 2k.
Try that. There is only 2 ways it can go. Either its not corrupt or it is.
There are no other options. Take Perl out of the conversation, it has nothing to
do with it apparently.
------------------------------
Date: Mon, 14 May 2007 18:22:29 -0700
From: sl123@netherlands.area
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <pk2i43phd318hmbtlok98nld69veebn3p0@4ax.com>
On Mon, 14 May 2007 18:04:38 -0700, sl123@netherlands.area wrote:
>On 14 May 2007 08:10:29 -0700, denis.papathanasiou@gmail.com wrote:
>
>>I have a script which reads a plain text (dos) file line-by-line and
>>splits it into several smaller files, based on a single attribute.
>>
>>The code (below) works, except when a line is malformed (i.e., the
>>line contains binary or control characters), and the script just exits
>>with an error:
>>
>>open(IN, "$IN_FILE") or die "\n\terror: Could not read $IN_FILE $!
>>\n"; ;
>>binmode(IN);
>>while( $ln=<IN> ) {
>> if( $ln =~ m/\r\n$/ ) {
>> $ln =~ s/\r\n$/\n/; # dos2unix: convert CR LF to LF
>> if( $. > 0 ) { # skip the header line
>> $sym = substr($ln, 10, 16);
>> $sym =~ s/ //g;
>> if( $prior_sym ne $sym ) {
>> if( $prior_sym ne '' ) { close(OUT); }
>> $sym_file = $OUT_PATH . "/" . $sym . "." . $OUT_SUFFIX ;
>> open(OUT, ">$sym_file") or die "\n\terror: Could not write to
>>$sym_file $!\n";
>> binmode(OUT);
>> }
>> print OUT $ln;
>> $prior_sym = $sym ;
>> }
>> }
>>}
>>close(IN);
>>
>>What I'd like it to do, instead, is if it hits a bad line, write a
>>warning and keep going to the end of the file.
>>
>>I've tried wrapping the block above in "eval { }; warn $@ if $@;" but
>>that doesn't trap the error; even with eval/warn, a bad line will
>>cause the script to exit.
>>
>>Is there a better way of doing this?
>
>
>Why are you opening a file in "binmode" then relying on translated(or not)
>crlf's when you read a line?
>
>Was it an after thought? Did you first open it in text mode, then try binmode
>when it didn't work?
>
>Was this file produced via some capture involving terminal emulation (ie: VT100,vt52)?
>
>This goes back a long way but you could have a corrupted file allocation table entry
>for it.
>
>Go back to the "old fashioned" (not old) way, binmode and buffer with fixed size reads,
>level I or II. If it can't do that then the fat entry for it is bad and you need
>a professional piece of software to pick it out.
>
>I know you can't bring a 14 gig file into a hex/bin editor so just read it in
>chunks, analyze each chunk for ctrl codes then write each clean chunk out to a
>text file.
>
>Read 2k, analyze it, write 2k.
>Try that. There is only 2 ways it can go. Either its not corrupt or it is.
>There are no other options. Take Perl out of the conversation, it has nothing to
>do with it apparently.
>
Use a level I, "unbuffered i/o" sysread/syswrite. You still supply your buffer.
Since its an old "dos" file, make sure you aren't doing utf8 reads.
Look for an "error" on a read, a sure giveaway its a corrupt fat entry.
------------------------------
Date: Mon, 14 May 2007 21:37:25 -0700
From: sl123@netherlands.area
Subject: Re: Parsing a text file line-by-line: skipping badly-formed lines?
Message-Id: <k1ei435b9bqrdfge0ku5r3bqb45c3a4l5e@4ax.com>
<snip>
Btw, you sound like a person with some experience with data.
Why haven't you thought of this? You really think Perl is
going to help you with this problem?
You couldn't solve this problem in a thousand years
in a thousand different languages.
Find another profession ..... trash collector
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 436
**************************************