[30642] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 1887 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 29 18:09:47 2008

Date: Mon, 29 Sep 2008 15:09:14 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 29 Sep 2008     Volume: 11 Number: 1887

Today's topics:
    Re: File edits in a Perlish way <rkb@i.frys.com>
    Re: File edits in a Perlish way <pgodfrin@gmail.com>
    Re: File edits in a Perlish way <rkb@i.frys.com>
    Re: File edits in a Perlish way <rkb@i.frys.com>
    Re: File edits in a Perlish way <pgodfrin@gmail.com>
    Re: File edits in a Perlish way <ben@morrow.me.uk>
    Re: File edits in a Perlish way <rkb@i.frys.com>
    Re: File edits in a Perlish way <rkb@i.frys.com>
        file locks and a counter <richard@example.invalid>
    Re: file locks and a counter <jurgenex@hotmail.com>
    Re: file locks and a counter <ben@morrow.me.uk>
    Re: file locks and a counter <richard@example.invalid>
    Re: file locks and a counter xhoster@gmail.com
    Re: Help: Duplicate and Unique Lines Problem <tim@burlyhost.com>
    Re: Perl Strings vs FileHandle <mshadabh@gmail.com>
    Re: Repeating characters <JustMe@somewhere.de>
    Re: Replacing binary data containing & using perl in HP <RedGrittyBrick@spamweary.invalid>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 29 Sep 2008 11:12:14 -0700 (PDT)
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: File edits in a Perlish way
Message-Id: <7c439831-5957-4cbf-9b2e-d9a28e61f821@r15g2000prd.googlegroups.com>

On Sep 29, 10:08 am, pgodfrin <pgodf...@gmail.com> wrote:
> I have a need to pass a series of "edits" I'd like to apply in a
> somewhat controlled manner.
>
> My pseudo code would be something like:
> 1. accept as input the file name, old text, new text
> 2. check existence in file for old text, if not found error out
> 3. if found, then show before and after text for the whole line
> 4. Write out file.
>
> I'm currently using @ARGV for #1 and I intend to slurp the whole file
> into an array and then use map for #4.
>
> I'm looking for ideas on a "perlish" way to do #2 and #3. The grep
> function lets me show before and after like so:
>
>     print grep(/$oldtx/,@oldfile);
>
> But I haven't checked for existence of the $oldtx as of yet. so a type
> would make this print statement print nothing. I could always use
> loops and other brute force methods - but it seems like there may be a
> cooler, perlish way to this... Any ideas?
> phil

Do you want the output to go to a new file, or simply edit/update the
source file?

If outputting to a new file, how do you want to determine the
filename?

If you're simply updating 1 or more lines in the source file, you may
want to look at using Tie::File

Could the search string be found more than once in the file?  If so,
do you want to change all occurrences or just the first?

For step #1 instead of using @ARGV, I'd either prompt the user for
input or use one of the Getopt modules.
http://search.cpan.org/~jv/Getopt-Long-2.37/lib/Getopt/Long.pm
http://search.cpan.org/~rgarcia/perl-5.10.0/lib/Getopt/Std.pm
http://search.cpan.org/~rsavage/Getopt-Simple-1.49/lib/Getopt/Simple.pm

Step #2 is handled with an "or die" statement when creating the
filehandle.

Steps #3 & #4 would be best handled while looping through the source
file line-by-line.


------------------------------

Date: Mon, 29 Sep 2008 11:37:46 -0700 (PDT)
From: pgodfrin <pgodfrin@gmail.com>
Subject: Re: File edits in a Perlish way
Message-Id: <cc5d7e5b-dfed-4e0a-af7b-7a4104f65829@x35g2000hsb.googlegroups.com>

On Sep 29, 1:12=A0pm, Ron Bergin <r...@i.frys.com> wrote:
> On Sep 29, 10:08 am, pgodfrin <pgodf...@gmail.com> wrote:
>
>
>
> > I have a need to pass a series of "edits" I'd like to apply in a
> > somewhat controlled manner.
>
> > My pseudo code would be something like:
> > 1. accept as input the file name, old text, new text
> > 2. check existence in file for old text, if not found error out
> > 3. if found, then show before and after text for the whole line
> > 4. Write out file.
>
> > I'm currently using @ARGV for #1 and I intend to slurp the whole file
> > into an array and then use map for #4.
>
> > I'm looking for ideas on a "perlish" way to do #2 and #3. The grep
> > function lets me show before and after like so:
>
> > =A0 =A0 print grep(/$oldtx/,@oldfile);
>
> > But I haven't checked for existence of the $oldtx as of yet. so a type
> > would make this print statement print nothing. I could always use
> > loops and other brute force methods - but it seems like there may be a
> > cooler, perlish way to this... Any ideas?
> > phil
>
> Do you want the output to go to a new file, or simply edit/update the
> source file?
>
> If outputting to a new file, how do you want to determine the
> filename?
>
> If you're simply updating 1 or more lines in the source file, you may
> want to look at using Tie::File
>
> Could the search string be found more than once in the file? =A0If so,
> do you want to change all occurrences or just the first?
>
> For step #1 instead of using @ARGV, I'd either prompt the user for
> input or use one of the Getopt modules.http://search.cpan.org/~jv/Getopt-=
Long-2.37/lib/Getopt/Long.pmhttp://search.cpan.org/~rgarcia/perl-5.10.0/lib=
/Getopt/Std.pmhttp://search.cpan.org/~rsavage/Getopt-Simple-1.49/lib/Getopt=
/Simple.pm
>
> Step #2 is handled with an "or die" statement when creating the
> filehandle.
>
> Steps #3 & #4 would be best handled while looping through the source
> file line-by-line.

Since these files are relatively small - 30 to 60 lines, I'm ok with
manipulating them in memory. They won't get any larger because the
files are command files that are generated by other unrelated
processes. As such, I could always process them line by line and have
the ultimate control, but I was just looking for some neat features
similar to the map function.

What I want to do is check for the "old text" IN the file not if the
file exists.

Showing before and after snippet is:

print "This will change from:\n";
print grep(/$oldtx/,@oldfile);
print "...to\n";
print grep(s/$newtx/t2ns1/g,@oldfile);

But if the first print grep statement does not find any of the $oldtx
in @oldfile, if the user made a typo for example, then it just prints
nothing. That why I would like to test the existence of the $oldtext.
I suppose:

 unless (grep(/$oldtx/,@oldfile)) { die "not found!\n";}

would work. And, since these are very small files, then it would be ok
to do grep multiple times...

pg


------------------------------

Date: Mon, 29 Sep 2008 12:01:13 -0700 (PDT)
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: File edits in a Perlish way
Message-Id: <d76af1a1-e670-4e73-914c-552675cd9531@a2g2000prm.googlegroups.com>

On Sep 29, 11:37 am, pgodfrin <pgodf...@gmail.com> wrote:
> On Sep 29, 1:12 pm, Ron Bergin <r...@i.frys.com> wrote:
>
>
>
> > On Sep 29, 10:08 am, pgodfrin <pgodf...@gmail.com> wrote:
>
[snip]
>
> Since these files are relatively small - 30 to 60 lines, I'm ok with
> manipulating them in memory. They won't get any larger because the
> files are command files that are generated by other unrelated
> processes. As such, I could always process them line by line and have
> the ultimate control, but I was just looking for some neat features
> similar to the map function.
>
> What I want to do is check for the "old text" IN the file not if the
> file exists.
>
> Showing before and after snippet is:
>
> print "This will change from:\n";
> print grep(/$oldtx/,@oldfile);
> print "...to\n";
> print grep(s/$newtx/t2ns1/g,@oldfile);
>
> But if the first print grep statement does not find any of the $oldtx
> in @oldfile, if the user made a typo for example, then it just prints
> nothing. That why I would like to test the existence of the $oldtext.
> I suppose:
>
>  unless (grep(/$oldtx/,@oldfile)) { die "not found!\n";}
>
> would work. And, since these are very small files, then it would be ok
> to do grep multiple times...
>
> pg

What's wrong with something like this (untested):

my $found;

open my $filehadle, '<', $file or die "can't open '$file' $!";
while ( <$filehandle> ) {
   my $line = $_;

   if ( $line =~ s/$oldtx/$newtx/ ) {
      $found++;
      print "Original:$_";
      print "New:$line";
   }
}
die "Search pattern was not found\n" unless $found;


------------------------------

Date: Mon, 29 Sep 2008 12:37:08 -0700 (PDT)
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: File edits in a Perlish way
Message-Id: <7ca82466-a5a6-4352-8cdc-3209f0fb3dc7@n33g2000pri.googlegroups.com>

On Sep 29, 11:37 am, pgodfrin <pgodf...@gmail.com> wrote:

[snip]
>
> Showing before and after snippet is:
>
> print "This will change from:\n";
> print grep(/$oldtx/,@oldfile);
> print "...to\n";
> print grep(s/$newtx/t2ns1/g,@oldfile);
>
> But if the first print grep statement does not find any of the $oldtx
> in @oldfile, if the user made a typo for example, then it just prints
> nothing. That why I would like to test the existence of the $oldtext.
> I suppose:
>
>  unless (grep(/$oldtx/,@oldfile)) { die "not found!\n";}
>
> would work. And, since these are very small files, then it would be ok
> to do grep multiple times...
>
> pg

Why would you want to intensionally add in inefficiency of the
multiple greps?  Granted the files are small and the loss of
efficiency may not be noticed in this case, but to me intensionally
adding inefficiency and possible loss of clarity doesn't make sense.
The loop I showed in my prior comment is clean, efficient and easy to
follow.


------------------------------

Date: Mon, 29 Sep 2008 12:54:11 -0700 (PDT)
From: pgodfrin <pgodfrin@gmail.com>
Subject: Re: File edits in a Perlish way
Message-Id: <64a459f6-0404-4e9d-b867-2923d9df38dd@s50g2000hsb.googlegroups.com>

On Sep 29, 2:37=A0pm, Ron Bergin <r...@i.frys.com> wrote:
> On Sep 29, 11:37 am, pgodfrin <pgodf...@gmail.com> wrote:
>
> [snip]
>
>
>
>
>
> > Showing before and after snippet is:
>
> > print "This will change from:\n";
> > print grep(/$oldtx/,@oldfile);
> > print "...to\n";
> > print grep(s/$newtx/t2ns1/g,@oldfile);
>
> > But if the first print grep statement does not find any of the $oldtx
> > in @oldfile, if the user made a typo for example, then it just prints
> > nothing. That why I would like to test the existence of the $oldtext.
> > I suppose:
>
> > =A0unless (grep(/$oldtx/,@oldfile)) { die "not found!\n";}
>
> > would work. And, since these are very small files, then it would be ok
> > to do grep multiple times...
>
> > pg
>
> Why would you want to intensionally add in inefficiency of the
> multiple greps? =A0Granted the files are small and the loss of
> efficiency may not be noticed in this case, but to me intensionally
> adding inefficiency and possible loss of clarity doesn't make sense.
> The loop I showed in my prior comment is clean, efficient and easy to
> follow.

Hi Ron,
Yup your code is good, clean (and fun). I'm just looking for a way to
make the code very, very simple. So, in this case, multiple grep
statements is very, very simple. Albeit not as efficient, but very
simple. In fact I was hoping for something even simpler - where I
could combine the grep and an error check into one line. sort of like:
   if(print grep(/$oldtx/,@oldfile)) {print "Text not found\n" and
die;}

But that won't work case printing the result of grep is always
successful.

pg


------------------------------

Date: Mon, 29 Sep 2008 21:04:50 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: File edits in a Perlish way
Message-Id: <20p7r5-std.ln1@osiris.mauzo.dyndns.org>


Quoth pgodfrin <pgodfrin@gmail.com>:
> I have a need to pass a series of "edits" I'd like to apply in a
> somewhat controlled manner.
> 
> My pseudo code would be something like:
> 1. accept as input the file name, old text, new text
> 2. check existence in file for old text, if not found error out
> 3. if found, then show before and after text for the whole line
> 4. Write out file.
> 
> I'm currently using @ARGV for #1 and I intend to slurp the whole file
> into an array and then use map for #4.
> 
> I'm looking for ideas on a "perlish" way to do #2 and #3. The grep
> function lets me show before and after like so:
> 
>     print grep(/$oldtx/,@oldfile);

    my @lines = grep /\Q$oldtx/, @oldfile
        or die "'$oldtx' not found\n";
    print @lines;

-- 
'Deserve [death]? I daresay he did. Many live that deserve death. And some die
that deserve life. Can you give it to them? Then do not be too eager to deal
out death in judgement. For even the very wise cannot see all ends.'
                                                               ben@morrow.me.uk


------------------------------

Date: Mon, 29 Sep 2008 13:29:14 -0700 (PDT)
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: File edits in a Perlish way
Message-Id: <e588a608-19f1-4666-90cd-a36511feff7e@h2g2000hsg.googlegroups.com>

On Sep 29, 12:54 pm, pgodfrin <pgodf...@gmail.com> wrote:
>
> Hi Ron,
> Yup your code is good, clean (and fun). I'm just looking for a way to
> make the code very, very simple. So, in this case, multiple grep
> statements is very, very simple. Albeit not as efficient, but very
> simple. In fact I was hoping for something even simpler - where I
> could combine the grep and an error check into one line. sort of like:
>    if(print grep(/$oldtx/,@oldfile)) {print "Text not found\n" and
> die;}
>
> But that won't work case printing the result of grep is always
> successful.
>
IMO, you seem to have an odd idea of the meaning of simple.

It sounds to me that what you're really after is the most compact code
and not necessarily the simplest.

while ( <$filehandle> ) {
   my $line = $_;
# this if statement is 1 line, but probably wrap in the email
   if($line =~ s/$oldtx/$newtx/){$found++; print "Original:$_", "New:
$line";}
}


------------------------------

Date: Mon, 29 Sep 2008 14:31:50 -0700 (PDT)
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: File edits in a Perlish way
Message-Id: <528de6a7-9ccd-475c-a225-aac0990c6340@c65g2000hsa.googlegroups.com>

On Sep 29, 12:54 pm, pgodfrin <pgodf...@gmail.com> wrote:
[snip]
>
> Hi Ron,
> Yup your code is good, clean (and fun). I'm just looking for a way to
> make the code very, very simple. So, in this case, multiple grep
> statements is very, very simple. Albeit not as efficient, but very
> simple. In fact I was hoping for something even simpler - where I
> could combine the grep and an error check into one line. sort of like:
>    if(print grep(/$oldtx/,@oldfile)) {print "Text not found\n" and
> die;}
>
> But that won't work case printing the result of grep is always
> successful.
>
> pg

Here's a clean, efficient, and very simple approach doesn't require
multiple greps.

#!/usr/bin/perl

use strict;
use warnings;
use Tie::File;

# checks should be done here, but I kept it simple
die "invalid args" unless @ARGV == 3;

my ($file, $oldtxt, $newtxt) = @ARGV;
my $found;

tie my @array, 'Tie::File', $file or die "can't tie '$file' $!";

for ( @array ) {
    my $original = $_;
    if ( s/\Q$oldtxt/$newtxt/ ) {
        $found++;
        print "Updating:$original\n", "To:$_\n";
    }
}
untie @array;
die "Search pattern was not found\n" unless $found;


------------------------------

Date: 29 Sep 2008 19:56:44 GMT
From: Richard Nixon <richard@example.invalid>
Subject: file locks and a counter
Message-Id: <1nlme8541y6cy$.eher6f0pphk7.dlg@40tude.net>



Hello newsgroup,

Fortran is my syntax of choice for my avocation in numerical analysis, but
perl is the much better tool for the net.  For one of the better fortran
sites, they have this counter:

http://www.fortranlib.com/flcounter.perl

If the open statement in this:

sub check_lock {
   $time = $_[0];

   for ($i = 1;$i <= $time; $i++) {
      if (-e "$data_dir$lock_file") {
    sleep 1;
      }
      else {
    open(LOCK,">$data_dir$lock_file");
    print LOCK "0";
    close(LOCK);
    last;
      }
   }
}

is changed to this:

sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT)         or die $!;

do they have a better program?

Thanks and cheers,

-- 
Richard Milhous Nixon

Denial ain't just a river in Egypt.
~~ Mark Twain


------------------------------

Date: Mon, 29 Sep 2008 13:07:38 -0700
From: J�rgen Exner <jurgenex@hotmail.com>
Subject: Re: file locks and a counter
Message-Id: <kbd2e4hhlot9qj6sg2g7ofc4ouh80mg7iq@4ax.com>

Richard Nixon <richard@example.invalid> wrote:
>perl is the much better tool for the net.  For one of the better fortran
>sites, they have this counter:

perldoc -q increment

jue


------------------------------

Date: Mon, 29 Sep 2008 21:07:22 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: file locks and a counter
Message-Id: <q4p7r5-std.ln1@osiris.mauzo.dyndns.org>


Quoth dialup1@copper.net:
> 
> If the open statement in this:
> 
> sub check_lock {
>    $time = $_[0];
> 
>    for ($i = 1;$i <= $time; $i++) {
>       if (-e "$data_dir$lock_file") {
>     sleep 1;
>       }
>       else {
>     open(LOCK,">$data_dir$lock_file");
>     print LOCK "0";
>     close(LOCK);
>     last;
>       }
>    }
> }
> 
> is changed to this:
> 
> sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT)         or die $!;
> 
> do they have a better program?

Yes. The program above has a race condition between the -e and the open;
using O_EXCL avoids this.

Ben

-- 
#!/bin/sh
quine="echo 'eval \$quine' >> \$0; echo quined"
eval $quine
#                                                        [ben@morrow.me.uk]


------------------------------

Date: 29 Sep 2008 20:56:25 GMT
From: Richard Nixon <richard@example.invalid>
Subject: Re: file locks and a counter
Message-Id: <o83foim83vt0$.pb2imthe5rdf$.dlg@40tude.net>

On Mon, 29 Sep 2008 13:07:38 -0700, J�rgen Exner wrote:

> Richard Nixon <richard@example.invalid> wrote:
>>perl is the much better tool for the net.  For one of the better fortran
>>sites, they have this counter:
> 
> perldoc -q increment
> 
> jue

I thought I might do just that, but I don't seem to have anything perl on
my machine right now:

F:\gfortran\source>perl anything
'perl' is not recognized as an internal or external command,
operable program or batch file.

F:\gfortran\source>

I'm downloading Activestate's msi for x86 windows.  I've got it stored on
disk somewhere, but at 17 megs, it's no biggie to get a fresh copy.

What I really want to do is have my perl stuff and my fortran stuff
together on F, which is a memory stick, which I can easily transport to my
ladyfriend's house, where I can then program on her laptop, which is what
insomniacs do.

Any suggestions about choices for the activestate install?
-- 
Richard Milhous Nixon

Always acknowledge a fault frankly. This will throw those in authority off
guard and allow you opportunity to commit more.
~~ Mark Twain


------------------------------

Date: 29 Sep 2008 21:46:19 GMT
From: xhoster@gmail.com
Subject: Re: file locks and a counter
Message-Id: <20080929174622.551$2M@newsreader.com>

dialup1@copper.net wrote:
> Hello newsgroup,
>
> Fortran is my syntax of choice for my avocation in numerical analysis,
> but perl is the much better tool for the net.  For one of the better
> fortran sites, they have this counter:
>
> http://www.fortranlib.com/flcounter.perl
>
> If the open statement in this:
>
> sub check_lock {
>    $time = $_[0];
>
>    for ($i = 1;$i <= $time; $i++) {
>       if (-e "$data_dir$lock_file") {
>     sleep 1;
>       }
>       else {
>     open(LOCK,">$data_dir$lock_file");
>     print LOCK "0";
>     close(LOCK);
>     last;
>       }
>    }
> }
>
> is changed to this:
>
> sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT)         or die $!;
>
> do they have a better program?

That depends.  I tend to think that all sub-values of "broken" are
equal, but sometimes some things can be more broken than others.

The current code has a race condition.  The change detects the race
condition, but upon detecting it it aborts rather than recovering.
Maybe:

sysopen(FH, $path, O_WRONLY|O_EXCL|O_CREAT) or
                            ($!{EEXIST} and redo) or die $!;

However, that still seems to have the problem that if the lock attempt
times out, it proceeds as if the lock was obtained even though it hasn't
been.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Mon, 29 Sep 2008 11:17:42 -0700
From: Tim Greer <tim@burlyhost.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <bZ8Ek.103$hX5.38@newsfe06.iad>

Amy Lee wrote:

> On Mon, 29 Sep 2008 15:28:51 +0200, Peter Makholm wrote:
> 
>> Amy Lee <openlinuxsource@gmail.com> writes:
>> 
>>> Dose perl has functions like the UNIX command sort and uniq can
>>> output duplicate lines and unique lines?
>> 
>> There is a uniq function in the List::MoreUtils module otherwise the
>> standard way is to use the printed stings as keys in a hash to mark
>> which lines is allready printed.
>> 
>> //Makholm
> Hello,
> 
> I use this module List::MoreUtils to have a process but still failed
> and output just the last line, here's my codes.
> 
> use List::MoreUtils qw(any all none notall true false firstidx
> first_index
>                            lastidx last_index insert_after
>                            insert_after_string apply after after_incl
>                            before before_incl indexes firstval
>                            first_value lastval last_value each_array
>                            each_arrayref pairwise natatime mesh zip
>                            uniq minmax);
> 
> $file = $ARGV[0];
> open FILE, '<', "$file";
> while (<FILE>)
> {
>   @raw_list = split /\n/, $_;
> }
> @list = uniq @raw_list;
> foreach $single (@list)
> {
>   print "$single\n";
> }
> 
> Thank you very much.
> 
> Regards,
> 
> Amy

Why read it into an array, just to break it down again?  Per line, use
hashes and check to see if it's been 'seen' yet.
-- 
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!


------------------------------

Date: Mon, 29 Sep 2008 15:05:14 -0700 (PDT)
From: shadabh <mshadabh@gmail.com>
Subject: Re: Perl Strings vs FileHandle
Message-Id: <f12c180d-ea2a-4195-b6fc-04543e3f1e86@d45g2000hsc.googlegroups.com>

On Sep 17, 12:39=A0pm, "John W. Krahn" <some...@example.com> wrote:
> shadabh wrote:
> > On Sep 6, 6:08 pm, "John W. Krahn" <some...@example.com> wrote:
> >> shadabh wrote:
> >>> Hi all,
> >>> Just wanted to run this through Perl gurus to see if fit is correct?.
> >>> I have a file that could possibly be 1GB in variable length EBCDIC
> >>> data. I will read the file as EBCDIC data and based on some criteria
> >>> split it into 100 different files which will add up to the 1GB. This
> >>> way a particular copy book can be applied to easy of the split files.
> >>> The approach I am using is a filehandle ( IO::FileHandle and
> >>> $Strings), substr and write out to 100 different files after applying
> >>> the 'logic'. I will use two routine, one to read and one to write, I
> >>> have tested this out with 100MB file and it works fine. The question
> >>> though is there a memory limit to this, as we are using strings to
> >>> break the files. Or is there an alternative way to do this?
> >>> Comments, suggestions, improvements and alternatives will really help
> >>> to design the code. thanks
> >> You have to show us what "this" is first. =A0It is kind of hard to mak=
e
> >> comments, suggestions, improvements and alternatives to something that
> >> we cannot see.
>
> >> John
> >> --
> >> Perl isn't a toolbox, but a small machine shop where you
> >> can special-order certain sorts of tools at low cost and
> >> in short order. =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0-- Larry Wall- Hide quoted text -
>
> >> - Show quoted text -
>
> > Sure. Here is what I meant by 'this way'. Please comment. Thanks
>
> > while ($raw_data) {
>
> You don't modify $raw_data inside the loop so there is no point in
> testing it every time through the loop.
>
> > =A0 =A0 $var_len=3D$INIT_LEN+$AC_NUM_LENGTH;
>
> =A0 =A0 =A0my $var_len =3D $INIT_LEN + $AC_NUM_LENGTH;
>
> > =A0 =A0 $val =3D substr $raw_data,$var_len,4;
> > =A0 =A0 $asc_string =3D $val;
>
> =A0 =A0 =A0my $asc_string =3D substr $raw_data, $var_len, 4;
>
> > =A0 =A0 eval '$asc_string =3D~ tr/\000-\377/' . $cp_037 . '/';
> > # =A0 open(OUTF, '>>ebcdic_ID.txt');
> > # =A0 #print OUTF $var_len, $asc_string, "\n";
> > # =A0 close OUTF;
> > =A0 =A0 open(PARM, '<TSY2_PARM.par') || die("Could not open the
> > parammeter file $PARM_FILE ! File read failed ..check iF file exits");
>
> You are opening 'TSY2_PARM.par' but your error message says you are
> opening $PARM_FILE. =A0You should include the $! variable in the error
> message so you know *why* it failed to open.
>
> =A0 =A0 =A0open my $PARM, '<', 'TSY2_PARM.par' or die "Could not open
> 'TSY2_PARM.par' $!";
>
> > =A0 =A0 $parm_data =3D <PARM>;
>
> You assign the same data to $parm_data every time so...
>
> > =A0 =A0 if (($parm_data =3D~ m!($asc_string)!g) eq 1) {
>
> your pattern will always match at the same place every time through the
> loop so if the pattern is present you have an infinite loop. =A0You are
> not using the contents of $1 so the parentheses are superfluous. =A0The
> comparison test to the string '1' is superfluous.
>
> =A0 =A0 =A0if ( $parm_data =3D~ /$asc_string/g ) {
>
> > =A0 =A0 =A0 =A0 $COPYBOOK_LEN =3D substr $parm_data,length($`)+4,4;
>
> The use of $` is discouraged as it slows down *all* regular expressions
> in your program.
>
> =A0 =A0 =A0 =A0 =A0my $COPYBOOK_LEN =3D substr $parm_data, $-[ 0 ] + 4, 4=
;
>
> > =A0 =A0 =A0 =A0 print $COPYBOOK_LEN;
> > =A0 =A0 =A0 =A0 close (PARM);
> > =A0 =A0 =A0 =A0 $OUT_DATAFILE =3D 'EBCDIC_'.$asc_string.'.txt';
> > =A0 =A0 =A0 =A0 $RECORD_LEN=3D $COPYBOOK_LEN+$HEADER_DATA;
> > =A0 =A0 =A0 =A0 open(OUTF, ">>$OUT_DATAFILE")|| die("Could not open fil=
e. File
> > read failed ..check id file exits");
>
> =A0 =A0 =A0 =A0 =A0open my $OUTF, '>>', $OUT_DATAFILE or die "Could not o=
pen
> '$OUT_DATAFILE' $!";
>
> > =A0 =A0 =A0 =A0 print OUTF substr $raw_data, $INIT_LEN, $RECORD_LEN;
> > =A0 =A0 =A0 =A0 close OUTF;
> > =A0 =A0 =A0 =A0 $INIT_LEN =3D $INIT_LEN + $RECORD_LEN;
> > =A0 =A0 =A0 =A0 print $INIT_LEN;
> > =A0 =A0 =A0 =A0 print $var_len;
> > =A0 =A0 }
> > =A0 =A0 else {
> > =A0 =A0 =A0 =A0 print 'End of file reached or copy book is not a part o=
f the
> > loading process', "\n";
> > =A0 =A0 =A0 =A0 exit 0;
> > =A0 =A0 }
> > }
>
> So, to summarise:
>
> open my $PARM, '<', 'TSY2_PARM.par' or die "Could not open the
> parammeter file 'TSY2_PARM.par' $!";
> my $parm_data =3D <$PARM>;
> close $PARM;
>
> while ( 1 ) {
>
> =A0 =A0 =A0my $var_len =3D $INIT_LEN + $AC_NUM_LENGTH;
> =A0 =A0 =A0my $asc_string =3D substr $raw_data, $var_len, 4;
> =A0 =A0 =A0eval '$asc_string =3D~ tr/\000-\377/' . $cp_037 . '/';
>
> =A0 =A0 =A0if ( $parm_data =3D~ /\Q$asc_string\E/g ) {
>
> =A0 =A0 =A0 =A0 =A0my $COPYBOOK_LEN =3D substr $parm_data, $-[ 0 ] + 4, 4=
;
> =A0 =A0 =A0 =A0 =A0my $OUT_DATAFILE =3D "EBCDIC_$asc_string.txt";
> =A0 =A0 =A0 =A0 =A0my $RECORD_LEN =A0 =3D $COPYBOOK_LEN + $HEADER_DATA;
>
> =A0 =A0 =A0 =A0 =A0open my $OUTF, '>>', $OUT_DATAFILE or die "Could not o=
pen
> '$OUT_DATAFILE' $!";
> =A0 =A0 =A0 =A0 =A0print $OUTF substr $raw_data, $INIT_LEN, $RECORD_LEN;
> =A0 =A0 =A0 =A0 =A0close $OUTF;
>
> =A0 =A0 =A0 =A0 =A0$INIT_LEN +=3D $RECORD_LEN;
> =A0 =A0 =A0 =A0 =A0print $COPYBOOK_LEN, $INIT_LEN, $var_len;
> =A0 =A0 =A0 =A0 =A0}
> =A0 =A0 =A0else {
> =A0 =A0 =A0 =A0 =A0print "End of file reached or copy book is not a part =
of the
> loading process\n";
> =A0 =A0 =A0 =A0 =A0last;
> =A0 =A0 =A0 =A0 =A0}
> =A0 =A0 =A0}
>
> exit 0;
>
> __END__
>
> John
> --
> Perl isn't a toolbox, but a small machine shop where you
> can special-order certain sorts of tools at low cost and
> in short order. =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0--=
 Larry Wall

Thanks for all your suggestions/comments. So what is your idea of
using this method. is there an alternative at all??
because $parm_data could potentially contain GB's worth of data.
Suggestions to improve the method are welcome. BTW all your comments
are good and have included in my code. TY

Shadab


------------------------------

Date: Mon, 29 Sep 2008 23:42:20 +0200
From: Hartmut Camphausen <JustMe@somewhere.de>
Subject: Re: Repeating characters
Message-Id: <MPG.234b68ef73c230dc989696@news.t-online.de>

Jason Carlton schrieb:

> [...] but this is the only section that does any
> replacements on this field:
> 
> $contents{'description'} =~ s/\r\n|\r|\n/ /g;
> $contents{'description'} =~ s/\&/ and /g;        <--
> $contents{'description'} =~ s/  / /g;
> $contents{'description'} =~ s/</\&lt;/g;         <--
> $contents{'description'} =~ s/>/\&gt;/g;         <--
> $contents{'description'} =~ s/\*//g;             <--
> $contents{'description'} =~ s/ +$//;
> $contents{'description'} =~ s/^\s+|\s+$//g;
> $contents{'description'} =~ s/(.)\1{4,}/$1$1$1$1/g;
> 
> 
> This is in a classifieds program. The other matches work, but
> yesterday someone submitted this in the description field:
> 
> LARGE BIRD CAGE ASKING $200
> FIRM ....>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> 
> It limited the .... to 4, but not the > or <

Well ... how could it ;-)

Have a look at what gets replaced before (lines marked with '<--'). 

'<' and '>' are HTML-escaped with their respective entities and are thus 
not showing up literally in the string when the program reaches the last 
s///g statement.

Hint: When checking results, have a look at the resulting *raw* data
      - use a text editor, not a web browser.


hth + mfg, Hartmut

-- 
  ------------------------------------------------
Hartmut Camphausen      h.camp[bei]textix[punkt]de


------------------------------

Date: Mon, 29 Sep 2008 19:50:34 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: Replacing binary data containing & using perl in HPUX 11.11
Message-Id: <gbr81t$o17$3@registered.motzarella.org>


badkmail@gmail.com wrote:
> Thanks Sherm and Peter. I have another query. I have seen in some
> places the constants &apos;, &amp; etc being used. I googled and found
> that these are HTML constants. Is my understanding correct? and how is
> it possible to use these with perl?

use HTML::Entities;

-- 
RGB


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 1887
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[30642] in Perl-Users-Digest

Perl-Users Digest, Issue: 1887 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Sep 29 18:09:47 2008

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 29 18:09:47 2008