[28552] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 9916 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Nov 1 14:05:43 2006

Date: Wed, 1 Nov 2006 11:05:06 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 1 Nov 2006     Volume: 10 Number: 9916

Today's topics:
        Extract content from a HTML or text file <contents.ra@gmail.com>
    Re: Extract content from a HTML or text file cgrady357@gmail.com
    Re: Extract content from a HTML or text file <john@castleamber.com>
    Re: File Permissions using NIS and NFS and copying via  cgrady357@gmail.com
    Re: killing processes using perl (artsd) <andreas./@\.poipoi.de>
    Re: longest common substring xhoster@gmail.com
    Re: longest common substring <tzz@lifelogs.com>
    Re: Mailbox-style directory hashing <hjp-usenet2@hjp.at>
    Re: Need to retain the order of array of hash reference <jgibson@mail.arc.nasa.gov>
    Re: Perl equivalent to unix script <hjp-usenet2@hjp.at>
    Re: Putting a line in a specific place in a file <veatchla@yahoo.com>
    Re: Putting a line in a specific place in a file <bryan@worldspice.net>
    Re: Putting a line in a specific place in a file <veatchla@yahoo.com>
        regular expression consecutive numbers or letters mchesak@gmail.com
    Re: regular expression consecutive numbers or letters <David.Squire@no.spam.from.here.au>
    Re: regular expression consecutive numbers or letters mchesak@gmail.com
    Re: regular expression consecutive numbers or letters <wahab@chemie.uni-halle.de>
    Re: regular expression consecutive numbers or letters <David.Squire@no.spam.from.here.au>
    Re: regular expression consecutive numbers or letters <sbryce@scottbryce.com>
    Re: regular expression consecutive numbers or letters mchesak@gmail.com
    Re: regular expression consecutive numbers or letters <David.Squire@no.spam.from.here.au>
    Re: regular expression consecutive numbers or letters <wahab@chemie.uni-halle.de>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 1 Nov 2006 08:59:45 -0800
From: "frozensnow" <contents.ra@gmail.com>
Subject: Extract content from a HTML or text file
Message-Id: <1162400384.964943.165460@m73g2000cwd.googlegroups.com>

hello everyone,

I need to extract content between two form tags.
I tried doing regular expression but its not effective as the HTML is
not formatted.
I saw many articles suggesting about the HTML::Parser but could not
find it in the active perl repository.
I was not able to figure out how the HTML::Parser works.
Can any one let me know if they know any solution for this problem or
how HTML::Parser works?

Thank you in advance



------------------------------

Date: 1 Nov 2006 09:41:17 -0800
From: cgrady357@gmail.com
Subject: Re: Extract content from a HTML or text file
Message-Id: <1162402877.387327.114710@h48g2000cwc.googlegroups.com>

Add http://www.bribes.org/perl/ppm to your ppm repositories.  It has
the module compiled for ActiveState users.

Another option is to install the Strawberry perl distribution.  This
includes the MinGW compiler, so you can compile your own modules as
needed.  Then you could run:
perl -MCPAN -e "install HTML::Parser"  from your command line.

frozensnow wrote:
> hello everyone,
>
> I need to extract content between two form tags.
> I tried doing regular expression but its not effective as the HTML is
> not formatted.
> I saw many articles suggesting about the HTML::Parser but could not
> find it in the active perl repository.
> I was not able to figure out how the HTML::Parser works.
> Can any one let me know if they know any solution for this problem or
> how HTML::Parser works?
> 
> Thank you in advance



------------------------------

Date: 1 Nov 2006 17:58:15 GMT
From: John Bokma <john@castleamber.com>
Subject: Re: Extract content from a HTML or text file
Message-Id: <Xns986E79C6B4164castleamber@130.133.1.4>

"frozensnow" <contents.ra@gmail.com> wrote:

> hello everyone,
> 
> I need to extract content between two form tags.
> I tried doing regular expression but its not effective as the HTML is
> not formatted.
> I saw many articles suggesting about the HTML::Parser but could not
> find it in the active perl repository.
> I was not able to figure out how the HTML::Parser works.
> Can any one let me know if they know any solution for this problem or
> how HTML::Parser works?

I recommend to use HTML::TreeBuilder, see http://johnbokma.com/perl/
for several examples.

-- 
John                Experienced Perl programmer: http://castleamber.com/

          Perl help, tutorials, and examples: http://johnbokma.com/perl/


------------------------------

Date: 1 Nov 2006 08:30:18 -0800
From: cgrady357@gmail.com
Subject: Re: File Permissions using NIS and NFS and copying via perl
Message-Id: <1162398616.988940.298520@e3g2000cwe.googlegroups.com>

If the file owner is changing when you FTP the files to the new server,
then you should put them in an archive first before transferring them.
Then when you expand the archive on the new server, the permissions
will remain the same.

If the issue is that files' owner is different to begin with, then this
code should help:
use File::Copy;

sub chg_owner {
	my $dir = shift;
	my $sep = shift;
	my @files = @_;
	for my $file(@files) {
		$file = "$dir$sep$file";
		if(! -s $file) { die ("Missing file: $file");}
		if( -l $file) {$file = readlink $file; }
		my $uid = (stat($file))[4] or die("File stat failed $file");
		my $owner  = getpwuid($uid) or die("Getpwuid failed on $uid");
		next if ($owner eq "www-data");
		my $newfile = $file . ".bk";
		copy($file,$newfile) or die("Copy failed: $file to $newfile $!");
		unlink $file or die("Unlink failed: $file $!");
		move($newfile,$file) or die("Copy failed: $newfile to $file $!");
		print "Changed owner of $file from $owner to www-data\n");
	}
}

alerman@gmail.com wrote:
> I have written a script that copies files from a test server to a
> production server. The userID's and groupID's are the same on both
> servers and we are using NIS.
>
> I set the permissions on the test server to alerman:www-data (I own it,
> it is in the web users group) so that the web server can read and
> execute it.
>
> The files copy over to the new server, but are always alerman:alerman .
> This means that my web server cant read them because of permissions
> issues.
>
> Any suggestions?
> 
> I have tried both File::Copy and using `cp -p from to`



------------------------------

Date: Wed, 01 Nov 2006 18:43:47 +0100
From: "andreas" <andreas./@\.poipoi.de>
Subject: Re: killing processes using perl (artsd)
Message-Id: <ceb8$4548dcd3$5499b0a1$22708@news2.eu.disputo.net>

Am Wed, 01 Nov 2006 08:44:04 +0100 schrieb Dr.Ruud:

>  print "kill '$_'\t?\n" for @pids ;

Hello again, 

this use of 'for' is not only perl it is a pearl !
Could be the death of 'foreach' ;-)

thanks!
-andreas


------------------------------

Date: 01 Nov 2006 16:46:47 GMT
From: xhoster@gmail.com
Subject: Re: longest common substring
Message-Id: <20061101114654.182$gT@newsreader.com>

Henry Townsend <henry.townsend@not.here> wrote:
> Is there a standard algorithm or module which finds the N longest common
> substrings in a set of text files?

I would probably just go implement it.  It would be probably be faster than
looking for a pre-existing module and learning how to use it.


> Here's the use case: I'm trying to clean up a very old, very large, and
> very ugly build system which has thousands of unparameterized
> compile/link commands in hundreds of Makefiles. I want to search them
> for frequently-occurring long substrings. Hopefully this will turn up
> phrases like "-lrpcsvc -ltermlib -lcurses -ldl -lnsl -lsocket" or
> "-DUNIX -DANSI -DUSE_SOCKETS". I would then evaluate these for semantic
> meaning, make up reasonable names like $(SYS_LIBS) and $(UNIX_DEFINES),
> and do a global replace. Then repeat until satisfied.

I would probably do this with a series of perl one liners and gnu text
utils, rather than all in Perl, but it would follow about the same pattern
as below. For memory/performance reasons, I would pick a "reasonable" upper
bound on the length of the common substrings.  Once you have a list of
candidates which are over 50 (say), then it should be easy enough to go
back by hand and see which one is the absolute longest, but for present
purposes that probably isn't even necessary.


use strict;
use warnings;
my $max=50; # the longest string which is "long enough"
my $hash;
while (my $line=<>) { chomp $line;
  foreach (0..(length $line)-1) {
    $hash->{substr $line, $_, $max}++;
  }
};


my $count=0;
while(1) {
  my @common= grep {$hash->{$_}>1 and length == $max} keys %$hash;
  $count+=@common;
  print length $_, "\t$hash->{$_}\t$_\n" foreach @common;
  last if $count>10 or $max==0;
  my %hash2;
  $max--;
  $hash2{substr $_, 0, $max}++ foreach keys %$hash;
  $hash=\%hash2;
}


Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: Wed, 01 Nov 2006 17:33:22 +0000
From: Ted Zlatanov <tzz@lifelogs.com>
Subject: Re: longest common substring
Message-Id: <g69zmbbax7h.fsf@lifelogs.com>

On  1 Nov 2006, henry.townsend@not.here wrote:

> Is there a standard algorithm or module which finds the N longest
> common substrings in a set of text files?
>
> Here's the use case: I'm trying to clean up a very old, very large,
> and very ugly build system which has thousands of unparameterized
> compile/link commands in hundreds of Makefiles. I want to search them
> for frequently-occurring long substrings. Hopefully this will turn up
> phrases like "-lrpcsvc -ltermlib -lcurses -ldl -lnsl -lsocket" or
> "-DUNIX -DANSI -DUSE_SOCKETS". I would then evaluate these for
> semantic meaning, make up reasonable names like $(SYS_LIBS) and
> $(UNIX_DEFINES), and do a global replace. Then repeat until satisfied.

Seems like what you really want is to find what words commonly follow
each other.  In other words, it would be good to know that -lcurses
usually follows -ltermlib.  That will let you build sets of associated
words (so, for instance, "-lcurses -ldl -lnsl" and "-lnsl -lcurses
-ldl" will be noticed).

To do that is fairly easy.  Here I will show you with a script that
reads from standard input or a set of files, and prints the contents
of %h at the end.  So you should be able to filter this by number of
words that follow each other.  You can also make sets of words that
are strongly associated, meaning that they are likely to appear
together.  This will be much more useful, I think, than common
substrings, which would break on space and order of options.

I figured that order doesn't matter so I always sort the list of the
two words.  That way, "a b" and "b a" will count the same.

Ted

#!/usr/bin/perl

use warnings;
use strict;
use Data::Dumper;

my @words = split ' ', join('', <>);

my %h; # this will hold the frequencies
while (exists $words[1])
{
 my ($w1, $w2) = sort @words[0,1];

 $h{$w1}->{$w2}++;

 shift @words;
}

print Dumper \%h;


------------------------------

Date: Wed, 1 Nov 2006 18:22:00 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Mailbox-style directory hashing
Message-Id: <slrnekhlto.j2n.hjp-usenet2@yoyo.hjp.at>

On 2006-10-31 23:40, s1037989@gmail.com <s1037989@gmail.com> wrote:
> I whipped up this quick and ugly script and I wanted to post it for
> code review and others' benefit.
>
> With an array such as:
> qw(aaaa aaab aaac bbbb bccc bcdd bcee bcff cccc dddd)
>
> The program returns:
> # perl list2fs.pl 2
> /a/aa/aaa/aaaa/aaaa
> /a/aa/aaa/aaab/aaab
> /a/aa/aaa/aaac/aaac
> /b/bb/bbbb
> /b/bc/bcc/bccc
> /b/bc/bcd/bcdd
> /b/bc/bce/bcee
> /b/bc/bcf/bcff
> /c/cccc
> /d/dddd
>
> Now as you can see, what this program does is take a list of filenames
> and "hashifies" it like mailbox storing allowing no more than 2 (or
> whatever $ARGV[0] is) filenames to be in a single directory.  The
> point, obviously, is if you have 100000 filenames and ext3 won't store
> 100000 files in a single directory, you can use this technique to break
> them down.

Ext3 will happily store 100000 filenames in a directory - it just won't
be very quick in retrieving them (Even that isn't true for Linux 2.6 any
more - ext3 directories now use a structure called an "htree" to quickly
access files in huge directories). But assuming you need to use ext3
with older kernel versions or other filesystems with linear directories:


Are all the filenames guaranteed to be the same length? Otherwise, what
happens if you have these file names?

aaab aaac aaad aaae aaa

You would need a directory /a/aa/aaa and a file /a/aa/aaa. But you can't
have both.


Do you have all the filenames in advance or is it possible to create new
files after the structure has been created? If it is the latter, you
proabably will need a way to split an existing directory if the number
of files in it becomes too large - what happens when somebody accesses
the directory in the middle of the split operation? How do you determine
when you have to split? Count files time you create a new file?


Finally, I assume you used the value of 2 for demonstration purposes
only: Such a small value is not practical: It makes little sense to
restrict directory sizes to much less than a disk block. A structure as
you showed with lots of directories with a single file in it would be
slower than one which is one level less deep.

> Now, that said, this is NOT intended for "hashifying" mail storage
> dirs.  It IS intended to "hashify" a HUGE list of filenames.
> Unfortunately this code is VERY inefficient.

How did you determine that is very inefficient?

> So, I post it here so people can see my idea if it helps, and so that
> people can maybe direct me to an existing CPAN module that would
> accomplish the same thing?

When I needed to do similar things I used a hash function (e.g. MD5 or
SHA-1) on the key (the filename in your case) to get nice, uniformly
distributed constant length filenames and then computed the number of
levels from the maximum number of files I had to store. With 256 files
per directory, 2 levels would be enough for 16 million files and 3
levels would be enough for 4 billion files. I never needed more :-).

> Or, perhaps someone likes what I've started and wants to help improve
> the code?

If you avoid copying around huge lists it might be faster. But I'm not
sure the code even does what you want - see my questions above.

	hp

-- 
   _  | Peter J. Holzer    | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR       | > ist?
| |   | hjp@hjp.at         | Was sonst wäre der Sinn des Erfindens?
__/   | http://www.hjp.at/ |	-- P. Einstein u. V. Gringmuth in desd


------------------------------

Date: Wed, 01 Nov 2006 09:57:15 -0800
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: Need to retain the order of array of hash references.
Message-Id: <011120060957159046%jgibson@mail.arc.nasa.gov>

In article <1162383994.216244.317950@k70g2000cwa.googlegroups.com>,
alwaysonnet <kalyanrajsista@gmail.com> wrote:

> hi all -- Help needed

I am afraid I do not understand exactly what it is you are trying to
accomplish. Perhaps you can clarify some of your confusing and
contradictory statements.

> 
> I'm trying to sort an array of hash-references based on the "A/C No" by
> preserving the "Type" order,'ZCurrency' and 'Status' , where "A/C No"
> is unique for every record.

If 'A/C No' values are unique, then you can do a simple sort on 'A/C
No'. There is no need to retain the order, because the order of the
sorted array is completely determined by the values of 'A/C No'.
Unfortunately, your sample output below is NOT sorted by 'A/C No'.

> 
> Important Note is that - if Records having same "Type","Status" and
> "ZCurrency" have different "A/C No" so sorting them with "A/C No"
> doesn't effect the actual order of display.

This sentence doesn't make any sense, and in fact may not be
grammatically correct. Did you mean "if ... then ..." instead of "if
 ... so ..."? 

If two or more records have the same 'Type', 'Status', and 'ZCurrency'
values and different 'A/C No' values, then sorting them by 'A/C No' can
definitely affect their order in the final result. 

> 
> 1) I'm looping through every element, checking current record if
> "Type","Status" and "ZCurrency" is equal to previous record. if not,
> pushing it to final array.

You should be using Perl's sort function to sort your records.

> 
> 2) if current record matches with previous record, i'm pushing to
> another array and sorting them with the "A/C No". Because, both the
> records will have same "Type","Status" and "ZCurrency".

Are records with the same 'Type', 'Status', and 'ZCurrency' values
always continguous in your set of records?

> 
> 3) By this, I got two arrays @final and @curr ... need to merge them
> into one by sorting with "A/C No" and preserving the order with
> original but sort with "A/C No" if the records have same
> "Type","Status" and "ZCurrency".

One general technique for preserving order in a sort is to add an
extra, temporary field that contains a sequence number defined however
you wish with unique value for each record. You can then use this field
as your final sort key to be used when the other sort keys are
identical. However, current versions of Perl use an order-preserving
sort algorithm, so if you do your sorting in one step this shouldn't be
necessary.

> 
> Partially working code ---
> 
> #!/usr/bin/perl
> 
> my %type_order = qw/Prefund 1 Receipt 2 Payment 3/;
> 
> my @records = (
>         {'Type'=>'Prefund' , 'A/C No'=>12345 , 'Status'=>'Y',
> 'ZCurrency' =>'GBP'},
>         {'Type'=>'Prefund' , 'A/C No'=>45678 , 'Status'=>'Y',
> 'ZCurrency' =>'SEK'},
>         {'Type'=>'Prefund' , 'A/C No'=>33333 , 'Status'=>'Y',
> 'ZCurrency' =>'SEK'},
>         {'Type'=>'Prefund' , 'A/C No'=>32222 , 'Status'=>'N',
> 'ZCurrency' =>'SEK'},
>         {'Type'=>'Receipt' , 'A/C No'=>32365 , 'Status'=>'Y',
> 'ZCurrency' =>'EUR'},
>         {'Type'=>'Receipt' , 'A/C No'=>78878 , 'Status'=>'N',
> 'ZCurrency' =>'AIR'},
>         {'Type'=>'Receipt' , 'A/C No'=>32435 , 'Status'=>'N',
> 'ZCurrency' =>'AIR'},
>         {'Type'=>'Receipt' , 'A/C No'=>64237 , 'Status'=>'N',
> 'ZCurrency' =>'GBP'},
>         {'Type'=>'Payment' , 'A/C No'=>22476 , 'Status'=>'Y',
> 'ZCurrency' =>'AUS'},
>         {'Type'=>'Payment' , 'A/C No'=>22447 , 'Status'=>'Y',
> 'ZCurrency' =>'BEL'},
>         {'Type'=>'Payment' , 'A/C No'=>56546 , 'Status'=>'N',
> 'ZCurrency' =>'EUR'},
>         {'Type'=>'Payment' , 'A/C No'=>44444 , 'Status'=>'N',
> 'ZCurrency' =>'EUR'},
>         {'Type'=>'Payment' , 'A/C No'=>43434 , 'Status'=>'N',
> 'ZCurrency' =>'EUR'},
> );
> 
> foreach $current (@records) {
>  if ($previous->{'Type'} eq $current->{'Type'} && $previous->{'Status'}
> eq $current->{'Status'} &&                 $previous->{'ZCurrency'} eq
> $current->{'ZCurrency'}) {
>  push(@curr,$previous);
>                 push(@curr,$current);
>         } else {
>                 push(@final,$current);
>         }
>                 $previous = $current;
> }
> 
> %seen = ();
> foreach $item (@curr) {
>     push(@uniq, $item) unless $seen{$item}++;
> }
> 
> @uniq = sort {  $type_order{$a->{'Type'}} <=> $type_order{$b->{'Type'}}
>                                            ||
>                 $a->{'A/C No'} <=> $b->{'A/C No'}
> 
>                 } @uniq;
> 
> foreach $x(@uniq) {
>         print " $x->{'Type'} $x->{'A/C No'} $x->{'Status'}
> $x->{'ZCurrency'}\n";
> }
> 
> I want my final output as
> 
> {'Type'=>'Prefund' , 'A/C No'=>12345 , 'Status'=>'Y', 'ZCurrency'
> =>'GBP'},
> {'Type'=>'Prefund' , 'A/C No'=>33333 , 'Status'=>'Y', 'ZCurrency'
> =>'SEK'},
> {'Type'=>'Prefund' , 'A/C No'=>45678 , 'Status'=>'Y', 'ZCurrency'
> =>'SEK'},
> {'Type'=>'Prefund' , 'A/C No'=>32222 , 'Status'=>'N', 'ZCurrency'
> =>'SEK'},
> {'Type'=>'Receipt' , 'A/C No'=>32365 , 'Status'=>'Y', 'ZCurrency'
> =>'EUR'},
> {'Type'=>'Receipt' , 'A/C No'=>32435 , 'Status'=>'N', 'ZCurrency'
> =>'AIR'},
> {'Type'=>'Receipt' , 'A/C No'=>78878 , 'Status'=>'N', 'ZCurrency'
> =>'AIR'},
> {'Type'=>'Receipt' , 'A/C No'=>64237 , 'Status'=>'N', 'ZCurrency'
> =>'GBP'},
> {'Type'=>'Payment' , 'A/C No'=>22476 , 'Status'=>'Y', 'ZCurrency'
> =>'AUS'},
> {'Type'=>'Payment' , 'A/C No'=>22447 , 'Status'=>'Y', 'ZCurrency'
> =>'BEL'},
> {'Type'=>'Payment' , 'A/C No'=>43434 , 'Status'=>'N', 'ZCurrency'
> =>'EUR'},
> {'Type'=>'Payment' , 'A/C No'=>44444 , 'Status'=>'N', 'ZCurrency'
> =>'EUR'},
> {'Type'=>'Payment' , 'A/C No'=>56546 , 'Status'=>'N', 'ZCurrency'
> =>'EUR'},

It looks like you want to sort by: 1) 'Type', 2) 'Status', 3)
'ZCurrency', and 4) 'A/C No'. Is that correct? Try this (untested):

@sorted = sort { $a->{'Type'} cmp $b->{'Type'} ||
                 $a->{'Status'} cmp $b->{'Status'} ||
                 $a->{'ZCurrency'} cmp $b->{'ZCurrency'} ||
                 $a->{'A/C No'} cmp $b->{'A/C No'} } @records;

> 
> So records to be effected are
> 
> {'Type'=>'Prefund' , 'A/C No'=>45678 , 'Status'=>'Y', 'ZCurrency'
> =>'SEK'},
> {'Type'=>'Prefund' , 'A/C No'=>33333 , 'Status'=>'Y', 'ZCurrency'
> =>'SEK'},
> 
> {'Type'=>'Receipt' , 'A/C No'=>78878 , 'Status'=>'N', 'ZCurrency'
> =>'AIR'},
> {'Type'=>'Receipt' , 'A/C No'=>32435 , 'Status'=>'N', 'ZCurrency'
> =>'AIR'},
> 
> {'Type'=>'Payment' , 'A/C No'=>56546 , 'Status'=>'N', 'ZCurrency'
> =>'EUR'},
> {'Type'=>'Payment' , 'A/C No'=>44444 , 'Status'=>'N', 'ZCurrency'
> =>'EUR'},
> {'Type'=>'Payment' , 'A/C No'=>43434 , 'Status'=>'N', 'ZCurrency'
> =>'EUR'},

You can help us help you by writing a complete, short-as-possible,
working program that demonstrates the problem you are having. In your
case, I would recommend testing with fewer and shorter data records
until you get the sort part working. Something link:

@records = (
  { a => 1, b => 1, c => 1, d => 1 },
  { a => 1, b => 1, c => 1, d => 2 },
  { a => 1, b => 1, c => 1, d => 3 },

(you get the picture, I hope).

Good luck.


------------------------------

Date: Wed, 1 Nov 2006 18:28:43 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Perl equivalent to unix script
Message-Id: <slrnekhmab.j2n.hjp-usenet2@yoyo.hjp.at>

On 2006-11-01 04:45, Ignoramus18920 <ignoramus18920@NOSPAM.18920.invalid> wrote:
> On Wed, 01 Nov 2006 03:58:33 GMT, Mumia W. (reading news) <paduille.4060.mumia.w@earthlink.net> wrote:
>>> Quoth "Mike" <mikedawg@gmail.com>:
>>>> cat tempfile1 | sort > newfile2; rm tempfile1
>>
>> It looks like a one-liner in Perl.
>>
>> perldoc File::Slurp
>> perldoc -f sort
>>
>>
>
> Now try sorting a 4 GB file with that... Something that GNU sort can
> easily do...

Use the right tool for the job.

	hp


-- 
   _  | Peter J. Holzer    | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR       | > ist?
| |   | hjp@hjp.at         | Was sonst wäre der Sinn des Erfindens?
__/   | http://www.hjp.at/ |	-- P. Einstein u. V. Gringmuth in desd


------------------------------

Date: Wed, 01 Nov 2006 10:29:02 -0600
From: l v <veatchla@yahoo.com>
Subject: Re: Putting a line in a specific place in a file
Message-Id: <12khiq6ekn22h83@news.supernews.com>

samasama wrote:
>   Hi... I need to place a line in a specific part of the file. I don't
> really know where to begin. Aside from feeding the file contents into
> an array?
> 
> 
> [base]
> name=CentOS-$releasever - Base
> mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
> #baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
> gpgcheck=1
> gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-centos4
> exclude=httpd
> 
>  I need to find the [base] entry and then put the exclude= below the
> gpgkey=  line.
> 
> Any help is vastly appreciated. Any docs or turtorials about writing a
> parser would help too. 
> 
> Thanks
> 
> --
> samasama
> 

Looks like you are trying to edit an INI file.  If so, have you looked 
at CPAN for INI related modules?  such as Config::INI::Simple.

-- 

Len


------------------------------

Date: 1 Nov 2006 09:24:46 -0800
From: "samasama" <bryan@worldspice.net>
Subject: Re: Putting a line in a specific place in a file
Message-Id: <1162401886.549087.281710@m73g2000cwd.googlegroups.com>


>
> Looks like you are trying to edit an INI file.  If so, have you looked
> at CPAN for INI related modules?  such as Config::INI::Simple.
>
> --

 Config file yeah... I need to learn how to do this by hand though.
 The only part I'm stuck at is determing what line number my first
regex is on and how many lines
down to my next regex.
 Something like:
use strict;
use Fcntl;

my $file = "/etc/yum.repos.d/CentOS-Base.repo";
sysopen( REPO_FILE, $file, O_RDONLY )
  || die "Can't open web03 passwd file!: $!\n";

my @file = <REPO_FILE>;

my ( $number1, $number2 );
foreach my $line (@file) {
    if ( $line =~ /\[base\]/i ) {
        print "Found $line on line $..\n";
    }
}

Except $. gives me the total number of lines, not the line number $line
is on. 

I think I'm going in the right direction? : ) 

--
samasama



------------------------------

Date: Wed, 01 Nov 2006 12:19:40 -0600
From: l v <veatchla@yahoo.com>
Subject: Re: Putting a line in a specific place in a file
Message-Id: <12khp9kfu59cu9d@news.supernews.com>

samasama wrote:
>> Looks like you are trying to edit an INI file.  If so, have you looked
>> at CPAN for INI related modules?  such as Config::INI::Simple.
>>
>> --
> 
>  Config file yeah... I need to learn how to do this by hand though.
>  The only part I'm stuck at is determing what line number my first
> regex is on and how many lines
> down to my next regex.
>  Something like:
> use strict;
> use Fcntl;
> 
> my $file = "/etc/yum.repos.d/CentOS-Base.repo";
> sysopen( REPO_FILE, $file, O_RDONLY )
>   || die "Can't open web03 passwd file!: $!\n";
> 
> my @file = <REPO_FILE>;
> 
> my ( $number1, $number2 );
> foreach my $line (@file) {
>     if ( $line =~ /\[base\]/i ) {
>         print "Found $line on line $..\n";
>     }
> }
> 
> Except $. gives me the total number of lines, not the line number $line
> is on. 

That would be because you are not looping through the file when you 
print $. .  You've slurped the whole file into @file and are looping 
through @file.  It's no surprise that $. is equal to the last line 
number of the input file.

You may want to loop through the file handle vs looping on @file.  Or 
loop through @file using
for my $i ( 0 .. $#file ) {
     if ( $line =~ ...
}
-- 

Len


------------------------------

Date: 1 Nov 2006 08:10:23 -0800
From: mchesak@gmail.com
Subject: regular expression consecutive numbers or letters
Message-Id: <1162397423.311542.273000@b28g2000cwb.googlegroups.com>

I need password validation routine.  The password cannot contain 4
consecutive numbers or letters, for example '1234' or 'abcd' would be
invalid.  No four consecutive  numbers or letters are allowed to be
part of the password.

       5678 is invalid
       mnop is invalid

Thanks



------------------------------

Date: Wed, 01 Nov 2006 16:35:14 +0000
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <eiaic3$64r$1@gemini.csx.cam.ac.uk>

mchesak@gmail.com wrote:
> I need password validation routine.  

So write one.

This is not a "free code" group. People here will be happy to help you
to improve your code, but not to write it from scratch.


DS


------------------------------

Date: 1 Nov 2006 09:35:10 -0800
From: mchesak@gmail.com
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <1162402510.302613.159130@k70g2000cwa.googlegroups.com>


David Squire wrote:
> mchesak@gmail.com wrote:
> > I need password validation routine.
>
> So write one.
>
> This is not a "free code" group. People here will be happy to help you
> to improve your code, but not to write it from scratch.
>
>
> DS

If I knew how I would.  I am not even sure where to start.  This seems
to be a common password validation issue and maybe some one has already
done it.   I could hack a barbaric routine to do this but I was hoping
was something more elegant soultion.  If some one would point me in the
right direction that would be helpfull, something your comments are not.



------------------------------

Date: Wed, 01 Nov 2006 18:55:43 +0100
From: Mirco Wahab <wahab@chemie.uni-halle.de>
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <eianb6$qop$1@mlucom4.urz.uni-halle.de>

Thus spoke mchesak@gmail.com (on 2006-11-01 18:35):
> David Squire wrote:
>> mchesak@gmail.com wrote:
>> > I need password validation routine.
>> So write one.
> If I knew how I would.  I am not even sure where to start.  

Start with a simple description of your
specification and refine it while you
try to understand it ...


   use strict;
   use warnings;

   my $no1 = '1234';
   my $no2 = 'abcd';
   my $ok1 = '1a2b';
   my $ok2 = 'c2d4';

   my $rgx = qr/[0-9]{4}|[A-z]{4}/;

   print "ok: $no1\n" if $no1 !~ /$rgx/;
   print "ok: $no2\n" if $no2 !~ /$rgx/;
   print "ok: $ok1\n" if $ok1 !~ /$rgx/;
   print "ok: $ok2\n" if $ok2 !~ /$rgx/;
   ...


Now add a check to ensure the password is *more*
than 4 characters ...

Regards

Mirco



------------------------------

Date: Wed, 01 Nov 2006 18:03:30 +0000
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <eianhi$gvr$1@gemini.csx.cam.ac.uk>

mchesak@gmail.com wrote:
> David Squire wrote:
>> mchesak@gmail.com wrote:
>>> I need password validation routine.
>> So write one.
>>
>> This is not a "free code" group. People here will be happy to help you
>> to improve your code, but not to write it from scratch.
> 
> If I knew how I would.  I am not even sure where to start.  This seems
> to be a common password validation issue and maybe some one has already
> done it.   I could hack a barbaric routine to do this but I was hoping
> was something more elegant soultion.  If some one would point me in the
> right direction that would be helpfull,

Have you gone to CPAN and searched for "password"? There is a module
there that does exactly what you want. Please search the standard Perl
resources before asking here.

> something your comments are not.

Not true. If you want to get help here, you need to learn how the group
works. See the posting guidelines that are regularly posted here, and
also available at
http://www.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html


DS



------------------------------

Date: Wed, 01 Nov 2006 11:11:23 -0700
From: Scott Bryce <sbryce@scottbryce.com>
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <WMWdnZRNZ_jIftXYnZ2dnUVZ_uidnZ2d@comcast.com>

Mirco Wahab wrote:

>    my $no1 = '1234';
>    my $no2 = 'abcd';
>    my $ok1 = '1a2b';
>    my $ok2 = 'c2d4';

If I understood the spec correctly, '1234' is not OK, '1235' is OK. But 
is 'qwert' OK? Or 'asdf'?


------------------------------

Date: 1 Nov 2006 10:17:16 -0800
From: mchesak@gmail.com
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <1162405036.345203.70910@f16g2000cwb.googlegroups.com>


Mirco Wahab wrote:
> Thus spoke mchesak@gmail.com (on 2006-11-01 18:35):
> > David Squire wrote:
> >> mchesak@gmail.com wrote:
> >> > I need password validation routine.
> >> So write one.
> > If I knew how I would.  I am not even sure where to start.
>
> Start with a simple description of your
> specification and refine it while you
> try to understand it ...
>
>
>    use strict;
>    use warnings;
>
>    my $no1 = '1234';
>    my $no2 = 'abcd';
>    my $ok1 = '1a2b';
>    my $ok2 = 'c2d4';
>
>    my $rgx = qr/[0-9]{4}|[A-z]{4}/;
>
>    print "ok: $no1\n" if $no1 !~ /$rgx/;
>    print "ok: $no2\n" if $no2 !~ /$rgx/;
>    print "ok: $ok1\n" if $ok1 !~ /$rgx/;
>    print "ok: $ok2\n" if $ok2 !~ /$rgx/;
>    ...
>
>
> Now add a check to ensure the password is *more*
> than 4 characters ...
>
> Regards
> 
> Mirco
How simple and elegant, thanks for the lesson in Perl.



------------------------------

Date: Wed, 01 Nov 2006 18:30:47 +0000
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <eiap4n$k4b$1@gemini.csx.cam.ac.uk>

mchesak@gmail.com wrote:

> Mirco Wahab wrote:

>> Start with a simple description of your
>> specification and refine it while you
>> try to understand it ...
>>
>>
>>    use strict;
>>    use warnings;
>>
>>    my $no1 = '1234';
>>    my $no2 = 'abcd';
>>    my $ok1 = '1a2b';
>>    my $ok2 = 'c2d4';
>>
>>    my $rgx = qr/[0-9]{4}|[A-z]{4}/;
>>
>>    print "ok: $no1\n" if $no1 !~ /$rgx/;
>>    print "ok: $no2\n" if $no2 !~ /$rgx/;
>>    print "ok: $ok1\n" if $ok1 !~ /$rgx/;
>>    print "ok: $ok2\n" if $ok2 !~ /$rgx/;
>>    ...
>>
>>
>> Now add a check to ensure the password is *more*
>> than 4 characters ...

> How simple and elegant, thanks for the lesson in Perl.

 ... except that it doesn't meet your spec., as made clear by this version:

----

#!/usr/bin/perl

use strict;
use warnings;


my $rgx = qr/[0-9]{4}|[A-z]{4}/;

while (<DATA>) {
	chomp;
	print "$_: ";
	if (/$rgx/) {
		print "bad\n";
	}
	else {
		print "good\n";
	}
}

__DATA__
1234
abcd
1235
1a2b
c2d4
abqk

----

Ouput:

1234: bad
abcd: bad
1235: bad
1a2b: good
c2d4: good
abqk: bad

----

Both '1235' and 'abqk' should be 'good' according to your spec.

Go to CPAN and check out Data::Password. It does all you want and more.


DS


------------------------------

Date: Wed, 01 Nov 2006 19:55:50 +0100
From: Mirco Wahab <wahab@chemie.uni-halle.de>
Subject: Re: regular expression consecutive numbers or letters
Message-Id: <eiaqrt$rq3$1@mlucom4.urz.uni-halle.de>

Thus spoke David Squire (on 2006-11-01 19:30):

>> How simple and elegant, thanks for the lesson in Perl.
> 
> ... except that it doesn't meet your spec., as made clear by this version:

You are right. I read 'subsequent' characters of the same type,
but it's clearly meant 'consecutive by value (ord)'.

This is some interesting question. Should be solveable
by regex, e.g. something like:

 ...
 my (@k, $i);
 ...
 my $rgx = qr/ ( [\w\d]{4} )
               (??{  $i  = 0;
                     @k  = split '', $1;
                     $i += (ord $k[-2]) - (ord $_) for @k;
                    ($i-2) ? '?' : '?!'
               })/x;
 ..
but my Perl (5.8.8/Win32) crashes (on match) here ...


WTF

Regards

Mirco




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 9916
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[28552] in Perl-Users-Digest

Perl-Users Digest, Issue: 9916 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Wed Nov 1 14:05:43 2006

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Nov 1 14:05:43 2006