[30641] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 1886 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 29 14:09:56 2008

Date: Mon, 29 Sep 2008 11:09:15 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 29 Sep 2008     Volume: 11 Number: 1886

Today's topics:
    Re: Any thoughts on catalyst? <walterbyrd@iname.com>
        File edits in a Perlish way <pgodfrin@gmail.com>
    Re: File edits in a Perlish way <cartercc@gmail.com>
    Re: File edits in a Perlish way <jurgenex@hotmail.com>
        Help: Duplicate and Unique Lines Problem <openlinuxsource@gmail.com>
    Re: Help: Duplicate and Unique Lines Problem <bugbear@trim_papermule.co.uk_trim>
    Re: Help: Duplicate and Unique Lines Problem <openlinuxsource@gmail.com>
    Re: Help: Duplicate and Unique Lines Problem <peter@makholm.net>
    Re: Help: Duplicate and Unique Lines Problem <openlinuxsource@gmail.com>
    Re: Help: Duplicate and Unique Lines Problem <bart.lateur@pandora.be>
    Re: Help: Duplicate and Unique Lines Problem <openlinuxsource@gmail.com>
    Re: Help: Duplicate and Unique Lines Problem <openlinuxsource@gmail.com>
    Re: Help: Duplicate and Unique Lines Problem <ben@morrow.me.uk>
    Re: Help: Duplicate and Unique Lines Problem <openlinuxsource@gmail.com>
    Re: Help: Duplicate and Unique Lines Problem <RedGrittyBrick@spamweary.invalid>
    Re: Help: Duplicate and Unique Lines Problem <RedGrittyBrick@spamweary.invalid>
    Re: Help: Duplicate and Unique Lines Problem <RedGrittyBrick@spamweary.invalid>
    Re: Help: Duplicate and Unique Lines Problem <openlinuxsource@gmail.com>
    Re: Help: Duplicate and Unique Lines Problem <jurgenex@hotmail.com>
    Re: Help: Duplicate and Unique Lines Problem <bart.lateur@pandora.be>
        Hot Linux Admin Requirement for ADP at NJ for 6+ months <indiana.123@gmail.com>
    Re: How to unable the use of tainted mode in a CGI scri <azol@non-non-non>
    Re: IPC:Shareable <clauskick@hotmail.com>
        Repeating characters <jwcarlton@gmail.com>
    Re: Repeating characters <jwcarlton@gmail.com>
    Re: Repeating characters <hjp-usenet2@hjp.at>
    Re: Repeating characters <JustMe@somewhere.de>
    Re: Repeating characters <jwcarlton@gmail.com>
    Re: Replacing binary data containing & using perl in HP <jurgenex@hotmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 29 Sep 2008 07:51:22 -0700 (PDT)
From: walterbyrd <walterbyrd@iname.com>
Subject: Re: Any thoughts on catalyst?
Message-Id: <8055becb-41ac-4da6-89eb-c4da009f19ff@25g2000prz.googlegroups.com>

> What are you searching for? When I google "perl catalyst", the top hit
> is
>
> =A0 =A0http://www.catalystframework.org/
>
> Which =A0I think answers all of your questions.

I am interested in a review of catalyst by somebody who has used
catalyst, as well as other frameworks.




------------------------------

Date: Mon, 29 Sep 2008 10:08:43 -0700 (PDT)
From: pgodfrin <pgodfrin@gmail.com>
Subject: File edits in a Perlish way
Message-Id: <4bd6b33f-9bd9-42d2-a606-510d628dc986@f36g2000hsa.googlegroups.com>

I have a need to pass a series of "edits" I'd like to apply in a
somewhat controlled manner.

My pseudo code would be something like:
1. accept as input the file name, old text, new text
2. check existence in file for old text, if not found error out
3. if found, then show before and after text for the whole line
4. Write out file.

I'm currently using @ARGV for #1 and I intend to slurp the whole file
into an array and then use map for #4.

I'm looking for ideas on a "perlish" way to do #2 and #3. The grep
function lets me show before and after like so:

    print grep(/$oldtx/,@oldfile);

But I haven't checked for existence of the $oldtx as of yet. so a type
would make this print statement print nothing. I could always use
loops and other brute force methods - but it seems like there may be a
cooler, perlish way to this... Any ideas?
phil


------------------------------

Date: Mon, 29 Sep 2008 10:32:42 -0700 (PDT)
From: cartercc <cartercc@gmail.com>
Subject: Re: File edits in a Perlish way
Message-Id: <29713740-b67d-4682-9ab2-26a81830d93d@y38g2000hsy.googlegroups.com>

On Sep 29, 1:08=A0pm, pgodfrin <pgodf...@gmail.com> wrote:
> 1. accept as input the file name, old text, new text

print "Enter file name:";
my $filename =3D <STDIN>;
chomp $filename;
print "Enter old text:";
my $oldtext =3D <STDIN>;
chomp $oldtext;
print "Enter new text:";
my $newtext =3D <STDIN>;
chomp $newtext;

> 2. check existence in file for old text, if not found error out

open INFILE, "<$filename" or die "INFILE ERROR, $!";
open OUTFILE, ">outfile.txt" or die "OUTFILE ERROR, $!";

> 3. if found, then show before and after text for the whole line
> 4. Write out file.

while(<INFILE>)
{
   print OUTFILE if /$oldtext/;
   # maybe this as well ...
   # next unless /$oldtext/;
   # $_ =3D s/$oldtext/$newtext/;
   # print;
}
close INFILE;
close OUTFILE;

CC


------------------------------

Date: Mon, 29 Sep 2008 10:44:52 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: File edits in a Perlish way
Message-Id: <ub42e4d90vdp53skps20g4t7ns0b0k7uu8@4ax.com>

pgodfrin <pgodfrin@gmail.com> wrote:
>I have a need to pass a series of "edits" I'd like to apply in a
>somewhat controlled manner.

Are those edits limited to one line at a time(a) or can a single edit
spread across multiple lines(b) ?

>My pseudo code would be something like:
>1. accept as input the file name, old text, new text
>2. check existence in file for old text, if not found error out
>3. if found, then show before and after text for the whole line
>4. Write out file.
>
>I'm currently using @ARGV for #1 

Ok

>and I intend to slurp the whole file into an array 

If (a) then ususally it is better to process the file line by line
instead of slurping it in all at once.
If (b) then manipulating the file in memory is probably easier.

>and then use map for #4.

Most people would use print() to write something.

>I'm looking for ideas on a "perlish" way to do #2 

perldoc -f index

>and #3. 

if (index($line, $oldtext) > -1) {
	print $line;
	apply_edit($line, $oldtext, $newtext);
	print $line
}

>The grep
>function lets me show before and after like so:
>
>    print grep(/$oldtx/,@oldfile);
>
>But I haven't checked for existence of the $oldtx as of yet. so a type
>would make this print statement print nothing. 

I have no idea what you are trying to do with that code snippet. It
doesn't look like anything related to "show before and after text"

jue


------------------------------

Date: Mon, 29 Sep 2008 21:15:40 +0800
From: Amy Lee <openlinuxsource@gmail.com>
Subject: Help: Duplicate and Unique Lines Problem
Message-Id: <pan.2008.09.29.13.15.21.483@gmail.com>

Hello,

Dose perl has functions like the UNIX command sort and uniq can output
duplicate lines and unique lines?

There's my codes, what if I run this it will output many lines but I just
want to save the duplicate line just once and unique line.

while (<>)
{
  if (/^\>.*/)
  {
    s/\>//g;
    if (/\w+\s\w+\s(.*)\smiR.*\s\w+/g)
    {
      print "$1\n";
    }
  }
}

The output is like this:

 ......
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Caenorhabditis elegans
Caenorhabditis elegans
Caenorhabditis elegans
Caenorhabditis elegans
Mus musculus
Mus musculus
Mus musculus
Mus musculus
Mus musculus
Mus musculus
Mus musculus
Arabidopsis thaliana
 ........

And mu purpose is the output should be like that:
 .......
Homo sapiens
Caenorhabditis elegans
Mus musculus
Arabidopsis thaliana
 .......

Thank you very much~

Best Regards,

Amy Lee


------------------------------

Date: Mon, 29 Sep 2008 14:17:16 +0100
From: bugbear <bugbear@trim_papermule.co.uk_trim>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <ebednf0PxNHBSH3VnZ2dnUVZ8umdnZ2d@posted.plusnet>

Amy Lee wrote:
> Hello,
> 
> Dose perl has functions like the UNIX command sort and uniq can output
> duplicate lines and unique lines?
> 
> There's my codes, what if I run this it will output many lines but I just
> want to save the duplicate line just once and unique line.
> 
> while (<>)
> {
>   if (/^\>.*/)
>   {
>     s/\>//g;
>     if (/\w+\s\w+\s(.*)\smiR.*\s\w+/g)
>     {
>       print "$1\n";
>     }
>   }
> }

If you're running on *NIX, just pipe your script to sort/uniq and you're done.

   BugBear


------------------------------

Date: Mon, 29 Sep 2008 21:29:21 +0800
From: Amy Lee <openlinuxsource@gmail.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <pan.2008.09.29.13.29.20.683090@gmail.com>

On Mon, 29 Sep 2008 14:17:16 +0100, bugbear wrote:

> Amy Lee wrote:
>> Hello,
>> 
>> Dose perl has functions like the UNIX command sort and uniq can output
>> duplicate lines and unique lines?
>> 
>> There's my codes, what if I run this it will output many lines but I just
>> want to save the duplicate line just once and unique line.
>> 
>> while (<>)
>> {
>>   if (/^\>.*/)
>>   {
>>     s/\>//g;
>>     if (/\w+\s\w+\s(.*)\smiR.*\s\w+/g)
>>     {
>>       print "$1\n";
>>     }
>>   }
>> }
> 
> If you're running on *NIX, just pipe your script to sort/uniq and you're done.
> 
>    BugBear
Thank you. But I hope make it more convenient so I could put codes into
another perl script.

Regards,

Amy Lee 


------------------------------

Date: Mon, 29 Sep 2008 15:28:51 +0200
From: Peter Makholm <peter@makholm.net>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <87od2713os.fsf@hacking.dk>

Amy Lee <openlinuxsource@gmail.com> writes:

> Dose perl has functions like the UNIX command sort and uniq can output
> duplicate lines and unique lines?

There is a uniq function in the List::MoreUtils module otherwise the
standard way is to use the printed stings as keys in a hash to mark
which lines is allready printed.

//Makholm



------------------------------

Date: Mon, 29 Sep 2008 22:31:47 +0800
From: Amy Lee <openlinuxsource@gmail.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <pan.2008.09.29.14.31.46.3357@gmail.com>

On Mon, 29 Sep 2008 15:28:51 +0200, Peter Makholm wrote:

> Amy Lee <openlinuxsource@gmail.com> writes:
> 
>> Dose perl has functions like the UNIX command sort and uniq can output
>> duplicate lines and unique lines?
> 
> There is a uniq function in the List::MoreUtils module otherwise the
> standard way is to use the printed stings as keys in a hash to mark
> which lines is allready printed.
> 
> //Makholm
Hello,

I use this module List::MoreUtils to have a process but still failed and
output just the last line, here's my codes.

use List::MoreUtils qw(any all none notall true false firstidx first_index 
                           lastidx last_index insert_after insert_after_string 
                           apply after after_incl before before_incl indexes 
                           firstval first_value lastval last_value each_array
                           each_arrayref pairwise natatime mesh zip uniq minmax);

$file = $ARGV[0];
open FILE, '<', "$file";
while (<FILE>)
{
  @raw_list = split /\n/, $_;
}
@list = uniq @raw_list;
foreach $single (@list)
{
  print "$single\n";
}

Thank you very much.

Regards,

Amy


------------------------------

Date: Mon, 29 Sep 2008 16:54:15 +0200
From: Bart Lateur <bart.lateur@pandora.be>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <p2p1e4ta3udlru6l95d3up6ccltstvktng@4ax.com>

Amy Lee wrote:

>Dose perl has functions like the UNIX command sort and uniq can output
>duplicate lines and unique lines?

Perl has a built in sort, and unique can be implemented with a few lines
of code. They're even in the official FAQ:

	perlfaq4: How can I remove duplicate elements from a list or
array?

http://perldoc.perl.org/perlfaq4.html#How-can-I-remove-duplicate-elements-from-a-list-or-array%3f


-- 
	Bart.


------------------------------

Date: Mon, 29 Sep 2008 23:16:14 +0800
From: Amy Lee <openlinuxsource@gmail.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <pan.2008.09.29.15.16.13.522397@gmail.com>

On Mon, 29 Sep 2008 16:54:15 +0200, Bart Lateur wrote:

> Amy Lee wrote:
> 
>>Dose perl has functions like the UNIX command sort and uniq can output
>>duplicate lines and unique lines?
> 
> Perl has a built in sort, and unique can be implemented with a few lines
> of code. They're even in the official FAQ:
> 
> 	perlfaq4: How can I remove duplicate elements from a list or
> array?
> 
> http://perldoc.perl.org/perlfaq4.html#How-can-I-remove-duplicate-elements-from-a-list-or-array%3f
Thanks, but my problem seems a little strange. Because I don't know if
uniq function can process list such as @list. When I use uniq to process
it I can just see the last line of the file.

Amy


------------------------------

Date: Mon, 29 Sep 2008 23:18:31 +0800
From: Amy Lee <openlinuxsource@gmail.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <pan.2008.09.29.15.18.30.751390@gmail.com>

On Mon, 29 Sep 2008 16:54:15 +0200, Bart Lateur wrote:

> Amy Lee wrote:
> 
>>Dose perl has functions like the UNIX command sort and uniq can output
>>duplicate lines and unique lines?
> 
> Perl has a built in sort, and unique can be implemented with a few lines
> of code. They're even in the official FAQ:
> 
> 	perlfaq4: How can I remove duplicate elements from a list or
> array?
> 
> http://perldoc.perl.org/perlfaq4.html#How-can-I-remove-duplicate-elements-from-a-list-or-array%3f
Here's the codes:

open FILE, '<', "$file";
while (<FILE>)
{
  @raw_list = split /\n/, $_;
  @list = uniq (@raw_list);
  print "@list\n";
}
It seems that the uniq does nothing! I don't know the reason.

Amy


------------------------------

Date: Mon, 29 Sep 2008 16:29:26 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <mr87r5-vg2.ln1@osiris.mauzo.dyndns.org>


Quoth Amy Lee <openlinuxsource@gmail.com>:
> On Mon, 29 Sep 2008 15:28:51 +0200, Peter Makholm wrote:
> 
> > Amy Lee <openlinuxsource@gmail.com> writes:
> > 
> >> Dose perl has functions like the UNIX command sort and uniq can output
> >> duplicate lines and unique lines?
> > 
> > There is a uniq function in the List::MoreUtils module otherwise the
> > standard way is to use the printed stings as keys in a hash to mark
> > which lines is allready printed.
> 
> I use this module List::MoreUtils to have a process but still failed and
> output just the last line, here's my codes.
> 
> use List::MoreUtils qw(any all none notall true false firstidx first_index 
>                            lastidx last_index insert_after insert_after_string 
>                            apply after after_incl before before_incl indexes 
>                            firstval first_value lastval last_value each_array
>                            each_arrayref pairwise natatime mesh zip uniq
> minmax);

Don't import more than you need.

    use List::MoreUtils qw(uniq);

> $file = $ARGV[0];

Your script should start with

    use warnings;
    use strict;

which will mean you need 'my' on all your variables

    my $file = $ARGV[0];

> open FILE, '<', "$file";

Use lexical filehandles.
Always check the return value of open.
Don't quote things when you don't need to.

    open my $FILE, '<', $file
        or die "can't read '$file': $!";

> while (<FILE>)
> {
>   @raw_list = split /\n/, $_;

while (<FILE>) reads the file one line at a time. You then split that
line on /\n/ (which won't do anything except remove the trailing
newline, since it's just a single line) and replace the contents of
@raw_line with the result. This means @raw_list never has more than one
element (the last line read).

Since you want to keep all the lines, either push them onto the array:

    while (<$FILE>) {
        chomp;          # remove the newline
        push @raw_list, $_;
    }

or, better, use <> in list context, which returns all the lines:

    my @raw_list = <$FILE>;
    chomp @raw_list;        # remove all the newlines at once

> }
> @list = uniq @raw_list;
> foreach $single (@list)
> {
>   print "$single\n";

Ben

-- 
                Outside of a dog, a book is a man's best friend.
                Inside of a dog, it's too dark to read.
ben@morrow.me.uk                                                  Groucho Marx


------------------------------

Date: Mon, 29 Sep 2008 23:49:30 +0800
From: Amy Lee <openlinuxsource@gmail.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <pan.2008.09.29.15.49.27.690264@gmail.com>

On Mon, 29 Sep 2008 16:29:26 +0100, Ben Morrow wrote:

> 
> Quoth Amy Lee <openlinuxsource@gmail.com>:
>> On Mon, 29 Sep 2008 15:28:51 +0200, Peter Makholm wrote:
>> 
>> > Amy Lee <openlinuxsource@gmail.com> writes:
>> > 
>> >> Dose perl has functions like the UNIX command sort and uniq can output
>> >> duplicate lines and unique lines?
>> > 
>> > There is a uniq function in the List::MoreUtils module otherwise the
>> > standard way is to use the printed stings as keys in a hash to mark
>> > which lines is allready printed.
>> 
>> I use this module List::MoreUtils to have a process but still failed and
>> output just the last line, here's my codes.
>> 
>> use List::MoreUtils qw(any all none notall true false firstidx first_index 
>>                            lastidx last_index insert_after insert_after_string 
>>                            apply after after_incl before before_incl indexes 
>>                            firstval first_value lastval last_value each_array
>>                            each_arrayref pairwise natatime mesh zip uniq
>> minmax);
> 
> Don't import more than you need.
> 
>     use List::MoreUtils qw(uniq);
> 
>> $file = $ARGV[0];
> 
> Your script should start with
> 
>     use warnings;
>     use strict;
> 
> which will mean you need 'my' on all your variables
> 
>     my $file = $ARGV[0];
> 
>> open FILE, '<', "$file";
> 
> Use lexical filehandles.
> Always check the return value of open.
> Don't quote things when you don't need to.
> 
>     open my $FILE, '<', $file
>         or die "can't read '$file': $!";
> 
>> while (<FILE>)
>> {
>>   @raw_list = split /\n/, $_;
> 
> while (<FILE>) reads the file one line at a time. You then split that
> line on /\n/ (which won't do anything except remove the trailing
> newline, since it's just a single line) and replace the contents of
> @raw_line with the result. This means @raw_list never has more than one
> element (the last line read).
> 
> Since you want to keep all the lines, either push them onto the array:
> 
>     while (<$FILE>) {
>         chomp;          # remove the newline
>         push @raw_list, $_;
>     }
> 
> or, better, use <> in list context, which returns all the lines:
> 
>     my @raw_list = <$FILE>;
>     chomp @raw_list;        # remove all the newlines at once
> 
>> }
>> @list = uniq @raw_list;
>> foreach $single (@list)
>> {
>>   print "$single\n";
> 
> Ben
Thank you very much. I have solved this one by your method.

Best Regards,

Amy


------------------------------

Date: Mon, 29 Sep 2008 16:53:57 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <48e0fa18$0$2500$da0feed9@news.zen.co.uk>


Amy Lee wrote:
> Hello,
> 
> Dose perl has functions like the UNIX command sort and uniq can output
> duplicate lines and unique lines?
> 
> There's my codes, what if I run this it will output many lines but I just
> want to save the duplicate line just once and unique line.
> 

#!/usr/bin/perl
use strict;
use warnings;

my %seen;
for(sort <DATA>) {
   chomp;
   if (/(\w+\s+\w+\s+)/) {
     print "$1\n" unless $seen{$1}++;
   }
}


__END__
Homo sapiens E
Homo sapiens D
Arabidopsis thaliana S
Homo sapiens G
Mus musculus P
Mus musculus Q
Mus musculus R
Homo sapiens F
Caenorhabditis elegans H
Caenorhabditis elegans I
Homo sapiens A
Homo sapiens B
Homo sapiens C
Caenorhabditis elegans J
Mus musculus L
Mus musculus O
Mus musculus M
Mus musculus N
Caenorhabditis elegans K

-- 
RGB


------------------------------

Date: Mon, 29 Sep 2008 16:59:28 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <48e0fb63$0$26077$db0fefd9@news.zen.co.uk>


RedGrittyBrick wrote:
> 
> Amy Lee wrote:
>> Hello,
>>
>> Dose perl has functions like the UNIX command sort and uniq can output
>> duplicate lines and unique lines?
>>
>> There's my codes, what if I run this it will output many lines but I just
>> want to save the duplicate line just once and unique line.
>>
> 
> #!/usr/bin/perl
> use strict;
> use warnings;
> 
> my %seen;
> for(sort <DATA>) {
>   chomp;
>   if (/(\w+\s+\w+\s+)/) {
>     print "$1\n" unless $seen{$1}++;
>   }
> }
> 

P.S. For large amounts of data I'd prefer

#!/usr/bin/perl
use strict;
use warnings;
my %seen;
my @uniq;
for(<DATA>) {
   chomp;
   if (/(\w+\s+\w+\s+)/) {
     push @uniq, "$1\n" unless $seen{$1}++;
   }
}
print sort @uniq;


__END__
Homo sapiens E
Homo sapiens D
Arabidopsis thaliana S
Homo sapiens G
Mus musculus P
Mus musculus Q
Mus musculus R
Homo sapiens F
Caenorhabditis elegans H
Caenorhabditis elegans I
Homo sapiens A
Homo sapiens B
Homo sapiens C
Caenorhabditis elegans J
Mus musculus L
Mus musculus O
Mus musculus M
Mus musculus N
Caenorhabditis elegans K


-- 
RGB


------------------------------

Date: Mon, 29 Sep 2008 17:03:50 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <48e0fc69$0$2918$fa0fcedb@news.zen.co.uk>


RedGrittyBrick wrote:
> 
> RedGrittyBrick wrote:
>>
>> Amy Lee wrote:
>>> Hello,
>>>
>>> Dose perl has functions like the UNIX command sort and uniq can output
>>> duplicate lines and unique lines?
>>>
>>> There's my codes, what if I run this it will output many lines but I 
>>> just
>>> want to save the duplicate line just once and unique line.
>>>
>>
>> #!/usr/bin/perl
>> use strict;
>> use warnings;
>>
>> my %seen;
>> for(sort <DATA>) {
>>   chomp;
>>   if (/(\w+\s+\w+\s+)/) {
>>     print "$1\n" unless $seen{$1}++;
>>   }
>> }
>>
> 
> P.S. For large amounts of data I'd prefer
> 
> #!/usr/bin/perl
> use strict;
> use warnings;
> my %seen;
> my @uniq;
> for(<DATA>) {
>   chomp;
>   if (/(\w+\s+\w+\s+)/) {
>     push @uniq, "$1\n" unless $seen{$1}++;
>   }
> }
> print sort @uniq;


#!/usr/bin/perl
use strict;
use warnings;
my %seen;
for(<DATA>) {
   chomp;
   if (/(\w+\s+\w+\s+)/) {
     $seen{"$1\n"}++;
   }
}
print sort keys %seen;


This is a deep hole I've dug myself into :-)


-- 
RGB


------------------------------

Date: Tue, 30 Sep 2008 00:04:45 +0800
From: Amy Lee <openlinuxsource@gmail.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <pan.2008.09.29.16.04.44.242004@gmail.com>

On Mon, 29 Sep 2008 16:59:28 +0100, RedGrittyBrick wrote:

> 
> RedGrittyBrick wrote:
>> 
>> Amy Lee wrote:
>>> Hello,
>>>
>>> Dose perl has functions like the UNIX command sort and uniq can output
>>> duplicate lines and unique lines?
>>>
>>> There's my codes, what if I run this it will output many lines but I just
>>> want to save the duplicate line just once and unique line.
>>>
>> 
>> #!/usr/bin/perl
>> use strict;
>> use warnings;
>> 
>> my %seen;
>> for(sort <DATA>) {
>>   chomp;
>>   if (/(\w+\s+\w+\s+)/) {
>>     print "$1\n" unless $seen{$1}++;
>>   }
>> }
>> 
> 
> P.S. For large amounts of data I'd prefer
> 
> #!/usr/bin/perl
> use strict;
> use warnings;
> my %seen;
> my @uniq;
> for(<DATA>) {
>    chomp;
>    if (/(\w+\s+\w+\s+)/) {
>      push @uniq, "$1\n" unless $seen{$1}++;
>    }
> }
> print sort @uniq;
> 
> 
> __END__
> Homo sapiens E
> Homo sapiens D
> Arabidopsis thaliana S
> Homo sapiens G
> Mus musculus P
> Mus musculus Q
> Mus musculus R
> Homo sapiens F
> Caenorhabditis elegans H
> Caenorhabditis elegans I
> Homo sapiens A
> Homo sapiens B
> Homo sapiens C
> Caenorhabditis elegans J
> Mus musculus L
> Mus musculus O
> Mus musculus M
> Mus musculus N
> Caenorhabditis elegans K
Thank you very much!

Regards,

Amy


------------------------------

Date: Mon, 29 Sep 2008 09:34:59 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <rp02e41jkjugqpbtgt31gjr19javorfa3v@4ax.com>

Amy Lee <openlinuxsource@gmail.com> wrote:
>Dose perl has functions like the UNIX command sort 

What does 'perldoc -f sort' tell you?

>and uniq can output
>duplicate lines and unique lines?

Did you check the FAQ? Please see 'perldoc -q duplicate'.

jue


------------------------------

Date: Mon, 29 Sep 2008 19:49:26 +0200
From: Bart Lateur <bart.lateur@pandora.be>
Subject: Re: Help: Duplicate and Unique Lines Problem
Message-Id: <h252e4940o0rsil7d13epi9b4f0a0p5iov@4ax.com>

Amy Lee wrote:

>Here's the codes:
>
>open FILE, '<', "$file";
>while (<FILE>)
>{
>  @raw_list = split /\n/, $_;
>  @list = uniq (@raw_list);
>  print "@list\n";
>}
>It seems that the uniq does nothing! I don't know the reason.

You need to slurp the whole file before working on the data. Now you're
checking fior every line, if there's not the same line in the list of
one. Which is impossible.

Using one of the tricks from the FAQ, one can do this:

	open FILE, '<', "$file";
	my %seen;
	while (<FILE>)
	{
	    print unless $seen{$_}++;
	}

The hash %seen is used to check for lines in the past, too. That's why
it works across lines.

-- 
	Bart.


------------------------------

Date: Mon, 29 Sep 2008 08:35:49 -0700 (PDT)
From: Sunny <indiana.123@gmail.com>
Subject: Hot Linux Admin Requirement for ADP at NJ for 6+ months
Message-Id: <ce00bf6b-a728-4060-b8cc-889f11ce5257@a29g2000pra.googlegroups.com>

Hi,
    My Client ADP is seeking consultants for below requirement, kindly
do let me know if you have any consultant, Forward me your consultants
profile with
    Necessary details


Need to have experience with Sun/Red hat, Veritas VxVM, EMC storage.
Will be working in an Internet DMZ networked/firewalled environment.
Position requires Sun Solaris 9, 10 with virtualization experience,
Red hat enterprise 4, knowledge of EMC SAN attached systems and its
client side systems, powerpath, Navigant. Requires Veritas Volume
manager and Clustered File System Experience. Experience with Internet
product delivery systems, web, application servers. MySQL/Oracle
experience a plus.

Summary:
Provides network support to operational computer networks. Essential
Duties and Responsibilities: Assembles and configures network
components and associated services. Sets up and maintains basic
network operations, including assembly of network hardware. Performs
network troubleshooting to isolate and diagnose common network
problems. Upgrades network hardware and software components as
required. Installs, upgrades and configures network printing,
directory structures, rights, security, software and files services.
Provides users with network technical support. Responds to needs and
questions of users concerning their access of network resources.
Establishes network users, user environments, directories, and
security for networks being installed. Installs and tests necessary
software and hardware.

Do forward me your consultanmts profile to sandeep.sulakhe@logistic-
solutions.com


------------------------------

Date: Mon, 29 Sep 2008 14:10:45 +0200
From: Azol <azol@non-non-non>
Subject: Re: How to unable the use of tainted mode in a CGI script ?
Message-Id: <MPG.234ae3cc666539079897f3@news.free.fr>

In article <dYRDk.3630$uS7.2552@newsfe02.iad>, tim@burlyhost.com says...
> Azol wrote:
> 
> > In article <ebvDk.28$Ra2.19@newsfe06.iad>, tim@burlyhost.com says...
> >> That is unfortunate, and wrong.  I suggest you start looking for a
> >> hosting provider that has a better idea of how this works.
> >> 
> > 
> > I've asked them more info about the reason why of their strange
> > decision... I'll see their reply.
> > 
> > Also, awaiting this, and knbowing I don't see any separated error.log
> > anywhere, I've tried to catch the error using :
> > 
> > use CGI::Carp qw/fatalsToBrowser/;
> > 
> > And it doesn't work : nothing special on browser's screen ? Does-it
> > means there again a special config on their side ?
> > 
> > :((
> 
> If -T is causing it to fail, it's pretty just like having invalid syntax
> that would cause the script to error rather than execute.  It can't
> report an error in that way, if the script can't run.  So, you'd have
> it fail and error in a way that wouldn't relate to showing errors via
> CGI::Carp, I'm sorry to say.  That's not going to allow you to see
> why/the error.  Ask them where the error logs are located.  Do you have
> shell/ssh access?  Do you have any control panel or interface where you
> can view logs, or download them via FTP, or anything?  Ultimately, you
> should just get a better web host that understands the advantages to
> allowing Taint (I honestly can't conceive of a reason why a host would
> make an effort to NOT allow something that only helps their clients
> create more secure scripts.  I'd worry about what else they've done (or
> have not done) that affects stability, security and efficiency).
> 

You're right : this hoster is really bad.

Here is their last reply when I ask them more details about tainted mode 
forbidding and the location where is error.log

In French :

Nous vous informons que ce fichier n'est accessible qu'à l'utilisateur
root sur le serveur.
Concernant le mode tainted, il s'agit de raisons techniques que nous ne
pouvons pas détailler ici.

So, in English :

We inform you that this file (error.log) is only accessible for the rrot 
user.
About tainted mode, we can't tell you our technical reason (ie. it's 
confidential and you're just a customer)

:(


------------------------------

Date: Mon, 29 Sep 2008 05:22:49 -0700 (PDT)
From: Snorik <clauskick@hotmail.com>
Subject: Re: IPC:Shareable
Message-Id: <f3f4928e-04af-4c69-a25f-f0f6df13c88d@u65g2000hsc.googlegroups.com>

On 26 Sep., 22:36, Ted Zlatanov <t...@lifelogs.com> wrote:

> Rather than explaining how to fix your example, here's a working program
> that shows how to use Tie::ShareLite.  Each child will lock, write
> 'hello' to the key of its PID, then unlock.  You should lock and unlock
> around every access to the tied hash.  The parent waits for the children
> to finish and then prints the summary (also locking and unlocking, to
> protect from other processes that might be accessing that shared memory).

OK.

> The get_ipc() function is just for convenience.  Key 1971 is just an
> example from the Tie::ShareLite docs, you can use any value.

I know, I know.

> Note I clear %shared every time I start.

Ok, I was not aware I could just do that. I thought I needed to
actually remove the contents of the shared memory segment somehow.

>It's not destroyed at the
> program's end.  Setting destroy to yes in the parent doesn't work for
> me, and I didn't debug it (no time :)

I have the nagging feeling that this does not work.
But I like your solution to that, it is very simple yet powerful.

Anyhow, stuff works like a charm now, thank you so much for your time,
I owe you a beer :-)


------------------------------

Date: Mon, 29 Sep 2008 03:09:07 -0700 (PDT)
From: Jason Carlton <jwcarlton@gmail.com>
Subject: Repeating characters
Message-Id: <13e8f9d9-9179-4042-b4b0-87430b41d002@r66g2000hsg.googlegroups.com>

This should be simple enough, but I can't quite remember the logic and
I can't find the answer.

I'm trying to prevent 4 or more occurences of any repeating
characters. This is what I've been using:

$x =~ s/(.)\1{4,}/$1$1$1$1/g;

How do I make it catch ANY character, including symbols like <, >, *,
$, etc? My goal would be to convert something like this:

<<<<<<<<<<< AAAAAAAAAAAAAA >>>>>>>>>>>>> BBBBBBBBBBB

To this:

<<<< AAAA >>>> BBBB


TIA,

Jason


------------------------------

Date: Mon, 29 Sep 2008 03:51:51 -0700 (PDT)
From: Jason Carlton <jwcarlton@gmail.com>
Subject: Re: Repeating characters
Message-Id: <eed164b7-d529-4e96-a725-6971c9f5c30a@m73g2000hsh.googlegroups.com>

On Sep 29, 6:09=A0am, Jason Carlton <jwcarl...@gmail.com> wrote:
> This should be simple enough, but I can't quite remember the logic and
> I can't find the answer.
>
> I'm trying to prevent 4 or more occurences of any repeating
> characters. This is what I've been using:
>
> $x =3D~ s/(.)\1{4,}/$1$1$1$1/g;
>
> How do I make it catch ANY character, including symbols like <, >, *,
> $, etc? My goal would be to convert something like this:
>
> <<<<<<<<<<< AAAAAAAAAAAAAA >>>>>>>>>>>>> BBBBBBBBBBB
>
> To this:
>
> <<<< AAAA >>>> BBBB
>
> TIA,
>
> Jason


I should have added that I thought the (.) would catch any character,
but I'm having submissions with more than 4 repeats so there's a flaw
somewhere.


------------------------------

Date: Mon, 29 Sep 2008 13:21:51 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Repeating characters
Message-Id: <slrnge1eih.dsm.hjp-usenet2@hrunkner.hjp.at>

On 2008-09-29 10:51, Jason Carlton <jwcarlton@gmail.com> wrote:
> On Sep 29, 6:09 am, Jason Carlton <jwcarl...@gmail.com> wrote:
>> I'm trying to prevent 4 or more occurences of any repeating
>> characters. This is what I've been using:
>>
>> $x =~ s/(.)\1{4,}/$1$1$1$1/g;
>>
>> How do I make it catch ANY character, including symbols like <, >, *,
>> $, etc? My goal would be to convert something like this:
>>
>> <<<<<<<<<<< AAAAAAAAAAAAAA >>>>>>>>>>>>> BBBBBBBBBBB
>>
>> To this:
>>
>> <<<< AAAA >>>> BBBB
>
>
> I should have added that I thought the (.) would catch any character,
> but I'm having submissions with more than 4 repeats so there's a flaw
> somewhere.

The dot does match any character (except newline unless you use /s).
Whereever the flaw is, it isn't in any information you posted.

#!/usr/bin/perl
use warnings;
use strict;

my $x = "<<<<<<<<<<< AAAAAAAAAAAAAA >>>>>>>>>>>>> BBBBBBBBBBB";

$x =~ s/(.)\1{4,}/$1$1$1$1/g;

print "$x\n";
__END__

prints 

<<<< AAAA >>>> BBBB

just as you want. Can you come up with an example which shows the
problem?

	hp


------------------------------

Date: Mon, 29 Sep 2008 16:21:33 +0200
From: Hartmut Camphausen <JustMe@somewhere.de>
Subject: Re: Repeating characters
Message-Id: <MPG.234b02dae2fff415989695@news.t-online.de>

Jason Carlton schrieb:

> > I'm trying to prevent 4 or more occurences of any repeating
> > characters. This is what I've been using:
> >
> > $x =~ s/(.)\1{4,}/$1$1$1$1/g;

This one should do it:

 my $x  =  '"""""""""""" AAAAAAAAAAAAAA >>>>>>>>>>>>> BBBBBB BBBBB';

 $x     =~ s/((.)\2{3})\2+/$1/g;  # match only if more than 4 occurences

 print     $x;                    # """" AAAA >>>> BBBB BBBB


> > How do I make it catch ANY character, including symbols like <, >, *,
> > $, etc? 

A Peter said, the '.' does this. Provide a working example of code 
yielding unwanted/-expected results.

hth + mfg, Hartmut


-- 
  ------------------------------------------------
Hartmut Camphausen      h.camp[bei]textix[punkt]de


------------------------------

Date: Mon, 29 Sep 2008 09:52:03 -0700 (PDT)
From: Jason Carlton <jwcarlton@gmail.com>
Subject: Re: Repeating characters
Message-Id: <d072c305-622c-4b41-baec-546ae61e48a1@m36g2000hse.googlegroups.com>

On Sep 29, 10:21=A0am, Hartmut Camphausen <Jus...@somewhere.de> wrote:
> Jason Carlton schrieb:
>
> > > I'm trying to prevent 4 or more occurences of any repeating
> > > characters. This is what I've been using:
>
> > > $x =3D~ s/(.)\1{4,}/$1$1$1$1/g;
>
> This one should do it:
>
> =A0my $x =A0=3D =A0'"""""""""""" AAAAAAAAAAAAAA >>>>>>>>>>>>> BBBBBB BBBB=
B';
>
> =A0$x =A0 =A0 =3D~ s/((.)\2{3})\2+/$1/g; =A0# match only if more than 4 o=
ccurences
>
> =A0print =A0 =A0 $x; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0# """" AAAA >=
>>> BBBB BBBB
>
> > > How do I make it catch ANY character, including symbols like <, >, *,
> > > $, etc?
>
> A Peter said, the '.' does this. Provide a working example of code
> yielding unwanted/-expected results.
>
> hth + mfg, Hartmut
>
> --
> =A0 ------------------------------------------------
> Hartmut Camphausen =A0 =A0 =A0h.camp[bei]textix[punkt]de


Shoot, guys, that was pretty much a copy and paste! The script itself
is rather long, but this is the only section that does any
replacements on this field:

$contents{'description'} =3D~ s/\r\n|\r|\n/ /g;
$contents{'description'} =3D~ s/\&/ and /g;
$contents{'description'} =3D~ s/  / /g;
$contents{'description'} =3D~ s/</\&lt;/g;
$contents{'description'} =3D~ s/>/\&gt;/g;
$contents{'description'} =3D~ s/\*//g;
$contents{'description'} =3D~ s/ +$//;
$contents{'description'} =3D~ s/^\s+|\s+$//g;
$contents{'description'} =3D~ s/(.)\1{4,}/$1$1$1$1/g;


This is in a classifieds program. The other matches work, but
yesterday someone submitted this in the description field:

LARGE BIRD CAGE ASKING $200
FIRM ....>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

It limited the .... to 4, but not the > or <


------------------------------

Date: Mon, 29 Sep 2008 04:35:01 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Replacing binary data containing & using perl in HPUX 11.11
Message-Id: <g3f1e4dafg8kjj7u6d9v532easo8t7bd3f@4ax.com>

badkmail@gmail.com wrote:
> I have seen in some
>places the constants &apos;, &amp; etc being used. I googled and found
>that these are HTML constants. Is my understanding correct?

They are called "Character entity references", please see
http://www.w3.org/TR/html401/charset.html#entities

>and how is it possible to use these with perl?

You don't because Perl neither knows nor cares about them.
To Perl those are just pieces of ordinary text.

jue


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 1886
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[30641] in Perl-Users-Digest

Perl-Users Digest, Issue: 1886 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Sep 29 14:09:56 2008

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 29 14:09:56 2008