[24485] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 6667 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Jun 8 18:06:13 2004

Date: Tue, 8 Jun 2004 15:05:05 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 8 Jun 2004     Volume: 10 Number: 6667

Today's topics:
    Re: Converting a string to multiple search patterns (Anno Siegel)
    Re: Converting a string to multiple search patterns <pinyaj@rpi.edu>
    Re: Converting a string to multiple search patterns <dwall@fastmail.fm>
    Re: Converting a string to multiple search patterns <dwall@fastmail.fm>
    Re: Converting a string to multiple search patterns <bmb@ginger.libs.uga.edu>
    Re: Converting a string to multiple search patterns <pinyaj@rpi.edu>
    Re: Converting a string to multiple search patterns (Randal L. Schwartz)
    Re: Converting a string to multiple search patterns <dwall@fastmail.fm>
    Re: Converting a string to multiple search patterns <tore@aursand.no>
    Re: Converting a string to multiple search patterns <tore@aursand.no>
    Re: Do you ever use awk? <michal@gortel.phys.ualberta.ca>
        Net::SMTP fails connection in CGI <jfp24@cornell.edu>
    Re: New to Perl: Need help with a script <pinyaj@rpi.edu>
    Re: No-install Perl Interpretor <bik.mido@tiscalinet.it>
    Re: No-install Perl Interpretor <bik.mido@tiscalinet.it>
        Perl Large Scalar Question? <r.mariotti@financialdatacorp.com>
    Re: Perl Large Scalar Question? <thundergnat@hotmail.com>
    Re: signal not being caught when re-exec()d <odyniec-usenet@odyniec.net>
    Re: sub returning nothing <MyFirstnameHere.News1@gustra.org>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 8 Jun 2004 18:10:53 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Converting a string to multiple search patterns
Message-Id: <ca4vfd$5qe$1@mamenchi.zrz.TU-Berlin.DE>

Tore Aursand  <tore@aursand.no> wrote in comp.lang.perl.misc:
> On Tue, 08 Jun 2004 14:52:41 +0000, Anno Siegel wrote:
> > sub selections {
> >     my @sel = [];
> >     for my $elem ( @_ ) {
> >         unshift @sel, map [ $elem, @$_], @sel;
> >     }
> >     @sel;
> > }
> 
> This one almost does the job, but it doesn't output what I want; The
> elements are mostly reversed, so reverse()'ing @_ makes it a bit better.
> 
> Still, 'A B D' comes before 'B C D'.  Take a look at my previous reply to
> you:
> 
>     1. A B C D
>     2. A B C
>     3.   B C D
>     4. A B   D
>     5. A   C D
>     6. A B
>     7.   B C
>     8.     C D
>     9. A   C
>    10.   B   D
>    11. A     D
>    12. A
>    13.   B
>    14.     C
>    15.       D
> 
> Try this simple script using your subroutine (rewritten, 'cause
> "sometimes" Pan won't let me paste thing I mark in xterm):
> 
>   my @array = selections( qw(A B C D) );
>   for ( 0..$#array ) {
>       print "$_. " . join(' ', @{$array[$_]}) . "\n";
>   }
> 
> But you're close, Anno! :)

I don't think I am.  Unless I'm missing something obvious, there is
no simple logic that generates exactly the sequence you want.  Quoting
the exchange with Brad Baxter

> o More terms = higher score
> o More adjacent terms = higher score
> o Leftward terms = higher score than those to the right

the resulting sequence looks rather arbitrary, best realized as a three-
level sort that follows the description.

Even generating all n-element selections before the (n+1)-element ones
isn't quite trivial, let alone generating those with many adjacent
elements before those with fewer, etc.

So, unless you can reveal a simple logic behind your sequence, my
recipe is, generate them in any sequence, then sort them into the
required order.

The sorting step could be realized as a hash slice through pre-sorted
sequences of integers.  Each sequence is a permutation of 0 .. 2**(n-1),
where n is the total number of terms, four in the example.  I think
Japhy was doing something similar at one stage of his Perl artistics
in the other subthread.

Anno


------------------------------

Date: Tue, 8 Jun 2004 14:33:30 -0400
From: Jeff 'japhy' Pinyan <pinyaj@rpi.edu>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <Pine.SGI.3.96.1040608143151.26664A-100000@vcmr-64.server.rpi.edu>

On 8 Jun 2004, Anno Siegel wrote:

>Tore Aursand  <tore@aursand.no> wrote in comp.lang.perl.misc:
> 
>>     1. A B C D
>>     2. A B C
>>     3.   B C D
>>     4. A B   D
>>     5. A   C D
>>     6. A B
>>     7.   B C
>>     8.     C D
>>     9. A   C
>>    10.   B   D
>>    11. A     D
>>    12. A
>>    13.   B
>>    14.     C
>>    15.       D
>
>The sorting step could be realized as a hash slice through pre-sorted
>sequences of integers.  Each sequence is a permutation of 0 .. 2**(n-1),
>where n is the total number of terms, four in the example.  I think
>Japhy was doing something similar at one stage of his Perl artistics
>in the other subthread.

The concept of my solution is to match them arbitrarily, but assign each
match a score, and sort based on the scores.  I think that's the most
efficient way (even if my regex is not efficient because of its code
evaluation assertions).

-- 
Jeff Pinyan         RPI Acacia Brother #734        RPI Acacia Corp Secretary
"And I vos head of Gestapo for ten     | Michael Palin (as Heinrich Bimmler)
 years.  Ah!  Five years!  Nein!  No!  | in: The North Minehead Bye-Election
 Oh.  Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)



------------------------------

Date: Tue, 08 Jun 2004 19:14:27 -0000
From: "David K. Wall" <dwall@fastmail.fm>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <Xns95029B0961B65dkwwashere@216.168.3.30>

Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> wrote:

> Tore Aursand  <tore@aursand.no> wrote in comp.lang.perl.misc:
[snip]

>>     1. A B C D
>>     2. A B C
>>     3.   B C D
>>     4. A B   D
>>     5. A   C D
>>     6. A B
>>     7.   B C
>>     8.     C D
>>     9. A   C
>>    10.   B   D
>>    11. A     D
>>    12. A
>>    13.   B
>>    14.     C
>>    15.       D

[snip]

>> But you're close, Anno! :)
> 
> I don't think I am.  Unless I'm missing something obvious, there
> is no simple logic that generates exactly the sequence you want. 


# building on previously posted code
use strict;
use warnings;

my @array = 'A' .. 'D';

my @subsets = subsets( 0 .. $#array );

my @sequences;
for my $subset ( @subsets ) {
    print join(' ',  @array[ @$subset ]), "\n" ;
}

sub subsets {
    my @sel = [];
    for my $elem ( @_ ) {
        push @sel, map [ @$_, $elem], @sel unless @sel == 0;
    }
    return sort { 
        @$b <=> @$a 
        ||  
        ($a->[-1] - $a->[0]) <=> ($b->[-1] - $b->[0])
        ||
        join('', @$a) cmp join('', @$b) 
    } @sel;
}


------------------------------

Date: Tue, 08 Jun 2004 19:50:26 -0000
From: "David K. Wall" <dwall@fastmail.fm>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <Xns9502A1236346Bdkwwashere@216.168.3.30>

I wrote:

> # building on previously posted code
> use strict;
> use warnings;
> 
> my @array = 'A' .. 'D';
> 
> my @subsets = subsets( 0 .. $#array );
> 
> my @sequences;
> for my $subset ( @subsets ) {
>     print join(' ',  @array[ @$subset ]), "\n" ;
>}
> 
> sub subsets {
>     my @sel = [];
>     for my $elem ( @_ ) {
>         push @sel, map [ @$_, $elem], @sel unless @sel == 0;
>     }
>     return sort { 
>         @$b <=> @$a 
>         ||  
>         ($a->[-1] - $a->[0]) <=> ($b->[-1] - $b->[0])
>         ||
>         join('', @$a) cmp join('', @$b) 

I should point out that this will break if $#array > 9.

>     } @sel;
>}



------------------------------

Date: Tue, 8 Jun 2004 16:32:41 -0400
From: Brad Baxter <bmb@ginger.libs.uga.edu>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <Pine.A41.4.58.0406081546300.15312@ginger.libs.uga.edu>

On Tue, 8 Jun 2004, David K. Wall wrote:

> Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> wrote:
>
> > Tore Aursand  <tore@aursand.no> wrote in comp.lang.perl.misc:
> [snip]
>
> >>     1. A B C D
> >>     2. A B C
> >>     3.   B C D
> >>     4. A B   D
> >>     5. A   C D
> >>     6. A B
> >>     7.   B C
> >>     8.     C D
> >>     9. A   C
> >>    10.   B   D
> >>    11. A     D
> >>    12. A
> >>    13.   B
> >>    14.     C
> >>    15.       D
>
> [snip]
>
> >> But you're close, Anno! :)
> >
> > I don't think I am.  Unless I'm missing something obvious, there
> > is no simple logic that generates exactly the sequence you want.
>
>
> # building on previously posted code
> use strict;
> use warnings;
>
> my @array = 'A' .. 'D';
>
> my @subsets = subsets( 0 .. $#array );
>
> my @sequences;
> for my $subset ( @subsets ) {
>     print join(' ',  @array[ @$subset ]), "\n" ;
> }
>
> sub subsets {
>     my @sel = [];
>     for my $elem ( @_ ) {
>         push @sel, map [ @$_, $elem], @sel unless @sel == 0;
>     }
>     return sort {
>         @$b <=> @$a
>         ||
>         ($a->[-1] - $a->[0]) <=> ($b->[-1] - $b->[0])
>         ||
>         join('', @$a) cmp join('', @$b)
>     } @sel;
> }
>

I think I disagree with this.  While it agrees with the OP's stated specs,
I'm not sure it agrees with the spirit of the specs.  :-)  Of course, my
interpretation may simply be wrong.

Where I disagree first is with the stated specs:

> >>     1. A B C D
> >>     2. A B C
> >>     3.   B C D
> >>     4. A B   D
> >>     5. A   C D
> >>     6. A B
> >>     7.   B C
> >>     8.     C D
> >>     9. A   C
> >>    10.   B   D
> >>    11. A     D
> >>    12. A
> >>    13.   B
> >>    14.     C
> >>    15.       D

I think 11. A D should come before 10. B D, because all else being equal,
A comes before B.  In addition, when your code expands the terms 'A'..'E',
you get:

A B C D E
A B C D
B C D E
A B C E
A B D E
A C D E
A B C
B C D
C D E
 ...

While 'A B D E' has more terms, I think 'A B C', 'B C D', and 'C D E'
should outrank it, because they have more adjacent terms in a row.

So, while I think your code is MUCH prettier, below is my take on this
problem.  The scoring is bizarre--I just want to weight the right things
while eliminating dupes.  Seems to work, but I can't give you a
mathematical proof, so it's probably flawed.  :-)


use strict;
use warnings;

my @sets = subsets( 'A' .. 'D' );

print "@$_\n" for @sets;

sub subsets {
    my @words = @_;
    my $n = @words;
    my @sets;
    my @scored;
    my %seen;

    # create "binary" sets, '1's represent words present
    push @sets, sprintf "%0${n}b", $_ for 0 .. 2**$n-1;

    # $x is for unique sort keys ("scores")
    my $x = $n - 1;

    for my $si ( 0 .. $#sets ) { # need $si in score
        my $set = $sets[ $si ];

        # split into groups of adjacent terms
        my @groups = split( /0/, $set);

        my $score = 0;
        for my $gi ( 0 .. $#groups ) { # need $gi in score
            my $group = $groups[ $gi ];
            next unless $group;

            # sets are scored by length, number,
            # and lefthandedness of their groups
            my $len = length $group;
            $score += $x**(2*$len) + ($x-$gi)*$len + $si;
        }

        # convert "binary" sets to sets of words
        my @wordset;
        my $i = 0;
        for( split //, $set) {
            my $word = $words[$i++];
            $_ && push @wordset, $word;
        }
        push @scored, [$score, [@wordset]];

        $seen{$score}++; # to prove our scores are unique
    }

    for ( sort keys %seen ) { die "Dupe: $_" if $seen{$_}>1 }

    map  { $_->[1] }
    sort { $b->[0] <=> $a->[0] }
    @scored;

}

__END__


Regards,

Brad


------------------------------

Date: Tue, 8 Jun 2004 17:06:57 -0400
From: Jeff 'japhy' Pinyan <pinyaj@rpi.edu>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <Pine.SGI.3.96.1040608170302.27727A-100000@vcmr-64.server.rpi.edu>

On Tue, 8 Jun 2004, Brad Baxter wrote:

>I think 11. A D should come before 10. B D, because all else being equal,
>A comes before B.  In addition, when your code expands the terms 'A'..'E',
>you get:
>
>A B C D E
>A B C D
>B C D E
>A B C E
>A B D E
>A C D E
>A B C
>B C D
>C D E
>
>While 'A B D E' has more terms, I think 'A B C', 'B C D', and 'C D E'
>should outrank it, because they have more adjacent terms in a row.

Well, my regex solution agrees with you, and while the insides of the
regex are a bit ugly to look at, the algorithm is far simpler than it
seems.

  #!/usr/bin/perl -l

  my $rx;

  {
    use re 'eval';
    my @kw = qw( A B C D E );
    $rx = qr{
      (?{ local ($s, $f) = (0, 1) })
      ^ \s*
      @{[map qq{ (?:
          \Q$_\E (?{ \$s += \$f <<= 1 }) |
          (?{ \$f = 1 })
        ) \\s*
      }, @kw ]}
      $
      (?{ $s })
    }x;
  }

  while (<{A,}{B,}{C,}{D,}{E,}>) {
    chomp;
    print "$^R\t$_" if /$rx/ and $^R;
  }

Run that code through '| sort -n', and you'll get:

2       A
2       B
2       C
2       D
2       E
4       AC
4       AD
4       AE
4       BD
4       BE
4       CE
6       AB
6       ACE
6       BC
6       CD
6       DE
8       ABD
8       ABE
8       ACD
8       ADE
8       BCE
8       BDE
12      ABDE
14      ABC
14      BCD
14      CDE
16      ABCE
16      ACDE
30      ABCD
30      BCDE
62      ABCDE

which I think is consistent with the rules.

-- 
Jeff Pinyan         RPI Acacia Brother #734        RPI Acacia Corp Secretary
"And I vos head of Gestapo for ten     | Michael Palin (as Heinrich Bimmler)
 years.  Ah!  Five years!  Nein!  No!  | in: The North Minehead Bye-Election
 Oh.  Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)




------------------------------

Date: 08 Jun 2004 14:18:22 -0700
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Converting a string to multiple search patterns
Message-Id: <8665a1wxy9.fsf@blue.stonehenge.com>

*** post for FREE via your newsreader at post.newsfeed.com ***

>>>>> "Brad" == Brad Baxter <bmb@ginger.libs.uga.edu> writes:

Brad> I think 11. A D should come before 10. B D, because all else being equal,
Brad> A comes before B.  In addition, when your code expands the terms 'A'..'E',
Brad> you get:

Brad> A B C D E
Brad> A B C D
Brad> B C D E
Brad> A B C E
Brad> A B D E
Brad> A C D E
Brad> A B C
Brad> B C D
Brad> C D E
Brad> ...

Brad> While 'A B D E' has more terms, I think 'A B C', 'B C D', and 'C D E'
Brad> should outrank it, because they have more adjacent terms in a row.

It appears as though you are sorting on "edit distance", which is a
well-defined term, and even has a module, String::Approx, to compute
it.

    use strict;
    use String::Approx 'adist';
    my @strings = glob "{,A}{,B}{,C}{,D}{,E}"; # lazy. :)
    shift @strings; # leave out empty string
    my @dists = map { abs adist("ABCDE", $_) } @strings;
    my @sorted = sort {
      $dists[$a] <=> $dists[$b] or $strings[$a] cmp $strings[$b]
    } 0..$#strings;
    printf "%5s %d\n", $strings[$_], $dists[$_] for @sorted;

==>

    ABCDE 0
     ABCD 1
     ABCE 1
     ABDE 1
     ACDE 1
     BCDE 1
      ABC 2
      ABD 2
      ABE 2
      ACD 2
      ACE 2
      ADE 2
      BCD 2
      BCE 2
      BDE 2
      CDE 2
       AB 3
       AC 3
       AD 3
       AE 3
       BC 3
       BD 3
       BE 3
       CD 3
       CE 3
       DE 3
        A 4
        B 4
        C 4
        D 4
        E 4


-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


 -----= Posted via Newsfeed.Com, Uncensored Usenet News =-----
http://www.newsfeed.com - The #1 Newsgroup Service in the World!
-----== 100,000 Groups! - 19 Servers! - Unlimited Download! =-----
                  


------------------------------

Date: Tue, 08 Jun 2004 21:27:17 -0000
From: "David K. Wall" <dwall@fastmail.fm>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <Xns9502B18EBBC43dkwwashere@216.168.3.30>

Brad Baxter <bmb@ginger.libs.uga.edu> wrote:

> On Tue, 8 Jun 2004, David K. Wall wrote:
> 
[snip code]

(BTW, I don't like the way my posted code (mis)handles the empty set, 
but never mind)

> I think I disagree with this.  While it agrees with the OP's
> stated specs, I'm not sure it agrees with the spirit of the specs.
>  :-)  

Heh. 

>        Of course, my interpretation may simply be wrong.

Or mine. It's Tore's problem, let him worry about it. :-)


> Where I disagree first is with the stated specs:
> 
>> >>     1. A B C D
>> >>     2. A B C
>> >>     3.   B C D
>> >>     4. A B   D
>> >>     5. A   C D
>> >>     6. A B
>> >>     7.   B C
>> >>     8.     C D
>> >>     9. A   C
>> >>    10.   B   D
>> >>    11. A     D
>> >>    12. A
>> >>    13.   B
>> >>    14.     C
>> >>    15.       D
> 
> I think 11. A D should come before 10. B D, because all else being
> equal, A comes before B.  

But all else isn't equal. The "distance" between A and D is greater 
than the "distance" between B and D.  I'm not sure how to express 
this clearly other than in code, but the way I understood Tore was 
this: the combinations are grouped 

    	first by the number of terms/elements in a combination, 
    	then by the "range" of the combination,
    	then by the order of the original set.

-- but maybe I was reading too much into the choice of 'A'..'D' for 
the example?

>                            In addition, when your code expands the
> terms 'A'..'E', you get:
> 
> A B C D E
> A B C D
> B C D E
> A B C E
> A B D E
> A C D E
> A B C
> B C D
> C D E
> ...
> 
> While 'A B D E' has more terms, I think 'A B C', 'B C D', and 'C D
> E' should outrank it, because they have more adjacent terms in a
> row. 

That's a good point -- I certainly won't argue against it. What Would 
Google Do?  :-)



------------------------------

Date: Tue, 08 Jun 2004 23:51:44 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <pan.2004.06.08.21.50.18.424416@aursand.no>

On Tue, 08 Jun 2004 14:18:22 -0700, Randal L. Schwartz wrote:
> It appears as though you are sorting on "edit distance", which is a
> well-defined term, and even has a module, String::Approx, to compute
> it.
> [...]

Hmm.  I've already had a look at String::Approx, but I don't think it will
help me solve this.

Remember that 'A B C D' really are four _words_;

  my $query = 'A B C D'; # What the user wants to search for
  my @words = split( /\s+/, $query );

Is there really no module which lets you do something like the following?

  my @words = ( whatever );
  foreach ( @Documents ) {
      my $score = search( $_->text(), \@words );
  }

Or something?  I want it! :)


-- 
Tore Aursand <tore@aursand.no>
"Those people who think they know everything are a great annoyance to
 those of us who do." (Isaac Asimov)


------------------------------

Date: Tue, 08 Jun 2004 23:51:44 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: Converting a string to multiple search patterns
Message-Id: <pan.2004.06.08.21.44.06.297128@aursand.no>

On Tue, 08 Jun 2004 16:32:41 -0400, Brad Baxter wrote:
>>  1. A B C D
>>  2. A B C
>>  3.   B C D
>>  4. A B   D
>>  5. A   C D
>>  6. A B
>>  7.   B C
>>  8.     C D
>>  9. A   C
>> 10.   B   D
>> 11. A     D
>> 12. A
>> 13.   B
>> 14.     C
>> 15.       D

> I think 11. A D should come before 10. B D, because all else being
> equal, A comes before B.

Hmm.  This is meant for a search engine, and personally I think it makes
more sense to score based on how close the words are to each other.  In
addition, words "to the left" should score higher than words on the "right
side".

Maybe I'm totally wrong, but I think most search engines out there works
like this; they always try to match "early" words before "later" words.

Or am I wrong?


-- 
Tore Aursand <tore@aursand.no>
"I didn't have time to write a short letter, so I wrote a long one
 instead." (Mark Twain)


------------------------------

Date: Tue, 8 Jun 2004 21:37:39 +0000 (UTC)
From: Michal Jaegermann <michal@gortel.phys.ualberta.ca>
Subject: Re: Do you ever use awk?
Message-Id: <ca5bj3$mgu$1@tabloid.srv.ualberta.ca>

Jeff Schwab <jeffrey.schwab@comcast.net> wrote:
> Michal Jaegermann wrote:
>> 
>> while read a b c d ; do echo $c ; done < my_file
> 
> The awk approach isn't "complicated,"

Indeed. :-)  Awk can do also various other things.  The point
was that awk is a heavy artillery for that minor task if you are
not using it for anything else.

> but what you wrote is getting there.

Ahem! If you think so. But you may use perl equally well.  See
'perl -ane ...'.  There is also "read -r" if you are afraid that
something may get wrongly interpreted.

   Michal


------------------------------

Date: Tue, 8 Jun 2004 17:08:53 -0400
From: CMCLab <jfp24@cornell.edu>
Subject: Net::SMTP fails connection in CGI
Message-Id: <MPG.1b2ff4f3ac81a087989681@newsstand.cit.cornell.edu>

Hello,
	I'm running an IIS server on Windows 2000 with ActiveState Perl 
5.8.  I need an automated e-mailer that attaches a PDF file.  To do so, 
I'm using MIME::Lite; I need to use Net::SMTP in turn to interface with 
the SMTP server.  Whenever I try to instantiate the Net::SMTP object 
through the CGI interface, the following error occurs:

"Failed to connect to mail server: Unknown error"

When I run the script from the command line interface, it makes the 
connection fine, and sends the e-mail without problems.  However in the 
CGI interface, it fails.  This is suggestive of a permissions problem, 
but I'm unclear about which permissions I would have to change if this 
was indeed the case.  

The other thought I had was that going through the CGI interface changes 
my identity to the mail server somehow, and it would no longer know what 
domain I was from - but there is nothing in Net/SMTP.pm that I can see 
that would suggest that.

I'm running out of ideas.  Any thoughts?

Thanks.

-- 
T. Barrett



------------------------------

Date: Tue, 8 Jun 2004 15:14:23 -0400
From: Jeff 'japhy' Pinyan <pinyaj@rpi.edu>
To: Jim Moser <jamesmmoser@yahoo.com>
Subject: Re: New to Perl: Need help with a script
Message-Id: <Pine.SGI.3.96.1040608145249.26793A-100000@vcmr-64.server.rpi.edu>

[posted & mailed]

On 8 Jun 2004, Jim Moser wrote:

>So far the script is functional; however, at each interval I'm calling
>system(clear) to clear the screen and update the list. This looks very
>sloppy to the end user and I would like the script to keep the IP and
>hostname on screen and only update the UP|DOWN field one at a time. If
>this is simple to do, could someone provide some sample code? If it's
>not quite so simple could someone point me in the right direction to
>get started. I am using O'Reilly Programming Perl 3rd Ed and O'Reilly
>Perl Cookbook as references.

If you want to be able to change the text at specific locations on your
terminal, I'd suggest the Curses module.

>#!/usr/bin/perl
>
>use Net::Ping;

You should always write your code with 'warnings' and 'strict' enabled.
It might take some getting used to (you'll need to declare your variables,
for one thing), but it will help you write cleaner code, and catch any 
typos you make, etc.

># Process arguements
>while (@ARGV and $ARGV[0] =~ /^-/) {
>	$_ = shift;
>	last if /^--$/;
>	if (/^-f(.*)/) { $hostfile = $1 }
>	if (/^-i(.*)/) { $int = $1 }
>	if (/^-\?|^-h/) { usage(); }
>}

This is fine, but you should also know that there are standard modules out
there to take care of command-line options for you.  See Getopt::Std and
Getopt::Long for starters.

Also, and some others might think me hyper-sensitive for saying this, but
you should really local()ize $_ whenever you assign to it explicitly or
use it when reading from a filehandle, because you can end up clobbering
its value from somewhere else.  Consider this example: 

  my @data = ('a' .. 'e');
  for (@data) {
    # ...
    foo($_)
    # ...
  }
  print "@data";  # no output!

  sub foo {
    my $file = shift;
    open TXT, "< file/$file.txt" or die "can't read file/$file.txt: $!";
    while (<TXT>) {
      # ...
    }
    close TXT;
  }

@data ends up having 5 undef values in it, because the naked <TXT> syntax
inside a while loop assigns to $_, and $_ is aliased to each of the
elemenst of your for loop list in turn.
  
>sub usage() {

There is no need for the () on your function definition here.

>print  "Usage: 	ping.pl options
>...
>	ping.pl -f/etc/hosts -i30\n";
>	exit;
>}

You might want to use a here-doc (perldoc perldata) instead, but it's
really just a preference.  More drastically, though, you might consider
documenting your program using POD (perldoc perlpod), and then making your
usage() function:

  sub usage {
    exec perldoc => __FILE__
  }

That runs 'perldoc' on your file, and presents the documentation you've
written in your program.

># Set default values if no arguments were supplied
>unless (defined($hostfile)) { $hostfile = "/etc/hosts" }
>unless (defined($int)) { $int = "30" }
>
>open(HOSTFILE, $hostfile);

Always, *always*, ALWAYS check the return value of a system call, like an
open():

  open HOSTFILE, "< $hostfile" or die "can't read $hostfile: $!";

Also, if you want to be SURE that your file will only be opened for
reading, be explicit about it like I've shown above.  If you just do

  open FILE, $file;

and you get $file's value from the user, what if they enter

  mail me@mywebsite.com < /etc/passwd |

as the filename?  Thus, be explicit.  (Also look into tainting incoming
data; read perldoc perlsec.)

># Create a hash of the hostfile; omitting comments, localhost, and
>blank lines
>LINE: while (<HOSTFILE>) {
>       next LINE if /^#/; 
>	next LINE if /^127/;
>	next LINE if /^\s/;

That just skips lines that start with any whitespace, not necessarily a
blank line.

>	($ip, $hostname) = split /\s/;
>	@fields = split ' ', $hostname;
>	$list{$ip} = [ @fields ];
>}

I would probably rewrite this like so:

  while (<HOSTFILE>) {
    s/#.*//;           # remove any comments
    next if /^\s+$/;   # skip if it's only whitespace
    next if /^127\./;  # skip if it's the localhost

    my ($ip, @fields) = split;
    $list{$ip} = \@fields;
  }

You'll notice a couple things different here.  First, I'm declaring my
variables with my().  This is because I'd be doing 'use strict', which
requires that I declare my variables' scope.  Since I don't need $ip or
@fields to exist outside that while loop, I declare them with my() here.

Also see that instead of doing [ @fields ], I did \@fields.  You can't do
this with your code, because your @fields is a global variable, not a
lexical one.  If you did \@fields, you'd end up with each hash value
pointing to the same array reference.  With my code, because @fields is
lexical and I've taken a reference to it, it doesn't die at the end of the
block, and it lives on (as a reference).

Finally, I've used split() with no arguments.  This means the same as
split(' ', $_).

># Ping each host once and label UP or DOWN
>sub getstatus {
>	$p = Net::Ping->new("icmp");

Is it *really* necessary to make a new Net::Ping object *every* time you
want to ping the hosts?  I don't think it is (but I haven't tested it, so
I could be wrong).

>	foreach $server ( keys %list ) {
>		if ($p->ping($server, 1)) {
>			$status = "UP";
>		}else{
>			$status = "DOWN";
>		}
>	write STDOUT;
>	}
>}
>
># Make it pretty
>format STDOUT = 
>@<<<<<<<<<<<<<<<< 	@<<<<<<<<<<<<<<<< 	@<<<<
>@{ $list{$server} }, 	$server,	  	$status
>}

Formats are kinda passe (in my opinion).  I'd use printf(), probably.

>while (1) {
>	system(clear);

You should quote that word.

  system "clear";

>	getstatus();
>	sleep($int);
>}

That's all I have to say at first glance.

-- 
Jeff Pinyan         RPI Acacia Brother #734        RPI Acacia Corp Secretary
"And I vos head of Gestapo for ten     | Michael Palin (as Heinrich Bimmler)
 years.  Ah!  Five years!  Nein!  No!  | in: The North Minehead Bye-Election
 Oh.  Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)





------------------------------

Date: Tue, 08 Jun 2004 23:06:11 +0200
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: No-install Perl Interpretor
Message-Id: <lj8cc05ch2oo57hq69bke78nc6aobvgaes@4ax.com>

On Mon, 07 Jun 2004 13:45:42 +0200, Micla <mick.lan@laposte.net>
wrote:

>Would anybody know a Perl version usable without any prior installation?
>
>One of the goals foreseen for it would be to have some "personal" 
>scripts on a USB memory key, immediately usable on any computer that I 
>have the opportunity to go on.
>
>But there are also other reasons to try to use Perl scripts without 
>installing Perl previously.

IMHO it's unreasonable to think that you won't need at least some of
the core modules. Now one possibility I have been thinking of for
other reasons and with which I would like to make some experiments
anyway is to spot the possibility[1] to put code in @INC to move
*most*[2] of perl lib into, say, a zip archive. Of course this would
be yet another instance of "trading time for space"...



[1] That I recently mentioned in another thread, BTW!
[2] It's obvious that a bunch of modules should remain available in a
physical directory to make the thing work...


Michele
-- 
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
  "perl bug File::Basename and Perl's nature"


------------------------------

Date: Tue, 08 Jun 2004 23:06:14 +0200
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: No-install Perl Interpretor
Message-Id: <rs9cc0d70sori025cu092oo72jq6a4ojsm@4ax.com>

On 7 Jun 2004 17:46:41 -0800, yf110@vtn1.victoria.tc.ca (Malcolm
Dew-Jones) wrote:

>The keys files required by perl are very few.
>
>On windows, for example, you need perl.exe and perhaps one .DLL file
>(perhaps another as well).  If you have those files in your path then you
>can write and run perl scripts without any kind of install, and without
>taking up much space.  Of course you can't use modules, unless you also
>include them somewhere, but so what, perl by itself is still extremely
>useful for things like mass batch editing of files and flexible filtering
>etc.

Yes, but even if often I come to think that I'm a
reinvent-the-wheel/roll-it-yourself-kinda-guy, thinking of it better I
realize that a consistent ratio of the scripts I write for daily use
relies on some module or another...

Not only, but in some cases perl will silently use some modules even
if appearently it is not so, I can't think of other examples ATM, but
an obvious one is:

  perl -MO=Deparse -le 'print for <*>'

that yields

  BEGIN { $/ = "\n"; $\ = "\n"; }
  use File::Glob ();
  foreach $_ (glob('*')) {
      print $_;
  }


Michele
-- 
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
  "perl bug File::Basename and Perl's nature"


------------------------------

Date: Tue, 08 Jun 2004 19:30:42 GMT
From: Bob Mariotti <r.mariotti@financialdatacorp.com>
Subject: Perl Large Scalar Question?
Message-Id: <3a486b3bbafd788ccfdc001dfff4a36d@news.teranews.com>

Here's one for the perl internals guru's:

We have a module that is used by many programs in our suite.
Currently it works and has been working just fine for several years.

What it does is take a string that is passed to it and sends it out a
tty port to a remote host then it enters into a read loop
contatenating all returned records until it receives one beginning
with "0999".   Here is a sample snippet (NOT ACTUAL CODE)

>$DATA="";
>open(XXX,"+>/dev/ttyxxxx") or die $!;
># Send Request to host
>print XXX "$Message\015" or die $!;
># Get Response from host
> while (<XXX>) {							
># Exit if Last Record Signal
>if (m/^0999/)  { last; } 
># Contruct Host Response String
>$DATA.=$_;
>} 
>return $DATA;

If the very most cases this works just absolutely fine as the records
coming back are either one or a few or at most a dozen or two.

However, we've added a request that will return upwards of 1000+
records of 115 bytes each.   The time to execute is becoming
unacceptable and my thought is that the reallocation of the scalar
variable $DATA is costing too much in time and resource.

Is there any good way to preallocate that scalar to approx 1000 x 115
before entering the loop so the re-allocation, data content move, and
destruction of the old scalar doesn't have to happen????

Also, can anyone recommend a more efficient method for obtaining and
passing these/this data back to the calling routine???

Thanks all.

Bob Mariotti
"perl is great!"


------------------------------

Date: Tue, 08 Jun 2004 16:04:55 -0400
From: thundergnat <thundergnat@hotmail.com>
Subject: Re: Perl Large Scalar Question?
Message-Id: <40c61be5$0$2939$61fed72c@news.rcn.com>

Bob Mariotti wrote:

> 
> However, we've added a request that will return upwards of 1000+
> records of 115 bytes each.   The time to execute is becoming
> unacceptable and my thought is that the reallocation of the scalar
> variable $DATA is costing too much in time and resource.

Unfortunately, I would suspect that the tranfer of 115000 bytes of data 
over a serial link is going to be three or more orders of magnitude 
slower than allocating 115000 bytes of space in memory. It sounds like 
you are trying to optimize the wrong thing.

You would likely get more bang for your buck trying to transfer the data 
in larger blocks, compressing the data before it is sent, or using a 
faster transport method. Not knowing the particulars of your 
application, it is difficult to make any really useful observations.


------------------------------

Date: 08 Jun 2004 20:56:35 +0200
From: Michal Wojciechowski <odyniec-usenet@odyniec.net>
Subject: Re: signal not being caught when re-exec()d
Message-Id: <873c55vpy4.fsf@odyniec.odyniec.net>

Jeff 'japhy' Pinyan <pinyaj@rpi.edu> writes:

[...]

>     $SIG{USR1} = sub {
>       warn "$$: re-exec()ing with [@args]\n";
>       exec @args;
>     };

[...]

> Then I send it a USR1 signal (numerical value is 10):
> 
>   % kill -10 1234
> 
> and I get
> 
>   1234: re-exec()ing with [/usr/bin/perl exec.pl arg]
>   1234: [/usr/bin/perl exec.pl arg]
> 
> Then I try to send it the signal again... and nothing.  No messages
> at all.  But the program is running, with the same PID... I'm
> baffled.

When the signal handler is triggered, the signal that caused it to run
gets blocked. The new program called with exec inherits the signal
mask, and therefore blocks the USR1 signal. To be able to catch it
again, you need to reset the signal mask. Use the POSIX sigprocmask
function:

  use POSIX qw(:signal_h);

  $sigset = POSIX::SigSet->new;
  sigprocmask(SIG_SETMASK, $sigset);

-- 
Michal Wojciechowski : for(<>){/\s/,$l{$m=$`}=$'}$_ : 10 PRINT "Yet another"
odyniec()odyniec;net : =$l{$c},/O\s/?$c=$'-1:y/"//d : 20 PRINT "Perl hacker"
 http://odyniec.net  : ,/T\s/?print$':0while$c++<$m : 30 GOTO 10


------------------------------

Date: Tue, 08 Jun 2004 20:36:54 +0200
From: Gunnar Strand <MyFirstnameHere.News1@gustra.org>
Subject: Re: sub returning nothing
Message-Id: <ca5106$711$1@hudsucker.umdac.umu.se>

Brian McCauley wrote:
> Gunnar Strand <MyFirstnameHere.News1@gustra.org> writes:
[...]
>>sub myvoid {
>>}
> 
> 
> That is not a subroutine returning nothing.
> 
> That is an empty function.
> 
> In a list context an empty function retunrns its arguments.

It returns an empty list in a list context according to perlmod,
and perl -e 'sub a{};print a(1, 2, 3)' prints nothing.

> In a scalar context it returns undef.
> 
> How ever I wouldn't consider this to be defined bevaviour.  I would be
> inclined to consider the behaviour of an empty function wrt what it
> returns to be undefined.
> So let us instead consider a function that returns nothing.
> 
> sub myvoid { 
>   return;
> }
> 
>>isok( myvoid(), 1 );
>>isok( ! myvoid(), 1 );
>>
>>This script will print
>>
>>Ok
>>Ok
>>
>>and not any 'Nok'. I assume that's because 'nothing' is magically
>>removed from @_, leaving only the '1', right?
> 
> 
> No magic.  If you call a function in a list context and it returns
> 'nothing' then nothing is an empty list.  If you append a 1 to an
> emplt list you get a single elemement list containing the value 1.

Thanks for the clarifaction. I didn't think about the list
context for the sub arguments.

>>I am trying out
>>the Test::Unit suite and had an empty sub in the code under test
>>while using assert in my test case, and all tests passed:
>>
>>$self -> assert( $target -> the_test(), "Test OK" );
>>$self -> assert( ! $target -> the_test(), "Test NOK" );
>>
>>IMHO, it would have been nice if perl could have issued some kind
>>of warning, since relying on 'nothing' as a return value seems...
>>well, not-so-common.
> 
> 
> I'm affraid that calling functions in a list context is common in
> perl.

Agreed. But it is going to make it more difficult to write test
cases for methods erroneously using return; (or empty) instead of
a valid value, as some tests will have to create the needed
context (or at least when using Test::Unit::TestCase.)

Thanks for the link to the discussion, that was interesting reading.

Cheers,

/Gunnar


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6667
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[24485] in Perl-Users-Digest

Perl-Users Digest, Issue: 6667 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Tue Jun 8 18:06:13 2004

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Jun 8 18:06:13 2004