[18638] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 806 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Apr 30 21:06:31 2001

Date: Mon, 30 Apr 2001 18:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <988679108-v10-i806@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Mon, 30 Apr 2001     Volume: 10 Number: 806

Today's topics:
    Re: FAQ 3.3:   Is there a Perl shell? (F. Xavier Noria)
    Re: Good editor for perl (Steve Lamb)
    Re: newbie question <webdaddy@delete.operamail.com>
    Re: newbie question <bart.lateur@skynet.be>
    Re: one-line stderr, stdout redirection <bart.lateur@skynet.be>
    Re: one-line stderr, stdout redirection (Anno Siegel)
    Re: one-line stderr, stdout redirection (Randal L. Schwartz)
    Re: one-line stderr, stdout redirection <Jonathan.L.Ericson@jpl.nasa.gov>
    Re: Point of using perlcc (Abigail)
        port problem <kienyeny@uci.edu>
        RegEx Question <nospam@newsranger.com>
    Re: RegEx Question (Abigail)
    Re: RegEx Question <nospam@newsranger.com>
    Re: Remove Adult Files with Perl <dodger@necrosoft.net>
    Re: Remove Adult Files with Perl (Randal L. Schwartz)
        requiring something I only need once (Peter Seebach)
    Re: requiring something I only need once <bart.lateur@skynet.be>
    Re: requiring something I only need once <sharding@ccbill.com>
    Re: requiring something I only need once <dodger@necrosoft.net>
    Re: Retrieve source <sharding@ccbill.com>
    Re: Retrieve source (Chris Fedde)
    Re: Should Perl be first? <miltonroad@btinternet.com>
    Re: Should Perl be first? <bchambless@nrlssc.navy.mil>
    Re: Testing whether a socket is still connected (Chris Fedde)
    Re: XML::RSS and mod_perl <m2@csu.edu.au>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 30 Apr 2001 22:13:58 GMT
From: fxn@isoco.com (F. Xavier Noria)
Subject: Re: FAQ 3.3:   Is there a Perl shell?
Message-Id: <3aede25d.30721271@news.iddeo.es>

On Mon, 30 Apr 2001 18:17:02 GMT, PerlFAQ Server <faq@denver.pm.org> wrote:

:   Is there a Perl shell?
: 
:     In general, no. The Shell.pm module (distributed with Perl) makes Perl
:     try commands which aren't part of the Perl language as shell commands.
:     perlsh from the source distribution is simplistic and uninteresting, but
:     may still be what you want.

Perhaps the Perl Shell could be mentioned there somehow? The Perl Shell
homepage is at

   http://www.focusresearch.com/gregor/psh/

-- fxn


------------------------------

Date: Mon, 30 Apr 2001 22:38:03 -0000
From: grey@despair.rpglink.com (Steve Lamb)
Subject: Re: Good editor for perl
Message-Id: <slrn9erqaa.t54.grey@teleute.dmiyu.org>

On Sat, 28 Apr 2001 16:53:16 +0200, Michael Ströck <michael@stroeck.com>
wrote:
>Well, sometimes that isn't true.
>This editor is indeed very good.

    Define "very good".  Is it emacs very good?  (el|n)vi(is|m) good?  Joe
good?  Ultra edit good?

    Now understand that out of that grouping there is only one I consider good
and one I consider very good.  Since editor preferences are personal and a
large part can be told through screenshots when it comes to an editor, I do
belive it is true.  ;)

-- 
         Steve C. Lamb         | I'm your priest, I'm your shrink, I'm your
         ICQ: 5107343          | main connection to the switchboard of souls.
-------------------------------+---------------------------------------------


------------------------------

Date: Mon, 30 Apr 2001 23:05:45 +0100
From: "cjam" <webdaddy@delete.operamail.com>
Subject: Re: newbie question
Message-Id: <mjlH6.10436$Kt2.1028458@news6-win.server.ntlworld.com>

thanks for the xpert advice,
I'll try and remember my subject line manners next time too....




------------------------------

Date: Mon, 30 Apr 2001 22:16:11 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: newbie question
Message-Id: <23pret4sh2tunhnh3jrq9s8d2pbhap6102@4ax.com>

cjam wrote:

>and I've set the permissions on it to be ReadWriteExecute for both the
>folder and in the IIS properties....

AFAIK on IIS you need execute but NO read permissions!

-- 
	Bart.


------------------------------

Date: Mon, 30 Apr 2001 22:11:52 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: one-line stderr, stdout redirection
Message-Id: <igoretgfgrqqr3hd864kifg776vaeupc7n@4ax.com>

Logan Shaw wrote:

>>open STDERR, ">stderr.txt" or die "Cannot reopen STDERR: $!";
>
>Just out of curiousity, does this reuse file numbers?  I.e. does it
>cause STDERR to be associated with file #2 and file #2 to be opened to
>stderr.txt?  Or does it potentially (say) file #7, open that to
>stderr.txt and re-associate STDERR with file #7 instead of #2?

Why don't you test it. There's a neat little function built into Perl
for this: fileno().

For me, it still returns 2, even after running the above line. Tested on
Windows (IndigoPerl 5.6.0) and FreeBSD (5.005_03).

-- 
	Bart.


------------------------------

Date: 30 Apr 2001 22:38:40 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: one-line stderr, stdout redirection
Message-Id: <9ckphg$999$1@mamenchi.zrz.TU-Berlin.DE>

According to Bart Lateur  <bart.lateur@skynet.be>:
> Logan Shaw wrote:
> 
> >>open STDERR, ">stderr.txt" or die "Cannot reopen STDERR: $!";
> >
> >Just out of curiousity, does this reuse file numbers?  I.e. does it
> >cause STDERR to be associated with file #2 and file #2 to be opened to
> >stderr.txt?  Or does it potentially (say) file #7, open that to
> >stderr.txt and re-associate STDERR with file #7 instead of #2?
> 
> Why don't you test it. There's a neat little function built into Perl
> for this: fileno().
> 
> For me, it still returns 2, even after running the above line. Tested on
> Windows (IndigoPerl 5.6.0) and FreeBSD (5.005_03).

I'd say this would have to be documented to be useful.  Even if tests
confirm the behavior, you couldn't rely on it.

I seem to remember (and books have a way to be where you aren't), that
APP discusses this at one point.  File descriptors below some kernel
constant (something about the maximum system file descriptor) are
more stable (in probably just this sense).  And no, I couldn't possibly
be more vague.

Anno


------------------------------

Date: 30 Apr 2001 16:22:09 -0700
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: one-line stderr, stdout redirection
Message-Id: <m1u2367z32.fsf@halfdome.holdit.com>

>>>>> "Anno" == Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> writes:

Anno> I seem to remember (and books have a way to be where you aren't), that
Anno> APP discusses this at one point.  File descriptors below some kernel
Anno> constant (something about the maximum system file descriptor) are
Anno> more stable (in probably just this sense).  And no, I couldn't possibly
Anno> be more vague.

Sure you could.

"File Descriptors are used in some way, maybe for this, I'm not sure."

{grin}

But while I'm not sure what piece of documentation ensures it,
I'm very sure that I know somehow that STDIN is always fileno 0,
STDOUT always 1, and STDERR 2.  Even when reopened as such.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


------------------------------

Date: 30 Apr 2001 23:11:27 +0000
From: Jon Ericson <Jonathan.L.Ericson@jpl.nasa.gov>
Subject: Re: one-line stderr, stdout redirection
Message-Id: <86bspenfts.fsf@jon_ericson.jpl.nasa.gov>

"Dennis Kowalski" <dennis.kowalsk@daytonoh.ncr.com> writes:

> Try this
> 
> open(LOG,">$filename");
> *STDERR = *LOG;
> 
> Gerald Shuman <nospam@newsranger.com> wrote in message
> news:BrgH6.4149$SZ5.335582@www.newsranger.com...
> > What's the perl equivalent of "exec 2> stderr.txt" in a shell script?
> >
> >

Yikes!  Top-posting, "Try this", and not checking the return value of
a system call are three signs of bad advice.  I suppose that the
general idea (opening a filehandle and copying the entire typeglob to
the STDERR typeglob) isn't terrible, but there are better solutions.
(perlopentut covers this topic quite well.)

Jon


------------------------------

Date: Mon, 30 Apr 2001 23:35:46 +0000 (UTC)
From: abigail@foad.org (Abigail)
Subject: Re: Point of using perlcc
Message-Id: <slrn9ertmi.59s.abigail@tsathoggua.rlyeh.net>

David Coppit (newspost@coppit.org) wrote on MMDCCXCIX September MCMXCIII
in <URL:news:Pine.SUN.4.33.0104300903140.7515-100000@mamba.cs.Virginia.EDU>:
??  
??  Also, you may require a version of Perl that isn't installed on the
??  system. (e.g. I recently found a bug in Perl 5.7.0 that kills my
??  program. :( )

Well, 5.7.x is a *development* release. You should not expect a 5.7.x
to come with your system, or to work flawlessly. 



Abigail
-- 
($;,$_,$|,$\)=("\@\x7Fy~*kde~box*Zoxf*Bkiaox","X"x25,1,"\r");
s/./ /;{vec($_=>1+$"=>8)=ord($/^substr$;=>$"=int rand 24=>1);
print&&select$,,$,,$,,$|/($|+tr/X//c);redo if y/X//};sleep 1;


------------------------------

Date: Mon, 30 Apr 2001 16:16:54 -0700
From: "Kien Y. Yee" <kienyeny@uci.edu>
Subject: port problem
Message-Id: <9ckrrc$de$1@news.service.uci.edu>

Hi all,

I'm a real newbie in perl programming, and I'm real stuck with this
assignment that I'm doing now. Wonder if anybody could help.

Problem 1:
I have to create a fake web server on a xxx.com:8080/~userID/file.pl to make
a connection to the 8080 port. I don't seem to get the connection done.

Problem 2:
After the above connection is made, I have to test out using another port
number to test my html page e.g. http://server.123.com:12345/test.html

Can anyone help?

Thanks!

KY





------------------------------

Date: Tue, 01 May 2001 00:05:22 GMT
From: Dan <nospam@newsranger.com>
Subject: RegEx Question
Message-Id: <65nH6.4725$SZ5.384799@www.newsranger.com>

Here is a portion of the data I am using:

SEVERE WEATHER STATEMENT
NATIONAL WEATHER SERVICE ST LOUIS MO
617 PM CDT MON APR 30 2001

How can I use a regular expression to get all the text after SERVICE?

Thanks,
Dan




------------------------------

Date: Tue, 1 May 2001 00:24:52 +0000 (UTC)
From: abigail@foad.org (Abigail)
Subject: Re: RegEx Question
Message-Id: <slrn9es0ik.59s.abigail@tsathoggua.rlyeh.net>

Dan (nospam@newsranger.com) wrote on MMDCCC September MCMXCIII in
<URL:news:65nH6.4725$SZ5.384799@www.newsranger.com>:
``  Here is a portion of the data I am using:
``  
``  SEVERE WEATHER STATEMENT
``  NATIONAL WEATHER SERVICE ST LOUIS MO
``  617 PM CDT MON APR 30 2001
``  
``  How can I use a regular expression to get all the text after SERVICE?

my ($text) = $data =~ /SERVICE(.*)/s;


Abigail
-- 
perl -MLWP::UserAgent -MHTML::TreeBuilder -MHTML::FormatText -wle'print +(
HTML::FormatText -> new -> format (HTML::TreeBuilder -> new -> parse (
LWP::UserAgent -> new -> request (HTTP::Request -> new ("GET",
"http://work.ucsd.edu:5141/cgi-bin/http_webster?isindex=perl")) -> content))
=~ /(.*\))[-\s]+Addition/s) [0]'


------------------------------

Date: Tue, 01 May 2001 01:02:05 GMT
From: Dan <nospam@newsranger.com>
Subject: Re: RegEx Question
Message-Id: <hWnH6.4781$SZ5.388106@www.newsranger.com>

Thank you...  But, how can I get all the data before the $$?

---
WIDELY SCATTERED STRONG THUNDERSTORMS WILL CONTINUE ACROSS THE BISTATE
REGION THROUGH 800 PM CDT. 

$$
RP 
---

Thanks,
Dan

In article <slrn9es0ik.59s.abigail@tsathoggua.rlyeh.net>, Abigail says...
>
>Dan (nospam@newsranger.com) wrote on MMDCCC September MCMXCIII in
><URL:news:65nH6.4725$SZ5.384799@www.newsranger.com>:
>``  Here is a portion of the data I am using:
>``  
>``  SEVERE WEATHER STATEMENT
>``  NATIONAL WEATHER SERVICE ST LOUIS MO
>``  617 PM CDT MON APR 30 2001
>``  
>``  How can I use a regular expression to get all the text after SERVICE?
>
>my ($text) = $data =~ /SERVICE(.*)/s;
>
>
>Abigail
>-- 
>perl -MLWP::UserAgent -MHTML::TreeBuilder -MHTML::FormatText -wle'print +(
>HTML::FormatText -> new -> format (HTML::TreeBuilder -> new -> parse (
>LWP::UserAgent -> new -> request (HTTP::Request -> new ("GET",
>"http://work.ucsd.edu:5141/cgi-bin/http_webster?isindex=perl")) -> content))
>=~ /(.*\))[-\s]+Addition/s) [0]'




------------------------------

Date: Mon, 30 Apr 2001 23:26:40 GMT
From: "Dodger" <dodger@necrosoft.net>
Subject: Re: Remove Adult Files with Perl
Message-Id: <QwmH6.40555$B22.9868197@news1.rdc2.pa.home.com>

"BUCK NAKED1" <dennis100@webtv.net> wrote in message
news:11901-3AEDAF15-116@storefull-242.iap.bryant.webtv.net...
> I want to remove all files in a subdirectory of "wkdir" IF they include
> one of many defined "bad words." I know this is not a complete solution,
> as people can name adult files anything they wish; but at least it's a
> start. My webhost doesn't allow adult material, and is peculiar. They
> search words and if they find certain words on your site, they just
> delete your site. Thus, I have to pad the "dirty words" as I've done
> below.
>
> Is this a good solution, or is there a "dirty word" filter script
> already out there?
>
> $f = "f0u0c0k"; $f =~ s/0//;
> $s = "s0e0x"; $s =~ s/0//;
> $wkdir = "wkdir/";
> # Remove files with bad words
> use File::Find;
> find sub {-f;
> if((my $new = $_) =~ $f | $s )
> { unlink $_; }  }, $wkdir ;
>
> I wrote the above for filtering "dirty" words in filenames, but I'd also
> like a script for filtering out "certain" words in all files in a
> directory too, if anyone has one.

Are these for inside CGIs or for words in HTML files or text files?
if HTML files, you can always do this to them all:

s[$badword]  [join '', map {'&#'.ord($_).';'} split // $1]ige

It should get rid of any recognisable words, replacing them with chains of
&#---; constructs.

Um, that may need another e on the end. I always need to look that up, and I
didn't.
--
Dodger
www.dodger.org
www.necrosoft.net
www.gothic-classifieds.com





------------------------------

Date: 30 Apr 2001 16:29:34 -0700
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Remove Adult Files with Perl
Message-Id: <m1n18y7yqp.fsf@halfdome.holdit.com>

>>>>> "BUCK" == BUCK NAKED1 <dennis100@webtv.net> writes:

BUCK> I want to remove all files in a subdirectory of "wkdir" IF they include
BUCK> one of many defined "bad words." I know this is not a complete solution,
BUCK> as people can name adult files anything they wish; but at least it's a
BUCK> start. My webhost doesn't allow adult material, and is peculiar. They
BUCK> search words and if they find certain words on your site, they just
BUCK> delete your site. Thus, I have to pad the "dirty words" as I've done
BUCK> below.

My favorite is to remember that

        '[d][o][g]'

is a regex that doesn't match itself, and yet it's obvious what
it matches.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


------------------------------

Date: 30 Apr 2001 22:10:05 GMT
From: seebs@plethora.net (Peter Seebach)
Subject: requiring something I only need once
Message-Id: <3aede2bd$0$39593$3c090ad1@news.plethora.net>

Trivial test program:
	foo.pl:
	my @foo = (1);
	1;

	bar.pl:
	#!/usr/bin/perl -w
	require "foo.pl";
	foreach $i (@foo) {
		print "$i\n";
	}

Running "bar.pl" gets me:
Name "main::foo" used only once: possible typo at ./bar.pl line 3.

Is there a way to tell perl that anything I refer to in one module, that
I defined another, was *actually* used twice, not just once?  It seems to
me that, since the code works "as expected", I'm obviously actually
successfully using the name twice - once to define it, once to refer back to
it.

I want to use this kind of construct for shared data, which any given
program may only need once, but which a number of programs may use.  How do
I supress this warning?

-s
-- 
Copyright 2001, all wrongs reversed.  Peter Seebach / seebs@plethora.net
C/Unix wizard, Pro-commerce radical, Spam fighter.  Boycott Spamazon!
Consulting & Computers: http://www.plethora.net/


------------------------------

Date: Mon, 30 Apr 2001 22:21:30 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: requiring something I only need once
Message-Id: <4apretguh01qj0q0qfqcj3jk09sfr1814a@4ax.com>

Peter Seebach wrote:

>Trivial test program:
>	foo.pl:
>	my @foo = (1);
>	1;
>
>	bar.pl:
>	#!/usr/bin/perl -w
>	require "foo.pl";
>	foreach $i (@foo) {
>		print "$i\n";
>	}
>
>Running "bar.pl" gets me:
>Name "main::foo" used only once: possible typo at ./bar.pl line 3.

If you want the same @foo, and it looks like you do, then DON'T make it
a lexical! You NEED a global variable there!

Bloody 'strict'. Get rid of the "my". Use "use vars"  or "our",  it may
help. It may most certainly help in getting rid of that warning that is
bothering you.

-- 
	Bart.


------------------------------

Date: Mon, 30 Apr 2001 15:47:21 -0700
From: "Shay Harding" <sharding@ccbill.com>
Subject: Re: requiring something I only need once
Message-Id: <9ckph2$1mni$1@node17.cwnet.frontiernet.net>

"Peter Seebach" <seebs@plethora.net> wrote in message
news:3aede2bd$0$39593$3c090ad1@news.plethora.net...
> Trivial test program:
> foo.pl:
> my @foo = (1);
> 1;
>
> bar.pl:
> #!/usr/bin/perl -w
> require "foo.pl";
> foreach $i (@foo) {
> print "$i\n";
> }
>
> Running "bar.pl" gets me:
> Name "main::foo" used only once: possible typo at ./bar.pl line 3.
>
> Is there a way to tell perl that anything I refer to in one module, that
> I defined another, was *actually* used twice, not just once?  It seems to
> me that, since the code works "as expected", I'm obviously actually
> successfully using the name twice - once to define it, once to refer back
to
> it.

If you put 'use diagnostics;' inside bar.pl, you would have received
additional information and a hint as to how to fix it:

The last line of the diagnostic output is:

The *our* declaration is provided for this purpose (I added the '*'s). But
this is only part of a solution. Change foo.pl to:

our @foo = (1);
1;

and change bar.pl to:

#!/usr/bin/perl -w

use diagnostics;

BEGIN{ require 'foo.pl'; }

print "$_\n" for @foo;


You wrap the require in a BEGIN{} block. Of course once you add 'use
strict;' this will fail because @foo was not declared inside bar.pl. To fix
that, inside bar.pl add:

@foo;

OR

Change foo.pl to:

use vars qw(@foo);
@foo = (1);
1;


Change bar.pl to:

#!/usr/bin/perl -w

use diagnostics;
use strict;

BEGIN{ require 'foo.pl'; }

print "$_\n" for @foo;


Really depends on what Perl version you are using since 'our' is relatively
new. You can do:

perldoc -f our

to find out more on this.


Shay






------------------------------

Date: Mon, 30 Apr 2001 23:14:44 GMT
From: "Dodger" <dodger@necrosoft.net>
Subject: Re: requiring something I only need once
Message-Id: <ElmH6.40492$B22.9859842@news1.rdc2.pa.home.com>

"Peter Seebach" <seebs@plethora.net> wrote in message
news:3aede2bd$0$39593$3c090ad1@news.plethora.net...
> Trivial test program:
> foo.pl:
> my @foo = (1);
> 1;
>
> bar.pl:
> #!/usr/bin/perl -w
> require "foo.pl";
> foreach $i (@foo) {
> print "$i\n";
> }
>
> Running "bar.pl" gets me:
> Name "main::foo" used only once: possible typo at ./bar.pl line 3.
>
> Is there a way to tell perl that anything I refer to in one module, that
> I defined another, was *actually* used twice, not just once?  It seems to
> me that, since the code works "as expected", I'm obviously actually
> successfully using the name twice - once to define it, once to refer back
to
> it.
>
> I want to use this kind of construct for shared data, which any given
> program may only need once, but which a number of programs may use.  How
do
> I supress this warning?

Remove the my in the required lib.
You ARE using it only once, two times. (err, it really does  make sense).
The required lib counts as a block, so it's like delacring a my variable in
a bareblock and then trying to access it outside of it.

try:
our @foo;
or use vars ('@foo');

--
Dodger
www.dodger.org
www.necrosoft.net
www.gothic-classifieds.com





------------------------------

Date: Mon, 30 Apr 2001 15:27:23 -0700
From: "Shay Harding" <sharding@ccbill.com>
Subject: Re: Retrieve source
Message-Id: <9ckobk$1aum$1@node17.cwnet.frontiernet.net>

"Jimmy" <klammy@hotmail.com> wrote in message
news:9ckmv9$hue$1@news.net.uni-c.dk...
> Hi
>
>  I need to make a program that reads the source of a given URL as a
command
>  line argument and outputs the source of the given URL and the source of
> each
>  URL that apears in the source of the given URL.
>
>  Text within HTML tags should not appear in the output.

man HTML::LinkExtor

The example there should give you a good start.


Shay






------------------------------

Date: Mon, 30 Apr 2001 22:35:33 GMT
From: cfedde@fedde.littleton.co.us (Chris Fedde)
Subject: Re: Retrieve source
Message-Id: <VMlH6.215$T3.204082176@news.frii.net>

In article <9ckmv9$hue$1@news.net.uni-c.dk>, Jimmy <klammy@hotmail.com> wrote:
> Hi
>
> I need to make a program that reads the source of a given URL as a command
> line argument and outputs the source of the given URL and the source of
>each
> URL that apears in the source of the given URL.
>
> Text within HTML tags should not appear in the output.
>
>
> Can anyone help me??
>

Here is something to start with:

    #
    # getbigurl -- stream a url to the standard out
    #              no internal buffering
    #

    use strict;
    use LWP::UserAgent;
    use vars qw( $request $response $ua );

    $ua = new LWP::UserAgent;
    $ua->env_proxy();

    for (@ARGV) {
	$request = new HTTP::Request('GET', $_);
	$response = $ua->request($request, \&callback, 4096);
    } 

    sub callback {
	print STDOUT $_[0];
    } 

And another one that does a similar thing for links in the text of a web
page:

    #!/usr/bin/perl -w
    #
    # getlinks
    # return links from listed urls
    #

    use LWP::UserAgent;
    use HTML::LinkExtor;
    use URI::URL;
    use strict;

    my @anchors = ();

    while (my $url = shift @ARGV){
	my $ua = new LWP::UserAgent;
	$ua->env_proxy();

	# Make the parser.  Unfortunately, we don't know the url base yet
	# (it might be diffent from $url)
	my $p = HTML::LinkExtor->new(\&callback);

	# Request document and parse it as it arrives
	my $res = $ua->request(HTTP::Request->new(GET => $url),
	    sub {$p->parse($_[ 0])});

	# Expand all A URLs to absolute ones
	my $base = $res->base;
	@anchors = map { $_ = url($_, $base)->abs; } @anchors;

	# Print them out
	print join("\n", @anchors), "\n";
    } 

    # Set up a callback that collect image links
    sub callback {
	my($tag, %attr) = @_;
	return if $tag ne 'a';  # we only look closer at <a ...>
	push(@anchors, values %attr);
    }

I'll leave problems of stripping markup and implementing infinite
recursion to those who have a closed form solution to the 
traveling salesman problem.

BTW these were ripped from the ducumentation of some of the modules that
are used.

chris
-- 
    This space intentionally left blank


------------------------------

Date: Mon, 30 Apr 2001 23:07:47 +0100
From: Milton Road <miltonroad@btinternet.com>
Subject: Re: Should Perl be first?
Message-Id: <MmlH6.150071$HR6.16936896@nnrp4.clara.net>

> I'd have been pushed to have Perl as my first language as it was yet to
> be created for ten years when I started programming ...
> 

Congratulations Jonathan.  You are old!  Celebrate with a nice cup of cocoa 
and a warm pair of slippers on your tootsies!

The group is basking in your self-promotion!

Milton




------------------------------

Date: 30 Apr 2001 23:39:11 GMT
From: Billy Chambless <bchambless@nrlssc.navy.mil>
Subject: Re: Should Perl be first?
Message-Id: <9ckt2v$n77$1@news.datasync.com>

In article <slrn9eprjq.91u.tjla@thislove.dyndns.org>,
Gwyn Judd <tjla@guvfybir.qlaqaf.bet> wrote:
>"mein Luftkissenfahrzeug ist voll von den Aalen"
>said Billy Chambless (bchambless@nrlssc.navy.mil) in 
><9ci16v$qt2$1@news.datasync.com>:

>>Could you point out exactly what it is I'm wrong about (just in the context
>>of this thread, we don't have time to discuss ALL the ways I'm wrong. :) )

>I misunderstood the meanng of what you were saying.


Clear not always is my meaning, hmmmmm? :)


------------------------------

Date: Mon, 30 Apr 2001 22:12:27 GMT
From: cfedde@fedde.littleton.co.us (Chris Fedde)
Subject: Re: Testing whether a socket is still connected
Message-Id: <frlH6.214$T3.212386816@news.frii.net>

In article <srj4rv6c095.fsf@w3proj1.ze.tu-muenchen.de>,
Walter Hafner  <hafner-usenet@ze.tu-muenchen.de> wrote:
>Hi there,
>
>me again with another question. :-)
>

Welcome!

>
>I need some kind of timeout for the test routine. All I can think of is
>the following:
>

From the perlfaq:
$ perldoc -q timeout 
Found in /usr/local/lib/perl5/5.6.1/pod/perlfaq8.pod
       How do I timeout a slow event?

       Use the alarm() function, probably in conjunction with a
       signal handler, as documented in the Signals entry in the
       perlipc manpage and the section on ``Signals'' in the
       Camel.  You may instead use the more flexible Sys::Alarm-
       Call module available from CPAN.

Good luck!
-- 
    This space intentionally left blank


------------------------------

Date: Tue, 1 May 2001 08:14:05 +1000
From: "Matt Morton-Allen" <m2@csu.edu.au>
Subject: Re: XML::RSS and mod_perl
Message-Id: <fqlH6.3$ax2.933@news0.optus.net.au>

It worked! Thankyou for your help. You've almost single handedly rekindled
my faith in newsgroups!

Matt.

"Matt Sergeant" <matt@sergeant.org> wrote in message
news:3AED2473.D43308B6@sergeant.org...
> Matt Morton-Allen wrote:
> >
> > Hi again,
> > the thing is I can't get this to work. I compile apache (1.3.19) and all
is
> > good (the proposed strings test on the AxKit FAQ shows no plain text XML
> > references in the binary). But after compiling mod_perl (1.25) the XML
> > references are back in the binary! I use EVERYTHING=1 to get my other
> > requirements so I tried that off to no effect. I also tried the apache
> > directive to mod_perl with no effect. What am I missing?
>
> Try the following recipe for compiling Apache and mod_perl:
>
> (run in the mod_perl directory)
>
>  $ perl Makefile.PL \
>  > EVERYTHING=1 \
>  > USE_APACI=1 \
>  > DYNAMIC=1 \
>  > APACHE_PREFIX=/opt/apache \
>  > APACHE_SRC=../apache_1.3.12/src \
>  > DO_HTTPD=1 \
>  > APACI_ARGS="--enable-module=so --enable-shared=info
>  > --enable-shared=proxy --enable-shared=rewrite
>  > --enable-shared=log_agent"
>  $ make
>  $ su
>  $ make install
>
>
> --
> <Matt/>
>
>     /||    ** Founder and CTO  **  **   http://axkit.com/     **
>    //||    **  AxKit.com Ltd   **  ** XML Application Serving **
>   // ||    ** http://axkit.org **  ** XSLT, XPathScript, XSP  **
>  // \\| // ** mod_perl news and resources: http://take23.org  **
>      \\//
>      //\\
>     //  \\




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 806
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[18638] in Perl-Users-Digest

Perl-Users Digest, Issue: 806 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Apr 30 21:06:31 2001

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Apr 30 21:06:31 2001