[24150] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 6344 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Mar 31 09:05:50 2004

Date: Wed, 31 Mar 2004 06:05:11 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 31 Mar 2004     Volume: 10 Number: 6344

Today's topics:
    Re: how to pattern match \d in a variable? <Joe.Smith@inwap.com>
        Is there any standard newsgroup messages parser? <d.adamkiewicz@i7.com.pl>
    Re: Lost data on socket - Can we start over politely? <ThomasKratz@REMOVEwebCAPS.de>
    Re: Lost data on socket - Can we start over politely? (Vorxion)
    Re: multiple lines / success or failure?! <tassilo.parseval@rwth-aachen.de>
    Re: multiple lines / success or failure?! <Joe.Smith@inwap.com>
    Re: multiple lines / success or failure?! <kuujinbo@hotmail.com>
    Re: multiple lines / success or failure?! <geoffacox@dontspamblueyonder.co.uk>
    Re: multiple lines / success or failure?! <geoffacox@dontspamblueyonder.co.uk>
    Re: multiple lines / success or failure?! <geoffacox@dontspamblueyonder.co.uk>
        system command (Andrea Spitaleri)
    Re: system command <matrix_calling@yahoo.dot.com>
    Re: system command <spamtrap@dot-app.org>
    Re: system command <krahnj@acm.org>
    Re: using alarm with File::Tail (Anno Siegel)
    Re: validate links?? <kuujinbo@hotmail.com>
    Re: validate links?? <Joe.Smith@inwap.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Wed, 31 Mar 2004 10:58:14 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: how to pattern match \d in a variable?
Message-Id: <avxac.43620$w54.282246@attbi_s01>

nickelstat wrote:

> I'm at a complete loss:
> $value = "ITEM_001_in_001.ldg";
> $pattern = '_in_\d\d\d.ldg'; # a substring of $value

I'd recommend using qr() when dealing with regular expressions.
    $pattern = qr/_in_\d\d\d\.ldg/;	# Note \.

> print "pattern($pattern)
> Value($value)\n";
> 
> #if ($value !~ /_in_\d\d\d.ldg/) -> this works, but not line below
> if ($value !~ /\Q$pattern/) {
>         print "patern not found\n";
> } else { print "OK OK OK\n"; }

Get rid of that silly \Q and it will work.
	-Joe


------------------------------

Date: Wed, 31 Mar 2004 16:02:25 +0200
From: Darek Adamkiewicz <d.adamkiewicz@i7.com.pl>
Subject: Is there any standard newsgroup messages parser?
Message-Id: <c4ej4k$s7q$1@atlantis.news.tpi.pl>

Hello Folks

Till now I used self-made parser script for self-made message file format - I was 
wondering if there is a standard perl parser for standard nntp messages files? 
(with ability to build thread of messages)?

Regards
Darek

---
http://itik.sf.net


------------------------------

Date: Wed, 31 Mar 2004 11:26:33 +0200
From: Thomas Kratz <ThomasKratz@REMOVEwebCAPS.de>
Subject: Re: Lost data on socket - Can we start over politely?
Message-Id: <406a9005.0@juno.wiesbaden.netsurf.de>

Vorxion wrote:

> In article <4069a132.0@juno.wiesbaden.netsurf.de>, Thomas Kratz wrote:
> 
>>Instead please try this server ( some quickly reduced code from a bigger 
>>server ) and mini client. If this works for you, you could extend it.
>>I don't loose any data with this setup. And I don't even bother to set 
>>unbuffered mode (tested on Win32).
> 
> 
> It does indeed work.  This is the best sign I've seen in a week, since I've
> been afraid it's simply not possible to do this, although I figured it was
> a goof on my end.
> 
> I note that if I decrease the input chunks from 1024 down to 1, and that's
> the -only- change I make to yours, it will eventually exhibit exactly what
> I'm experiencing in my own server, which is a premature end of data--and
> in your case, a disconnect of the client due to how you coded the error
> checking--after about 30k.
> 
> Given that, I strongly suspect that my problem is taking expensive
> time to sniff the first 15-20 bytes of each packet individually to
> find the application level packet lengths and a first internal packet
> separator, even though once I have the length I'm trying to read the
> rest of the packet at the full listed size of the packet.  I'm talking
> application-level packets here, not TCP packets.
> 
> In short, I think my code is simply lagging behind, and when it lags far
> enough, the rest of the data vanishes.  I'm working on fixing that bit, and
> I'll also have to prioritize it so that when data is present, it stops
> working on processing its data internally and goes immediately back to
> reading from the socket.  And I'm simply going to have it scarf up as much
> data as is present and basically parse and process when there is no
> actual communication going on.  That should (hopefully) elminiate the
> problem.

That was my guess to, but I couldn't confirm it :-). The problem could be, 
that the TCP buffers of the machine the server runs on, are filled because 
you are draining them too slow, but the client is sending data anyway. I 
didn't use IO::Select on the client with a can_write(), which should take 
care of that ( provided you have control over the client code :-)

> 
>>If you look at the server code, you'll get an idea what is meant by "a 
>>small but complete sample to reproduce the problem". The base 
>>functionality you need is in there. And if I were able to understand, what 
>>your code tries to do, I would be able to shorten it to that level too.
> 
> 
> Well, mine is multiplexing--it's meant to take in more than one connection
> at a time.  Yours was written with a single connection in mind, although it
> demonstrates perfectly that buffering shouldn't be an issue, even at 1k
> block sizes.  The multiplexing is what's complicated mine so much.  :)

No. Have you tried it? It handles exactly SOMAXCONN connections.
The only cause for using the $max_buf variable is preventing one client 
from flooding the server with data and neglecting the other clients.
If the client data dripples in slowly, one could also use a $max_time 
value (with Time:HiRes) for the maximum time the server processes one 
socket at a time.

The foreach loop looks for all readable sockets and handles them. The only 
thing left out is handling a specific timeout for connected client sockets 
without incoming data (could be easily done via a lookup hash with the 
stringified socket values as keys and the time value of the last action on 
the socket as values, shouldn't be more than a few lines).

Thomas

-- 
open STDIN,"<&DATA";$=+=14;$%=50;while($_=(seek( #J~.> a>n~>>e~.......>r.
STDIN,$:*$=+$,+$%,0),getc)){/\./&&last;/\w| /&&( #.u.t.^..oP..r.>h>a~.e..
print,$_=$~);/~/&&++$:;/\^/&&--$:;/>/&&++$,;/</  #.>s^~h<t< ..~. ...c.^..
&&--$,;$:%=4;$,%=23;$~=$_;++$i==1?++$,:_;}__END__#....>>e>r^..>l^...>k^..


------------------------------

Date: 31 Mar 2004 05:05:24 -0500
From: vorxion@knockingshopofthemind.com (Vorxion)
Subject: Re: Lost data on socket - Can we start over politely?
Message-Id: <406a97e4$1_1@news.iglou.com>

In article <406a9005.0@juno.wiesbaden.netsurf.de>, Thomas Kratz wrote:
>> In short, I think my code is simply lagging behind, and when it lags far
>> enough, the rest of the data vanishes.  I'm working on fixing that bit, and
>> I'll also have to prioritize it so that when data is present, it stops
>> working on processing its data internally and goes immediately back to
>> reading from the socket.  And I'm simply going to have it scarf up as much
>> data as is present and basically parse and process when there is no
>> actual communication going on.  That should (hopefully) elminiate the
>> problem.
>
>That was my guess to, but I couldn't confirm it :-). The problem could be, 
>that the TCP buffers of the machine the server runs on, are filled because 
>you are draining them too slow, but the client is sending data anyway. I 
>didn't use IO::Select on the client with a can_write(), which should take 
>care of that ( provided you have control over the client code :-)

The question then becomes whether a non-forking model is viable for
multiple connections if it needs to do processing.  I should think so.  I
think it's mostly the expense of doing a sysread() of -one- character
about 18 times, and -then- getting bigger blocks, but still not big enough
to make it keep pace--they were only as big as the packets, which were
generally 110chars max.

At which point I probably had far more overhead than it could tolerate when
the client was spewing things forth without delay, even though it was
writing a packet/line at a time.  :)

Boy, I never considered the client-side.  You're basically saying that the
can_write will actually get the socket equivalent of flow control from the
remote end and only send when it won't get lost?  I had always thought for
some reason that the "writable" nature of the select() flags indicated more
the presence of the socket to write to.  I didn't know it had -that-
functionality inherent, if I'm reading you correctly.  And I do in fact
have control over it.  Of course, one writes the server so that it can
handle any client...but I'm ignoring anything that doesn't match my
protocol.  :)  I just don't want someone to be able to flood it with too
much data and prevent legitimate input from making it.

>> Well, mine is multiplexing--it's meant to take in more than one connection
>> at a time.  Yours was written with a single connection in mind, although it
>> demonstrates perfectly that buffering shouldn't be an issue, even at 1k
>> block sizes.  The multiplexing is what's complicated mine so much.  :)
>
>No. Have you tried it? It handles exactly SOMAXCONN connections.
>The only cause for using the $max_buf variable is preventing one client 
>from flooding the server with data and neglecting the other clients.
>If the client data dripples in slowly, one could also use a $max_time 
>value (with Time:HiRes) for the maximum time the server processes one 
>socket at a time.

Ah, that explains the limitation.  Good idea, actually, and I think I'll
keep that then.  Actually, do you know if the buffer on the socket is one
single pool, or if each connection to the port has its own buffer?  That
goes baack to my other statement about not wanting one attacker to DoS the
whole socket by flooding one fd.

>The foreach loop looks for all readable sockets and handles them. The only 
>thing left out is handling a specific timeout for connected client sockets 
>without incoming data (could be easily done via a lookup hash with the 
>stringified socket values as keys and the time value of the last action on 
>the socket as values, shouldn't be more than a few lines).

Yes, I caught the fact that it would take multiple connections.  I even
tested it.  However, the logging was not multiplexed, where all my protocol
states must be.  You weren't differentiating in the log between fd's, but
it was really the proof that the buffering was fine that mattered to me.
The rest is no problem.  That was what had me really worried.

I've got it halfway rolled into my code.  I separated out the flow into two
loops--one will go as fast and hard as it can to read data (I'm going to
implement the max_buf now that I know why you had it), and the other is for
processing the input, and at every possible pausing point it checks to see
if there is more data to be read and will go back to reading as immediately
as possible if there is.  I "just" need to roll in the code that breaks up
the packets from this large internal buffer where I'm storing huge hunks of
code that aren't even analysed.  Once I have that, it should hopefully be
fine.

Chances are, nothing in the intended use would stress it anywhere near what
I have been putting it through.  Then again, I don't like to take chances.

If I think back to how lousy NFS performance is if you have it set to less
than 8192 byte packets, I should have realised exactly what was going on,
probably.  I just didn't think those 15 single-byte reads per packet were
that expensive.  And then the small packets on top of it.  Ugh.  That
pretty much explains it all.

I thank you SO much for your assistance, Thomas!  You have no idea how much
relief I feel at this point.  I did a brief test of your read methodology
rolled partly into the small sample I had here and it was working 100%
and consistantly.  I'm rolling into the real thing now, which is a wee bit
more complicated.

Thank you ever so much!

-- 
Vorxion - Member of The Vortexa Elite


------------------------------

Date: 31 Mar 2004 08:32:06 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: multiple lines / success or failure?!
Message-Id: <c4dvm6$qqq$1@nets3.rz.RWTH-Aachen.DE>

Also sprach Geoff Cox:

> Re my previous posting on how to capture successive blocks of text
> between <p> and </p> where the </p> is not on the first line of the
> blocl of text in an html file ....
> 
> The only way that I have been able to extract the text and place it in
> the correct place in the newly created file has been to edit all the
> original html files so that the <p> and the </p> are on the same line,
> ie instead of
> 
><p>sjdhjhsjdhjhs
> sjdjksjdkksj
> jskdjkjs</p>
> 
> I have 
> 
><p>sjdhjhsjdhjhs sjdjksjdkksj jskdjkjs</p>
> 
> This is a failure on my part! 

It's mostly a failure of the tools you try to employ here. HTML is
treacherous in that in looks as though it could be handled with just a
few regular expressions. Even when you slurp the whole fine and work on
large strings, sooner or later regular expressions wont be enough.

> I have not tried HTML::Parser yet as I cannot find any help info to
> get me started. All seen so far assume more than I know or understand
> on OOP. I am back to where I was once on the use of ODBC when
> eventually I found a little help and wrote my own steps 1 to 10 to
> implement a simple example of this data base / CGI connectivity...

When it comes to HTML::Parser, this approach is worthy knowing as it
is a general technique (used by other Perl modules as well). The idea
behind it requires only a bit of understanding of OOP concepts.

HTML::Parser is a class that provides a few methods that you will be
using verbatimly, such as parse(), parse_file() or parse_chunk(). What
they do is walking through the HTML and once they have identified a
certain HTML construct (a start or end tag, plain text etc.) they
trigger methods (they are a bit like callbacks) and pass them the stuff
they have identified. Those callback methods are the one you have to
provide.

In order to make this whole thing work, you create a subclass of
HTML::Parser. This subclass will inherit all the methods from
HTML::Parser (most notably the various parse() functions). Some methods
however you will have to override (that is: replace them so that they
suit your needs). Quite naturally, it makes sense to override the
callbacks because those are the parts you want to customize.

So take this subclass:

    package MyParser;
    use base qw(HTML::Parser);

That's a fully functional subclass of HTML::Parser. Now you create an
object of this class and see what happens when it parses a file:

    package main;
    my $parser = MyParser->new;
    $parser->parse_file("file.html");

When you run that, you'll notice that nothing appears to be happening.
But you'll also notice that you don't get any errors like calling
non-existent functions. That's because you call two methods on $parser
that were inherited from HTML::Parser, namely new() and parse_file().

Further above I said that parse_file() would trigger those callbacks,
but seemingly it doesn't do that (because nothing is happening). But
actually, MyParser::parse_file() does call them. As you did not override
them, it calls the default methods HTML::Parser::start/end/text/etc
(after all, those methods were inherited by 'MyParser').  Those methods
are empty (which can be confirmed when you have a look at the source
code of HTML/Parser.pm.

In order to make your parser do something useful, you provide those
methods yourself:

    package MyParser;
    use base qw(HTML::Parser);
    
    # we use these three variables to count something
    our ($text_elements, $start_tags, $end_tags);
    
    # here HTML::text/start/end are overridden 
    sub text	{ $text_elements++  }
    sub start	{ $start_tags++	    }
    sub end	{ $end_tags++	    }

    package main;

    # Test the parser
    
    my $html = <<EOHTML;
    <html>
	<head>
	    <title>Bla</title>
	</head>
	<body>
	Here's the body.
	</body>
    </html>
    EOHTML
    
    my $parser = MyParser->new;
    $parser->parse( $html );	# parse() is also inherited from HTML::Parser
    
    print <<EOREPORT;
    text elements:  $MyParser::text_elements
    start tags   :  $MyParser::start_tags
    end tags     :  $MyParser::end_tags
    EOREPORT
    
    __END__
    text elements: 7
    start tags   : 4
    end tags     : 4

So apparently, MyParser::text() has been called 7 times (7 apparently
because HTML::Parser also considers white-space), start() and
end() four times each (which makes sense: you have <html>, <head>,
<title> and <body> plus their corresponding closing tags). 

The above parser only does counting. But the callbacks are called with
arguments. The first argument is always the 'MyParser' objects (as
always with methods). The additional arguments are those you are really
interested: They are the broken down elements of HTML. 

Next Parser:

    package MyParser;
    use base qw(HTML::Parser);
    
    # This parser only looks at opening tags
    sub start { 
	my ($self, $tagname, $attr, $attrseq, $origtext) = @_;
	if ($tagname eq 'a') {
	    print "URL found: ", $attr{ href }, "\n";
	}
    }

    package main;

    my $html = <<EOHTML;
    <html>
	<body>
	    <a href="http://www.first.com" target="bla">One link</a>
	    <a href="http://www.second.com">Second link</a>
	</body>
    </html>
    EOHTML
    
    my $parser = MyParser->new;
    $parser->parse( $html );
    __END__
    URL found: http://www.first.com
    URL found: http://www.second.com

The above is essentially a cheap link extractor. The interesting part is
the start-callback:

    sub start {
	my ($self, $tagname, $attr, $attrseq, $origtext) = @_;
	if ($tagname eq 'a') {
	    print "URL found: ", $attr{ href }, "\n";
	    print "  all attributes: @$attrseq\n";
	}
    }
    
It is called with five arguments. $self is the object itself, $tagname
is the name of the start tag, $attr is a hash-reference containing the
attributes as key/value pairs, $attrseq is an array-reference which
lists the attribute keys in the order in which they appeared in the tag,
and $origtext is eventually the original text as it appeared in the
HTML snippet.

The start-callback will be called four times for the given HTML string.
It will only do something when it encountered an <a>-tag:

    if ($tagname eq 'a') {

In this case it looks up the value of the 'href' attribute:

    print "URL found: ", $attr{ href }, "\n";

Additionally, it prints all the attributes in the order in which they
appeared:

    print "  all attributes: @$attrseq\n";

For the first a-tag, this is "href target". For the second one, only
"href". 

You should notice, that you simply ignore all the stuff you are not
interested in. The above parser doesn't care about end-tags or plain
text. It only looks at the start-tags to find links in the HTML
document.

It's quite easy to integrate more complicated logic into a parser. You
said you needed to parse other documents when they were referenced in an
attribute. Likewise, this parser can be made to work recursively:
Whenever it encounters a link to another document, it retrieves this
document, parses it for more links and follows them as well (until it
has walked through the whole www):

    package MyParser;
    use base qw(HTML::Parser);
    use LWP::Simple ();
    
    sub start {
	my ($self, $tagname, $attr) = @_;
	if ($tagname eq 'a') {
	    my $url = $attr->{ href };
	    print "URL found: $url\n";
	    
	    # make a new parser to parse
	    # the document referenced by $url
	    
	    my $p = MyParser->new;
	    $p->parse( LWP::Simple::get($url) );
	}
    }

This parser will probably never stop because it doesn't keep track of
the websides it has already parsed. It's not very hard to prevent
inifinite recursion:

    ...
    my %already_parsed;
    sub start {
	my ($self, $tagname, $attr) = @_;
	if ($tagname eq 'a') {
	    my $url = $attr->{ href };
	    print "URL found: $url\n";
	    return if $already_parsed{ $url };
	    
	    # not yet parsed
	    $already_parsed{ $url }++;
	    MyParser->new->parse( LWP::Simple::get($url) );
	}
    }

> Well, I could go on and on but I would like to use the HTML::Parser so
> look forward to hearing about some intro type info with sample scripts
> on its use! Or any Perl books which do this? Perhaps the latest Perl
> O'Reilly books are OK?

I don't know whether HTML::Parser is covered in any books. But maybe the
above is already all you need to write your program. It takes a little
time to get used to these event-based approaches so you might want to
experiment a bit with it. Once you have grokked it, you'll realize how
convenient and powerful HTML::Parser is.

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: Wed, 31 Mar 2004 10:43:51 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: multiple lines / success or failure?!
Message-Id: <Hhxac.144463$1p.1846147@attbi_s54>

Geoff Cox wrote:
> I have not tried HTML::Parser yet as I cannot find any help info to
> get me started.

#!/usr/bin/perl -w
# Name: nohtml                  Author: Joe.Smith@inwap.com 07-Nov-2001
# Purpose: Extracts just the text portions of a document.

   use strict;
   use HTML::Parser ();

   sub text_handler {            # Ordinary text
     print @_;
   }

   my $p = HTML::Parser->new(api_version => 3);
   $p->handler( text => \&text_handler, "dtext");
   $p->parse_file(shift || "-") || die $!;

1;


------------------------------

Date: Wed, 31 Mar 2004 20:09:26 +0900
From: ko <kuujinbo@hotmail.com>
Subject: Re: multiple lines / success or failure?!
Message-Id: <c4e914$2h6456$1@ID-227975.news.uni-berlin.de>

Geoff Cox wrote:

[snip]

> Well, I could go on and on but I would like to use the HTML::Parser so
> look forward to hearing about some intro type info with sample scripts
> on its use! Or any Perl books which do this? Perhaps the latest Perl
> O'Reilly books are OK?
> 
> Cheers
> 
> Geoff

The latest builds of ActiveState include HTML::Parser, but not the 
example scripts that come as part of the distribution included on CPAN, 
which are available here:

http://search.cpan.org/src/GAAS/HTML-Parser-3.35/eg/

HTH - keith


------------------------------

Date: Wed, 31 Mar 2004 13:41:38 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: multiple lines / success or failure?!
Message-Id: <qjil6011niofhjucvjdiqd05nliohd2id2@4ax.com>

On Wed, 31 Mar 2004 20:09:26 +0900, ko <kuujinbo@hotmail.com> wrote:

>The latest builds of ActiveState include HTML::Parser, but not the 
>example scripts that come as part of the distribution included on CPAN, 
>which are available here:
>
>http://search.cpan.org/src/GAAS/HTML-Parser-3.35/eg/

Keith, 


Many thanks - they are a great help.

Cheers

Geoff




>
>HTH - keith



------------------------------

Date: Wed, 31 Mar 2004 13:46:14 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: multiple lines / success or failure?!
Message-Id: <uqil60lhv4mpvusdi7hc2c920gqdcjddfn@4ax.com>

On 31 Mar 2004 08:32:06 GMT, "Tassilo v. Parseval"
<tassilo.parseval@rwth-aachen.de> wrote:

>When it comes to HTML::Parser, this approach is worthy knowing as it
>is a general technique (used by other Perl modules as well). The idea
>behind it requires only a bit of understanding of OOP concepts.

Tassilo

A full tutorial ! Many thanks, have printed it off and will work
through it.

Cheers

Geoff


------------------------------

Date: Wed, 31 Mar 2004 13:46:58 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: multiple lines / success or failure?!
Message-Id: <jtil60lfj0cfepfvdodhnfht5fi6hae74e@4ax.com>

On Wed, 31 Mar 2004 02:55:33 -0500, Sherm Pendley
<spamtrap@dot-app.org> wrote:

>Geoff Cox wrote:
>
>> I have not tried HTML::Parser yet as I cannot find any help info to
>> get me started. All seen so far assume more than I know or understand
>> on OOP.
>
>These will help get you started with objects:
>
>perldoc perlboot
>perldoc perltoot
>perldoc perltooc
>perldoc perlobj
>perldoc perlbot
>
>The web site <http://learn.perl.org> has a list of recommended books.

Shem

Many thanks - will follow up these suggestions.

Cheers

Geoff




>
>sherm--



------------------------------

Date: 31 Mar 2004 00:16:54 -0800
From: spiritelllo@interfree.it (Andrea Spitaleri)
Subject: system command
Message-Id: <4de1519a.0403310016.2ab1754a@posting.google.com>

Hi
if a dir contains file like 1.pov 2.pov 3.pov 4.pov 5.pov 6.pov ..
n.pov
I use to do:
foreach in $index (1 .. n){
system ("povray $index.pov")}
but if I have got something like a.pov gf.pov fr.pov .....zz.pov
(randoms name with the same extension), how may I do for using system?


and


------------------------------

Date: Wed, 31 Mar 2004 14:04:28 +0530
From: Abhinav <matrix_calling@yahoo.dot.com>
Subject: Re: system command
Message-Id: <1vvac.2$rc5.118@news.oracle.com>



Andrea Spitaleri wrote:
> Hi
> if a dir contains file like 1.pov 2.pov 3.pov 4.pov 5.pov 6.pov ..
> n.pov
> I use to do:
> foreach in $index (1 .. n){
> system ("povray $index.pov")}
> but if I have got something like a.pov gf.pov fr.pov .....zz.pov
> (randoms name with the same extension), how may I do for using system?
> 
try
@files = glob("*.pov");

Then you can interate on @files

HTH

Regards
Abhinav
--



------------------------------

Date: Wed, 31 Mar 2004 03:39:18 -0500
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: system command
Message-Id: <ividnRiXYqg4HvfdRVn-sw@adelphia.com>

Andrea Spitaleri wrote:

> I use to do:
> foreach in $index (1 .. n){
> system ("povray $index.pov")}
> but if I have got something like a.pov gf.pov fr.pov .....zz.pov
> (randoms name with the same extension), how may I do for using system?

There are plenty of ways to do that. The hard part is choosing one. ;-)

If the files are all in one directory, you could use opendir(), readdir(),
and closedir() to build a list of filenames. If there are sub-directories,
you might want to look at the standard File::Find module.

Or, if the files all have alphabetical names with one or two characters, you
could use Perl's string incrementing magic:

foreach $index ('a' .. 'zz') {
        next unless(-f "$index.pov");
        system("povray $index.pov");
}

sherm--

-- 
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org


------------------------------

Date: Wed, 31 Mar 2004 09:13:01 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: system command
Message-Id: <406A8B29.379867C0@acm.org>

Andrea Spitaleri wrote:
> 
> if a dir contains file like 1.pov 2.pov 3.pov 4.pov 5.pov 6.pov ..
> n.pov
> I use to do:
> foreach in $index (1 .. n){
> system ("povray $index.pov")}
> but if I have got something like a.pov gf.pov fr.pov .....zz.pov
> (randoms name with the same extension), how may I do for using system?

Well, system() runs a shell if you have shell meta-characters in the
string so you could do:

system 'povray *.pov';


John
-- 
use Perl;
program
fulfillment


------------------------------

Date: 31 Mar 2004 12:27:49 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: using alarm with File::Tail
Message-Id: <c4edg5$bsb$1@mamenchi.zrz.TU-Berlin.DE>

Keith Michaels <krm@sdc.cs.boeing.com> wrote in comp.lang.perl.misc:
> File::Tail doesn't seem to work with alarm, probably because it
> uses sleep internally.  Is there a way around this??  I need to
> wake up the File::Tail->read periodically to flush some buffers
> and then resume the Tail where it left off.  Do I need another
> thread for that??  A sample of this would be appreciated!!

I don't see that problem, at least with a single alarm.

Here is a script that watches /tmp/xxx:

    #!/usr/bin/perl
    use strict; use warnings; $| = 1; # @^~`

    use File::Tail;

    $SIG{ ALRM} = sub { print "Alaaaaarm\n" };
    alarm( 30);

    my $file=File::Tail->new("/tmp/xxx");
    while ( defined( my $line=$file->read)) {
        print "$line";
    }

And here is one that feeds /tmp/xxx:

    #!/usr/bin/perl
    use strict; use warnings; $| = 1; # @^~`
    use Vi::QuickFix;

    open my $t, '>>/tmp/xxx' or die $!;
    select $t;
    $| = 1;

    my $n = 0;
    while ( 1 ) {
        print $t "line $n\n";
        sleep 1;
        $n ++;
    }

With that combo, the watcher lists lines up to "line 30", then
shouts "Alaaaaarm", and continues with "line 31", just like it
should.

Anno


------------------------------

Date: Wed, 31 Mar 2004 18:14:36 +0900
From: ko <kuujinbo@hotmail.com>
Subject: Re: validate links??
Message-Id: <c4e29p$2h7pah$1@ID-227975.news.uni-berlin.de>

Dan Pelton wrote:
> Whats the best way to check for broken links on a web page. These links 
> goto to CGIs and some of the links a redirected. I tried the code below 
> from the perl cook book. It works fine for links to html pages only.
> 
> "churl.pl http://www.ams.org" reports that http://www.ams.org/eims is 
> BAD, but it is a valid URL.
> 
> Any suggestions?
> 
> thanks,
> Dan
> 
> --------------------------------------------------------
> #!/usr/bin/perl -w
> # churl - check urls
> use HTML::LinkExtor;
> use LWP::Simple;
> $base_url = shift
>    or die "usage: $0 <start_url>\n";
> $parser = HTML::LinkExtor->new(undef, $base_url);
> $html = get($base_url);
> die "Can't fetch $base_url" unless defined($html);
> $parser->parse($html);
> @links = $parser->links;
> print "$base_url: \n";
> foreach $linkarray (@links) {
>    my @element  = @$linkarray;
>    my $elt_type = shift @element;
>    while (@element) {
>        my ($attr_name , $attr_value) = splice(@element, 0, 2);
>        if ($attr_value->scheme =~ /\b(ftp|https?|file)\b/) {
>            print "  $attr_value: ", head($attr_value) ? "OK" : "BAD", "\n";
>        }
>    }
> }
> 

The URL http://www.ams.org/eims results in a redirection response from 
the server to https://www.ams.org/eims/. In scalar context the head() 
function returns a false value when the attempt to access the URL fails, 
therefore you get 'BAD'. Make sure you have Crypt::SSLeay installed:

perl -MCrypt::SSLeay

If you don't, you won't be able to access URLs with the https scheme and 
need to install the module. Do a 'perldoc lwpcook' from your shell for 
an explanation - look under the 'HTTPS' heading.

If you're going to be doing more complicated things in the future, have 
a look at LWP::UserAgent - it gives you a lot more options. For example, 
in this case you could have checked the server response code with the 
LWP::UserAgent is_error() method.

HTH - keith


------------------------------

Date: Wed, 31 Mar 2004 10:49:38 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: validate links??
Message-Id: <6nxac.149840$po.892076@attbi_s52>

Dan Pelton wrote:

> Whats the best way to check for broken links on a web page.

Another way is http://www.stonehenge.com/merlyn/LinuxMag/col16.html
	-Joe


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6344
***************************************


home help back first fref pref prev next nref lref last post