[24575] in Perl-Users-Digest
Perl-Users Digest, Issue: 6751 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Jun 30 14:05:42 2004
Date: Wed, 30 Jun 2004 11:05:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 30 Jun 2004 Volume: 10 Number: 6751
Today's topics:
Authenticating to a web form that posts to a jsp. (Jeremy)
Re: catching ctrl chars <jgibson@mail.arc.nasa.gov>
Re: Cyrpt::GPG example (Anno Siegel)
distinguish $! vaules <dog@dog.dog>
Getting to variables contained in a typeglob referenced <this.is@invalid>
Re: Getting to variables contained in a typeglob refere <this.is@invalid>
Re: Getting to variables contained in a typeglob refere <j.g.karssenberg@student.utwente.nl>
Re: HTTP::Request, trailing slash <sebastian.baua@t-online.de>
Re: HTTP::Request, trailing slash <noreply@gunnar.cc>
Re: HTTP::Request, trailing slash <sebastian.baua@t-online.de>
Re: HTTP::Request, trailing slash <noreply@gunnar.cc>
Re: Indented text converted to arrays of arrays <bmb@ginger.libs.uga.edu>
Re: Nonblocking Pipe Open (J. Romano)
Perl vs. DCOM <jochen.friedmann3@de.bosch.com>
Re: Perl vs. DCOM <ceo@nospan.on.net>
Re: tern an hebrew string into unicode (dana livni)
Re: Why can't I get WWW::Mechanize->find_all_links to w <kuujinbo@hotmail.com>
Re: Why can't I get WWW::Mechanize->find_all_links to w (Peter M. Jagielski)
Re: Why can't I get WWW::Mechanize->find_all_links to w (Peter M. Jagielski)
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 30 Jun 2004 08:28:56 -0700
From: dazeconf@yahoo.com (Jeremy)
Subject: Authenticating to a web form that posts to a jsp.
Message-Id: <75a2871f.0406300728.5c681af2@posting.google.com>
Hi,
I am trying to write a perl script that will automatically
authenticate to a web site so that I can download files from the site.
However, the site posts username/password information to a jsp, not a
cgi. I have read the LWP::UserAgent and LWP::Simple documentation,
and had no problem writing a script to authenticate to another site,
but I haven't had any success here. Any suggestions? If you're
interested, the authentication page page in question is:
https://www.affymetrix.com/site/login/login.affx
Any help would be greatly appreciated.
Thanks,
Jeremy
------------------------------
Date: Wed, 30 Jun 2004 08:37:13 -0700
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: catching ctrl chars
Message-Id: <300620040837136570%jgibson@mail.arc.nasa.gov>
In article <c0837966.0406292341.76b1d013@posting.google.com>, justme
<eight02645999@yahoo.com> wrote:
> hi
>
> how can i catch Ctrl-x (or any other letters except 'c' ) in perl ??
> thanks..
See the following:
perldoc -q single
perldoc Term::ReadKey
------------------------------
Date: 30 Jun 2004 11:25:44 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Cyrpt::GPG example
Message-Id: <cbu7vo$ci2$1@mamenchi.zrz.TU-Berlin.DE>
Robert Walkup <rdww@ti.com> wrote in comp.lang.perl.misc:
> I am trying to decrypt a file using the Cyrpt:GPG module. I haven't had much
> look at this.
^^^^
That is my impression too...
> Has anyone used this module and does anyone have a simple example
> on how to use this module to decrypt and then re-encrypt a file?
What module are you trying to use? There's
Crypt::GPG, Crypt::PGP2, Crypt::PGP5 and Crypt::PGPsimple.
What have you tried, and how are the results different from what you
expect?
Anno
------------------------------
Date: Wed, 30 Jun 2004 17:44:53 +0200
From: "Peter Michael" <dog@dog.dog>
Subject: distinguish $! vaules
Message-Id: <cbulgi$ba11@news-1.bank.dresdner.net>
Hi,
what is currently the preferred way to distinguish between different
values of $! ? I suppose that %! was created to this end (sample code?)
but should I use Switch(3) instead today?
use Errno qw(:POSIX);
use Switch;
open my $fh, "file" or do
{ switch($!)
{ case ENOENT { warn "you should first create the file\n"; }
case EACCES { warn "you are not allowed to see this\n"; }
else { warn "some other error...\n"; }
}
};
Any hints welcome (Anno? ;-).
Best regards,
Peter
------------------------------
Date: Wed, 30 Jun 2004 19:31:10 +0200
From: ddtl <this.is@invalid>
Subject: Getting to variables contained in a typeglob referenced by a scalar.
Message-Id: <hlt5e0htp9m98t9husg44pk87hkhuh0lmd@4ax.com>
Hello,
After 'open' is called in the following way:
open my $fh, $file;
'$fh' contains a reference to a filehandle, i.e., somewhere in the 'open'
function there is probably a following assignment:
$fh = \*SOME_TYPEGLOB;
(and somehow that typeglob is anonymous (?), so the filehandle is
also anonymous).
If I understand correctly, it actually means that '$fh' contains
a reference to a typeglob, because wherever there is a filehandle,
it can be substituted for a typeglob.
Than, if we want to get to the variables inside that typeglob, we
have to do it in the following way:
*$fh - access the fileglob
*$fh->{SCALAR} - access a reference to a scalar (typeglob is also a special
hash)
${*$fh->{SCALAR}} - access the scalar itself.
But "Programming Perl" (14.4.1) mentions another, easier way -
just use $$$fh (or @$$fh or %$$fh, etc.).
I don't understand, though, how it works - what do we access when
$$fh is used - there is no scalar reference inside '$fh'?
The book doesn't mention it as some new way of using typeglobs/references,
which probably means that it should be clear from the previous chapters
why and how it works, and it seems that I missed the point.
The question is, then: how and why $$$fh etc. work, and where is an
explanation for it in the book (or just in the documentation)
ddtl.
------------------------------
Date: Wed, 30 Jun 2004 21:04:08 +0200
From: ddtl <this.is@invalid>
Subject: Re: Getting to variables contained in a typeglob referenced by a scalar.
Message-Id: <c436e0lpr43s3gu4254l4ark6ts2coq06f@4ax.com>
On Wed, 30 Jun 2004 19:28:59 +0200, Jaap Karssenberg
<j.g.karssenberg@student.utwente.nl> wrote:
>The point is the nature of a typeglob, this entity is a namespace node
>which refers to _any_ type with that name (hence "typeglob"). So you can
>regard a typeglob reference as a reference to any type at the same time.
>
>Using ->{SCALAR} is just a special syntax to force dereferencing as a
>scalar type. Using $$fh does this implicitly.
But if reference to a typeglob is equivalent to a reference to whatever type
with that name I want, it should be able to say:
$$fh = 10;
print "$$fh";
and it would print out "10", because '$fh contains a reference to a
typeglobe == a reference to a scalar (among other things), so
dereferencing '$fh' should get to the value, but the above prints:
*main::10
so it means that another level of indirection is needed, i.e. - '$fh' does not
contains a reference to a scalar (or it's equivalent)?
ddtl.
------------------------------
Date: Wed, 30 Jun 2004 19:28:59 +0200
From: Jaap Karssenberg <j.g.karssenberg@student.utwente.nl>
Subject: Re: Getting to variables contained in a typeglob referenced by a scalar.
Message-Id: <20040630192859.7ed4c4e3@Captain>
The point is the nature of a typeglob, this entity is a namespace node
which refers to _any_ type with that name (hence "typeglob"). So you can
regard a typeglob reference as a reference to any type at the same time.
Using ->{SCALAR} is just a special syntax to force dereferencing as a
scalar type. Using $$fh does this implicitly.
--
) ( Jaap Karssenberg || Pardus [Larus] | |0| |
: : http://pardus-larus.student.utwente.nl/~pardus | | |0|
) \ / ( |0|0|0|
",.*'*.," Proud owner of "Perl6 Essentials" 1st edition :) wannabe
------------------------------
Date: Wed, 30 Jun 2004 14:08:22 +0200
From: Sebastian Bauer <sebastian.baua@t-online.de>
Subject: Re: HTTP::Request, trailing slash
Message-Id: <cbuafd$9bi$04$1@news.t-online.com>
Thx a lot for answering, here comes the program that fails.
It extracts the url of an image out of a webpage and then should download
this file. If you try to load the $img_url in mozilla it works. If you
append a slash or open the url in konqueror it fails the same way the
script does...
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
my $taz_url = "http://www.taz.de";
my $filename = "tom.gif";
my $uagent = LWP::UserAgent->new();
my $request = HTTP::Request->new(GET => $taz_url
."/pt/2004/06/30.nf/tomnf");
my $result = $uagent->request($request);
my $img_url;
if($result->content() =~
/<img src="(.*)" alt="TOM">\s+<br \/><b>Tom Touché vom/)
{
$img_url = $1;
} else {
die "url of todays image cannot be determined\n";
}
print "${taz_url}${img_url}\n";
$request = HTTP::Request->new(GET => "${taz_url}${img_url}");
$result = $uagent->simple_request($request,$filename);
if($result->is_success) {
print "todays image stored in $filename\n";
} else {
die "could not store todays image\n";
}
Thx for your help
------------------------------
Date: Wed, 30 Jun 2004 15:12:02 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: HTTP::Request, trailing slash
Message-Id: <2kfsq8F1sjg9U1@uni-berlin.de>
Sebastian Bauer wrote:
> Thx a lot for answering, here comes the program that fails. It
> extracts the url of an image out of a webpage and then should
> download this file. If you try to load the $img_url in mozilla it
> works.
$img_url is assigned the absolute URL
'/pt/.nf/gif.t,tom.d,1088589600', and that's not enough for any
browser to find the image. I don't understand what you mean by that.
But your script concatenates $taz_url and $img_url to
'http://www.taz.de/pt/.nf/gif.t,tom.d,1088589600'
which seems to be a valid URL to a (copyright protected) image.
> If you append a slash or open the url in konqueror it fails the
> same way the script does...
The script you posted does not fail for me. It prints "todays image
stored in tom.gif", and no slash is appended.
Sorry, but I still don't understand what the problem is.
<program snipped>
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Wed, 30 Jun 2004 15:39:02 +0200
From: Sebastian Bauer <sebastian.baua@t-online.de>
Subject: Re: HTTP::Request, trailing slash
Message-Id: <cbufpd$oc5$05$1@news.t-online.com>
the script downloads something, but if you make a
less tom.gif
you'll see that it did not download an image but plain text. If you follow
the link http://www.taz.de/pt/.nf/gif.t,tom.d,1088589600 in mozilla you'll
get an image. if you follow
http://www.taz.de/pt/.nf/gif.t,tom.d,1088589600/
you'll get the same text as the tom.gif file contains. That's the reason why
i thought that there might an additional slash
>which seems to be a valid URL to a (copyright protected) image.
this is just for personal use (i collect those images and its hard work to
to it manually)
thx sebastian
------------------------------
Date: Wed, 30 Jun 2004 17:58:43 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: HTTP::Request, trailing slash
Message-Id: <2kg6jrF1vameU1@uni-berlin.de>
Sebastian Bauer wrote:
> the script downloads something, but if you make a
>
> less tom.gif
>
> you'll see that it did not download an image but plain text.
Aha, I see that now. Actually it downloads an HTML error page.
> If you follow the link
> http://www.taz.de/pt/.nf/gif.t,tom.d,1088589600 in mozilla you'll
> get an image. if you follow
> http://www.taz.de/pt/.nf/gif.t,tom.d,1088589600/ you'll get the
> same text as the tom.gif file contains. That's the reason why i
> thought that there might an additional slash
I see. Well, that error page is returned whichever incorrect URL you
are using, so why would it be caused by an appended slash?
The URL is not exactly the standard kind of URL. Maybe its special
nature makes LWP misinterpret it in some way. Maybe the site owner has
taken actions to prevent that people do what you are trying to do (you
can't view the image directly any longer, with or without the slash,
so I'd guess that the latter is the case).
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Wed, 30 Jun 2004 12:38:48 -0400
From: Brad Baxter <bmb@ginger.libs.uga.edu>
Subject: Re: Indented text converted to arrays of arrays
Message-Id: <Pine.A41.4.58.0406301112360.38288@ginger.libs.uga.edu>
On Wed, 30 Jun 2004, Tore Aursand wrote:
> On Tue, 29 Jun 2004 17:37:50 -0400, Brad Baxter wrote:
> > I would like to take a table of indented text like the following:
> > [...]
>
> I needed help with something quite related to this, but I really don't
> know if the answers I got will help you.
>
> You can read the whole thread here:
> <http://tinyurl.com/2mb39>
Those answers weren't what I was after, but you inspired me to look
further and I did find an answer here:
http://groups.google.com/groups?selm=3DAC4D05.61AC210C%40earthlink.net
By mangling Ben's solution, I got just what I wanted.
I'm a happy camper. :-)
Thanks!
Brad
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent = 1;
print Dumper make_array( <<_end_, " ", 4 );
AAA
BBB
CCC
DDD
EEE
FFF
GGG
_end_
# make_array parameters:
# $raw - string of indented text with newlines
# $char - indent character (typically space or tab)
# $num - number of $char's per indent level
sub make_array {
my( $raw, $char, $num ) = @_;
my @cooked;
my @a = split "\n", $raw;
for my $i ( 0 .. $#a ) {
my( $indent, $string ) = $a[ $i ] =~ /^($char*)(.*)/;
my $len = length( $indent );
my( $lookahead ) = $i == $#a ? '': $a[ $i+1 ] =~ /^($char*)/;
$lookahead = length( $lookahead ) > $len;
my $level = $len/$num;
my $dref = \@cooked;
$dref = $dref->[-1][-1] for 1 .. $level;
push @$dref, $lookahead ? [$string,[]] : $string;
}
return \@cooked;
} # end sub make_array
__END__
$VAR1 = [
[
'AAA',
[
'BBB',
'CCC'
]
],
[
'DDD',
[
[
'EEE',
[
'FFF'
]
]
]
],
'GGG'
];
------------------------------
Date: 30 Jun 2004 05:59:24 -0700
From: jl_post@hotmail.com (J. Romano)
Subject: Re: Nonblocking Pipe Open
Message-Id: <b893f5d4.0406300459.45756709@posting.google.com>
> J. Romano wrote:
>
> > # print out output from process, if any exists:
> > while ($selector->can_read(0))
> > {
> > my $char;
> > sysread($r, $char, 1);
> > print $char;
> >
> > unless ($selector->can_read(0))
> > {
> > sleep 1; # allow some time for request to process
> > # or else while loop will finish if there
> > # there is a pause in the program
> > }
> > }
Dear Gregory,
Just after I posted my sample Perl script, I realized that I could
write the same loop without sleep calls if I just use an inifnite loop
to continually check to see if there is output waiting to be read. In
other words, here's a script that does the same thing as the one I
gave you yesterday, but without sleeping:
#!/usr/bin/perl -w
use strict;
use IPC::Open2;
use IO::Select;
$| = 1; # autoflush STDOUT
# Declare filehandles and command to use:
my ($r, $w);
my $cmd = 'ping 127.0.0.1';
# Open the process and set the selector:
my $pid = open2($r, $w, $cmd);
my $selector = IO::Select->new($r);
while (1) # infinite loop (use "last" to break out)
{
if ($selector->can_read(0))
{
my $char;
sysread($r, $char, 1);
print $char;
}
# Do anything you want in between reads here...
}
__END__
The advantage to this script is that, if your commands (like
"whois", "dig", and "ping") happen to pause, the loop won't
automatically break out. The disadvantage to this script is that it
might be difficult figuring out when a command has finished, or just
has delayed output (in which case you might have to put in a few
sleep() calls). Either way, I think that this script here does a
better job of helping you visualize what is going on -- you just need
to be mindful of the fact that some programs don't flush their output
right away, and that it's not a simple matter to tell if the program
has stopped running altogether.
So you might want to give the above script a try, if my first was
too confusing. But if both are too overwhelming for you, you might
want to check out Rocco Caputo's solution.
Hopefully one of our solutions will help.
-- Jean-Luc
------------------------------
Date: Wed, 30 Jun 2004 14:13:53 +0200
From: "Jochen Friedmann" <jochen.friedmann3@de.bosch.com>
Subject: Perl vs. DCOM
Message-Id: <cbuaq2$2ko$1@ns1.fe.internet.bosch.com>
Hello,
how can I use a DCOM object in a Perl script ?
Jochen
------------------------------
Date: Wed, 30 Jun 2004 14:01:03 GMT
From: Chris <ceo@nospan.on.net>
Subject: Re: Perl vs. DCOM
Message-Id: <zIzEc.5903$Bm4.613@newssvr16.news.prodigy.com>
Jochen Friedmann wrote:
> Hello,
>
> how can I use a DCOM object in a Perl script ?
>
Strange that when I searched Google using "perl DCOM" I was referred to
this page and you were not:
http://www.codeproject.com/books/1578702151.asp
<Ctrl+F> and search for "DCOM" on the page itself. Interesting.
-ceo
------------------------------
Date: 30 Jun 2004 04:11:26 -0700
From: dana_livni@hotmail.com (dana livni)
Subject: Re: tern an hebrew string into unicode
Message-Id: <1596f85c.0406300311.5af14f5b@posting.google.com>
i gess you right, i need to convert the text in order to send it in a
get request - in the format of the www.vivvisimo.com site.
i think that all the %d7 meen that this is hebrow and that the second
pare symbol the specific letter.
i'm not sure witch encoding is it.
i meant to send the real string (my name - dana) but google site
encoded it.
if there any function that get a string and the encode for use and
retearnd a string of two pares :
1. symbol the languge
2. symbol the specific letter.
like in my example, i will find the encoding i'm looking for.
thanks
------------------------------
Date: Wed, 30 Jun 2004 19:40:16 +0900
From: ko <kuujinbo@hotmail.com>
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <2kfjkiF1m4a4U1@uni-berlin.de>
Peter M. Jagielski wrote:
> Fellow Perl programmers,
>
> I'm doing a project for an attorney that involves searching the local
> court house records via the court's web site. I'm doing a search and
> trying to get a list of all the links to the court dockets. My code
> only returns one (the 1st) link, although if you load the HTML (note
> that there's a space between "smith," and "john") into your browser
> and execute it, you can clearly see that there's 16 links/dockets.
> What am I doing wrong? Here's the code:
>
> #!/usr/bin/perl
>
> use WWW::Mechanize;
>
> my $Mech = WWW::Mechanize->new();
> my $URL = "http://www.loraincountycpcourt.org/nxquick.exe?pname=smith,
> john";
>
> $Mech->get($URL);
>
> my @Links = $Mech->find_all_links(url_regex => qr/casen=/i);
>
> foreach my $Link (@Links)
> { print $Link->url_abs . "\n"; }
>
> Thanks in advance to anyone who responds.
As has been pointed out, neither the returned HTML nor the URL you're
passing are valid.
Besides LWP, WWW::Mechanize 'use's HTML::TokeParser and URI::URL, so you
can do something like this:
#!/usr/local/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $Mech = WWW::Mechanize->new();
my $URL = "http://www.loraincountycpcourt.org/nxquick.exe?pname=smith";
$Mech->get($URL);
get_urls( $Mech->content, $Mech->base );
sub get_urls {
my ($content, $base) = @_;
my $p = HTML::TokeParser->new( \$content );
while ( my $href = $p->get_tag('a') ) {
print URI::URL->new( $href->[1]{href}, $base )->abs, "\n";
}
}
Have a look at the HTML::TokeParser for an explanation of the
constructor and what the get_tag() method is doing.
HTH - keith
------------------------------
Date: 30 Jun 2004 06:12:05 -0700
From: peterj@insight.rr.com (Peter M. Jagielski)
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <f5f1d08b.0406300512.3d6386f@posting.google.com>
Guys,
Thanks for all the help. I'm a Perl newbie, so I just assumed it was
me that was doing something wrong. But obviously, Mechanize is
choking on the invalid HTML. Too bad it's not more (completely?)
tolerant of invalid HTML, but oh well.
>> Even though it has nothing to do with your problem:
>> use strict;
>> use warnings;
I left those out to shorten the program listing.
>> One solution might be to run the HTML you receive through HTML
Tidy. I remember seeing some CPAN modules to do that. Also,
HTML::Parser may be able to deal with this but I haven't tried.
I'll look into HTML Tidy. Regarding HTML::Parser, I have a (longer)
version of the program that uses it, and it works fine on the invalid
HTML. I want to use Mechanize, though, because it's simpler for what
I need to accomplish.
>> The space may be giving you a problem as the other poster pointed
out. Try converting any spaces in your URLs to "%20". Your browser
will do that for you automatically, but Mechanize may not.
>> ...HTML::Entities
I'll try both approaches.
------------------------------
Date: 30 Jun 2004 07:52:31 -0700
From: peterj@insight.rr.com (Peter M. Jagielski)
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <f5f1d08b.0406300652.2b5306d2@posting.google.com>
Keith,
Thanks for responding.
>> Neither...the URL you're passing are valid.
I know, but if I get rid of the space between "smith," and "john", the
URL (ie. my "query") returns an empty dataset. Granted, the HTML is
broken, but it works, so I have to work within those constraints. I
contacted the court's web master, and he couldn't care less - he said
it works just fine if you use the web site, and that my "back-door"
approach via Perl to getting to the data is not his problem, which is
true.
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 6751
***************************************