[24570] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 6748 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Jun 30 00:10:35 2004

Date: Tue, 29 Jun 2004 21:10:12 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 29 Jun 2004     Volume: 10 Number: 6748

Today's topics:
    Re: Problem with Gtk2 and POE <troc@pobox.com>
    Re: Setting environment variables from a Perl script (J. Romano)
        Why can't I get WWW::Mechanize->find_all_links to work? (Peter M. Jagielski)
    Re: Why can't I get WWW::Mechanize->find_all_links to w (Randal L. Schwartz)
    Re: Why can't I get WWW::Mechanize->find_all_links to w <1usa@llenroc.ude>
    Re: Why can't I get WWW::Mechanize->find_all_links to w (Walter Roberson)
    Re: Why can't I get WWW::Mechanize->find_all_links to w <wcitoan@NOSPAM-yahoo.com>
    Re: Why can't I get WWW::Mechanize->find_all_links to w <1usa@llenroc.ude>
    Re: Why can't I get WWW::Mechanize->find_all_links to w <wcitoan@NOSPAM-yahoo.com>
    Re: Why can't I get WWW::Mechanize->find_all_links to w <wcitoan@NOSPAM-yahoo.com>
    Re: win32 - access shared folder <MrReallyVeryNice.NOVIRUS@NoSpam.yahoo.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Wed, 30 Jun 2004 02:54:13 GMT
From: Rocco Caputo <troc@pobox.com>
Subject: Re: Problem with Gtk2 and POE
Message-Id: <slrnce4bbj.hdf.troc@eyrie.homenet>

On Tue, 29 Jun 2004 22:15:17 +0200, Krisztian VASAS wrote:
>
> Well, now works quiet well, but I have another problem and I can't 
> decide what's the problem...

[...]

> I've tried to reduce the program to the smallest amount code that still 
> triggers the problem and looks like this is Gtk2's (?) fault.
>
> Here's the code: http://www.nomorepasting.com/paste.php?pasteID=15366

When I run that code, I get:

  2) poerbook:~/projects/support% perl gtk2-client.perl 
  Bareword "menuitem_quit" not allowed while "strict subs" in use \
    at gtk2-client.perl line 73.
  BEGIN not safe after errors--compilation aborted \
    at gtk2-client.perl line 243.

After I changed line 73 to be

        $window->signal_connect("destroy", \&menuitem_quit );

I started getting these errors:

  Gtk-WARNING **: gtk_item_factory_create_item(): Can't specify a \
    callback on a branch: "/_File" at gtk2-client.perl line 85.
  Gtk-WARNING **: gtk_item_factory_create_item(): Can't specify a \
    callback on a branch: "/_Settings" at gtk2-client.perl line 85.
  Gtk-WARNING **: gtk_item_factory_create_item(): Can't specify a \
    callback on a branch: "/_Help" at gtk2-client.perl line 85.
  Fontconfig error: Cannot load default config file
  2 -> login (from gtk2-client.perl at 132)
  2 -> _stop (from \
    /Users/troc/projects/poe/poe/lib/POE/Resource/Sessions.pm at 492)

The Gtk-WARNING lines say you're not setting up @menu_items correctly.
Sure enough, they go away when I replace the 0 callbacks with undef.

The \&loginablak callback for "<control>N" should probably be
$session->postback("login") instead.  The program doesn't immediately
exit when I change it here, but it also doesn't connect to a server when
I try that.

Line 203 reads

  $session->postback("ar_conn")

You probably should add this to gui_start()

  $kernel->alias_set("gui");

and then replace line 203 with

  $poe_kernel->post("gui", "ar_conn");

 ... and that makes the [Ok] button connect to the server.

I have placed an updated version of your script at
http://www.nomorepasting.com/paste.php?pasteID=15396

Have fun!

-- 
Rocco Caputo - http://poe.perl.org/


------------------------------

Date: 29 Jun 2004 20:51:45 -0700
From: jl_post@hotmail.com (J. Romano)
Subject: Re: Setting environment variables from a Perl script
Message-Id: <b893f5d4.0406291951.e4e9ae1@posting.google.com>

> On 2004-06-29, J. Romano <jl_post@hotmail.com> wrote:
> 
> >    Of course, the changes made by my solution aren't permanent (in
> > that, once you exit your shell and restart it, the changes made by the
> > Perl script are no longer there), but there are times when a user
> > needs to set up the ideal environment in which to begin work on a
> > specific task (I've done it several times myself), but doesn't want
> > those environment changes to be permanent.

Keith Keller <kkeller-usenet@wombat.san-francisco.ca.us> responded in
message news:<sp0sbc.bgc.ln@goaway.wombat.san-francisco.ca.us>...
> 
> Don't you think it's a bit overkill to use a Perl script to set
> shell environment variables?

   That's a good question.  I'll begin by saying that I used to write
shell scripts to do just that before I took the time to learn Perl.  I
was able to write fairly elaborate scripts, but it was very
frustrating at times:  between bash, csh, and ksh, I couldn't always
keep all the dialectual differences straight to the point that I had
trouble remembering how to do a simple loop or even a simple
assignment.  In some cases, the shell language I started with lacked a
feature that another shell had, so I had to re-write my script from
scratch using the other shell language.

   That's one of the reasons I liked Perl so much when I started
learning it.  It had a superset of the shells' features in a syntax
that made sense to a C programmer.

> Why not (for example) write a bash script that accomplishes
> the same task?

   Because the Perl script is much easier to write and understand. 
I'll give you an example:

   Back when I was a student in college, as part of our homework
assignments, we had to run a specific program the Unix machines.  Many
times these programs required that certain environment variables were
set and, as a result, the instructors very often provided us with a
file for us to "source", like with the following command:

      source cs400

That would set our environment variables (and maybe even change our
working directories).  One of the environment variables that always
got modified was $PATH.  As a result, my $PATH became super-huge, and
many of its entries were duplicates, making it difficult to
troubleshoot Unix problems as the $PATH was very convoluted.

   I did not know Perl at the time, so I used csh to write myself a
script to simplify my $PATH.  It worked, but it was not easy.  Once I
had written the script, I could run a line like:

      source simplifyPath

and duplicates would automatically be removed from my $PATH.

   I don't remember how I wrote it, but I do remember that it was more
than five lines of code.  And here's where Perl comes in:  With this
Perl script that is effectively four lines long, I achieve the same
effect:


#!/usr/bin/perl -w
# File:  simplifyPath.pl
use strict;
my %seen;  # store paths already seen

my @pathList = split /:/, $ENV{PATH};
@pathList = grep { ! $seen{$_}++ } @pathList;
$ENV{PATH} = join ':', @pathList;

exec $ENV{SHELL};
__END__


   Now, instead of typing "source simplifyPath", I type:

      exec simplifyPath.pl

and I get a new $PATH that has all the duplicate paths removed.

   Could I have done it with a bash script?  Definitely.  Like I said,
I've done it before with csh, but setting my environment variables and
my current directory with Perl gives me all the power of the Perl
programming language.

   And this "Perl power" is why, periodically, we get users asking how
to use Perl to set environment variables that outlast the Perl script.
 Unfortunately, most of them receive the answer that "it can't be
done."  And that's the reason behind my posts -- to show that it CAN
be done.

   I hope this answers your questions, Keith.

   -- Jean-Luc


------------------------------

Date: 29 Jun 2004 18:59:51 -0700
From: peterj@insight.rr.com (Peter M. Jagielski)
Subject: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <f5f1d08b.0406291759.3217fa41@posting.google.com>

Fellow Perl programmers,

I'm doing a project for an attorney that involves searching the local
court house records via the court's web site.  I'm doing a search and
trying to get a list of all the links to the court dockets.  My code
only returns one (the 1st) link, although if you load the HTML (note
that there's a space between "smith," and "john") into your browser
and execute it, you can clearly see that there's 16 links/dockets. 
What am I doing wrong?  Here's the code:

#!/usr/bin/perl

use WWW::Mechanize;

my $Mech = WWW::Mechanize->new();
my $URL  = "http://www.loraincountycpcourt.org/nxquick.exe?pname=smith,
john";

$Mech->get($URL);

my @Links = $Mech->find_all_links(url_regex => qr/casen=/i);

foreach my $Link (@Links)
  { print $Link->url_abs . "\n"; }

Thanks in advance to anyone who responds.


------------------------------

Date: Wed, 30 Jun 2004 02:26:03 GMT
From: merlyn@stonehenge.com (Randal L. Schwartz)
To: peterj@insight.rr.com (Peter M. Jagielski)
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <6a479d302e62aba5d787933ae7a7d02c@news.teranews.com>

>>>>> "Peter" == Peter M Jagielski <peterj@insight.rr.com> writes:

Peter> (note
Peter> that there's a space between "smith," and "john")

Peter> my $URL  = "http://www.loraincountycpcourt.org/nxquick.exe?pname=smith,
Peter> john";

A URL with a space in it is ILLEGAL.  If mechanize doesn't handle it,
that's not the fault of Mechanize.

print "Just another Perl hacker,"; # the original

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


------------------------------

Date: 30 Jun 2004 02:47:52 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude>
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <Xns9517E7EABE28Casu1cornelledu@132.236.56.8>

peterj@insight.rr.com (Peter M. Jagielski) wrote in 
news:f5f1d08b.0406291759.3217fa41@posting.google.com:

> #!/usr/bin/perl

Even though it has nothing to do with your problem:

use strict;
use warnings;

> use WWW::Mechanize;
> 
> my $Mech = WWW::Mechanize->new();
> my $URL  = "http://www.loraincountycpcourt.org/nxquick.exe?pname=smith,
> john";
> 
> $Mech->get($URL);
> 
> my @Links = $Mech->find_all_links(url_regex => qr/casen=/i);

use Data::Dumper;
print Dumper \@Links;

 
> foreach my $Link (@Links)
>   { print $Link->url_abs . "\n"; }

The server returns invalid html. Looking at the output from Dumper:

$VAR1 = [
          bless( [
                   'nxhist.exe?casen=93CV110536                ',
                   'SMITH #138-753, OTTO M P COLLINS WARDEN, TERRY J 
05/27/93 93CV110536 SMITH 3RD,
DBA, EURIE W D OHIO DEPARTMENT OF TAXATION 05/16/88 CJ 110/13 SMITH A 
MINOR, BRADY P JOHNSON, AL 05/

 ...

RON A D HOUSEHOLD REALTY CORPORATION 02/01/91 91CV105717',
                   undef,
                   'a',
                   bless( do{\(my $o = 
'http://www.loraincountycpcourt.org/nxquick.exe?pname=smith')
}, 'URI::http' )
                 ], 'WWW::Mechanize::Link' )
        ];

This indicates the presence of invalid HTML. Looking at the page source 
confirms the problem:

<TR><td width="34%"><small><strong>               <A HREF="nxhist.exe?
casen=93CV110536                ">SMITH  #138-753, OTTO M       </TD>
<td width="7%" align="center"><small><strong>     P</TD>
<td width="29%"><small><strong>                   COLLINS  WARDEN, TERRY 
J      </TD>
<td width="10%" align="center"><small><strong>    05/27/93</TD>
<td width="20%"><small><strong>                   93CV110536                    
</TD></TR>
<TR><td width="34%"><small><strong>               <A HREF="nxhist.exe?
casen=CJ+110/13                 ">SMITH  3RD, DBA, EURIE W      </TD>
<td width="7%" align="center"><small><strong>     D</TD>
<td width="29%"><small><strong>                   OHIO DEPARTMENT OF 

etc etc so on and so forth. People get paid to write this crap I guess.

HTML Tidy has a lot to say:

ttt.html:26:52: Warning: <a> escaping malformed URI reference
ttt.html:26:135: Warning: missing </a> before </td>
ttt.html:26:135: Warning: missing </strong> before </td>
ttt.html:26:135: Warning: missing </small> before </td>
ttt.html:27:1: Warning: plain text isn't allowed in <tr> elements
ttt.html:27:52: Warning: missing </strong> before </td>
ttt.html:27:52: Warning: missing </small> before </td>
ttt.html:28:81: Warning: missing </strong> before </td>

One solution might be to run the HTML you receive through HTML Tidy. I 
remember seeing some CPAN modules to do that.

Also, HTML::Parser may be able to deal with this but I haven't tried.

-- 
A. Sinan Unur
1usa@llenroc.ude (reverse each component for email address)


------------------------------

Date: 30 Jun 2004 02:56:06 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <cbta46$fih$1@canopus.cc.umanitoba.ca>

In article <6a479d302e62aba5d787933ae7a7d02c@news.teranews.com>,
Randal L. Schwartz <merlyn@stonehenge.com> wrote:
:A URL with a space in it is ILLEGAL.

Amazing. Is that one of the clauses that got snuck into the
Patriot Act? Can the offenders be hauled before the International
Criminal Court for Crimes Against Humanity?
-- 
   Preposterous!! Where would all the calculators go?!


------------------------------

Date: Wed, 30 Jun 2004 02:57:45 -0000
From: "W. Citoan" <wcitoan@NOSPAM-yahoo.com>
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <slrnce4b12.2vg.wcitoan@wcitoan-via.supernews.com>

On 29 Jun 2004 18:59:51 -0700, Peter M. Jagielski wrote:
>  
>  although if you load the HTML (note that there's a space between
>  "smith," and "john") into your browser and execute it, you can
>  clearly see that there's 16 links/dockets. 

The space may be giving you a problem as the other poster pointed out.
Try converting any spaces in your URLs to "%20".  Your browser will do
that for you automatically, but Mechanize may not.

>  my $URL  =
>  "http://www.loraincountycpcourt.org/nxquick.exe?pname=smith, john";

   $URL =~ s/ /%20/g
   
- W. Citoan
-- 
What signature?


------------------------------

Date: 30 Jun 2004 03:01:18 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude>
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <Xns9517EA32034C3asu1cornelledu@132.236.56.8>

"W. Citoan" <wcitoan@NOSPAM-yahoo.com> wrote in 
news:slrnce4b12.2vg.wcitoan@wcitoan-via.supernews.com:

> On 29 Jun 2004 18:59:51 -0700, Peter M. Jagielski wrote:
>>  
>>  although if you load the HTML (note that there's a space between
>>  "smith," and "john") into your browser and execute it, you can
>>  clearly see that there's 16 links/dockets. 
> 
> The space may be giving you a problem as the other poster pointed out.
> Try converting any spaces in your URLs to "%20".  Your browser will do
> that for you automatically, but Mechanize may not.
> 
>>  my $URL  =
>>  "http://www.loraincountycpcourt.org/nxquick.exe?pname=smith, john";
> 
>    $URL =~ s/ /%20/g
>    
> - W. Citoan

A better idea would be to use HTML::Entities. However, that is not the 
OP's problem. See my response where I used the following URL:

http://www.loraincountycpcourt.org/nxquick.exe?pname=smith

The HTML returned is not legal and WWW::Mechanize is choking on it.

-- 
A. Sinan Unur
1usa@llenroc.ude (reverse each component for email address)


------------------------------

Date: Wed, 30 Jun 2004 03:11:16 -0000
From: "W. Citoan" <wcitoan@NOSPAM-yahoo.com>
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <slrnce4bqd.2vg.wcitoan@wcitoan-via.supernews.com>

On Wed, 30 Jun 2004 02:57:45 -0000, W. Citoan wrote:
>  
>  The space may be giving you a problem as the other poster pointed
>  out.  Try converting any spaces in your URLs to "%20".  Your browser
>  will do that for you automatically, but Mechanize may not.

Ignore me.  Mechanize appears to be doing the right thing with spaces.
Should have actually tried testing it before posting...

- W. Citoan
-- 
What signature?


------------------------------

Date: Wed, 30 Jun 2004 03:17:32 -0000
From: "W. Citoan" <wcitoan@NOSPAM-yahoo.com>
Subject: Re: Why can't I get WWW::Mechanize->find_all_links to work?
Message-Id: <slrnce4c64.2vg.wcitoan@wcitoan-via.supernews.com>

On 30 Jun 2004 03:01:18 GMT, A. Sinan Unur wrote:
>  
>  A better idea would be to use HTML::Entities. However, that is not
>  the OP's problem.

Yes, I already realized that.  Comes from trying to take the easy way
out and not actually test a theory before posting and then thinking
better of it and testing after I post.  Someday I'll learn...

- W. Citoan
-- 
What signature?


------------------------------

Date: Tue, 29 Jun 2004 19:28:56 -0700
From: "MrReallyVeryNice" <MrReallyVeryNice.NOVIRUS@NoSpam.yahoo.com>
Subject: Re: win32 - access shared folder
Message-Id: <or-dnSDSXrxnuH_dRVn-sw@comcast.com>

First of all, let me offer you a little _untested_ sample script that will
hopefully help you:

--- Beginning ---
use strict;
use warnings;

my $IP=192.168.1.192; # define $IP to match your environment
my $ShareName='wwwroot';  # define $ShareName to match your environment
my $UserName='Administrator'; # define $UserName to match your environment
my $Password='SensitivePassword'; # define $UserName to match your
environment

# the following net use should connect to the share using the credentials
# defined for $UserName and $Password.
system ("net use \\\\$IP\\$ShareName /u:$IP\\$UserName $Password");
#you should now have access to the share and should be able to copy your
file over
system ("copy client.pl \\\\$IP\\$ShareName\\*.*");

--- End -----


Not being a native english speaker, I must admit that I'm slightly shooting
dark because the following statment is ambiguous (to me):

"The shared folder needs a login with login information that the user is not
suppose to have, so the login data has to be hard coded. What do I have to
do that the script does an automatic login ?"

Let me try to address what I believe to be your concern.  As you requested,
if you hard code the login data (username/password) in the script, it is
really not secure. It is pretty much equivalent to handing over the
information to the end user running the script. Of course you might be in an
environment where users are not computer literate but it is only a matter of
time before one user reads your source code and figure out the
username/password.

You will find many threads concerning 'hidding' your source code.  Run the
command  'perldoc -q hide' on your system or google. The bottom line is that
there is no way to hide the information once the script is running on the
user machine. You can try to create an executable or to offuscate the info,
however, another person with a bit of motivation and knowledge will
eventually reverse engineer your code.  My purpose is not to scare you but
to make you aware. Writing a script containing a sensitive username and
password is just not secure.

To 'minimize' the risk, you might want to create a generic user that is
limited to writing to the share that you want the user to access. Don't use
any sensitive user belonging to your administrators group. :-)  Also, you
have to realize that upon exit of the sample script provided above, the
connection to the share will still be there. You should consider removing
the connection to the share:

system ("net use \\\\$IP\\$ShareName /d");  # untested code


Don't hesitate to ask further questions or to report problems/success.

MrReallyVeryNice




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6748
***************************************


home help back first fref pref prev next nref lref last post