[18204] in Perl-Users-Digest
Perl-Users Digest, Issue: 372 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Feb 28 00:07:05 2001
Date: Tue, 27 Feb 2001 21:05:09 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <983336709-v10-i372@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Tue, 27 Feb 2001 Volume: 10 Number: 372
Today's topics:
Re: Absent Config.pm <VincentMurphy@mediaone.net>
Re: Aliasing refs while using strict <micah@cowanbox.com>
camel book p102 middle <captain@intnet.net>
Re: camel book p102 middle <uri@sysarch.com>
certificate question (LDHOLMES4)
command line arguments <sks@sierra.net>
Re: command line arguments <uri@sysarch.com>
Re: command line arguments <bwalton@rochester.rr.com>
Re: Easy REGEX question <VincentMurphy@mediaone.net>
Re: Etymology of hash <bart.lateur@skynet.be>
Re: Extracting string between XML tag pair from file newsone@cdns.caNOSPAM
Re: HELP needed on a simple Parse::RecDescent program ( (Gwyn Judd)
Re: How are SOL_SOCKET and SO_REUSEADDR defined in vari <andrew@erlenstar.demon.co.uk>
Re: How are SOL_SOCKET and SO_REUSEADDR defined in vari (Mikko Tyolajarvi)
Re: Perl CGI.pm RESET problem <brondsema@my-deja.com>
Re: Perl CGI.pm RESET problem <whataman@home.com>
Re:GDBM <s1sims@home.com>
regex help needed (cRYOFAN)
regex help please (The Mosquito ScriptKiddiot)
Re: regex help please <bwalton@rochester.rr.com>
Re: Regexp to match Web urls? newsone@cdns.caNOSPAM
Scalars and Arrays <DavidTaylor@yes.co.th>
Re: Scalars and Arrays <bwalton@rochester.rr.com>
Re: Search results by filename (Tad McClellan)
Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Wed, 28 Feb 2001 02:21:39 GMT
From: Vinny Murphy <VincentMurphy@mediaone.net>
Subject: Re: Absent Config.pm
Message-Id: <m366hvo8d6.fsf@vmurphy-hnt1.athome.net>
>>>>> "Brett" == Brett Young <youngb@uclink.berkeley.edu> writes:
Brett> Hi All: I was trying to install a Perl module on a Unix machine
Brett> and the script told me that it could not find the Config module
Brett> used in the @INC statement. I went looking for Config.pm with
Brett> all of the other modules (or a Config directory) and it wasn't
Brett> there.
Brett> Since Config.pm seems to be a core module, how can I explain the
Brett> fact that it is absent? Can the Config.pm module even be
Brett> excluded during normal installation?
The following should give you where it is, else there is a problem with
your perl distribution.
% perl -le 'for( @INC ) { print "$_/Config.pm" if -f "$_/Config.pm"}'
on DOS machines change the quoting.
HTH.
--Vinny
------------------------------
Date: 27 Feb 2001 20:11:53 -0800
From: Micah Cowan <micah@cowanbox.com>
Subject: Re: Aliasing refs while using strict
Message-Id: <yu8n1b7e99y.fsf@mcowan-linux.transmeta.com>
Bart Lateur <bart.lateur@skynet.be> writes:
> No. There even isn't a possibility for lexical aliases (except for
> scalars, using "for"). But you can do:
>
> use vars '*foo';
>
> That's the @foo inside the sub, not the one you make the call with, with
> also happens to be named "foo".
>
> --
> Bart.
Are you sure? The following code:
----------
#!/usr/bin/perl -w
use strict;
my ($foo, $bar);
sub foo ($) {
use vars '*bar';
local *bar = shift;
$bar = 15;
return $bar;
}
$foo = 20;
$bar = 10;
&foo(\$foo);
print "$foo\n$bar\n";
----------
Produces the following output:
20
15
Doesn't that mean it's overwriting the higher-scope one rather than
the local one?
However, when I remove the higher-scope bar from existance, it does as
it's supposed to.
The manpage specifically says use vars declares /globals/ - the
preceding seems to agree with this.... :(
-Micah
------------------------------
Date: Tue, 27 Feb 2001 21:52:11 -0500
From: "Jim Marvel" <captain@intnet.net>
Subject: camel book p102 middle
Message-Id: <8IZm6.136$K4.59277@news.intnet.net>
Can someone please explain to me what is going on in
if ( "fred" & "\1\2\3\4" ) {... }
...that is, how does this attempt to
"see whether any byte came out to nonzero"?
------------------------------
Date: Wed, 28 Feb 2001 04:51:20 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: camel book p102 middle
Message-Id: <x7bsrnfm0o.fsf@home.sysarch.com>
>>>>> "JM" == Jim Marvel <captain@intnet.net> writes:
JM> Can someone please explain to me what is going on in
JM> if ( "fred" & "\1\2\3\4" ) {... }
JM> ...that is, how does this attempt to
JM> "see whether any byte came out to nonzero"?
well, you are missing the next part which shows you how to correct the
problem and clarifies it. the above expression masks 'fred' with the 4
bytes "\1\2\3\4" yielding 0x00020104. you can get this from:
perl -le '$a = "fred" & "\1\2\3\4" ; print unpack "H*", $a'
now, that result is not all zero bytes but supposedly you wanted to find
out if all of the bits in 'fred' that are masked by "\1\2\3\4" are all
zero? you can't directly test that result since all strings are true
(other than '0') no matter if all the bytes are zero. so you need the
extra regex /[^\0]/ which matches any non-zero byte.
the issue is that you can't directly test for all zero bytes with the
boolean tests in perl because masking 2 strings with & will always
return a true value.
HTH,
uri
--
Uri Guttman --------- uri@sysarch.com ---------- http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page ----------- http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net ---------- http://www.northernlight.com
------------------------------
Date: 28 Feb 2001 02:37:43 GMT
From: ldholmes4@aol.com (LDHOLMES4)
Subject: certificate question
Message-Id: <20010227213743.19605.00000240@ng-fa1.aol.com>
I am trying to post to a website that is secured by SSL and certificates. I
am using LWP::UserAgent and I have installed
the Crypt::SSLeay module. However every secured web site I try to post to I get
the following returned.
HTTP/1.1 200 OK
Connection: close
Date: Wed, 28 Feb 2001 02:00:00 GMT
Server: Stronghold/3.0 Apache/1.3.12 C2NetEU/3011 (Unix) mod_ssl/2.6.4
OpenSSL/0.9.5a mod_perl/1.22
Content-Length: 7011
Content-Type: text/html
Client-Date: Tue, 27 Feb 2001 23:59:48 GMT
Client-Peer: 198.200.171.101:443
Client-SSL-Cert-Issuer: /C=US/O=RSA Data Security, Inc./OU=Secure Server
Certification Authority
Client-SSL-Cert-Subject: /C=US/ST=Nebraska/L=Omaha/O=Ameritrade/OU=EI/OU=Terms
of use at www.verisign.com/rpa (c)00/CN=wwws.ameritrade.com
Client-SSL-Cipher: EDH-RSA-DES-CBC3-SHA
Client-SSL-Warning: Peer certificate not verified
Set-Cookie: DEJA=2786802240075904719b; secure
Set-Cookie: Apache=172.134.193.155.15307983325600814; path=/
Title: Login to Ameritrade
Does anyone know why I get the SSL error: peer certificate not verified?
Thnaks in advance.
------------------------------
Date: Wed, 28 Feb 2001 02:30:11 -0000
From: adam <sks@sierra.net>
Subject: command line arguments
Message-Id: <t9oolj3ami8o8a@corp.supernews.com>
Hi, new to PERL, writting my first program. It's a link checker for web
pages. I'm using the book "PERL for DUMMIES", and quizing myself on the
next line of code as I go. I really like perl and I can understand most
perl code I read, I just can't use any of them. When trying to run my
link checker, I don't know how to pass the address argument on the
command line. I take that back. I don't know how to run it at all. Sure
I can double click the file (I'm in Windows by the way), but that doesn't
allow any arguments. I tried running the program from a RUN command using:
1st.pl http://www.help.com #No good
It runs, but there is "die()" code if the URL fails. I've tried
running "perl.exe" and typing the same thing in there, but no good
How do I run a perl program with or w/o argument. Is the problem in my
code?
Thanks a million
Adam T.
sks@sierra.net
--
Posted via CNET Help.com
http://www.help.com/
------------------------------
Date: Wed, 28 Feb 2001 04:41:06 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: command line arguments
Message-Id: <x7g0gzfmhp.fsf@home.sysarch.com>
>>>>> "a" == adam <sks@sierra.net> writes:
a> Hi, new to PERL, writting my first program. It's a link checker for
a> web pages. I'm using the book "PERL for DUMMIES", and quizing
BZZZZZTT! you lose. next player.
i highly recommend you burn that for fuel. with the recent rise is home
heating costs, you can probably save money that way.
that is the worst perl book ever written (and there are so many bad
ones).
uri
--
Uri Guttman --------- uri@sysarch.com ---------- http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page ----------- http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net ---------- http://www.northernlight.com
------------------------------
Date: Wed, 28 Feb 2001 04:40:21 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: command line arguments
Message-Id: <3A9C8198.71386192@rochester.rr.com>
adam wrote:
>
> Hi, new to PERL, writting my first program. It's a link checker for web
> pages. I'm using the book "PERL for DUMMIES", and quizing myself on the
> next line of code as I go. I really like perl and I can understand most
> perl code I read, I just can't use any of them. When trying to run my
> link checker, I don't know how to pass the address argument on the
> command line. I take that back. I don't know how to run it at all. Sure
> I can double click the file (I'm in Windows by the way), but that doesn't
> allow any arguments. I tried running the program from a RUN command using:
> 1st.pl http://www.help.com #No good
> It runs, but there is "die()" code if the URL fails. I've tried
> running "perl.exe" and typing the same thing in there, but no good
> How do I run a perl program with or w/o argument. Is the problem in my
> code?
> Thanks a million
> Adam T.
> sks@sierra.net
...
Try at a command prompt:
perl foo.pl arg1 arg2 arg3
where foo.pl is the name of your Perl program, arg1 is your first
argument, etc. In you program, pick up the arguments as the elements of
array @ARGV, like:
print "The first argument is $ARGV[0]\n";
See:
perldoc perlrun
perldoc perlvar
for more info.
--
Bob Walton
------------------------------
Date: Wed, 28 Feb 2001 02:14:08 GMT
From: Vinny Murphy <VincentMurphy@mediaone.net>
Subject: Re: Easy REGEX question
Message-Id: <m3ae77o8po.fsf@vmurphy-hnt1.athome.net>
>>>>> "Todd" == Todd Bair <todd@ti.com> writes:
Todd> I just can't seem to get this to work. How do a split a line
Todd> based on two or more spaces, but leave single spaces alone.
Todd> ie...
Todd> $line = 'column one column two column three'; @row =
Todd> split/ /,$line;
Todd> so that
Todd> $row[0] eq 'column one'; $row[1] eq 'column two';
Todd> Thanks,
$line = q/column one column two column three/;
@r = split /\s{2,}/, $line;
print map "$_\n" => @r';
This gives:
column one
column two
column three
HTH.
--Vinny
------------------------------
Date: Tue, 27 Feb 2001 23:55:32 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: Etymology of hash
Message-Id: <hdfo9tckoafp9kj3f862q8gmcqmvs7074j@4ax.com>
Eric Bohlman wrote:
>AIUI, "hash" here really means "scramble" in the sense that a good hash
>function scrambles up the bits of the key so that similar keys lead to
>dissimilar function values.
Funny. "hash" canactually point to two different things, both which are
essential for the workings:
* chopping up the string, reduce it to a pile of dehydrated essence.
There's no way to get the original back, but every time you do the
action with the same parameters, you'll get the same result. Hopefully,
a different parameter has a high likelihood of returning a different
value.
* chopping up the data to tackle, much like Caesar said: "divide et
impera". In this respect, the hashing lookup method is similar to binary
search, where, too, you can disregard a lot of possible matches right
from the start.
--
Bart.
------------------------------
Date: 28 Feb 2001 02:33:03 GMT
From: newsone@cdns.caNOSPAM
Subject: Re: Extracting string between XML tag pair from file
Message-Id: <97ho0v$m59$1@news.netmar.com>
In article <39910E44.28BDAFF1@eurodyn.com>, Sasa Danicic
<Sasa.Danicic@eurodyn.com> writes:
>Hello,
>
>I've got a huge file so I've need to extract a string which matches
>between first occurance of XML pair of tags.
>
>So, for example I've need to extract all between first apperiance of
><EMPLOYEE> and first apperiance of </EMPLOYEE> in a file, including
><EMPLOYEE> and </EMPLOYEE> as part of string.
>
>Inside file could be much more <EMPLOYEE>...</EMPLOYEE> tags, but I've
>need between the first ones.
>
>Regargs,
>Sasa
SASA:
#!/usr/local/bin/perl
print "Content-type: text/plain\n\n";
########################### MAIN PROGRAM ###################
&get_first_employee_data;
print "\n\n\nSecond Example\n\n";
&get_employee_data_one_line_format;
print "\n\nAll done\n\n";
######################### END MAIN PROGRAM ###############
##################### SUB ROUTINES #########################
sub get_first_employee_data {
# Assuming you've got a file $file
# You don't say whether the employee data is on multiple lines, but I assume
it is
# You don't say whether or not there are tags within the <employee></employee>
tags
# but I'll assume there are #this will work regardless of the format between
# the <employee></employee> tags for multiple lines
# SAMPLE FILE
my $sample_file = <<ENDFILE;
<employee>
<name>Bob Smith</name>
<dob>June 15, 1980</dob>
<department>Sales</department>
<salary>1000</salary>
</employee>
<employee>
<name>Jane Smith</name>
<dob>April 5, 1975</dob>
<department>Support</department>
<salary>1100</salary>
</employee>
<employee>
<name>John Bell</name>
<dob>February 7, 1964</dob>
<department>Sales</department>
<salary>1200</salary>
</employee>
ENDFILE
# file placed in same directory as script
my $file = "./employee.txt";
my $count = 0;
open (FILE,"<$file");
while ($line = <FILE>) {
chomp $line;
if ( $line =~ /<employee>/ ) {
$count++;
if ($count < 2) {
$first_employee_record .= "$line\n";
}
else {
last;
}
}
if ( $count < 2 && $line !~ /<employee>/) {
$first_employee_record .= "$line\n";
}
}
close FILE;
print $first_employee_record;
}
sub get_employee_data_one_line_format {
# SAMPLE FILE
my $count = 0;
my $file2 = "./employee2.txt";
my $sample_file2 = <<ENDFILE;
<retired_employee><name>Eric Robb</name><retiredate>June 1,
2000</retiredate><pensionamt>400</pensionamt></retired_employee>
<employee><name>Bob Smith</name><dob>June 15,
1980</dob><department>Sales</department><salary>1000</salary></employee>
<employee><name>Jane Smith</name><dob>April 5,
1975</dob><department>Support</department><salary>1100</salary></employee>
<employee><name>John Bell</name><dob>February 7,
1964</dob><department>Sales</department><salary>1200</salary></employee>
ENDFILE
# One line record type files are easy
open (FILE,"<$file2");
while ($line = <FILE>) {
chomp $line;
if ($line =~ /<employee>/ && !$count) {
$count++;
$first_employee_record = $line;
}
else {
last if $count == 1;
}
}
close FILE;
print $first_employee_record;
}
Now, if someone could just help me with my Blowfish problem posted a couple of
days ago, I'd be a happy camper :-(
Let me know if this helped.
Eric
----- Posted via NewsOne.Net: Free (anonymous) Usenet News via the Web -----
http://newsone.net/ -- Free reading and anonymous posting to 60,000+ groups
NewsOne.Net prohibits users from posting spam. If this or other posts
made through NewsOne.Net violate posting guidelines, email abuse@newsone.net
------------------------------
Date: Wed, 28 Feb 2001 04:48:12 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: HELP needed on a simple Parse::RecDescent program (problem: some rules are matched twice)
Message-Id: <slrn99p0oa.kf8.tjla@thislove.dyndns.org>
I was shocked! How could Eric Liao <ekliao@pacbell.net>
say such a terrible thing:
>and I naively thought that
>
>execution of action = successful match of subrule
>
>which is incorrect.
Well one of the advantages of using an LL(1) grammar is that this is in fact
correct. It's not always easy to construct an LL(1) (or even LL(k))
grammar for a given language though. Way off topic for this group though
:)
--
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
Education is the process of driving a set of prejudices down your throat.
-- Martin H. Fischer
------------------------------
Date: 28 Feb 2001 03:32:22 +0000
From: Andrew Gierth <andrew@erlenstar.demon.co.uk>
Subject: Re: How are SOL_SOCKET and SO_REUSEADDR defined in various flavors of Unix?
Message-Id: <87bsrnmqih.fsf@erlenstar.demon.co.uk>
>>>>> "Kenny" == Kenny McCormack <gazelle@yin.interaccess.com> writes:
Kenny> So, I think the score so far is:
Kenny> Solaris Linux, HP/UX, Win9X (probably others)
Kenny> ======= =====================================
Kenny> SOCK_STREAM value: 2 1
I suspect that SOCK_STREAM == 2 is a SVR4ism, because the BSD stack
has always had it as 1, but the Solaris headers are trying to make it
the same as NC_TPI_COTS. I wouldn't be surprised if other SVR4ish
systems were like Solaris in this respect.
But in all honesty, if you try and code stuff like this in your
program, you deserve all the resulting maintenance headaches.
--
Andrew.
comp.unix.programmer FAQ: see <URL: http://www.erlenstar.demon.co.uk/unix/>
or <URL: http://www.whitefang.com/unix/>
------------------------------
Date: 28 Feb 2001 03:38:22 GMT
From: mikko@dynas.se (Mikko Tyolajarvi)
Subject: Re: How are SOL_SOCKET and SO_REUSEADDR defined in various flavors of Unix?
Message-Id: <97hrre$2cgu$1@xlerb.dynas.se>
gazelle@yin.interaccess.com (Kenny McCormack) writes:
>In article <97h57q$2926$1@xlerb.dynas.se>,
>Mikko Tyolajarvi <mikko@dynas.se> wrote:
>...
>>>Now, where this is all coming from is that in the past, I've found that
>>>Solaris seems to do things differently than most other Unixes. So, I've
>>
>>s/Solaris/Linux/
>>
>>>done this:
>>
>>> $SOCK_STREAM = $ENV{'OSTYPE'} =~ /solaris/ ? 2 : 1;
>>
>>Apart from this being a bad idea from a mainatanence perspective,
>>Solaris, AIX, HP-SUX, FreeBSD and even M$ winsock use the same
>>definitions - most likely inheritance from the BSD TCP/IP stack.
>This is, in fact, not true. I just installed ActiveState Perl on my Win9X
>machine, and took a server program from Solaris (that set SOCK_STREAM to 2)
>and tried to run it on the Win9X machine.
>You guessed it. Had to change SOCK_STEAM to 1 - after which it worked fine.
In that case you may be even worse off than you first thought: I just
ran the grep you posted (egrep 'SOL_SOCKET|REUSEA' /usr/include/*/*.h)
and that gives the results I posted (modulo a different path for
include files on windows). But, of SOCK_STREAM is all you care about,
then why ask about SOL_SOCKET and SO_RESUSEADDR? ;-)
$.02,
/Mikko
--
Mikko Työläjärvi_______________________________________mikko@rsasecurity.com
RSA Security
------------------------------
Date: Wed, 28 Feb 2001 03:43:12 GMT
From: "Dave Brondsema" <brondsema@my-deja.com>
Subject: Re: Perl CGI.pm RESET problem
Message-Id: <kt_m6.11842$W05.2282260@news1.rdc1.mi.home.com>
"What A Man !" <whataman@home.com> wrote in message
news:3A9C4FF5.55DEE58A@home.com...
> How do I get my RESET button in the script below to clear
> the HTTP REFERRER out of the input field when someone hits
> RESET? I don't want to use Javascript or have to create
> another file to do this. I've been studying CGI.pm all day
> and can't figure it out. Is there a way to do this with
> raw Perl if CGI.pm won't do it?
I'm pretty sure you can't. AFAIK, when you press the reset button, all form
fields are returned to their state when the page was first loaded. This is
an HTML problem you need to work around, not Perl. I'd suggest rethinking
what you are trying to do and find a different solution. I'm sure there is
one.
Dave Brondsema
>
> #!/usr/local/bin/perl -wd
> use CGI qw(fatalsToBrowser);
> use CGI qw(:standard);
>
> print "Content-type: text/html\n\n";
>
> print "<HTML><BODY bgcolor=99CCFF topmargin=0>
> <CENTER><!--#echo banner=''-->";
>
> $query = new CGI;
>
> print $query->start_form(-method=>$method,
> -action=>$action,
> -enctype=>$encoding);
>
> print $query->textfield(-name=>'url',
> -default=>"$ENV{HTTP_REFERER}",
> -size=>60,
> -maxlength=>300);
> print "<BR>";
>
> print $query->reset(-name=>'RESET',
> -value=>'param()');
> print " ";
> print $query->submit(-name=>'submit',
> -value=>'SUBMIT');
> print "</CENTER></BODY></HTML>";
> print $query->endform;
>
>
> Thanks,
> Dennis
------------------------------
Date: Wed, 28 Feb 2001 04:27:38 GMT
From: "What A Man !" <whataman@home.com>
Subject: Re: Perl CGI.pm RESET problem
Message-Id: <3A9C7EAE.C6974EA8@home.com>
Dave Brondsema wrote:
>
> "What A Man !" <whataman@home.com> wrote in message
> news:3A9C4FF5.55DEE58A@home.com...
> > How do I get my RESET button in the script below to clear
> > the HTTP REFERRER out of the input field when someone hits
> > RESET? I don't want to use Javascript or have to create
> > another file to do this. I've been studying CGI.pm all day
> > and can't figure it out. Is there a way to do this with
> > raw Perl if CGI.pm won't do it?
> >
> > #!/usr/local/bin/perl -wd
> > use CGI qw(fatalsToBrowser);
> > use CGI qw(:standard);
> >
> > print "Content-type: text/html\n\n";
> >
> > print "<HTML><BODY bgcolor=99CCFF topmargin=0>
> > <CENTER><!--#echo banner=''-->";
> >
> > $query = new CGI;
> >
> > print $query->start_form(-method=>$method,
> > -action=>$action,
> > -enctype=>$encoding);
> >
> > print $query->textfield(-name=>'url',
> > -default=>"$ENV{HTTP_REFERER}",
> > -size=>60,
> > -maxlength=>300);
> > print "<BR>";
> >
> > print $query->reset(-name=>'RESET',
> > -value=>'param()');
> > print " ";
> > print $query->submit(-name=>'submit',
> > -value=>'SUBMIT');
> > print "</CENTER></BODY></HTML>";
> > print $query->endform;
> >
> >
> > Thanks,
> > Dennis
>
> I'm pretty sure you can't. AFAIK, when you press the reset button, all form
> fields are returned to their state when the page was first loaded. This is
> an HTML problem you need to work around, not Perl. I'd suggest rethinking
> what you are trying to do and find a different solution. I'm sure there is
> one.
>
> Dave Brondsema
>
Thanks, Dave... but I'm not one that takes NO for an
answer easily. I think I saw a way to do this with CGI.pm,
but can't figure it out.
I've been trying $query->delete('foo');
$query->param('foo', "newvalue"); and CORE::reset() trying
to get something to work.
I even changed my hand-formed HTML to CGI.pm, and that
didn't help. I may go back to it. CGI.pm doesn't seem to
be as flexible as my own HTML coding. I really don't see
this as an HTML problem, though. I see it as mostly a CGI
and Perl problem. :)
--Dennis
------------------------------
Date: Wed, 28 Feb 2001 03:31:10 GMT
From: Tuxman <s1sims@home.com>
Subject: Re:GDBM
Message-Id: <3A9C718A.900C24CE@home.com>
Hello,
Anyone know of a good website that
demonstrates the use of the GDBM module in Perl
syntax. All documentation I've found is given in C
syntax. Maybe I'm stupid and should be able to
translate from $C->{Perl}. Or am I supposed to
embed C in Perl, someone enlighten me por favor,
Tuxman
------------------------------
Date: Wed, 28 Feb 2001 02:59:11 GMT
From: cryofan@mylinuxisp.com (cRYOFAN)
Subject: regex help needed
Message-Id: <3a9c61d1.957368671@news3.mylinuxisp.com>
Well, I think it's a regex problem anyway.
The code below is for my senior project, a data mining tool that runs
off a free server and is actually a CGI program, Once initiated, it
downloads some webpages which are news stories, and counts the number
of occurences of certain words of interest (rates, confidence,
recession, etc); it then stores the word counts along with the current
time on a file.
The problem right now is that I open the first web page which contains
URLs that I want to parse out and store in another array so that I can
open an HTTP connection to these parsed URLs (no, the server I am on
does not have the HTTP package--it's free). I load that entire web
page into a string, split it by using
@array1= split(/href=/,$content);
So therefore every string in the array should start with "href". And
it seems that all of them do. I am looking for URLS of course.
THe only URLs I am interested in are links to financial news stories,
and they all have the same form, so I selected them out thusly from
the array:
if($array1[$num_words-1] =~/^http\:\/\/biz\.yahoo\.com\/rb\/.*/)
Now I have a URL that links to a news story. But it has some unwanted
text and characters at the end of the string (after the ".html") that
prevent the string from being a usable URL, so I get rid of that by
using substution thusly:
$array1[$num_words-1] =~ s/>.*$/$blank/;
Once I have successfully parsed the news story URLs, then I will store
them in another array.
THis seems to work fine for all the URL strings except the first one,
which would be the the zeroth string in the array into which I place
all the URLs of interest (I realize that it is not the zeroeth
element/string in the first array).
That one URL string for some reason still has the unwanted characters
on the end.
All of the potential URL strings have this form:
http://biz.yahoo.com/rb/somestuff//....html><b>somestuff
So why does my code not work for that first string that I am
interested, the first string in the page that matches my template for
URLs of interest. It works fine for all the other URL string that come
after it. All the other URL string are being successfully parsed and
the "><b>somestuff" is gone, but not that first one.
So currently I am just not using that first element, and the program
works fine without it. But I want that first URL because I am trying
to track the effect of positive-connotation words versus
negative-connotation words on the stock market indicators, adn that
first URL to be parsed is the newest news story.
Sorry, but the editor is not kind to the code below.
Thanks.
#!/usr/local/bin/perl
use LWP::Simple;
print "Content-type:text/html\n\n";
####################
if(open(KILLFILE1, "killfile1.txt"))
{
$line= <KILLFILE1>;
if ($line == 1)
{
##########
if(open(DATAFILEA, ">>datafileA.txt"))
{
print DATAFILEA "\n 1:", time, " \n";
$time1=time;
do{#start of do while < time span
$webpage =
"http://finance.yahoo.com/?u";
$content = get($webpage);
#####################
$blank="";
@array1= split(/href=/,$content);
$num_links=0;
$num_words = @array1;
while($num_words > 0)
{
if($array1[$num_words-1]
=~/^http\:\/\/biz\.yahoo\.com\/rb\/.*/)
{
$array1[$num_words-1] =~
s/>.*$/$blank/;
$somestring=
$array1[$num_words-1];
$array2[$num_links]=$somestring;
$num_links++;
}
$num_words--;
}#end while $num_words > 0
##########################
@words= ("recession", "rates",
"profits", "losses",
"down", "up", "bull", "bear",
"trading", "soar","drop", "bush", "greenspan", "slowing", "slowdown",
"bears", "bulls", "bearish", "bullish","confidence",
"confident","shaky", "fear", "capital", "economy", "positive",
"negative", "failing", "fail", "rising", "falling", "prices",
"consumer confidence", "rally", "rallied" );
while($num_links > 1){#cycle thru each
link
# print "\n";
# print
$array2[$num_links-1];
$page = $array2[$num_links-1];
$content2 = get($page);
$count =0;$pos=0;
$wordcounter= @words;
print DATAFILEA ("For link:
");
print DATAFILEA
($array2[$num_links-1], "\n");
($secs, $mins, $hrs,$days,
$mons, $yr)=(localtime)[0,1,2,3,4,5];
print DATAFILEA ("\nDate and Time: ",$mons+1, " ",$days," ",$yr+1900,"
",
$hrs," ",$mins,"
",$secs, "\n");
#for each each link, count
each word of interest
while($wordcounter >0){
while
(($pos=index($content2,$words[$wordcounter-1] ,$pos))
!= -1){
$count++; $pos++;
}#end while loop to
count a word
print DATAFILEA
($words[$wordcounter-1]);
print DATAFILEA ("
");
print DATAFILEA
($count, " ");
$count=0; $pos=0;
$wordcounter--;
}#end while $wordcounter > 0
$num_links--;
}#end while $num_links >1
sleep(60);
$time2=time;
}while($time2 < ($time1+600));#end do while loop
print "\n", "it's over!";
close(DATAFILEA);
}#end if datafileA successfully opened
else{
print"\n", "datafileA not opened!","\n";
}
close(DATAFILEA);
}#end if killfile == 1
else{
print "\n", "killed by killfile", "\n";
}
close(KILLFILE1);
}#end if KILLFILE1 successfully opened
else
{
print "\n", "Killfile1 did not open!","\n";
}
########END of FILE################
------------------------------
Date: 28 Feb 2001 03:21:15 GMT
From: anotherway83@aol.com (The Mosquito ScriptKiddiot)
Subject: regex help please
Message-Id: <20010227222115.05068.00000607@ng-ch1.aol.com>
hey
this problem has been bugging me for a long time and i haven't been able to
solve it hence this post
the problem is that i need to simplify sumthing like
( ( ( ( ( x ) ) ) ) )
into just ( x )
this sounded easy to me, but the complications just kept adding up
now, the things in between an open bracket and a close bracket could include a
combination of the following 10 characters
+ (the plus sign)
- (the minus sign)
^ (the raised-to-the-power-of sign)
/ (the divide sign)
* (the multiplication sign)
. (a decimal point)
( (an open bracket)
) (a close bracket)
x (any letter of the alphabet, but for now lets assume it's only x)
n (any number)
so for instance i need to simplify sumthing like
( ( ( ( ( ( x - ( 2 + ( 2x + 3 ( 4x + 5 ) - 8.345 ) ^ - 4 * ( 5 - x / 2 ) + 4 )
- 6 ) ) ) ) ) )
into just ( x - ( 2 + ( 2x + 3 ( 4x + 5 ) - 8.345 ) ^ - 4 * ( 5 - x / 2 ) + 4 )
- 6 )
those extra brackets are the ones i need to get rid of
lets say the whole thing is stored in a variable called $poly
so i tried
$poly=~ s/ \ ( ( \ ( . * \ ) ) \ ) / $1 /x;
of course that did NOT work if i had sumthing like
( ( ( x ) ) ) ( ( ( x ) ) )
because it would match the last pair of close brackets
u'll mite say thats because of "greediness", but i think i tried limiting
it....i doubt that will work, because new problems seem to crop up when u do
that....but if any of u can suggest a regexp of any kind that u'll think might
work, please do
it mite sound like im asking my work to be dun for me....but trust me, as a
newbie i've tried everything!!! (within my knowledge, that is)
thanks for any help
The Mosquito ScriptKiddiot
Championing the Cause of Mosquitoes in Technology
------------------------------
Date: Wed, 28 Feb 2001 04:31:57 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: regex help please
Message-Id: <3A9C7FA0.22EFEE57@rochester.rr.com>
The Mosquito ScriptKiddiot wrote:
>
> hey
>
> this problem has been bugging me for a long time and i haven't been able to
> solve it hence this post
>
> the problem is that i need to simplify sumthing like
>
> ( ( ( ( ( x ) ) ) ) )
>
> into just ( x )
>
> this sounded easy to me, but the complications just kept adding up
>
> now, the things in between an open bracket and a close bracket could include a
> combination of the following 10 characters
>
> + (the plus sign)
> - (the minus sign)
> ^ (the raised-to-the-power-of sign)
> / (the divide sign)
> * (the multiplication sign)
> . (a decimal point)
> ( (an open bracket)
> ) (a close bracket)
> x (any letter of the alphabet, but for now lets assume it's only x)
> n (any number)
>
> so for instance i need to simplify sumthing like
>
> ( ( ( ( ( ( x - ( 2 + ( 2x + 3 ( 4x + 5 ) - 8.345 ) ^ - 4 * ( 5 - x / 2 ) + 4 )
> - 6 ) ) ) ) ) )
>
> into just ( x - ( 2 + ( 2x + 3 ( 4x + 5 ) - 8.345 ) ^ - 4 * ( 5 - x / 2 ) + 4 )
> - 6 )
>
> those extra brackets are the ones i need to get rid of
>
> lets say the whole thing is stored in a variable called $poly
>
> so i tried
>
> $poly=~ s/ \ ( ( \ ( . * \ ) ) \ ) / $1 /x;
>
> of course that did NOT work if i had sumthing like
>
> ( ( ( x ) ) ) ( ( ( x ) ) )
>
> because it would match the last pair of close brackets
>
> u'll mite say thats because of "greediness", but i think i tried limiting
> it....i doubt that will work, because new problems seem to crop up when u do
> that....but if any of u can suggest a regexp of any kind that u'll think might
> work, please do
>
> it mite sound like im asking my work to be dun for me....but trust me, as a
> newbie i've tried everything!!! (within my knowledge, that is)
>
> thanks for any help
>
> The Mosquito ScriptKiddiot
> Championing the Cause of Mosquitoes in Technology
You might find the following sub useful. It creates a regex that
matches a string that is balanced with respect to parentheses, provided
the parens are not nested deeper than a depth specified by the
argument. This should aid you in deciphering your strings. Call it
with something like:
$bal=make_parenmatching_regex(10);
and use the resulting string in regex'es like:
$string=~/($bal)/;
for example. The pattern will seek first the leftmost, then the longest
string which is balanced with respect to parentheses.
#
## Given DEPTH, return a regex which will match a string with up
## to DEPTH levels of nested parens.
##
sub make_parenmatching_regex {
local($depth) = @_;
local($nonparen) = '[^()]';
"($nonparen|\\(" x $depth . "$nonparen*" . '\))*' x ($depth-1) .
'\))+';
}
(Thanks to Jeffrey Friedl for the original code)
You might also consider one of the parser modules, like maybe
Parse::Yapp.
If you want to just restrict the problem to removing pairs of leading (
and trailing ), maybe something like:
while($string=~s/^\((.*)\)$/$1/){};
would work? In your examples, you'd have to crunch out the spaces
first, maybe with something like:
$string=~s/\s+//g;
--
Bob Walton
------------------------------
Date: 28 Feb 2001 02:39:39 GMT
From: newsone@cdns.caNOSPAM
Subject: Re: Regexp to match Web urls?
Message-Id: <97hodb$miv$1@news.netmar.com>
In article <971bpq$l33$1@panix3.panix.com>, Clay Shirky <clays@panix.com>
writes:
>I need the canonical regexp to match urls beginning with http:// (I
>don't need to worry about ftp:, telnet: or mailto:, in other words)
>and though I don't want to roll my own, Google searches of the form
>
> regexp url http
>
>are useless because url and http appear everywhere.
>
>Any pointers appreciated.
>
>-clay
$line = "http://www.help.com";
if ($line =~ /^http:\/\//) {
print "Yes, it starts with http://\n";
}
Is that what you mean?
Eric
----- Posted via NewsOne.Net: Free (anonymous) Usenet News via the Web -----
http://newsone.net/ -- Free reading and anonymous posting to 60,000+ groups
NewsOne.Net prohibits users from posting spam. If this or other posts
made through NewsOne.Net violate posting guidelines, email abuse@newsone.net
------------------------------
Date: Wed, 28 Feb 2001 08:55:14 +0700
From: "David Taylor" <DavidTaylor@yes.co.th>
Subject: Scalars and Arrays
Message-Id: <97hmem$sg3$1@news.loxinfo.co.th>
Could someone help me please with getting a scalar value back into an array?
Extract from my program:
### Read in the data file - works ok.
open (FILE,"<data.db") or die "Unable to open data.db. \nReason: $!";
while (<FILE>) { $db .= $_; }
close (FILE);
### Then Sort it, etc. here, subroutine returns $sorted, - works ok.
### Cannot seem to get $sorted back into an array, have tried umpteen
### ways, such as.... push @{lines},$sorted;
### So have resorted to writing the file to disk and reading it back
### in again, which works fine, but can't be the right way to do it!
#######################################################
open (DATA,">temp.txt") or die "Unable to open temp.txt \nReason: $!";
print DATA $sorted; close DATA;
open (DATA,"<temp.txt") or die "Unable to open temp.txt \nReason: $!";
(@lines) = <DATA>; close (DATA);
#######################################################
### Write the file to disk - works ok.
open (DATA,">data.csv") or die "Unable to open data.csv \nReason: $!";
foreach $line(@lines) {
@fields = split(/\|/,$line);
print DATA "\"$fields[22]\",\"$fields[23]\",\"$fields[24]\"\n";
$mcount++;}
close DATA;
### END
------------------------------
Date: Wed, 28 Feb 2001 04:44:27 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: Scalars and Arrays
Message-Id: <3A9C8285.E9208315@rochester.rr.com>
David Taylor wrote:
>
> Could someone help me please with getting a scalar value back into an array?
> Extract from my program:
> ### Read in the data file - works ok.
> open (FILE,"<data.db") or die "Unable to open data.db. \nReason: $!";
> while (<FILE>) { $db .= $_; }
> close (FILE);
>
> ### Then Sort it, etc. here, subroutine returns $sorted, - works ok.
>
> ### Cannot seem to get $sorted back into an array, have tried umpteen
> ### ways, such as.... push @{lines},$sorted;
> ### So have resorted to writing the file to disk and reading it back
> ### in again, which works fine, but can't be the right way to do it!
> #######################################################
> open (DATA,">temp.txt") or die "Unable to open temp.txt \nReason: $!";
> print DATA $sorted; close DATA;
>
> open (DATA,"<temp.txt") or die "Unable to open temp.txt \nReason: $!";
> (@lines) = <DATA>; close (DATA);
> #######################################################
>
> ### Write the file to disk - works ok.
> open (DATA,">data.csv") or die "Unable to open data.csv \nReason: $!";
> foreach $line(@lines) {
> @fields = split(/\|/,$line);
> print DATA "\"$fields[22]\",\"$fields[23]\",\"$fields[24]\"\n";
> $mcount++;}
> close DATA;
> ### END
What you are trying to do is very unclear from the above. However, I'll
bet the sub you are calling is returning a reference to an array. This
reference could be dereferenced with:
@array=@$sorted;
Then the elements of the array can be picked off with $array[0]
$array[1] etc. Or you could get them directly via $$sorted[0],
$$sorted[1], etc.
--
Bob Walton
------------------------------
Date: Tue, 27 Feb 2001 22:02:58 -0500
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Search results by filename
Message-Id: <slrn99oqj2.h92.tadmc@tadmc26.august.net>
Chris <chris002@nc.rr.com> wrote:
>
>I am trying to locate a perl script (that runs on NT - I have one for UNIX)
If it was written carefully, then it should work on both platforms.
What happens when you try the program that you do have on NT?
Have you tried it on NT?
>that will allow a web based search of a directory (plus all of it's
>subdirectories) and all its files based solely on the filenames
>If anyone can point me in the direction of a script that can perform this
>function I would be most appreciative.
That is so simple to do in Perl that I doubt that it would be
worth saving. You just write the few lines it takes whenever
you need them.
You need a module to do the dir walking for you:
perldoc File::Find
You need a regular expression to match whatever it is you want to match:
perldoc perlre
You need an "i" option on the pattern match operator:
perldoc perlop
-------------------------
#!/usr/bin/perl -w
use strict;
use File::Find;
my $term = 'picture';
find( \&find_match, '.' );
sub find_match { print "$_\n" if /$term/i };
-------------------------
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 372
**************************************