[18174] in Perl-Users-Digest
Perl-Users Digest, Issue: 342 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Feb 23 09:10:31 2001
Date: Fri, 23 Feb 2001 06:10:18 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <982937417-v10-i342@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Fri, 23 Feb 2001 Volume: 10 Number: 342
Today's topics:
Not all HREF's vals getting replaced with my val's in s <angel@reflex-point.com>
Re: Not all HREF's vals getting replaced with my val's <c_clarkson@hotmail.com>
Re: Not all HREF's vals getting replaced with my val's <nouser@emailunwelcome.com>
Re: perl-mode in emacs: nested sub's (Kai =?iso-8859-1?q?Gro=DFjohann?=)
Re: perl-mode in emacs: nested sub's (Kai =?iso-8859-1?q?Gro=DFjohann?=)
Re: Persistent pages? <tore@extend.no>
Re: Problem with Sockets (Anno Siegel)
Re: question about arrays (Abigail)
Re: Reading from __DATA__ <mitiaNOSPAM@northwestern.edu.invalid>
Re: Regexp to match Web urls? (Damian James)
Re: Regexp to match Web urls? (Abigail)
Still Outlook Express (was Re: Help!) egwong@netcom.com
submit formbox to perlscript <florian.handle@chello.at>
Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Thu, 22 Feb 2001 23:18:03 -0600
From: "}ANGEL{" <angel@reflex-point.com>
Subject: Not all HREF's vals getting replaced with my val's in s/[HREF val]/[My val]/g...?
Message-Id: <4pml6.63$eS2.183580@nnrp3.sbc.net>
Calling all Saints:
In the following code, I read all HREF's in an HTML file and attempt to
replace the value of the HREF. In this code, I spit the HREF's out to
my browser so I can see that the script is picking up all of the
HREF's...and it does this flawlessly. But when I s/[HREF value]/[my
value]/g it works on some, but not all HREF's. The code is a bit
choppy because I am a little green. Perhaps my explanation of the
problem is as well. However, any moderately helpful conjecture is
sincerely appreciated. The following the complete script:
#!/usr/bin/perl5.004
use CGI::Carp qw(fatalsToBrowser);
require "cgi-lib.pl";
&ReadParse;
print &PrintHeader;
open(NewDbFile,"</home/xpedite-houston/www/upload/Message_Body.htm") ||
&ShitBiskits;
flock(NewDbFile, 2);
my @links = <NewDbFile>;
flock(NewDbFile, 8);
close(NewDbFile);
my $doMe;
foreach (@links) {
$doMe .= $_;
while ($_ =~ m/(href="\S+")/g) { #<-----Get the HREF's
push(@listOlinks,$1);
}
}
foreach (@listOlinks) {
print $_."<BR>\n";
$doMe =~ s/$_/href\="\[pull name\=$n.html\]"/; #<---Replace the HREF's
open(clickTo,">/home/xpedite-houston/www/upload/$n.html") ||
&ShitBiskits;
flock(clickTo, 2);
print clickTo "<html><head><title>Loading...</title><meta
http-equiv=\"REFRESH\" content=\"0; URL=$_\"></head></html>" ;
flock(clickTo, 8);
close(clickTo);
$n++;
}
open(NewFile,">/home/xpedite-houston/www/upload/newMessage.htm") ||
&ShitBiskits;
flock(NewFile, 2);
print NewFile $doMe;
flock(NewFile, 8);
close(NewFile);
print "All Done!";
exit();
Sincere & Drowning,
ANGEL
------------------------------
Date: Fri, 23 Feb 2001 02:07:59 -0600
From: "Charles K. Clarkson" <c_clarkson@hotmail.com>
Subject: Re: Not all HREF's vals getting replaced with my val's in s/[HREF val]/[My val]/g...?
Message-Id: <C87EB5F64E5F7C56.7181C8B4F864533A.187C11D0C3719A87@lp.airnews.net>
"}ANGEL{" <angel@reflex-point.com> wrote in message
news:4pml6.63$eS2.183580@nnrp3.sbc.net...
: Calling all Saints:
:
: In the following code, I read all HREF's in an HTML file and attempt to
: replace the value of the HREF. In this code, I spit the HREF's out to
: my browser so I can see that the script is picking up all of the
: HREF's...and it does this flawlessly. But when I s/[HREF value]/[my
: value]/g it works on some, but not all HREF's. The code is a bit
: choppy because I am a little green. Perhaps my explanation of the
: problem is as well. However, any moderately helpful conjecture is
: sincerely appreciated. The following the complete script:
:
: #!/usr/bin/perl5.004
You should really, really, really, really, use strict
and warnings. It will really jump start your learning curve.
#!/usr/bin/perl5.004 -w
use strict;
use diagnostics;
:
: use CGI::Carp qw(fatalsToBrowser);
: require "cgi-lib.pl";
: &ReadParse;
: print &PrintHeader;
:
: open(NewDbFile,
: "</home/xpedite-houston/www/upload/Message_Body.htm") ||
: &ShitBiskits;
: flock(NewDbFile, 2);
: my @links = <NewDbFile>;
: flock(NewDbFile, 8);
: close(NewDbFile);
:
: my $doMe;
: foreach (@links) {
: $doMe .= $_;
: while ($_ =~ m/(href="\S+")/g) { #<-----Get the HREF's
Are you sure all the hrefs use double quotes? HTML allows
single quotes as well.
HTH,
Charles K. Clarkson
: push(@listOlinks,$1);
: }
: }
:
: foreach (@listOlinks) {
: print $_."<BR>\n";
: $doMe =~ s/$_/href\="\[pull name\=$n.html\]"/; #<---Replace the HREF's
: open(clickTo,">/home/xpedite-houston/www/upload/$n.html") ||
: &ShitBiskits;
: flock(clickTo, 2);
: print clickTo "<html><head><title>Loading...</title><meta
: http-equiv=\"REFRESH\" content=\"0; URL=$_\"></head></html>" ;
: flock(clickTo, 8);
: close(clickTo);
: $n++;
: }
:
: open(NewFile,">/home/xpedite-houston/www/upload/newMessage.htm") ||
: &ShitBiskits;
: flock(NewFile, 2);
: print NewFile $doMe;
: flock(NewFile, 8);
: close(NewFile);
:
: print "All Done!";
:
: exit();
:
: Sincere & Drowning,
: ANGEL
:
:
:
------------------------------
Date: Fri, 23 Feb 2001 06:52:57 -0500
From: Jay Tilton <nouser@emailunwelcome.com>
Subject: Re: Not all HREF's vals getting replaced with my val's in s/[HREF val]/[My val]/g...?
Message-Id: <oeic9toko6b0u4b6sqqvi7k8rmiiq590hv@4ax.com>
"}ANGEL{" <angel@reflex-point.com> wrote:
>But when I s/[HREF value]/[my value]/g it works on some, but not all HREF's.
>$doMe =~ s/$_/href\="\[pull name\=$n.html\]"/; #<---Replace the HREF's
You lost your /g.
Something else to be aware of. Any regex meta-characters in $_ will
be interpreted as part of the regex instead of as literal characters.
A better technique would be
$doMe =~ s/\Q$_\E/href\="\[pull name\=$n.html\]"/g;
------------------------------
Date: 23 Feb 2001 13:16:38 +0100
From: Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai =?iso-8859-1?q?Gro=DFjohann?=)
Subject: Re: perl-mode in emacs: nested sub's
Message-Id: <vaf8zmxzja1.fsf@lucy.cs.uni-dortmund.de>
On Thu, 22 Feb 2001, ivo welch wrote:
>
> looks like I need cperl. I found version 4.24. It loads fine (and
> works indeed much better), but when I try to byte-compile it, I get
>
> Compiling file /tmp/cperl-mode.el at Thu Feb 22 21:17:03 2001
> ** reference to free variable cperl-nonoverridable-face
> While compiling toplevel forms:
> !! End of file during parsing
Looks like an incomplete line.
The pre-Emacs-20.3 autoload statement should work for Emacs 20.7, too.
kai
--
Be indiscrete. Do it continuously.
------------------------------
Date: 23 Feb 2001 13:17:28 +0100
From: Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai =?iso-8859-1?q?Gro=DFjohann?=)
Subject: Re: perl-mode in emacs: nested sub's
Message-Id: <vaf4rxlzj8n.fsf@lucy.cs.uni-dortmund.de>
On Thu, 22 Feb 2001, ivo welch wrote:
>
> the code says that prior to 20.3, one should put ;; (autoload
> 'perl-mode "cperl-mode" "alternate mode for editing Perl programs"
> t) but what do I say in 20.7.1? Just putting it into my
> /usr/lib/emacs/site-lisp/ directory and hoping for autoinvokation
> does not do it.
Sorry. I think this helps:
(add-to-list 'auto-mode-alist '("\\.[pP][lLmM]\\'" . cperl-mode))
kai
--
Be indiscrete. Do it continuously.
------------------------------
Date: Fri, 23 Feb 2001 12:36:22 +0100
From: Tore Aursand <tore@extend.no>
Subject: Re: Persistent pages?
Message-Id: <MPG.1500699e605f38e69898c1@news.online.no>
In article <971m1i$27c$1@news.service.uci.edu>, rwm@techie.com says...
> I'm interested in creating a dynamic page that can be appended to,
> preserving the previous content
I would have written the _data_ (not the file itself) from the previous
page to a file, and eventually read from this file.
--
Tore Aursand - tore@extend.no - http://www.extend.no/~tore/
------------------------------
Date: 23 Feb 2001 13:35:17 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Problem with Sockets
Message-Id: <975oul$82h$1@mamenchi.zrz.TU-Berlin.DE>
According to Cornelius Siller <corni99@nibbles.de>:
> Hi!
>
> I've tried to set up a bidirectional socket connection between a server
> and a client, both programs running on the same computer.
>
> I want to send some data from the server to the client at first, like a
> question, and then have the client respond to that question. My problem
> is, the client doesn't stop waiting for data from the server, even when
> the server has sent a newline character.
You have that backwards. It's the server that sits there and
waits for a client to connect. The difference in the code for
the server vs. that for the client reflects that. You can't
reverse the roles at a whim.
> How do I get the client to realize that the server is done sending?
>
> On the server side I've used the following code:
>
> $proto = getprotobyname('tcp');
> socket(Server, PF_INET, SOCK_STREAM, $proto) or die "socket: $!";
> bind(Server, sockaddr_in($port, INADDR_ANY)) or die "bind: $!";
> listen(Server, SOMAXCONN) or die "listen: $!";
> $SIG{CHLD} = \&REAPER;
> $paddr = accept(Client, Server);
> $|=1;
That won't help any. $| is a per-filehandle variable and pertains
to the currently selected filehandle. So you want something like
this:
{ my $sel_sav = select( Server); $| = 1; select( $sel_sav) }
> print Client "aaaaaaaaaaaaa\n";
> $num = <Client>;
This won't work. The client must take the initiative.
> Client side:
>
> $proto = getprotobyname('tcp');
> socket(SOCK, PF_INET, SOCK_STREAM, $proto) or die "socket: $!";
> connect(SOCK, $paddr) or die "connect: $!";
> $line = <SOCK>;
> print $line;
> $|=1;
See above.
> print SOCK "blabla\n";
Anno
------------------------------
Date: 23 Feb 2001 09:21:45 GMT
From: abigail@foad.org (Abigail)
Subject: Re: question about arrays
Message-Id: <slrn99cat9.j9m.abigail@tsathoggua.rlyeh.net>
John Hamm (johnhamm@wpi.edu) wrote on MMDCCXXXIII September MCMXCIII in
<URL:news:Pine.OSF.4.30.0102222015030.26728-100000@grover.WPI.EDU>:
// Hi there,
// I'm using GD for creating images and for some weird reason it won't accept
// an array that I create like:
//
// for ($i=0;$i<10;$i++)
// {
// $myarray[$i] = $i;
// }
Define "won't accept". Do you get an error?
// it has to be created like:
//
// @myarray = [
// 0,
// 1,
// 2,
// 3,
// 4,
// 5,
// 6,
// 7,
// 8,
// 9,
// ];
That's a different array. The for loop creates an array with 10 elements,
the latter an array with one element, that element is a reference to
an array containing 10 elements.
The for loop can be written as: @myarray = ( 0 .. 9 );
The second one can be written as: @myarray = ([0 .. 9]);
Big difference.
Abigail
--
echo "==== ======= ==== ======"|perl -pes/=/J/|perl -pes/==/us/|perl -pes/=/t/\
|perl -pes/=/A/|perl -pes/=/n/|perl -pes/=/o/|perl -pes/==/th/|perl -pes/=/e/\
|perl -pes/=/r/|perl -pes/=/P/|perl -pes/=/e/|perl -pes/==/rl/|perl -pes/=/H/\
|perl -pes/=/a/|perl -pes/=/c/|perl -pes/=/k/|perl -pes/==/er/|perl -pes/=/./;
------------------------------
Date: Fri, 23 Feb 2001 01:53:40 -0600
From: Dmitry Epstein <mitiaNOSPAM@northwestern.edu.invalid>
Subject: Re: Reading from __DATA__
Message-Id: <3A961704.5D09982@northwestern.edu.invalid>
Thanks everyone for replies. It works now.
> After the seek, do:
>
> 1 until <> eq "__DATA__\n";
>
> Abigail
OK, I am not a Perl expert, so let me ask a stupid question: doesn't <>
read from STDOUT or a file from the parameter list? You obviously meant
it to read from the source file.
--
Dmitry Epstein
Northwestern University, Evanston, IL. USA
mitia@northwestern.edu
------------------------------
Date: 23 Feb 2001 06:59:46 GMT
From: damian@qimr.edu.au (Damian James)
Subject: Re: Regexp to match Web urls?
Message-Id: <slrn99c2ht.r21.damian@puma.qimr.edu.au>
Thus spake Abigail on 21 Feb 2001 22:55:12 GMT:
>...
>
>Sorry, that allows too much.
>
>Here's a better one (it cheats on ldap:// though) (remove the newlines):
>
>(?:http://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.
>)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)
>){3}))(?::(?:\d+))?)(?:/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F
>...
yikes! <kersplash>
<sounds in the fog>
<distant_voice value=1>
arr cap'n, a mighty noise afor the starboard bow
</distant_voice>
<distant_voice value=2>
hard a port, matey - thar be monsters
</distant_voice>
--
#!/usr/bin/perl -w
use strict;$|=1;$:=79;for $; (split//,<DATA>){print" "x($:-$_),
$;,"\x"x600,"\b"x($:-$_+1)for 0..--$:;print$;}; __END__
and that was how they captured the monster from the sea
------------------------------
Date: 23 Feb 2001 08:24:29 GMT
From: abigail@foad.org (Abigail)
Subject: Re: Regexp to match Web urls?
Message-Id: <slrn99c7ht.i3g.abigail@tsathoggua.rlyeh.net>
Eli the Bearded (elijah@workspot.net) wrote on MMDCCXXXIII September
MCMXCIII in <URL:news:eli$0102221917@qz.little-neck.ny.us>:
__ In comp.lang.perl.misc, Abigail <abigail@foad.org> wrote:
__ > Eli the Bearded (elijah@workspot.net) wrote on MMDCCXXXI September
__ > MCMXCIII in <URL:news:eli$0102211629@qz.little-neck.ny.us>:
__ > "" In comp.lang.perl.misc, Clay Shirky <clays@panix.com> wrote:
__ > "" > I need the canonical regexp to match urls beginning with http:// (I
__ > "" > don't need to worry about ftp:, telnet: or mailto:, in other words)
__ > "" > and though I don't want to roll my own, Google searches of the form
__ > "" Maybe not cannonical, but
__ > "" @parts =
__ > "" m,\b
__ > "" (http) # scheme
__ > "" ://(?:
__ > "" (?:
__ > "" ([^:@/\s]+) # username -- if password in URL
__ > "" :)?
__ > "" ([^:@/\s]+) # username if no password -- otherwise password
__ > "" @
__ > "" )?
__ > "" ([^:@/\s]+) # hostname
__ > "" (?: :
__ > "" (\d+) # port number
__ > "" )?
__ > "" ( # URI start
__ > "" (/ [^\s"'>?]*) # file part
__ > "" (?: \?
__ > "" ( [^\s"'>]* ) # CGI args
__ > "" )?
__ > "" )? # URI end
__ > "" ,x;
__ > Sorry, that allows too much.
__
__ What does it allow? A sample URL please.
http://foo.bar.baz/%not!
is not a valid URL, yet it is entirely consumed by your regex.
Other false positives:
http://foo.bar.baz/<<<
http://127.0.0.1/{nope}
or anything contain control characters, or characters outside of ASCII.
__ > Here's a better one (it cheats on ldap:// though) (remove the newlines):
__
__ Use /x.
Eh, no. /x removes whitespace, but it gets confused when there's whitespace
inside tokens.
__ > (?:http://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.
__ > )*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)
__ > ){3}))(?::(?:\d+))?)(?:/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F
__ > \d]{2}))|[;:@&=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{
__ > 2}))|[;:@&=])*))*)(?:\?(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{
__ > 2}))|[;:@&=])*))?)?)
__
__ That's the only part needed for http URLs, the rest is different shemes.
__
__ I notice also that this is somewhat strictly following the RFCs in
__ some regards, rather than following what works in practice. But it
__ still doesn't catch things like bad IP addresses.
RFC 1738 doesn't define "good" and "bad" IP addresses. As for ignoring
standards and reverse engineer the utter shit of Netscape and Microsoft,
please say no. Let's not help them rape the standards for their own
profits.
__ my $re = qr^$httppart^x;
__
__ while(<>) {
__ if(/($re)/) {
__ print "Matched <$1>\n";
__ } else {
__ print "No match\n";
__ }
__ }
__
__ http://eli:S3CrET@some.machine:8080/cgi/readmail?mess=17&f=1
__ Matched <http://eli>
RFC 1738 clearly states HTTP URLs do not have usernames and paswords.
__ http://2130706433/2130706433-is-127.0.0.1-as-one-number/
__ No match
2130706433 is not a valid IP address. If it works for you, it's just
because your system libraries are nice. It's not valid by RFC 1738.
__ http://18918181.1818181.23932939.83883:888888888888888/
__ Matched <http://18918181.1818181.23932939.83883:888888888888888/>
Valid according to RFC 1738.
__ So I fail to see the advantage of this for http scheme URLs over my
__ RE, which does not claim to be strictly RFC compliant and would
__ match all three of these.
The advantage is that my regex uses a standard as a basis. Your just
matches some set of strings, which may or not may coincide what "work"
on some browser/server/platform combinations. Your idea of what's a
URL may or may not match someone elses idea. What's the point of having
standards when everyone makes their own?
/.*/s
is nice and simple and will match any URL that works on at least one
browser. I doubt it's useful.
Abigail
--
package Just_another_Perl_Hacker; sub print {($_=$_[0])=~ s/_/ /g;
print } sub __PACKAGE__ { &
print ( __PACKAGE__)} &
__PACKAGE__
( )
------------------------------
Date: 23 Feb 2001 06:01:02 GMT
From: egwong@netcom.com
Subject: Still Outlook Express (was Re: Help!)
Message-Id: <974uau$7ca4$1@newssvr05-en0.news.prodigy.com>
Orville Bullitt III <o.bullittiii@worldnet.att.net> wrote:
> ERic <egwong@netcom.com> wrote:
>> Orville Bullitt III <o.bullittiii@worldnet.att.net> wrote:
>>>
>>> I hope that this is the correct NG to ask this question:
>>>
>>> I'm using Microsoft's Outlook Express 5.50.
>>>
>> No, it is not. I can't even, in my wildest dreams,
>> being to imagine why you think it might be. You might try
>> news://microsoft.public.outlook.usage or asking tech support at
>> www.microsoft.com
>
> I wrote in this NG because a friend of mine told me that he **THOUGHT**
> perl could do it but he had no idea how to do it. He's over 85 years old and
> I'm over 90.
As I said, perl *can* do it, but what does that have to do with Outlook
Express? *Traditionally*, one lurks on a newsgroup for a couple of days
before one posts, just to make sure it's the right place. In real life,
you wouldn't walk into a random store downtown and ask for a hammer,
would you? No, you'd look around and make sure you're in a hardware
store first.
> I guess that this NG is NOT the place where one can get help.
Well, at least not the kind of help you're looking for. I certainly
could have just written "No, piss off" rather than actually looking for
and finding a better newsgroup for you, but, well, perhaps I'm just
a sucker for punishment.
My previous "Good luck" was neither meant to be ironic or facetious,
and neither is this one:
Good luck.
ERic
------------------------------
Date: Fri, 23 Feb 2001 12:45:18 GMT
From: "Florian Handle" <florian.handle@chello.at>
Subject: submit formbox to perlscript
Message-Id: <yXsl6.68215$GC5.2399552@news.chello.at>
How can I submit a form (by clicking the submitbutton) to a Perlscript and
write the value of a formbox into a textfile. The site shouldn't change if
this is possible.
Thanks
------------------------------
Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 342
**************************************