[7134] in Perl-Users-Digest
Perl-Users Digest, Issue: 759 Volume: 8
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Jul 23 22:07:55 1997
Date: Wed, 23 Jul 97 19:00:42 -0700
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 23 Jul 1997 Volume: 8 Number: 759
Today's topics:
1 <webmaster@restaurantdigest.com>
Cant seem to reset $1, $2 $3... (Kevin Swope)
Re: CGI.pm file uploads with Internet Explorer <asparks@harris.com>
Check THIS out <Alphabet@letters.edu>
Re: Checking for valid Email... <maelstrom@deathsdoor.com>
Re: Checking for valid Email... (Gabor)
Re: Checking for valid Email... (Brian - DKOnline)
Re: Checking for valid Email... <maelstrom@deathsdoor.com>
Re: Frame Maker to html conversion <kluff@enterprise.net>
Re: how to do system(@array) with backticks (Ilya Zakharevich)
How to get rid of junkmail <klausf@mucsun.sps.mot.com>
Newbie: Backtracking with regexp (Stefan Berglund)
Re: Parsing a list of strings <mark@tstonramp.com>
Re: Perl Scripts and Windows NT <rootbeer@teleport.com>
Re: Please Help: Pattern Matching (Tad McClellan)
Q: e other thing I have to do iefficient way to use sub <klausf@mucsun.sps.mot.com>
Re: Regex: Email address format <merlyn@stonehenge.com>
Re: Regex: Email address format <usenet-tag@qz.little-neck.ny.us>
Re: Regex: Email format <sfairey@adc.metrica.co.uk>
Re: Regex: Email format <cherold@pathfinder.com>
SOLVED: Writing to a file descriptor (Dragomir R. Radev)
Re: Sorting (Sanjay Varma)
stdout problem <ksato@mda.ca>
Re: sybase extenstions to perl (Sybase: :DBlib) (Dave Cross)
text processing <demirkol@sgs-server.Stanford.EDU>
Re: The regex that doesn't want to work!@#%_(_)^ <rootbeer@teleport.com>
Digest Administrivia (Last modified: 8 Mar 97) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Tue, 22 Jul 1997 20:32:04 +0000
From: Mark McDonald <webmaster@restaurantdigest.com>
Subject: 1
Message-Id: <33D518C2.18A7@restaurantdigest.com>
1
------------------------------
Date: 23 Jul 1997 17:27:44 GMT
From: obsidian@shore.net (Kevin Swope)
Subject: Cant seem to reset $1, $2 $3...
Message-Id: <5r5eug$4dq@fridge-nf0.shore.net>
what am I doing wrong.
I need to reset the backreference variables but nothing is happening.
is reset the wrong function to use. I know I just cant go $1="";
here's a test script I'm using, but I took out the reset.
----------------------------------------------------------------
$TheString="abc";
$TheString=~ s/((a)(b)(c))/$1/x;
print "\n\n$TheString\n\n";
print "\n\n\$1=$1\n\n";
print "\n\n\$2=$2\n\n";
print "\n\n\$3=$3\n\n";
print "\n\n\$4=$4\n\n";
print "############################################";
I want to reset here!!!!!!!!!!!!!!!
$TheString="a---b---c";
$TheString=~ s/((a)(b)(c))/$1/x;
print "\n\n$TheString\n\n";
print "\n\$1=$1\n\n";
print "\n\$2=$2\n\n";
print "\n\$3=$3\n\n";
print "\n\$4=$4\n\n";
------------------------------
Date: Wed, 23 Jul 1997 13:55:35 -0700
From: Alan Sparks <asparks@harris.com>
To: Duncan Halstead <duncan.halstead@symbios.com>
Subject: Re: CGI.pm file uploads with Internet Explorer
Message-Id: <33D66FC7.67EC@harris.com>
CGI.pm will work with both, *if* you install the File Upload Add-on for
IE 3.0.
Find it at URL http://www.microsoft.com/ie/download/?/ie/download/ie.htm
For the Win95/NT platform, look for the Internet Explorer 3.02 File
Upload Add-on.
-Alan
Duncan Halstead wrote:
>
> Does anyone know if there is an easy way to do file uploads with
> Internet Explorer? CGI.pm makes file uploads really easy assuming that
> you are using netscape 2.0 or better, but I have seen nothing similar
> for Internat Explorer.
>
> Thanks
>
> --
> ******************************************************************
> Duncan Halstead
> Integration Test Engineer
> Tools & Libraries
> Symbios Logic
> 2001 Danfield Court Fax : (970) 225-4829
> Fort Collins, CO 80525 Ph # : (970) 223-5100 x 9032
> ******************************************************************
--
Alan Sparks, IS Engineering Support asparks@harris.com
Harris Network Support Systems, Camarillo CA 93012 (805) 389-2430
------------------------------
Date: Wed, 23 Jul 1997 04:47:23 -0400
From: Letter-Man <Alphabet@letters.edu>
Subject: Check THIS out
Message-Id: <33D5C51B.4CAA@letters.edu>
look at what this kindly person wrote for perl/irc'ers
it runs beautiful, only one problem..it doesnt log into any
irc.servers.domain....could someone help me figgure this one out?
#!/usr/bin/perl
$| = 1;
warn("Dbot2---nicknamed scary");
while(<>) {
warn("$_");
if (/End of \/MOTD command\./) {
last;
}
if (/Henson/) {
exit(-1);
}
}
warn("Motd finished.\n");
print( "/join #test\n" );
warn("Join issued.\n");
while(<>) {
if ( /has joined/ ) {
last;
}
}
warn("Join complete. Now monitoring traffic...");
$regs="false";
$master="";
$pubs="true";
$noop="false";
while(<>) {
chop;
if ( /<(.+)> (.*)/ ) {
$nick=$1;
$text=$2;
$type="public";
&pubhandl($nick,$text);
}
if ( /\*(.+)\* (.*)/ ) {
$nick=$1;
$text=$2;
$type="private";
&privhandl($nick,$text);
}
if ( /\*\*\* (.+) (.+) has joined channel (.+)/ ) {
$nick=$1;
$login=$2;
$channel=$3;
&joinhandl($nick,$login,$channel);
}
if ( /\*\*\* Mode change "(.+) (.+)" on channel (.+) by (.+)/ )
{
$typ=$1;
$nick=$2;
$channel=$3;
&modehandl($typ,$nick,$channel);
}
}
sub privhandl {
local($nick,$text)=@_;
if ($nick eq $master) {
if ($regs eq "true") {
if ($text =~ /join (.+)/ ) {
print("/join $1\n");
}
if ($text =~ /gone/ ) {
print("/quit\n");
}
if ($text =~ /akick (.+) (.+)/) {
$anik=$1;
$akik=$2;
print("/notice $master Autokick on
$anik!$akik...\n");
}
if ($text =~ /pubon/) {
$pubs="true";
print("/notice $master Public messages
on.\n");
}
if ($text =~ /ops/) {
$noop="false";
print("/notice $master Ops Allowed.\n");
}
if ($text =~ /noop/) {
$noop="true";
print("/notice $master No ops
allowed.\n");
}
if($text =~ /leave (.+)/) {
print("/leave $1\n");
}
if ($text =~ /dereg/) {
print("/notice $master De
registering...\n");
$master="";
$regs="false";
}
}
} else {
if ($text =~ /pass (.+)/) {
if ($1 eq "k7uyx") {
print("/notice $master Access
gone:$nick\n");
$master=$nick;
$regs="true";
print("/notice $nick You have full
access.\n");
} else {
print("/notice $master Access
try:$nick\n");
print("/notice $nick Access denied.\n");
}
}
if ($text =~ /nopubs/) {
$pubs="false";
print("/notice $nick Now not talking to
channel.\n");
print("/notice $master Nopubs turned on by
$nick.\n");
}
if ($text =~ /help/) {
print("/notice $nick --- Welcome to DbotII
(Dbot2).\n");
print("/notice $nick --- There is no current
help
facility.\n");
}
if ($text =~ /poem (.+)/ ) {
print("/notice $nick Sending poem to $1...\n");
print("/notice $1 $nick sends you the following
poem:\n");
&poem($1);
}
}
}
sub joinhandl {
local($nick,$login,$channel)=@_;
if ($nick eq "davo" ) {
if ($login eq "(noord\@UCS.ORST.EDU)" ) {
print("/mode $channel +o $nick\n");
} else {
print("/notice $channel The person nicknamed
``davo'' who is\n");
print("/notice $channel on this channel is not
the
real davo\n");
print("/notice $channel Please beware that any
actions the davo\n");
print("/notice $channel on this channel takes,
the
real davo\n");
print("/notice $channel is not responsible
for.\n");
}
}
if ($login eq "(sukhiac\@UCS.ORST.EDU)") {
print("/mode $channel +o $nick\n");
}
if (($login eq $akik) || ($nick eq $anik)) {
print("/kick $channel $nick\n");
}
}
sub modehandl {
local($typ,$nick,$channel)=@_;
if (($typ eq "+o") && ($noop eq "true")) {
printf("/mode $channel -o $nick\n");
}
}
sub pubhandl {
local($nick,$text)=@_;
if ($pubs eq "true") {
if ($text =~ /scary/) {
if ($text =~ /hello/) {
print("Hi there, $nick!\n");
}
if (($text =~ /master/) || ($text =~ /owns/) ||
($text =~/owner/)) {
print("Davo owns me!\n");
}
if (($text =~ /shut up/) || ($text =~ /die/)) {
print("My public responses my be stopped
by typing /msg scary nopubs\n");
}
if (($text =~ /who/) || ($text =~ /what/)) {
print("/notice $channel This is Dbot II,
currently nicknamed scary.\n");
print("/notice $channel I am owned by
David Noor, noord\@ucs.orst.edu,\n");
print("/notice $channel IRC nickname
``davo''.\n");
}
}
}
}
sub poem {
local($nick)=@_;
$a=&rst(19,"adjects");
$n=&rst(19,"nouns");
$v=&rst(19,"verbs");
print("/notice $nick $a $n $v\n");
$a=&rst(19,"adjects");
$n=&rst(19,"nouns");
$v=&rst(19,"verbs");
print("/notice $nick as $a $n $v.\n");
}
sub rst {
local($times,$fname)=@_;
open(fhandl,$fname);
$tot=int(rand($times))+1;
$ctr=1;
while($ctr!=$tot) {
$ctr++;
$stng=<fhandl>;
chop $stng;
}
close(fhandl);
return $stng;
}
------------------------------
Date: Wed, 23 Jul 1997 20:40:42 +0800
From: Maelstrom <maelstrom@deathsdoor.com>
Subject: Re: Checking for valid Email...
Message-Id: <33D5FBCA.5F5D@deathsdoor.com>
Simon Fairey wrote:
>
> Firstly I did the same as you, that is a search for 'valid email
> address comp.lang.perl' at DejaNews and the 6th entry had the required
> information!
Well I'm impressed then because I looked at a lot more than six and
didn't find anything. Incidentally the next day I tried again in
comp.infosystems.www.authoring.cgi and found the command in about the
fourth time. All the same I think it took a *lot* longer than it
should've too find a fairly basic command.
>
> Secondly the question of a valid email address is one that IS
> addressed( no pun intended ) in the very Perl FAQ which you refer to.
> As the FAQ says, "see the more specific questions", at that point
> simply searching the FAQ for 'email' might not have been a too
> unreasonable idea.
As I mentioned in another post I don't have access to any sort of search
function in Windows. If the page maintainer ever has the tiem to waste
I think a simple CGI search would be a useful tool. Also I was finding
next impossible to navigate anyway with it jumping between two different
servers and sometimes being nothing more than an FTP index listing. On
top of that it's often down or at least I can't get anything out of it.
I don't know if the problems at my end or theirs but it's often not
workable at all.
>
> Thirdly I have found that provided you do some research beforehand,
> i.e.
> look at DejaNews, ALL the relevant FAQs and shock horror read the
> manual pages, that when you ask your question you will be much more
> likely to get a decent response. To be honest I have seen very few
> posts here that I would term as flames.
I'm the first to advocate doing prior research. My main complaint here
was that I couldn't *do* the prior research due to bad design (IMO) of
the faq and a a surplass of articles at Deja that contained the keywords
but no real information. And as for the flames thing what would you
call it when a question like this...
"Hi. I'm learning Perl and am confused by the structure of Arrays. I've
checked the FAQ and still can't find the answer."
gets a response like this...
"Well.. the first glaring problem I see is that you posted to machines
all over the world without even doing an *inkling* of work for yourself.
<minor snip>
You've told the world, "hey, I haven't read perlsyn(1) or perldata(1).
I didn't even bother to try executing my own code. Why don't you
readers of comp.lang.perl.misc just give me an answer for free? I don't
feel like doing it myself."
This sort of thing, at the very least, should have been confined to
email. It wastes my time, it wastes the original posters time and it
wastes the time of the next person who's researching that topic in Deja
news.
> Finally to your question, ( I am assuming you are v.v.v.new to Perl
> and are not familiar with scripting ).
You assume right sir :) Thanks for the rest of your answer.
--Maelstrom
------------------------------
Date: 23 Jul 1997 15:20:18 GMT
From: gabor@vinyl.quickweb.com (Gabor)
Subject: Re: Checking for valid Email...
Message-Id: <5r57fi$n1n$1@flint.sentex.net>
Simon Fairey (sfairey@adc.metrica.co.uk) wrote:
[snip]
: PS: $string =~ tr/\s//; #One alternative to the answer you will find at
: DejaNews for removing spaces.
Unfortunately your answer is wrong. It will return a count of matched
backslashes and s's.
$string =~ tr/ \n\r\f\t//d;
will do the trick. And if you found that answer somewhere then it's
wrong!
--
Gabor Egressy : gabor@quickweb.com
No, I am not going to explain it. If you can't figure it out, you
didn't want to know anyway... -- Larry Wall, 1991
------------------------------
Date: Wed, 23 Jul 1997 07:08:48 GMT
From: Brian@dkonline.com (Brian - DKOnline)
Subject: Re: Checking for valid Email...
Message-Id: <33daacd7.68617596@pub.news.uk.psi.net>
I check as follows:
unless ($input{'Email'} =~ /\@/) #email needs @
{
&CgiDie ("Sorry","Your Internet e-mail address
seems to be wrong<br>e.g.<br>name<b><u>\@</u></b>isp.com");
}
unless ($input{'Email'} =~ /\./) #email needs at least
one .
{
&CgiDie ("Sorry","Your Internet e-mail address
seems to be wrong<br>e.g.<br>name\@isp<u><b>.</b></u>com") ;
}
if ($input{'Email'} =~ /,/) # a real problem with
some Compuserver people
{
#email needs no commas
&CgiDie ("Sorry","Your Internet e-mail address
seems to be wrong<br>e.g.<br>name\@isp.com<br> (no commas)") ;
}
Finally, when I've got my mailing list, I make sure I capture any
delivery failure messages when I've posted a message to it and then
build up an exclusion list of bad (unreal) email addresses so I won't
post more than once:
while (<MAILFILE>)
{
s/\b[\w\.]*\@[\w\.]*\b/$Mail{$&}=1/gie;
}
Hope this helps someone.
------------------------------
Date: Wed, 23 Jul 1997 20:40:09 +0800
From: Maelstrom <maelstrom@deathsdoor.com>
Subject: Re: Checking for valid Email...
Message-Id: <33D5FBA9.56C6@deathsdoor.com>
I-hate-cyber-promo@man.ac.uk wrote:
> Is there some sort of lobotomy that you need to have done to make you
> incapable of reading the FAQ with due attention?
erm does Windows count?
> $grep 'email address' perl/PerlFAQ.pod
> Hmmm... "How do I check a valid email address?" Look's like it's right
> there.
I was reading from A Win 95 environment and was reading the HTML
section. Perhaps a simple search script on the site would be a good
idea? If there's a grep for Win95 I haven't found it.
>>>>>>>>>>>>> NB: comp.lang.perl.misc is NOT a CGI group <<<<<<<<<<<<<<
So I've heard but it sounds kind of anal to me. How does putting a Perl
script on a server suddenly make it no longer a Perl script? How about
if I rephrased it to sound like this...
"Can someone tell me how to to check a variable for the @ sign? This
script will be used ONLY on my computer and NOWHERE else! I realise
Herr Kommandant that it sounds suspicously like I'm trying to smuggle a
CGI script past the censors but it's hardly my fault if some jerk has
put a '@' in his email is it? No, my program is STRICTLY for stopping
those damn kids from putting smutty words into my text-2-speech
generator!"
Would this have been more palatable for the good and gentle readers of
comp.lang.perl.misc?
--Maelstrom
------------------------------
Date: Wed, 23 Jul 1997 20:38:07 +0100
From: Kevin Luff <kluff@enterprise.net>
Subject: Re: Frame Maker to html conversion
Message-Id: <33D65D9E.929B044A@enterprise.net>
According to a Que HTML book I have, theres a prog called frame2ht
which comes from a company called Telenor Reasearch.
I don't know if it's free, share or buy ware. but might be of interest
Best wishes from the lovely Isle of Man
Kevin
Angel Rodgers wrote:
> Does anyone know of a script that converts Frame Maker Documents to
> html? I have a program called Web Works which does this, but I am
> looking for a way to convert a file as it is uploaded. Also, does
> anyone know of a script which converts MS Word documents to Frame
> Maker?
------------------------------
Date: 22 Jul 1997 21:05:27 GMT
From: ilya@math.ohio-state.edu (Ilya Zakharevich)
Subject: Re: how to do system(@array) with backticks
Message-Id: <5r37an$75n@agate.berkeley.edu>
In article <slrn5t9kn3.5k5.karrer@kuru.ee.ethz.ch>,
Andreas Karrer <karrer@ife.ee.ethz.ch> wrote:
> > How about opening a pipe as follows:
> >
> >open DATE, "date|";
> >while(<DATE>){
> > chomp;
> > print "\nCmd returned:<$_>";
> >}
> >close DATE;
>
> Nice idea, buggy implementation.
>
> open CMD, "@cmd|"
>
> interpolates @cmd before open sees it, and runs a subshell. You want
>
> open CMD, "-|" or exec @cmd;
>
> as described under "open" in the perlfunc man page or one of the more
> eleborate constructs described under "Safe Pipe Opens" in the perlipc
> man page.
Wrong. `cmd` and
open PIPE "cmd|";
will not use shell unless needed. To open a pipe without shell *no
matter what*, you may use
$fh = IO::Pipe->new;
$fh->reader('cmd','arg1',...);
Ilya
------------------------------
Date: Wed, 23 Jul 1997 12:06:43 +0200
From: Klaus Foerster <klausf@mucsun.sps.mot.com>
Subject: How to get rid of junkmail
Message-Id: <33D5D7B3.1BBD@mucsun.sps.mot.com>
Just a sort of questions.
Since I started posting to newsgroups I'm getting loads of
junkmail.
Is there any way to identify this junkmail automatically
(faked mail headers) and to post them to the right postmaster
of the offensive site.
The amount of junkmail is just increasing.
A perl solution would be prefered. If you can point me to other
soultions I'm quite happy as well.
bye and
thanx in advance
bye
Klaus
------------------------------
Date: 23 Jul 1997 12:34:05 GMT
From: emwsbl@emw.ericsson.se (Stefan Berglund)
Subject: Newbie: Backtracking with regexp
Message-Id: <5r4tnt$dnt@infomania.emw.ericsson.se>
So I'm a perl newbie (but a pretty experienced C and UNIX programmer,
he proudly pointed out ;-)
Maybe this is of academic interrest only as I solved my actual problem
with a workaround but I want to know what I should have done...
I have a file that looks like:
<entry>
<topic>Computers</topic>
<url>http://www.nisse.se</url>
<title>Nisse</title>
<comment>akjfhafaskfj</comment>
<date>868432637</date>
</entry>
<entry>
<topic>Entertainment</topic>
<url>http://www.ti.com</url>
<title>Texas</title>
<comment>Texas lilla sida</comment>
<date>868432637</date>
</entry>
<entry>
<topic>Education</topic>
<url>http://www.ti.org</url>
<title>TI suger</title>
<comment></comment>
<date>868435940</date>
</entry>
I want to match one entry (one block between <entry> and </entry>)
using the expression between <url> and </url> tags.
So a match for http://www.ti.org should return:
<entry>
<topic>Education</topic>
<url>http://www.ti.org</url>
<title>TI suger</title>
<comment></comment>
<date>868435940</date>
</entry>
How can I do that using _one_ regular expression?
I have tried these (first converting the file to one (chomp:ed) string):
m#(<entry>.*?<url>$url</url>.*?</entry>)#
m#(<entry>.*?(?!</entry>)<url>$url</url>.*?</entry>)#
Neither of these work.
I understand why the first one doesn't work (but can't really accept it...),
but the second one???
Both regexps return (in $1) the same thing, starting with the very first <entry>
and upto the </entry> in the correct block.
Can anybody explain why it behaves like it does and how it can be done?
I have read the documentation, FAQ and the FMTYEWTK Irreggular expressions by
Tom Christiansen, several times but I still don't get it.
--
/Stefan
Stefan.Berglund@emw.ericsson.se
Life - the ultimate practical joke
------------------------------
Date: Wed, 23 Jul 1997 03:08:42 -0700
From: "Mark J. Schaal" <mark@tstonramp.com>
Subject: Re: Parsing a list of strings
Message-Id: <33D5D82A.45B0@tstonramp.com>
Bernard Cosell wrote:
>
> I need to parse ['split', actually!] a list of strings out of a
> database dump. So I have something that looks like:
> "field1", "field2", .., "fieldn"
> and I can't find a non-ugly way to handle it. I can't just split on
> ','s because the field strings can include commas. The field strings
> can also include ditto marks [which are doubled], and so
> ..., """", ...
> is a field consisting of a single ditto-mark. Any tricks or tips for
> parsing/handling this mess? It feels like it ought to be easy [since
> that is such a simple/standard format], but everything I try is hard
> and/or ugly... [this with 5.03 on Linux] Thanks!!
>
> /Bernie\
> --
> Bernie Cosell mailto:bernie@rev.net
> Roanoke Electronic Village
Yes, this is a standard format, and therefore has its own entry
in the Perl FAQ under:
How can I split a [character] delimited string except when
inside [character]? (Comma-separated files)
Hope this helps,
mark
--
Mark J. Schaal TST On Ramp Sysadmin mark@tstonramp.com
------------------------------
Date: Sun, 20 Jul 1997 12:15:23 -0700
From: Tom Phoenix <rootbeer@teleport.com>
To: Bobbi Arlett <bja109@mail.usask.ca>
Subject: Re: Perl Scripts and Windows NT
Message-Id: <Pine.GSO.3.96.970720121309.21345E-100000@kelly.teleport.com>
On Sun, 20 Jul 1997, Bobbi Arlett wrote:
> Can anyone tell me how to specify the path in a perl script?
I think you want the open command, documented in perlfunc(1). If you still
have questions after you've read the manpages, ask again. Hope this helps!
--
Tom Phoenix http://www.teleport.com/~rootbeer/
rootbeer@teleport.com PGP Skribu al mi per Esperanto!
Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
------------------------------
Date: Tue, 22 Jul 1997 18:42:16 -0500
From: tadmc@flash.net (Tad McClellan)
Subject: Re: Please Help: Pattern Matching
Message-Id: <ogg3r5.mna.ln@localhost>
daniel abrams (dabrams@mathcs.emory.edu) wrote:
: Hello,
Hi.
: I am getting funny results trying to substitue " for <xDQx>
: For instance:
: $holder =~ s/<xDQx>/BEGIN"END/g;
Looks OK to me.
: results in a string where some of the <xDQx> are replaced
: with BEGIN"END as you would expect, and some appear
: as BEGINEND missing the ".
I can't get it to do that.
Could you post a complete program that exhibits that behavior?
: $holder =~ s/<xDQx>/"/g;
: also results in some <xDQx> being replaced with ", while some
: are simply removed. The same <xDQx> are replaced as above.
: i.e. in each case 1'st and 4th instance of <xDQx> are vanished,
: while 2nd and 3d are replaced with ".
: However the following:
: $holder =~ s/<xDQx>/elephant/g;
: works properly for all intances of <xDQx> in the string.
: This is really frustrating. Does anyone have an explanation?
Something else is happening. Can't tell what it might be with
only one line of code to look at...
You probably should mention what perl version you are using
(output from 'perl -v') and what platform you are on...
: $holder is simply a string, I cannot find anything different
: between when the " is placed properly and when its ommitted.
This works fine for me w/ 5.003 and 5.004_01:
------------------------
#! /usr/bin/perl -w
$holder = "first one is <xDQx>, then <xDQx>. After some rambling, we
have <xDQx>. And yet two <xDQx> <xDQx> more\n";
$holder =~ s/<xDQx>/BEGIN"END/g;
print "$holder\n";
------------------------
outputs:
first one is BEGIN"END, then BEGIN"END. After some rambling, we
have BEGIN"END. And yet two BEGIN"END BEGIN"END more
--
Tad McClellan SGML Consulting
Tag And Document Consulting Perl programming
tadmc@flash.net
------------------------------
Date: Wed, 23 Jul 1997 12:03:59 +0200
From: Klaus Foerster <klausf@mucsun.sps.mot.com>
Subject: Q: e other thing I have to do iefficient way to use subarrays
Message-Id: <33D5D70F.5297@mucsun.sps.mot.com>
Hi folks,
The camel book (perl 4) says its more efficient to use
foreach (@array){
PROCESS $_;
}
than doing
for($i=0;$i<@array;$i++){
PROCESS($array[$i]);
}
I have an array containing about 1 million text lines (The machine has
enough RAM for that).
I have to process certain subsections of this array.
Is there a faster way to process sections of an array
(example: elements 200,000 - 800,00)
or do I have to do this with a for loop.
The other thing i have to do is
searching a set of about 20 regular expressions within a
sub array. Again I'd like to know the fastest way.
The set of regular expressions is stored in an array @expr and can
change during runtime.
Any help is appreciated.
ThanX in advance
bye
Klaus
------------------------------
Date: 23 Jul 1997 02:10:44 -0700
From: Randal Schwartz <merlyn@stonehenge.com>
To: Mark Schwartz <mcs@in.net>
Subject: Re: Regex: Email address format
Message-Id: <8c67u2w2uj.fsf@gadget.cscaper.com>
>>>>> "Mark" == Mark Schwartz <mcs@in.net> writes:
Mark> I have written a regex to verify data input to the 'Email Address' field
Mark> of a form I have
Mark> written.
Mark> The regex is:
Mark> if (($regdata{"Email Address"}) !~
Mark> /^(?:[\w]+)\.?(?:[\w]*)\@(?:[\w]+)\.(?:[\w]+)(?:[\.][\w]+)*/) {
Mark> <--PRINT ERROR CODE, RE-SUBMIT FORM-->
Mark> }
Mark> This regex is partially working.
And partially broken. :-)
It incorrectly rejects fred&barney@stonehenge.com, which is not only a
valid address, it is a *working* address (try sending email to it to
see... it's got an autoreply).
As has been said Oh So Many Times here, just about any character on
the planet is permitted on the left side of the @. See tchrist's
chkaddr script for something MUCH much closer than what you've
written. Or just Dejanews in comp.lang.perl.misc for "validate email
address" and see how many times we've had to say the same thing Over
and Over again. :-)
print "Just another Perl hacker," # but not what the media calls "hacker!" :-)
## legal fund: $20,495.69 collected, $182,159.85 spent; just 405 more days
## before I go to *prison* for 90 days; email fund@stonehenge.com for details
--
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <merlyn@stonehenge.com> Snail: (Call) PGP-Key: (finger merlyn@ora.com)
Web: <A HREF="http://www.stonehenge.com/merlyn/">My Home Page!</A>
Quote: "I'm telling you, if I could have five lines in my .sig, I would!" -- me
------------------------------
Date: 23 Jul 1997 20:18:46 GMT
From: Eli the Bearded <usenet-tag@qz.little-neck.ny.us>
Subject: Re: Regex: Email address format
Message-Id: <eli$9707231546@qz.little-neck.ny.us>
Matthew D. Healy <Matthew.Healy@yale.edu> wrote:
> The nearly-definitive answer to this is found in Jeff Friedl's book
> {Mastering Regular Expressions}, published by O'Reilly. It is a real
> MONSTER of a regular expression -- over 5000 characters long -- and
> even then it will NOT cover all possible cases because regexes are
> inherently incapable of doing so. The problem comes with nested
> parenthetical comments -- regexes CANNOT balance parens to any
> arbitrary number of levels -- so Friedl's example works for no more
> than two levels of nesting.
He just tries to do too much in one regexp. (Mostly this is just
him showing off.)
> In practice, you will find rather few RFC-compliant addresses that
> Friedl's expression won't match in the real world. Contrived examples
> that break his regex will usually break lots of mailers out there
> anyway!
Maybe. Most mailers I have seen deal with deeply nested parans well,
and AFAIK, that is all his breaks on. Lots of ones his won't break
on will break mailers. I haven't gotten sendmail to accept stuff
like "eli@qz"@qz.little-neck.ny.us yet, and that's valid. Most Unix
MTAs are quite content with quoted spaces or commas in the local-part,
but few mail user agents will go near them. Friedl's code likes them
fine, though.
One thing Friedl could have done was check not just for RFC822
valid addresses, but also check for RFC821 size limits and RFC921
hostnames. Just because
$address= "<" . "j"x999 . "@" . "_"x999 . ">";
is an address valid by RFC 822 does not mean it has any chance of
working in the real world.
Anyway, here is Friedl's script with a fix for deeply nested
parantheses.
------ begin script ------
#!/usr/local.bin/perl5.004
#
# Program to build a regex to match an internet email address,
# from Chapter 7 of _Mastering Regular Expressions_ (Friedl / O'Reilly)
# (http://www.ora.com/catalog/regexp/)
#
# Optimized version.
#
# Copyright 1997 O'Reilly & Associates, Inc.
#
# Enable warnings (added by Eli)
$^W=1;
# Some things for avoiding backslashitis later on.
$esc = '\\\\'; $Period = '\.';
$space = '\040'; $tab = '\t';
$OpenBR = '\['; $CloseBR = '\]';
$OpenParen = '\('; $CloseParen = '\)';
$NonASCII = '\x80-\xff'; $ctrl = '\000-\037';
$CRlist = '\n\015'; # note: this should really be only \015.
# Items 19, 20, 21
$qtext = qq/[^$esc$NonASCII$CRlist\"]/; # for within "..."
$dtext = qq/[^$esc$NonASCII$CRlist$OpenBR$CloseBR]/; # for within [...]
$quoted_pair = qq< $esc [^$NonASCII] >; # an escaped character
##############################################################################
# Items 22 and 23, comment.
# Impossible to do properly with a regex, I make do by allowing at most one
# level of nesting.
$ctext = qq< [^$esc$NonASCII$CRlist()] >;
# $Cnested matches one non-nested comment.
# It is unrolled, with normal of $ctext, special of $quoted_pair.
$Cnested = qq<
$OpenParen # (
$ctext* # normal*
(?: $quoted_pair $ctext* )* # (special normal*)*
$CloseParen # )
>;
# $comment allows one level of nested parentheses
# It is unrolled, with normal of $ctext, special of ($quoted_pair|$Cnested)
$comment = qq<
$OpenParen # (
$ctext* # normal*
(?: # (
(?: $quoted_pair | $Cnested ) # special
$ctext* # normal*
)* # )*
$CloseParen # )
>;
##############################################################################
# $X is optional whitespace/comments.
$X = qq<
[$space$tab]* # Nab whitespace.
(?: $comment [$space$tab]* )* # If comment found, allow more spaces.
>;
# Item 10: atom
$atom_char = qq/[^($space)<>\@,;:\".$esc$OpenBR$CloseBR$ctrl$NonASCII]/;
$atom = qq<
$atom_char+ # some number of atom characters...
(?!$atom_char) # ..not followed by something that could be part of an atom
>;
# Item 11: doublequoted string, unrolled.
$quoted_str = qq<
\" # "
$qtext * # normal
(?: $quoted_pair $qtext * )* # ( special normal* )*
\" # "
>;
# Item 7: word is an atom or quoted string
$word = qq<
(?:
$atom # Atom
| # or
$quoted_str # Quoted string
)
>;
# Item 12: domain-ref is just an atom
$domain_ref = $atom;
# Item 13: domain-literal is like a quoted string, but [...] instead of "..."
$domain_lit = qq<
$OpenBR # [
(?: $dtext | $quoted_pair )* # stuff
$CloseBR # ]
>;
# Item 9: sub-domain is a domain-ref or domain-literal
$sub_domain = qq<
(?:
$domain_ref
|
$domain_lit
)
$X # optional trailing comments
>;
# Item 6: domain is a list of subdomains separated by dots.
$domain = qq<
$sub_domain
(?:
$Period $X $sub_domain
)*
>;
# Item 8: a route. A bunch of "@ $domain" separated by commas, followed by a
# colon.
$route = qq<
\@ $X $domain
(?: , $X \@ $X $domain )* # additional domains
:
$X # optional trailing comments
>;
# Item 6: local-part is a bunch of $word separated by periods
$local_part = qq<
$word $X
(?:
$Period $X $word $X # additional words
)*
>;
# Item 2: addr-spec is local@domain
$addr_spec = qq<
$local_part \@ $X $domain
>;
# Item 4: route-addr is <route? addr-spec>
$route_addr = qq[
< $X # <
(?: $route )? # optional route
$addr_spec # address spec
> # >
];
# Item 3: phrase........
$phrase_ctrl = '\000-\010\012-\037'; # like ctrl, but without tab
# Like atom-char, but without listing space, and uses phrase_ctrl.
# Since the class is negated, this matches the same as atom-char plus space
# and tab
$phrase_char =
qq/[^()<>\@,;:\".$esc$OpenBR$CloseBR$NonASCII$phrase_ctrl]/;
# We've worked it so that $word, $comment, and $quoted_str to not consume
# trailing $X because we take care of it manually.
$phrase = qq<
$word # leading word
$phrase_char * # "normal" atoms and/or spaces
(?:
(?: $comment | $quoted_str ) # "special" comment or quoted string
$phrase_char * # more "normal"
)*
>;
## Item #1: mailbox is an addr_spec or a phrase/route_addr
$mailbox = qq<
$X # optional leading comment
(?:
$addr_spec # address
| # or
$phrase $route_addr # name and address
)
>;
###########################################################################
# Here's a little snippet to test it.
# One address is read from stdin (no shell quoting to deal with) and
# processed.
my $error = 0;
my $valid;
$verbose=0;
# Set to undef to slurp to EOF
$/=undef;
$address=<STDIN>;
# (added by Eli)
if (defined($ARGV[0]) && ($ARGV[0] =~ /^-v/)) {
$verbose=1;
}
# This copy bit was added by Eli-the-Bearded to make sure Friedl's
# code never fails for valid addresses. His code has problems with
# deeply nested comments, so I have several easy steps which transform
# the address into another one without making a legal address illegal
# or vice versa.
$copy = $address;
$copy =~ s:\\.:a:g ; # replace \quoted stuff
$copy =~ s:\\\n::gs ; # unfold lines
$copy =~ s:"[^"]*":b:g ; # replace "quoted" stuff
while ( $copy =~ s:\([^()]*\)::g ) {;} # (remove ((all) comments))
# End of Eli's transform
$valid = $copy =~ m/^$mailbox$/xo;
chomp $address;
$verbose && printf "`$address' is syntactically %s.\n",
$valid ? "valid" : "invalid";
$error = 1 if not $valid;
exit $error;
------ end script ------
Elijah
------
<URL:http://www.netusa.net/~eli/faqs/addressing.html#verify>
------------------------------
Date: Wed, 23 Jul 1997 09:09:26 +0100
From: Simon Fairey <sfairey@adc.metrica.co.uk>
To: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Regex: Email format
Message-Id: <33D5BC36.2596B5C2@adc.metrica.co.uk>
Tom Phoenix wrote:
> On Tue, 22 Jul 1997, Simon Fairey wrote:
>
> > I don't believe you to escape the '@'.
>
> Actually, regular expressions are double-quote interpolated,
Ahh I see.
> so it's
> normal to need to escape an @-sign. But trying validate an e-mail
> address
> is an exercise in futility. There are more details in the FAQ. Hope
> this
> helps!
I assume, in that case, that the only reason /@/ works for matching the
'@' character is because there is nothing following it so it can't
possibly be an array.
Hmmmm, should the unescaped usage of '@' in the above not generate an
error message, I know that it does when you put it in double quotes and
if // is double quote interpolated I would have thought the same would
apply. warning at the very least. Having said that with a few
experiments it appears that Perl ( as always ) is too clever for me in
that it recognises when '@' is not followed by anything.
Guess I just answered my own question then! :-)
Thanks for the info.
Simon
------------------------------
Date: Wed, 23 Jul 1997 14:51:41 +0100
From: Charles Herold <cherold@pathfinder.com>
Subject: Re: Regex: Email format
Message-Id: <33D60C66.160F@pathfinder.com>
Everyone's told you where to find a good email address checker, but no
one's told you why your regex didn't work, so I will. It's because
you're telling it at the end to find 1 or more characters followed by 0
or more incidents of an expression. So in mcs@in.n=t your expression is
matching up until the last 'n', which is all it's required to match. If
you put a $ at the end of your regular expression then it will have to
match all the way to the end of the string, and will work. Also, you
don't need to put every special character in brackets, and there's
absolutely no need for a non-capturing parenthesis anywhere you don't
need grouping. So your expression should look more like:
/\w+\.?\w*\@\w+\.\w+(?:\.\w+)*$/
Mark Schwartz wrote:
>
> I have written a regex to verify data input to the 'Email Address' field
> of a form I have written.
>
> The regex is:
>
> if (($regdata{"Email Address"}) !~
> /^(?:[\w]+)\.?(?:[\w]*)\@(?:[\w]+)\.(?:[\w]+)(?:[\.][\w]+)*/) {
> <CODE that displays invalid format message>
> This regex is partially working :-(
>> examples:
> Test address Did regex work?
> mcs@in.net Y Valid
> m=cs@in.net Y Invalid
> m.cs@in.net Y Valid
> mcs@i=n.net Y Invalid
> mcs@in.n=t N Invalid, but not caught
> by regex
> mcs@in.net.k12.ed=u N Invalid, but not caught by
> regex
--
Best regards,
Charles Herold
Pathfinder
Production Assistant
cherold@pathfinder.com
(212) 522-5190
------------------------------
Date: 23 Jul 1997 13:45:44 -0400
From: radev@news.cs.columbia.edu (Dragomir R. Radev)
Subject: SOLVED: Writing to a file descriptor
Message-Id: <5r5g08$25t@bluewhale.cs.columbia.edu>
After reading the faq and pod more carefully, I did the following which
worked:
----------------------------------------------------------------------------
open (OUTPUT, ">&=$client_fd);
...
printf OUTPUT ...
...
my $oldfh = select (OUTPUT); $| = 1; select ($oldfh);
----------------------------------------------------------------------------
Drago
In article <5r5f5e$24n@bluewhale.cs.columbia.edu>,
Dragomir R. Radev <radev@news.cs.columbia.edu> wrote:
>1. Is there a function that does the opposite to fileno: namely, return a
>filehandle given a file descriptor?
>
>2. Is there a way for a Perl process to write to a specified (inherited) file
>descriptor without opening a file for it?
>
>3. Is there any other solution to the problem described below? Perhaps
>using dup?
>
>The context is the following: I am using perlembed and I want the perl code
>to write to a file descriptor that has already been open by the C code. I
>don't want to open a file explicitly from Perl, as that file and the
>associated file descriptor will be closed when the perl interpreter exits
>and I will need to reuse the file descriptor from the C code.
>
>Thanks,
>
>
>Drago
>
>--
>Dragomir R. Radev Graduate Research Assistant
>Natural Language Processing Group Columbia University CS Department
>H: 212-749-9770 O: 212-939-7121 http://www.cs.columbia.edu/~radev
--
Dragomir R. Radev Graduate Research Assistant
Natural Language Processing Group Columbia University CS Department
H: 212-749-9770 O: 212-939-7121 http://www.cs.columbia.edu/~radev
------------------------------
Date: Wed, 23 Jul 1997 03:34:53 -0600
From: sanjay@rocketship.com (Sanjay Varma)
Subject: Re: Sorting
Message-Id: <869646307.31194@dejanews.com>
> Walter> I've been major disapointed w/ the sort in perl until I found this:
> Walter> http://www.usenix.org/publications/perl/perl01.html
> Randal> At about that level of writing, I also have my Unix Review columns
> Randal> on-line at http://www.stonehenge.com/merlyn/UnixReview/ ... so you
> Randal> could check that out as well.
Had read Randal's articles earlier and the one on usenix.org today.
If I have a huge text file containing data I wish to sort. Is there any
way it could be done with Perl without reading the entire file into an
associative array? Say coz of memory constraints.
-------------------==== Posted via Deja News ====-----------------------
http://www.dejanews.com/ Search, Read, Post to Usenet
------------------------------
Date: Wed, 23 Jul 1997 15:03:50 -0700
From: "Ken Sato" <ksato@mda.ca>
Subject: stdout problem
Message-Id: <5r5vca$d30$1@beauty.mda.ca>
I am invoking a Perl script from a Java application using the exec() call
but I can't seem to get the stdout back in Java. Simple Print commands get
routed back fine through the DataInputStream , but for example:
$output = `lpq -S prnServerName -P prnName`;
print $output;
does not print back in Java. Running the script on its own works fine of
course. Has anyone experienced this? Any help would be appreciated.
--
Ken Sato
ksato@mda.ca
------------------------------
Date: Wed, 23 Jul 1997 17:02:24 GMT
From: Dave.Cross@gb.swissbank.com (Dave Cross)
Subject: Re: sybase extenstions to perl (Sybase: :DBlib)
Message-Id: <DAVE.CROSS.97Jul23180224@ln4d110swk.gb.swissbank.com>
In article <33D4DDD1.5ED4@mobility.com> Mike Krueger <mkrueger@mobility.com> writes:
> I am writing a Perl program that uses Sybase::DBlib. Does this API
> support the use of "cursors". Can anyone recommend where I might find a
> good example of a Perl program that uses "cursors."
Mike,
Sybase::DBlib doesn't support cursors directly - that's largely
because DBlib itself doesn't support cursors. You might be able
to run the cursor commands thru standard dbcmd calls but it
would be a tad tricky.
If you've got Sybase::DBlib, it's very likely that you've also
got Sybase::CTlib. This does have cursor support built in. Try
using that instead.
Better yet, don't use cursors and watch your application speeds
increase tenfold!
Dave... .. .
p.s. I bet Michael Peppler answers this in more depth than me...
and why shouldn't he? He wrote the stuff after all.
--
"...but Man created all gods equal."
Dave.Cross@gb.swissbank.com
------------------------------
Date: Wed, 23 Jul 1997 11:22:41 -0700
From: Mehmet Demirkol <demirkol@sgs-server.Stanford.EDU>
Subject: text processing
Message-Id: <Pine.SOL.3.95.970723111341.12391A-100000@sgs-server.stanford.edu>
Hello,
How can I remove all "^M"s from a piece of text?
I guess I can use
$text =~ s/\x??//g;
but what should "??" be? Is there a web page that has a list of those
substitude values?
Mehmet Fatih Demirkol
email: demirkol@sgs-server.stanford.edu
------------------------------
Date: Wed, 23 Jul 1997 08:07:25 -0700
From: Tom Phoenix <rootbeer@teleport.com>
To: Shaun O'Shea <lmisosa@eei.ericsson.se>
Subject: Re: The regex that doesn't want to work!@#%_(_)^
Message-Id: <Pine.GSO.3.96.970723075724.14902D-100000@kelly.teleport.com>
On Wed, 23 Jul 1997, Shaun O'Shea wrote:
> The following is a section of code I wrote.
> When I ran it I used the debugger and fom **1 and **2 I could see that
> there should be a match for the regexp!! and hence the condition should
> be satisfied!
> Doesn'tmatch--> if($full[$wew] =~ /${address[$rtr]}/){
So, what's in $address[$rtr], and what's in $full[$wew]? I suspect that
one is a pattern which won't match the other. :-)
> (Using perl version 5.000 on a Sun ULTRA1 )
You may have found one of the many bugs in that old version of Perl. I
recommend installing 5.004. Then, if you can replicate this problem with a
short script, post the script here. Good luck!
--
Tom Phoenix http://www.teleport.com/~rootbeer/
rootbeer@teleport.com PGP Skribu al mi per Esperanto!
Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
------------------------------
Date: 8 Mar 97 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 8 Mar 97)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V8 Issue 759
*************************************