[25434] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 7679 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jan 21 14:10:36 2005

Date: Fri, 21 Jan 2005 11:10:28 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Fri, 21 Jan 2005     Volume: 10 Number: 7679

Today's topics:
        does foreach (@list) alter @list?? <hendrik_maryns@despammed.com>
    Re: does foreach (@list) alter @list?? <1usa@llenroc.ude.invalid>
    Re: does foreach (@list) alter @list?? <wyzelli@yahoo.com>
    Re: does foreach (@list) alter @list?? <jgibson@mail.arc.nasa.gov>
    Re: does foreach (@list) alter @list?? <dwall@fastmail.fm>
    Re: does foreach (@list) alter @list?? <tadmc@augustmail.com>
    Re: does foreach (@list) alter @list?? <matthew.garrish@sympatico.ca>
        Format - stopping at certain page <sammie-nospam@greatergreen.com>
    Re: how to write a tutorial <fb@frank-buss.de>
    Re: how to write a tutorial <cbfalconer@yahoo.com>
    Re: how to write a tutorial <mfinder@digipen.edu>
        Innermost containing tag: match/replace it w_laks@yahoo.com
    Re: Innermost containing tag: match/replace it <jgibson@mail.arc.nasa.gov>
    Re: installing module to my own directory with MCPAN <chicks_hate_me@hotmail.com>
    Re: Is zero even or odd? <george_coxanti@spambtinternet.com.invalid>
    Re: locale problem <no@mail.org>
    Re: Low level data manipulation in Perl <perl@lennychallis.co.uk>
    Re: Negative lookahead regex clarification needed <shifty_MyU@yahoo.com>
    Re: Negative lookahead regex clarification needed <shifty_MyU@yahoo.com>
    Re: Negative lookahead regex clarification needed <shifty_MyU@yahoo.com>
    Re: Negative lookahead regex clarification needed <flavell@ph.gla.ac.uk>
    Re: Negative lookahead regex clarification needed (Anno Siegel)
        net::SFTP to capture 'get' reference? <rhxk@yahoo.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 20 Jan 2005 22:54:42 +0100
From: Hendrik Maryns <hendrik_maryns@despammed.com>
Subject: does foreach (@list) alter @list??
Message-Id: <maWdneQiKfU-tW3cRVnyjQ@scarlet.biz>

Hi,
I've got this little program that reads all the files ending on .pos in 
a directory into a list @dir, then opens them one by one, does 
something, closes them again.  When finished with all files, I start 
this same thing for a second time, doing something different with the 
files, based on what I did with them in the first loop.  Now strangely 
enough I get an error message which says there are uninitialised 
variables in @dir, "Can't open etc."

I get this (with diagnostics):
Use of uninitialized value in open at test.pl li
ne 21 (#1)
     (W uninitialized) An undefined value was used as if it were already
     defined.  It was interpreted as a "" or a 0, but maybe it was a 
mistake.
     To suppress this warning assign a defined value to your variables.

     To help you figure out what was undefined, perl tells you what 
operation
     you used the undefined value in.  Note, however, that perl 
optimizes your
     program and the operation displayed in the warning may not necessarily
     appear literally in your program.  For example, "that $foo" is
     usually optimized into "that " . $foo, and the warning will refer to
     the concatenation (.) operator, even though there is no . in your
     program.

Uncaught exception from user code:
         No such file or directory at test.pl line 21.
  at test.pl line 21

If I redefine the @dir variable, it works, but this shouldn't be 
necessary if you ask me...

Though of course you can't test this because you don't have those files 
in your active directory, here's a reduced version:

use strict;
use warnings;
use diagnostics;

my @dir= <*.pos>;	# feel free to change this for testing purposes

for (@dir){
	open(my $in, '<', $_)||die($!);
		# do something
	close $in;
} #endfor

# @dir= <*.pos>;

for (@dir){
	open(my $in, '<', $_ )||die($!);
		#do something else
	close $in;
} #endfor

So my question is: how come?  Where does @dir get altered, or how come 
$_ is undefined there.

For "do something" you could for example use

my $counter=0;
	while (<$in>){$counter++;
		print $counter, ' ';

Curious greetings, Hendrik


------------------------------

Date: 20 Jan 2005 22:25:54 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: does foreach (@list) alter @list??
Message-Id: <Xns95E4B15A17E12asu1cornelledu@132.236.56.8>

Hendrik Maryns <hendrik_maryns@despammed.com> wrote in
news:maWdneQiKfU-tW3cRVnyjQ@scarlet.biz: 

> use strict;
> use warnings;
> use diagnostics;
> 
> my @dir= <*.pos>;     # feel free to change this for testing purposes
> 
> for (@dir){
>      open(my $in, '<', $_)||die($!);
>           # do something
>      close $in;
> } #endfor
> 
> # @dir= <*.pos>;
> 
> for (@dir){
>      open(my $in, '<', $_ )||die($!);
>           #do something else
>      close $in;
> } #endfor
> 
> So my question is: how come?  Where does @dir get altered, or how come
> $_ is undefined there.
> 
> For "do something" you could for example use
> 
> my $counter=0;
>      while (<$in>){$counter++;
>           print $counter, ' ';

You might want to read perldoc perlsyn. Look for the section "Foreach 
Loops".

When you have while(<$in>), $_ set to the line that was read from $in 
each time. The loop terminates when no more lines can be read, and 
hence, $_ is set to undef. Since $_ was an alias to an element of @dir, 
that element is set to undef in return. 

The simple solution to this is to use an explicit lexically scoped loop 
variable, as in 

for my $file (@dir) {

}

Sinan.


------------------------------

Date: Thu, 20 Jan 2005 22:24:20 GMT
From: "Peter Wyzl" <wyzelli@yahoo.com>
Subject: Re: does foreach (@list) alter @list??
Message-Id: <ocWHd.126397$K7.121118@news-server.bigpond.net.au>

"Hendrik Maryns" <hendrik_maryns@despammed.com> wrote in message 
news:maWdneQiKfU-tW3cRVnyjQ@scarlet.biz...
: Hi,
: I've got this little program that reads all the files ending on .pos in
: a directory into a list @dir, then opens them one by one, does
: something, closes them again.  When finished with all files, I start
: this same thing for a second time, doing something different with the
: files, based on what I did with them in the first loop.  Now strangely
: enough I get an error message which says there are uninitialised
: variables in @dir, "Can't open etc."

<snip>:

: use strict;
: use warnings;
: use diagnostics;
:
: my @dir= <*.pos>; # feel free to change this for testing purposes
:
: for (@dir){
: open(my $in, '<', $_)||die($!);
: # do something
: close $in;
: } #endfor
:
: # @dir= <*.pos>;
:
: for (@dir){
: open(my $in, '<', $_ )||die($!);
: #do something else
: close $in;
: } #endfor
:
: So my question is: how come?  Where does @dir get altered, or how come
: $_ is undefined there.
:
: For "do something" you could for example use

When if a foreach loop, the $_ variable is aliased to each element of the 
array.  Changing $_ in any way will change the corresponding variable in the 
array.

What you are doing above is using the implicit $_ variable.

Since you don't include the actual 'do something' code I can't be _certain_ 
but from the sample you gave you are also using the $_ variable in the while 
loop.

my $counter=0;
while (<$in>){ #### here $_ is repetitively set to the content of each line 
of the file, thus changing @dir
    $counter++;
    print $counter, ' ';
}

It is probable that you have one ot more blank lines in one or more of the 
files....

You either need to localise the $_ variable in the while loop or explicitly 
declare and use a named variable in the foreach loop.

foreach my $file (@dar){
    open (my $in, '<', $file) or die($!);
    # do stuff
    close $in;
}

etc etc etc

HTH

P
-- 
print "Just another Perl Hacker";




------------------------------

Date: Thu, 20 Jan 2005 14:29:25 -0800
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: does foreach (@list) alter @list??
Message-Id: <200120051429251683%jgibson@mail.arc.nasa.gov>

In article <maWdneQiKfU-tW3cRVnyjQ@scarlet.biz>, Hendrik Maryns
<hendrik_maryns@despammed.com> wrote:

[ problem description, diagnostic message about uninitialized value
snipped]

> use strict;
> use warnings;
> use diagnostics;
> 
> my @dir= <*.pos>;  # feel free to change this for testing purposes
> 
> for (@dir){
>  open(my $in, '<', $_)||die($!);
>     # do something
>  close $in;
> } #endfor
> 
> # @dir= <*.pos>;
> 
> for (@dir){
>  open(my $in, '<', $_ )||die($!);
>     #do something else
>  close $in;
> } #endfor
> 
> So my question is: how come?  Where does @dir get altered, or how come 
> $_ is undefined there.

You are running afoul of the implicit loop-aliasing feature. In a for
loop, the loop variable is set to an alias of each member of the list
in turn. You are using the default loop variable $_. If you modify $_,
you are modifying the list element itself. You probably have something
inside your loop that modifies $_. A while{ <$in> } would do that,
since it also uses $_ as a default loop variable.

Use an explicit loop variable in each case and do not modify it:

   for my $file ( @dir ) {
      open( my $in, ... );
      while( my $line = <$in> } 
         ...
      }
   }

See perldoc perlsyn "Foreach Loops"


----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= East/West-Coast Server Farms - Total Privacy via Encryption =---


------------------------------

Date: Thu, 20 Jan 2005 22:32:50 -0000
From: "David K. Wall" <dwall@fastmail.fm>
Subject: Re: does foreach (@list) alter @list??
Message-Id: <Xns95E4B27FA29D5dkwwashere@216.168.3.30>

Hendrik Maryns <hendrik_maryns@despammed.com> wrote:

> use strict;
> use warnings;
> use diagnostics;
> 
> my @dir= <*.pos>;     # feel free to change this for testing
> purposes 
> 
> for (@dir){
>      open(my $in, '<', $_)||die($!);
>           # do something
>      close $in;
>} #endfor

If you do something to $_ inside this loop, then yes, that will alter 
the current element of @dir.  See the section on foreach loops in 
perlsyn.

It's better to do something like this:

    for my $d (@dir) {
        # do stuff with $d
    }

Then you don't have to worry about what you do to $_.


------------------------------

Date: Thu, 20 Jan 2005 16:54:33 -0600
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: does foreach (@list) alter @list??
Message-Id: <slrncv0dl9.8vn.tadmc@magna.augustmail.com>

Hendrik Maryns <hendrik_maryns@despammed.com> wrote:

> Subject: does foreach (@list) alter @list??


Not by itself. But it will if you ask it to.


> Now strangely 
> enough I get an error message which says there are uninitialised 
> variables in @dir, 


> my @dir= <*.pos>;	# feel free to change this for testing purposes
> 
> for (@dir){
> 	open(my $in, '<', $_)||die($!);
> 		# do something


> So my question is: how come?  


Can't tell, because you have not shown us what goes in
the "do something" there...


> Where does @dir get altered, 


Somewhere in the "do something" part.


> For "do something" you could for example use
> 
> my $counter=0;
> 	while (<$in>){$counter++;


You are using $_ in an outer foreach AND in an inner while.

The while will fail with $_ = undef.

Since $_ is also the loop control variable of a foreach(), it
serves as an alias back into the corresponding list element,
just as the "Foreach Loops" section in perlsyn.pod says it will.

Overusing $_ is abusing $_.

Choose some other name for one or the other of the loops.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Fri, 21 Jan 2005 09:44:06 -0500
From: "Matt Garrish" <matthew.garrish@sympatico.ca>
Subject: Re: does foreach (@list) alter @list??
Message-Id: <Ty8Id.44638$K03.1151857@news20.bellglobal.com>


<xhoster@gmail.com> wrote in message 
news:20050121091244.238$GB@newsreader.com...
> "Matt Garrish" <matthew.garrish@sympatico.ca> wrote:
>> "Hendrik Maryns" <hendrik_maryns@despammed.com> wrote in message
>> news:maWdneQiKfU-tW3cRVnyjQ@scarlet.biz...
>> >
>> > my $counter=0;
>> > while (<$in>){$counter++;
>> > print $counter, ' ';
>> >
>>
>> I never understand the fascination people have with running their own
>> unnecessary counters:
>>
>> print "$. " while <$in>;
>
> What if I want to change "while (<$in>)" to "foreach (@in)"?
> Or if I need to add something in the loop which deals with
> some other file?  Now for which file is $. accurate?
>

And what if you were running your own counter and failed to account for the 
obvious? Sorry if I don't see your point.

Matt 




------------------------------

Date: Thu, 20 Jan 2005 13:17:25 -0800
From: "Brad Walton" <sammie-nospam@greatergreen.com>
Subject: Format - stopping at certain page
Message-Id: <HIednQmfIfskg23cRVn-3Q@comcast.com>

I would like the output of data to stop at a certain page number. This is
the test script I have been working on:

----
#!/usr/bin/perl
use strict;
use warnings;

my ($key, $value);
$= = 10; # Number of lines before new page

open (FILE, 'greatergreen.plr') or die "Cannot open: $!";
while (<FILE>) {
    chomp;
    if ($_ =~ /(.*)\=\"(.*)\".*/) {
        $key = $1;
        $value = $2;
        write if $% < 2;
    }
}
close (FILE);

format STDOUT =
@<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<
$key, $value
 .

format STDOUT_TOP =

Page @<<
$%

Variable                  Setting
========================= =================
 .
----

And this is the output:

----
Page 1

Variable                  Setting
========================= =================
Track File                LOCATION
Track Description         LongBeach
AI Database File          LOCATION
Profile Vehicle File      TEAMS\GT
Vehicle File              TEAMS\GT
?
Page 2

Variable                  Setting
========================= =================
Sim Vehicle File          TEAMS\GT
----

I would like it to stop at the end of Page 1, but it's leaking on to page 2.
I assume this is happening because the page counter isn't updated until
after it's called for write.

Thanks for any help,
Brad




------------------------------

Date: Fri, 21 Jan 2005 15:03:45 +0000 (UTC)
From: Frank Buss <fb@frank-buss.de>
Subject: Re: how to write a tutorial
Message-Id: <csr5kh$216$3@newsreader2.netcologne.de>

drewc <drewc@rift.com> wrote:

> What does this have to do with Lisp? (i'm in c.l.l).

he is a troll, but one who confess this fact:

http://www.xahlee.org/Netiquette_dir/troll.html

-- 
Frank Bu�, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de


------------------------------

Date: Fri, 21 Jan 2005 16:30:55 GMT
From: CBFalconer <cbfalconer@yahoo.com>
Subject: Re: how to write a tutorial
Message-Id: <41F12912.3748FFB7@yahoo.com>

Xah Lee wrote:
> 
> i've started to read python tutorial recently.
> http://python.org/doc/2.3.4/tut/tut.html
> 
> Here are some quick critique:

This has absolutely nothing to do with c.l.c, nor most of the
cross-posted groups.  F'ups set.  Why did you do such a foul
cross-posting in the first place.

-- 
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on 
 "show options" at the top of the article, then click on the 
 "Reply" at the bottom of the article headers." - Keith Thompson




------------------------------

Date: Fri, 21 Jan 2005 08:44:37 -0800
From: M Jared Finder <mfinder@digipen.edu>
Subject: Re: how to write a tutorial
Message-Id: <35cprlF4hhav6U1@individual.net>

Xah Lee wrote:
> i've started to read python tutorial recently.
> http://python.org/doc/2.3.4/tut/tut.html

What does this have to do with Perl, Lisp, Scheme, or C?

   -- MJF


------------------------------

Date: 20 Jan 2005 12:54:06 -0800
From: w_laks@yahoo.com
Subject: Innermost containing tag: match/replace it
Message-Id: <1106254446.627069.172760@f14g2000cwb.googlegroups.com>

I need to extract a couple of image links inside a table containing
spacer gifs and replace it with a div.
==============================================================
 ... some HTML content ...
<table><tr><td>stuff ...</td></tr></table>
 ... more stuff
<table> <!-- this needs to be replaced with a <div>
containing only the yes and no image links -->
<!-- spacer gif -->
<a href="#"><img src="yes.gif"/><a/>
<!-- spacer gif -->
<a href="#"><img src="no.gif"/></a>
</table>
==============================================================
If possible I would like to extract the image links and replace its
*containing* table with the div. I tried the minimal (non-greedy)
qualifier but I guess it always goes for the left-most match starting
at the *first* table tag. Can I write a regexp to match and replace the
innermost containing tag?

Thanks,
Lakshmi.



------------------------------

Date: Thu, 20 Jan 2005 14:38:31 -0800
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: Innermost containing tag: match/replace it
Message-Id: <200120051438314428%jgibson@mail.arc.nasa.gov>

In article <1106254446.627069.172760@f14g2000cwb.googlegroups.com>,
<w_laks@yahoo.com> wrote:

> I need to extract a couple of image links inside a table containing
> spacer gifs and replace it with a div.
> ==============================================================
> ... some HTML content ...
> <table><tr><td>stuff ...</td></tr></table>
> ... more stuff
> <table> <!-- this needs to be replaced with a <div>
> containing only the yes and no image links -->
> <!-- spacer gif -->
> <a href="#"><img src="yes.gif"/><a/>
> <!-- spacer gif -->
> <a href="#"><img src="no.gif"/></a>
> </table>
> ==============================================================
> If possible I would like to extract the image links and replace its
> *containing* table with the div. I tried the minimal (non-greedy)
> qualifier but I guess it always goes for the left-most match starting
> at the *first* table tag. Can I write a regexp to match and replace the
> innermost containing tag?

Parsing HTML with regular expressions is considered difficult or next
to impossible. You are seeing why. You should use a real HTML parsing
method, such as the HTML::Parser module from CPAN.


----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= East/West-Coast Server Farms - Total Privacy via Encryption =---


------------------------------

Date: 20 Jan 2005 13:40:57 -0800
From: "ChicksHateMe" <chicks_hate_me@hotmail.com>
Subject: Re: installing module to my own directory with MCPAN
Message-Id: <1106257257.439228.50750@f14g2000cwb.googlegroups.com>

Thanks for the link here, what you and Paul showed will surely help.

P.S.

Don't you just love , when you are asking for help, You get some JERK
like Sinan, who think he knows it all,  and bust yer ,,,, instead of
trying to be NICE and helpful.

Geesh, what a weenie..

Hey, Sin. Instead of being a snotty nosed jerk, Why don't ya chose to
be nice and helpful or put your time into doing something CONSTRUCTIVE
rather than DESTRUCTIVE...

*mumbles "weenie" again*

Gawd, sometimes dontcha just wish you could reach out and TOUCH
someone.....



------------------------------

Date: Thu, 20 Jan 2005 22:13:53 +0000 (UTC)
From: George Cox <george_coxanti@spambtinternet.com.invalid>
Subject: Re: Is zero even or odd?
Message-Id: <41F02D28.E30939CF@spambtinternet.com.invalid>

Fred Bloggs wrote:
> 
> ... Halmos in his General
> Topology...

What's that?  Guess (on the basis that they're both GTM): Kelley's
General Topology or Halmos's Measure Theory.


------------------------------

Date: Thu, 20 Jan 2005 22:09:35 GMT
From: osmo <no@mail.org>
Subject: Re: locale problem
Message-Id: <z_VHd.622$%w5.611@read3.inet.fi>

Arndt Jonasson wrote:
> osmo wrote :
> 
>>I have problems with using my locale in perl. I have all my locale 
>>environment variables set to "fi_FI.UTF-8" (finnish).
>>[...]
> 
> 
> Exactly which are "all my locale environment variables"? Maybe you and
> those who do get the wanted results do not have the same ones set?

By all i mean all the ones listed when typing command "locale":

LANG=fi_FI.UTF-8
LC_CTYPE="fi_FI.UTF-8"
LC_NUMERIC="fi_FI.UTF-8"
LC_TIME="fi_FI.UTF-8"
LC_COLLATE="fi_FI.UTF-8"
LC_MONETARY="fi_FI.UTF-8"
LC_MESSAGES="fi_FI.UTF-8"
LC_PAPER="fi_FI.UTF-8"
LC_NAME="fi_FI.UTF-8"
LC_ADDRESS="fi_FI.UTF-8"
LC_TELEPHONE="fi_FI.UTF-8"
LC_MEASUREMENT="fi_FI.UTF-8"
LC_IDENTIFICATION="fi_FI.UTF-8"
LC_ALL=fi_FI.UTF-8

I think those should be more than enough.

Osmo



------------------------------

Date: Thu, 20 Jan 2005 16:06:09 -0000
From: "Leonard Challis" <perl@lennychallis.co.uk>
Subject: Re: Low level data manipulation in Perl
Message-Id: <csokth$40s$1@newsg4.svr.pol.co.uk>

Tad McClellan wrote
>
> They are posted here twice each week.
>
>   http://mail.augustmail.com/~tadmc/clpmisc.shtml
>

Excellent - thanks for that Tad 




------------------------------

Date: 21 Jan 2005 08:31:37 -0800
From: "shifty" <shifty_MyU@yahoo.com>
Subject: Re: Negative lookahead regex clarification needed
Message-Id: <1106325097.883251.263140@c13g2000cwb.googlegroups.com>


> I didn't know what "Regex Coach" is (I do now, courtesy of Google),
> but I find "pcretest" (part of the PCRE package from Phil Hazel) to
be
> a valuable aid.

I'll hafta check that out.


> OTOH, if you're in a context where only a regex is acceptable (you're

> not by any chance writing recipes for spamassassin?) then I might
have
> to take that back.

I am writing recipes for spam rejection, you're sharp ;)

I'm writing something specific to PCRE.  I couldn't find any current
regex-specific groups.



------------------------------

Date: 21 Jan 2005 08:34:48 -0800
From: "shifty" <shifty_MyU@yahoo.com>
Subject: Re: Negative lookahead regex clarification needed
Message-Id: <1106325288.912212.5670@f14g2000cwb.googlegroups.com>


> If the syntax weren't correct it wouldn't compile.  What you are
asking is
> whether it does what you want it to do, which is about semantics.

For the purpose it's being used, it is not necessary to compile the
regex.  It's being accessed from an outside resource (spam filter).


> Is there any reason why you want to use lookahead to exclude
unaltered
> strings like "microsoft"?  Just skip those strings using an extra
regex,
> and concentrate on matching the altered variants.

Yes.  I don't want to bounce legitimate emails.  Spam emails offering
their software almost always misspell it at some point; I want to
bounce anything I can be 99% certain is spam.



------------------------------

Date: 21 Jan 2005 08:41:00 -0800
From: "shifty" <shifty_MyU@yahoo.com>
Subject: Re: Negative lookahead regex clarification needed
Message-Id: <1106325660.361803.31580@f14g2000cwb.googlegroups.com>


Jim Gibson wrote:
> In article <1106161759.879622.23020@f14g2000cwb.googlegroups.com>,
> shifty <shifty_MyU@yahoo.com> wrote:
>
> Yes, it does work, but it could be simplified:

I'm still not sure how, though :)  Seriously, though, I've noticed it
works for everything but microsof+ (non-word character @ end of
expression! You actually noted this :) )

> 1, It is useless to have .* at the beginning and end of the regex.

For the purpose it's being used (spam filter rule), it is necessary.

> 2. It is useless to group with (?: ... ) in this case

You're right ... I was doing this because I didn't want to capture the
match.

> 3. You don't need all of the plus signs unless you expect repeated
> characters.

I do.  Spam emails with "hacked" words often use repeat characters to
fool keyword filtering.

> 9. Dont forget $ as a replacement for s, $ needs escaping in
> double-quote context of a regular expression.

Thanks, missed that one.  I hadn't even thought about it.  I was
running through an ASCII character map to look at similar
characters...dunno how I missed the $ sign.

>
> With all of the above points in mind, I would suggest the following:
>
> my $regex =3D qr(
>   (?:\b|\s)
>   (?!microsoft)
>   m
>   [i1l\\\|!=A1=EE=ED=EC=EF]
>   [C=E7]
>   r
>   [o0=F6=F8=F5=F4=F3=F2=F0]
>   [s=A7\$]
>   [o0=F6=F8=F5=F4=F3=F2=F0]
>   f
>   [t+]
>   (?:\b|\s)
> )ix;
>

Thanks!  I'm going to play with your suggestion for a bit, I think this
should work.  I need to make some versions for pharmaceutical spam as
well.  Should work perfect!


> Are you looking for other approximations such as 'microsloth' and
> 'microsquash'?

Nah, because spammers don't usually do things like that.

Thanks again for your insight.  Couldn't have asked for a more perfect
answer!



------------------------------

Date: Fri, 21 Jan 2005 17:11:16 +0000
From: "Alan J. Flavell" <flavell@ph.gla.ac.uk>
Subject: Re: Negative lookahead regex clarification needed
Message-Id: <Pine.LNX.4.61.0501211706040.5962@ppepc56.ph.gla.ac.uk>

On Fri, 21 Jan 2005, shifty wrote:

> Jim Gibson wrote:

> > 2. It is useless to group with (?: ... ) in this case
> 
> You're right ... I was doing this because I didn't want to capture the
> match.

I think Jim means that the negative-lookahead syntax is itself 
non-capturing, despite the parentheses - so you did't need to nullify 
the capturing anyway.

If you already realised that - apologies in advance.

No, I don't know where to raise questions specifically about regexes, 
either.  But the Perl regulars seem quite a bit more tolerant of 
off-topically regex-related questions here, than they are about 
off-topically CGI questions here :-}


------------------------------

Date: 21 Jan 2005 17:42:45 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Negative lookahead regex clarification needed
Message-Id: <csreul$m3o$1@mamenchi.zrz.TU-Berlin.DE>

shifty <shifty_MyU@yahoo.com> wrote in comp.lang.perl.misc:
> 
> > If the syntax weren't correct it wouldn't compile.  What you are
> asking is
> > whether it does what you want it to do, which is about semantics.
> 
> For the purpose it's being used, it is not necessary to compile the
> regex.  It's being accessed from an outside resource (spam filter).

Something is going to compile it.  Every regex engine in existence
does that.

My point was the misuse of "syntax" for "correct code".  It's becoming a
sore spot.

> > Is there any reason why you want to use lookahead to exclude
> unaltered
> > strings like "microsoft"?  Just skip those strings using an extra
> regex,
> > and concentrate on matching the altered variants.
> 
> Yes.  I don't want to bounce legitimate emails.  Spam emails offering
> their software almost always misspell it at some point; I want to
> bounce anything I can be 99% certain is spam.

That's inconclusive, but since you didn't say what your spam filter
actually does with the regex, there's no way of telling.

Anno


------------------------------

Date: Fri, 21 Jan 2005 18:53:50 GMT
From: Rob <rhxk@yahoo.com>
Subject: net::SFTP to capture 'get' reference?
Message-Id: <2dcId.13870$5R.5048@newssvr21.news.prodigy.com>

Hi,

I'm sftp'ing files from one machine to this one, and wanted to 
capture the error messages for the $sftp->get reference.

I tried the eval method, but that didn't work...any ideas?
here's my script.

#!/usr/bin/perl
 
select(STDERR); $| = 1;         # flush output buffer (STDERR)
select(STDOUT); $| = 1;         # flush output buffer (STDOUT)
                                                                                                                                                                                                               
 
use Net::SSH::Perl;
use Net::SFTP;
use Net::SFTP::Util;

$user = 'username';
$passwd = 'password';
$site = 'machine1';
$data = 'result';
 
eval {
    $sftp = Net::SFTP->new($site,user => $user, password => $passwd, debug
=> 0);
};
if ($@) {
    print "SFTP connection to $site failed: $@\n";
}
 
if ($sftp) {
    print "Connected to $site\n";
} else {
    print "cannot connect\n";
    exit;
}
 
$dir = "/home/username/";
$odir = "/tmp";
eval {
    @array=$sftp->ls($dir);
};
if ($@) {
    print "cannot do ls ($dir): $@\n";
     
}
 
foreach (@array) {
   $file = $_;
   ($size,$filename) = (split(/\s+/,$file->{longname}))[4,8];
   $file{$filename} = $size;
}
 
foreach $f (sort keys(%file)) {
    print "remote found $f\n";
    next if ($f !~ /^$data/);
    print "$f ($file{$f})\n";


    ####HERE IS THE AREA OF INTEREST######
    #if i put a 'a' in front of $f to cause $sftp->get
    #to fail, I want to capture it & continue with the
    #script rather it killing it.
    eval {
        $sftp->get("$dir/$f","$odir/$f", \&callback);
    };
    if ($@) {
        print "Could not get $dir/$f\n";
        print "error is: $@\n";
    }
#    $a = $sftp->do_remove("$dir/$f");
#    $b = fx2txt($a);
#print "$a, $b,\n";
#last;
}
 
sub callback {
    my($sftp, $data, $offset, $size) = @_;
    print "Retrieved $f $offset of $size,\n";
}



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 7679
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[25434] in Perl-Users-Digest

Perl-Users Digest, Issue: 7679 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Fri Jan 21 14:10:36 2005

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jan 21 14:10:36 2005