[23227] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 5448 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Sep 5 11:05:52 2003

Date: Fri, 5 Sep 2003 08:05:10 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Fri, 5 Sep 2003     Volume: 10 Number: 5448

Today's topics:
    Re: [newbie] how to get a element from a hash of array? (Tad McClellan)
    Re: [newbie] how to get a element from a hash of array? <postmaster@castleamber.com>
    Re: [newbie] list elements <postmaster@castleamber.com>
    Re: Arbitrarily Complex Data Structure <minceme@start.no>
    Re: Arbitrarily Complex Data Structure <minceme@start.no>
    Re: expression specific search and replace (Tad McClellan)
    Re: expression specific search and replace <postmaster@castleamber.com>
        My perl script is "Killed" - Ran out of memory (Marcus Brody)
    Re: My perl script is "Killed" - Ran out of memory ctcgag@hotmail.com
    Re: My perl script is "Killed" - Ran out of memory <peter@semantico.com>
    Re: Net::Telnet and waitfor context problem <jim.mozley@exponential-e.com>
    Re: Net::Telnet and waitfor context problem (Anno Siegel)
    Re: Net::Telnet and waitfor context problem <jim.mozley@exponential-e.com>
    Re: Net::Telnet and waitfor context problem <jim.mozley@exponential-e.com>
        Perl, javascript and CGI (Saya)
        regex help <not@home.com>
    Re: regex help <bernard.el-hagin@DODGE_THISlido-tech.net>
    Re: regex help (Anno Siegel)
    Re: regex help <not@home.com>
    Re: regex help <minceme@start.no>
    Re: regex help (Sam Holden)
    Re: regex help <abigail@abigail.nl>
    Re: regex help (Sam Holden)
    Re: regex help <not@home.com>
        regex weirdness <hacker@amazon.de>
    Re: regex weirdness <abigail@abigail.nl>
    Re: regex weirdness <hacker@amazon.de>
        Telling perl to expect input data to be in UTF8 <Graham.T.Wood@oracle.com>
    Re: View NG with Net::NNTP (Tom)
    Re:  <bwalton@rochester.rr.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Fri, 5 Sep 2003 08:34:14 -0500
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: [newbie] how to get a element from a hash of array?
Message-Id: <slrnblh46m.2p8.tadmc@magna.augustmail.com>

fabre <pedro.fabre@gen.gu.se> wrote:
> 
> being my script:
> 
> #! /usr/bin/perl -w
> 
> use strict;      
> 
> my %hash = (
>     'one' => "A B C D",
>     'two' => "I J K L",
> );
> 
> foreach my $key (keys %hash){
>     print "$hash{$key}\n";   
> }   
>  
> 
> How I can get "K", the third value for key two? 


You can split on whitespace and use a "list slice" to grab the 3rd element:

    my $third_value_for_key_two =  (split /\s+/, $hash{two})[2];


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Fri, 05 Sep 2003 16:18:00 +0200
From: John Bokma <postmaster@castleamber.com>
Subject: Re: [newbie] how to get a element from a hash of array?
Message-Id: <3f589b7e$0$1745$58c7af7e@news.kabelfoon.nl>

Gunnar Hjalmarsson wrote:

> Of course, the hash doesn't _need_ to be anonymous:
> 
>     #hash of array references
>     my %hash = ( one => [qw/A B C D/], two => [qw/I J K L/] );
> 
>     #output
>     foreach my $key (keys %hash) {

	  my $rarr = $hash{$key};

[remove]
>         foreach my $elem (@{$hash{$key}}) {
>             print "$elem ";
>         }
>         print "\n";
[end remove]

	print join(" ", @$rarr), "\n";

>     }

I use $rarr to make it more readable, the two added lines could be 
replaced by: print join(" ", @{$hash{$key}}), "\n";


-- 
Kind regards,       feel free to mail: mail(at)johnbokma.com (or reply)
                     virtual home: http://johnbokma.com/  ICQ: 218175426
John                web site hints: http://johnbokma.com/websitedesign/



------------------------------

Date: Fri, 05 Sep 2003 16:42:11 +0200
From: John Bokma <postmaster@castleamber.com>
Subject: Re: [newbie] list elements
Message-Id: <3f58a129$0$1760$58c7af7e@news.kabelfoon.nl>

Vlad Tepes wrote:

@hash{@a} = ( "" ) x @a;

v.s.

@hash{@a} = ( undef ) x @a;

> So setting a hash value to undef at least appears to save about
> 56 bytes memory compared to storing it as an empty string.

56 bytes/element. That is quite some saving. Thanks for sharing.
I guess that undef just creates a null pointer and an empty string 
creates a list element with an empty string and some info. (Wild guess)

-- 
Kind regards,       feel free to mail: mail(at)johnbokma.com (or reply)
                     virtual home: http://johnbokma.com/  ICQ: 218175426
John                web site hints: http://johnbokma.com/websitedesign/



------------------------------

Date: Fri, 5 Sep 2003 11:48:58 +0000 (UTC)
From: Vlad Tepes <minceme@start.no>
Subject: Re: Arbitrarily Complex Data Structure
Message-Id: <bj9t7a$e05$2@troll.powertech.no>

JR <jrolandumuc@yahoo.com> wrote:

> With your technique, I was able to significantly factor-down my code
> (the one key component I was forgetting is that a reference to a
> reference is always dereferenced as a scalar---that's why I had so
> many unnecessary evals in my first script).  I also noticed that the
> ref has to be tested for in the for loop--using regex can lead to some
> mistakes.  Anyway, below is the much more succinct code for my class
> and calling script.  Thanks, Vlad.

Great JR! I haven't tested it, but this looks far better than the code
you first posted. But, as Bryan pointed out, it doesn't handle circular
references.

And regexes can be nice, you could use them to recursively print
objects:

    $t = bless {}, "Some::Package";
    print $t;	                    # Some::Package=HASH(0x838cd50)
    if (ref $t) {
        last unless $t =~ /=(\w+)/;
        print "\$t is a reference to a $1\n";
    }

Unfortunately, the snippet I posted didn't check for a reference before
performing the pattern matches, so it would wrongly assume that a string
matching /HASH|REF|ARRAY|CODE/ would be a reference... But that's how it
is with fresh code, it takes time to discover and fix the bugs.

That's one important reason to use modules. Unless you're coding to
learn something, search.cpan.org before you start.

Regards,
-- 
Vlad



------------------------------

Date: Fri, 5 Sep 2003 12:40:30 +0000 (UTC)
From: Vlad Tepes <minceme@start.no>
Subject: Re: Arbitrarily Complex Data Structure
Message-Id: <bja07u$ev6$1@troll.powertech.no>

Bryan Castillo <rook_5150@yahoo.com> wrote:
> Vlad Tepes <minceme@start.no> wrote
>> JR <jrolandumuc@yahoo.com> wrote:
>> >jrolandumuc@yahoo.com (JR) wrote
>> >>Brian McCauley <nobull@mail.com> wrote
>> >>> jrolandumuc@yahoo.com (JR) writes:
>> >>> 
>> >>>> Hi.  Is it possible to create a subroutine to handle an arbitrarily
>> >>>> complex data structure (for my purposes, complex only refers to hashes
>> >>>> and arrays)?
>> 
>> I just palyed some small with tarversing. Maybe yo like to have a loko:
>> 
>>     sub processitem($) {
>>         my $indent = shift;
>>         my $item   = shift || $_;
>>         print "$indent ", "    " x $indent, $item, "\n";
>>     }
>>     
>>     sub trav(@); # must declare sub to prototype recurs. func.
>> 
>>     sub trav {
>>         my $i = ref $_[0] ? 0 : 1 + shift; # indentation
>>         foreach ( @_ ) {
>>             /HASH/  &&  do{ trav $i, %$_;    next }; 
>>             /ARRAY/ &&  do{ trav $i, @$_;    next }; 
>>             /CODE/  &&  do{ trav $i, $_->(); next };
>>             /REF/   &&  do{ trav $i, $$_ ;   next };   
>>             processitem $i;
>>         }
>>     }
>> 
>>     trav  \%hoh, \\\\\%hah, \%hah, \%heh;
>> 
>     # DON'T TRY DOING THIS THOUGH!
>     # fyi Data::Dumper handles it
>
>     my $a = { name => 'bryan', age => 26 };
>     $a->{evil_death} = $a;
>     trav($a);

Yes, this snippet loops infinitely on circular references. (There are
also other errors in it.)

But I can't see how you could use Data::Dumper to print only the values
of datastructures.

Regards,
-- 
Vlad



------------------------------

Date: Fri, 5 Sep 2003 08:27:25 -0500
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: expression specific search and replace
Message-Id: <slrnblh3ps.2p8.tadmc@magna.augustmail.com>

qanda <fumail@freeuk.com> wrote:
> Thanks Tad, as always you make me look at things in a different way.
> 
> If we extend the data ...
> __DATA__
> field1,ABC/ab12cd ef34,field3,field4,EFC/ab12cd ef56,field6
> field1,XBC/ab12cd ef34,field3,field4,EFC/ab13cd ef56,field6
> field1,YBC/ab12cd ef34,field3,field4,EFC/ab13ce ef56,field6
> field1,YBC/ab13cd ef34,field3,field4,EFC/ab13ce ef56,field6
> field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6
> field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6
> field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6
> field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6


> However the unique parts and their replacements should be ...
> 
> all C/ab12cd replaced by string_1
> all C/ab13cd replaced by string_2
> all C/ab13ce replaced by string_3
> all C/ab14cd replaced by string_4


   my %seen;
   my $cnt;
   while ( <DATA> ) {
      s#C/(\S+)# $seen{$1} = ++$cnt unless $seen{$1}; "C/string_$seen{$1}" #ge;
      print;
   }



-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Fri, 05 Sep 2003 16:30:16 +0200
From: John Bokma <postmaster@castleamber.com>
Subject: Re: expression specific search and replace
Message-Id: <3f589e5e$0$1755$58c7af7e@news.kabelfoon.nl>

qanda wrote:

> However the unique parts and their replacements should be ...
> 
> all C/ab12cd replaced by string_1
> all C/ab13cd replaced by string_2
> all C/ab13ce replaced by string_3
> all C/ab14cd replaced by string_4

#!/usr/bin/perl -w

use strict;

my %hash;
my $cnt = 1;

while (my $line = <DATA>) {

     $line =~ s{(C/\S+)}{
         defined $hash{$1} ? $hash{$1} :
                             ($hash{$1} = "string_" . $cnt++);
     }ge;
     print $line;

}


__DATA__
field1,ABC/ab12cd ef34,field3,field4,EFC/ab12cd ef56,field6
field1,XBC/ab12cd ef34,field3,field4,EFC/ab13cd ef56,field6
field1,YBC/ab12cd ef34,field3,field4,EFC/ab13ce ef56,field6
field1,YBC/ab13cd ef34,field3,field4,EFC/ab13ce ef56,field6
field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6
field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6
field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6
field1,YBC/ab14cd ef34,field3,field4,EFC/ab13ce ef56,field6


Gives:

field1,ABstring_1 ef34,field3,field4,EFstring_1 ef56,field6
field1,XBstring_1 ef34,field3,field4,EFstring_2 ef56,field6
field1,YBstring_1 ef34,field3,field4,EFstring_3 ef56,field6
field1,YBstring_2 ef34,field3,field4,EFstring_3 ef56,field6
field1,YBstring_4 ef34,field3,field4,EFstring_3 ef56,field6
field1,YBstring_4 ef34,field3,field4,EFstring_3 ef56,field6
field1,YBstring_4 ef34,field3,field4,EFstring_3 ef56,field6
field1,YBstring_4 ef34,field3,field4,EFstring_3 ef56,field6



-- 
Kind regards,       feel free to mail: mail(at)johnbokma.com (or reply)
                     virtual home: http://johnbokma.com/  ICQ: 218175426
John                web site hints: http://johnbokma.com/websitedesign/



------------------------------

Date: 5 Sep 2003 06:21:30 -0700
From: dna@888.nu (Marcus Brody)
Subject: My perl script is "Killed" - Ran out of memory
Message-Id: <1ab76986.0309050521.669cd00@posting.google.com>

I am doing some data extraction on a massive scale with perl (e.g. 10
gigs of *zipped* files).  For this I am using the PerlIO::gzip
modules, as I dont even have enough disk space to unzip them all
(which would be around 35 Gb).

I parse the "xls.gz" files one by one (actually tab delimeted files),
extracting the data I need, which would then be printed to STDOUT
before the program completes.

However, the program is automatically killed half way through.  By
checking with "free" I can see I am running out of RAM/virtual disk
space, which is not suprising ;-)

I guess my data structure in perl (fairly simple, a hash containing an
array and some associated variables etc.) is just swelling till the
process gets killed by the OS.  However, I need a way round this.

It is difficult to print out the information as i go along, as I am
aggragating info from all the files - I do not know what to print out
until the last file is read.

Therefore is there a cleaner way of dealing with huge data structures
in perl?  I guestimate that my data structure contains 250 million
variables (all very simple - either decimal points (to about 6 places)
or short names).
Any ideas how I could stop running out of memory?  Declaring data
types ala C?

Dont suggest I go buy some more RAM... I'm skint ;-)


Thanks in advance

MB


------------------------------

Date: 05 Sep 2003 13:55:03 GMT
From: ctcgag@hotmail.com
Subject: Re: My perl script is "Killed" - Ran out of memory
Message-Id: <20030905095503.828$fm@newsreader.com>

dna@888.nu (Marcus Brody) wrote:
> I am doing some data extraction on a massive scale with perl (e.g. 10
> gigs of *zipped* files).  For this I am using the PerlIO::gzip
> modules, as I dont even have enough disk space to unzip them all
> (which would be around 35 Gb).

You could also use gzcat to stream the data into your perl script
without unzip them all at once.
gzcat file.gz | script.pl

>
> I parse the "xls.gz" files one by one (actually tab delimeted files),
> extracting the data I need, which would then be printed to STDOUT
> before the program completes.
>
> However, the program is automatically killed half way through.  By
> checking with "free" I can see I am running out of RAM/virtual disk
> space, which is not suprising ;-)
>
> I guess my data structure in perl (fairly simple, a hash containing an
> array and some associated variables etc.) is just swelling till the
> process gets killed by the OS.  However, I need a way round this.

Don't store as much!  (If you told us what you were doing, I might be
able to tell you how to not store as much.

> It is difficult to print out the information as i go along, as I am
> aggragating info from all the files

Aggregating how?  sum, count, min, max by group?  by how many groups?

> - I do not know what to print out
> until the last file is read.

Then read the last file first :)

>
> Therefore is there a cleaner way of dealing with huge data structures
> in perl?

We really don't know how you are dealing with it now.  "a hash containing
an array and some associated variables" is not much of a description.

> I guestimate that my data structure contains 250 million
> variables (all very simple - either decimal points (to about 6 places)
> or short names).

This is what the input data structure is, or this is what the working
memory structure is?  If stored in standar Perl variables, that's going to
be at least 5 gig, even if you they are all floats and never used in a
stringish way.

> Any ideas how I could stop running out of memory?  Declaring data
> types ala C?

You could use "pack" to pack the data into C-type variables that are
held (in bulk) in strings.  But that would still take at least 1 gig,
and I assume you actually want to do something with these numbers, which
would be difficult if they are all packed.

Does your output consists of all 250 million items?  If not, then
perhaps you don't need to store them all afterall.



Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service              New Rate! $9.95/Month 50GB


------------------------------

Date: Fri, 05 Sep 2003 15:20:18 +0100
From: Peter Hickman <peter@semantico.com>
Subject: Re: My perl script is "Killed" - Ran out of memory
Message-Id: <3f589ba3$0$12643$afc38c87@news.easynet.co.uk>

I take it that this is on Linux or some such OS.

I have a problem like that in that when parsing various XML files the process 
memory grows to the size required by the largest file. And will not give it 
back until the process finishes.

This occasionally gets killed due to lack of swap. What I do is process each 
file from smallest to largest to reduce the amount of time that the process is 
holding onto a lot of swap.



------------------------------

Date: Fri, 5 Sep 2003 11:57:57 +0100
From: "Jim Mozley" <jim.mozley@exponential-e.com>
Subject: Re: Net::Telnet and waitfor context problem
Message-Id: <bj9q7s$gkppc$1@ID-201189.news.uni-berlin.de>



> > $session->waitfor(Match => '/mymatch/',
> >                   Errmode => 'return')
> >    or die "cannot match it", $session->lastline;
> >
> > This is as shown in an example in the module documentation.
> >
> > The only way I can get this to actually die is to use the default
errmode
> > (which is die). If I use error mode return or my own error handling
> > subroutine the test is always true even when it should not (I have used
> > input_log and dump_log to check the input).
>
> I cannot reproduce this.  This
>
>     use Net::Telnet;
>     my $t = Net::Telnet->new( 'localhost') or die;
>     $t->waitfor( Match => '/mymatch/', Errmode => 'return') or
>         die "timeout";
>
> dies with "timeout" every time.  Perl 5.8.0, Net::Telnet 3.03
>
> Anno

Thanks for trying to replicate this.

I'm using Perl 5.8 with Net::Telnet 3.03 as well on Solaris 8. I can
replicate the problem reliably :-( i.e. if I have /willnevermatch/ instead
of /somematch/ the script still won't die.

Jim




------------------------------

Date: 5 Sep 2003 12:15:05 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Net::Telnet and waitfor context problem
Message-Id: <bj9uo9$ec1$1@mamenchi.zrz.TU-Berlin.DE>

Jim Mozley <jim.mozley@exponential-e.com> wrote in comp.lang.perl.misc:
> 
> 
> > > $session->waitfor(Match => '/mymatch/',
> > >                   Errmode => 'return')
> > >    or die "cannot match it", $session->lastline;
> > >
> > > This is as shown in an example in the module documentation.
> > >
> > > The only way I can get this to actually die is to use the default
> errmode
> > > (which is die). If I use error mode return or my own error handling
> > > subroutine the test is always true even when it should not (I have used
> > > input_log and dump_log to check the input).
> >
> > I cannot reproduce this.  This
> >
> >     use Net::Telnet;
> >     my $t = Net::Telnet->new( 'localhost') or die;
> >     $t->waitfor( Match => '/mymatch/', Errmode => 'return') or
> >         die "timeout";
> >
> > dies with "timeout" every time.  Perl 5.8.0, Net::Telnet 3.03
> >
> > Anno
> 
> Thanks for trying to replicate this.
> 
> I'm using Perl 5.8 with Net::Telnet 3.03 as well on Solaris 8. I can
> replicate the problem reliably :-( i.e. if I have /willnevermatch/ instead
> of /somematch/ the script still won't die.

I ran it on Solaris 8 (with Perl 5.6.1 Net::Telnet 3.03) with the
same results (i.e., it behaves).  Something weird is going on on your
side.

Anno


------------------------------

Date: Fri, 5 Sep 2003 14:13:43 +0100
From: "Jim Mozley" <jim.mozley@exponential-e.com>
Subject: Re: Net::Telnet and waitfor context problem
Message-Id: <bja26h$h1ram$1@ID-201189.news.uni-berlin.de>

> I ran it on Solaris 8 (with Perl 5.6.1 Net::Telnet 3.03) with the
> same results (i.e., it behaves).  Something weird is going on on your
> side.
>
> Anno

OK I'll try to look some more. I am using the latest version of
Net::Telnet::Cisco too, I don't know if this would cause me problems.

Jim




------------------------------

Date: Fri, 5 Sep 2003 14:40:23 +0100
From: "Jim Mozley" <jim.mozley@exponential-e.com>
Subject: Re: Net::Telnet and waitfor context problem
Message-Id: <bja3o8$gvblv$1@ID-201189.news.uni-berlin.de>



> > I ran it on Solaris 8 (with Perl 5.6.1 Net::Telnet 3.03) with the
> > same results (i.e., it behaves).  Something weird is going on on your
> > side.

mmmmmmmmmm

I also see wierdness now in matching /matchA|matchB/. When neither of these
are present the match is a success as it seems to /matchC/ in one particular
case but not others.

Jim




------------------------------

Date: 5 Sep 2003 06:03:52 -0700
From: vahu@novonordisk.com (Saya)
Subject: Perl, javascript and CGI
Message-Id: <9e9517bf.0309050503.ce1dd3@posting.google.com>

Hi, 

I have the following scenario in a system that we run. From a webpage
using javascript I am able to invoke a *.ipl script on the server and
pass parameters as well:

javascript param parsing: 
parameters = new Object();
parameters.iw_arrArtikelIDs	= arrArtikelsToExtract;
callServer("test.ipl", parameters, true);

*.ipl param extraction:
my${cgi} = new CGI();
my $arrArtikelIDs = ${cgi}->param('iw_arrArtikelIDs');

The issue here is that arrArtikelsToExtract is a java array containing
ID's that I need to process in the *.ipl script. I can't seem to get
it to work.
I thought I was so lucky that I could get away with saying something
like the below in the *.ipl script:
my @arrArtikelIDs = ${cgi}->param('iw_arrArtikelIDs');

Does anyone have any hints/clues/way(s) of achieving this. Parsing an
array from javascript to an ipl script ?


------------------------------

Date: Sat, 06 Sep 2003 00:06:50 +1200
From: mdew <not@home.com>
Subject: regex help
Message-Id: <pan.2003.09.05.12.06.47.20627@home.com>

This isnt specifically a perl question, I'm running squid, and running it
through a regex. The question, I'm trying to filter out some spam/ad sites
using regex, I started with the "Penis Enlarging" websites.

I want to match "penis" and "enlarge" in any order, So far I've got

[(penis)(large)(.*)]
I've also tried [(penis)(large)(.*)]{1,} with no luck. Anyone a regex king
that could help me out? :) 


-- 
"Prayer has no place in Public School, just like facts have no place in
organized religion." --The Simpsons




------------------------------

Date: Fri, 5 Sep 2003 12:08:55 +0000 (UTC)
From: "Bernard El-Hagin" <bernard.el-hagin@DODGE_THISlido-tech.net>
Subject: Re: regex help
Message-Id: <Xns93ED8FDB6DFC5elhber1lidotechnet@62.89.127.66>

mdew wrote:

> This isnt specifically a perl question [...]


So why the fook are you asking it here!?


-- 
Cheers,
Bernard
--
echo 42|perl -pe '$#="Just another Perl hacker,"'



------------------------------

Date: 5 Sep 2003 12:17:38 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: regex help
Message-Id: <bj9ut2$ec1$2@mamenchi.zrz.TU-Berlin.DE>

mdew  <not@home.com> wrote in comp.lang.perl.misc:
> This isnt specifically a perl question, I'm running squid, and running it
> through a regex. The question, I'm trying to filter out some spam/ad sites
> using regex, I started with the "Penis Enlarging" websites.
> 
> I want to match "penis" and "enlarge" in any order, So far I've got
> 
> [(penis)(large)(.*)]
> I've also tried [(penis)(large)(.*)]{1,} with no luck. Anyone a regex king
> that could help me out? :) 

Matching "this" or "that" in any order *can* be done in a single
(Perl-) regex, but it's a nuisance and doesn't scale.  Use an extra
regex for each word.

Anno


------------------------------

Date: Sat, 06 Sep 2003 00:48:45 +1200
From: mdew <not@home.com>
Subject: Re: regex help
Message-Id: <pan.2003.09.05.12.48.44.223148@home.com>

On Fri, 05 Sep 2003 12:17:38 +0000, Anno Siegel wrote:

> mdew  <not@home.com> wrote in comp.lang.perl.misc:
>> This isnt specifically a perl question, I'm running squid, and running
>> it through a regex. The question, I'm trying to filter out some spam/ad
>> sites using regex, I started with the "Penis Enlarging" websites.
>> 
>> I want to match "penis" and "enlarge" in any order, So far I've got
>> 
>> [(penis)(large)(.*)]
>> I've also tried [(penis)(large)(.*)]{1,} with no luck. Anyone a regex
>> king that could help me out? :)
> 
> Matching "this" or "that" in any order *can* be done in a single (Perl-)
> regex, but it's a nuisance and doesn't scale.  Use an extra regex for
> each word.

I'm testing to 2 possibilities, to prevent legit websites from being
unnecessarily filtered, I'm thinking of *penis*enlarge* and in reverse
*enlarge*penis*. Whats the proper regex way of doing this?

-- 
"Prayer has no place in Public School, just like facts have no place in
organized religion." --The Simpsons




------------------------------

Date: Fri, 5 Sep 2003 12:52:56 +0000 (UTC)
From: Vlad Tepes <minceme@start.no>
Subject: Re: regex help
Message-Id: <bja0v8$fa4$1@troll.powertech.no>

mdew <not@home.com> wrote:

> On Fri, 05 Sep 2003 12:17:38 +0000, Anno Siegel wrote:
>> mdew  <not@home.com> wrote in comp.lang.perl.misc:
>>> 
>>> I want to match "penis" and "enlarge" in any order, So far I've got
>>> 
>>> [(penis)(large)(.*)]
>>> I've also tried [(penis)(large)(.*)]{1,} with no luck. Anyone a regex
>>> king that could help me out? :)
>> 
>> Matching "this" or "that" in any order *can* be done in a single (Perl-)
>> regex, but it's a nuisance and doesn't scale.  Use an extra regex for
>> each word.
>
> I'm testing to 2 possibilities, to prevent legit websites from being
> unnecessarily filtered, I'm thinking of *penis*enlarge* and in reverse
> *enlarge*penis*. Whats the proper regex way of doing this?

print "Spam!" if (/penis enlarge/i || /enlarge penis/i );

-- 
Vlad


------------------------------

Date: 5 Sep 2003 12:53:46 GMT
From: sholden@flexal.cs.usyd.edu.au (Sam Holden)
Subject: Re: regex help
Message-Id: <slrnblh1qq.nm7.sholden@flexal.cs.usyd.edu.au>

On Sat, 06 Sep 2003 00:48:45 +1200, mdew <not@home.com> wrote:
> On Fri, 05 Sep 2003 12:17:38 +0000, Anno Siegel wrote:
> 
>> Matching "this" or "that" in any order *can* be done in a single (Perl-)
>> regex, but it's a nuisance and doesn't scale.  Use an extra regex for
>> each word.
> 
> I'm testing to 2 possibilities, to prevent legit websites from being
> unnecessarily filtered, I'm thinking of *penis*enlarge* and in reverse
> *enlarge*penis*. Whats the proper regex way of doing this?

/enlarge/ || /penis/

Your just trying to make us use rudy words (like "enlarge") aren't you :)

-- 
Sam Holden



------------------------------

Date: 05 Sep 2003 12:56:37 GMT
From: Abigail <abigail@abigail.nl>
Subject: Re: regex help
Message-Id: <slrnblh205.esq.abigail@alexandra.abigail.nl>

mdew (not@home.com) wrote on MMMDCLVII September MCMXCIII in
<URL:news:pan.2003.09.05.12.06.47.20627@home.com>:
[]  This isnt specifically a perl question,

Then you shouldn't be asking here.

Goodbye.


Abigail
-- 
perl -MTime::JulianDay -lwe'@r=reverse(M=>(0)x99=>CM=>(0)x399=>D=>(0)x99=>CD=>(
0)x299=>C=>(0)x9=>XC=>(0)x39=>L=>(0)x9=>XL=>(0)x29=>X=>IX=>0=>0=>0=>V=>IV=>0=>0
=>I=>$==-2449231+gm_julian_day+time);do{until($=<$#r){$_.=$r[$#r];$=-=$#r}for(;
!$r[--$#r];){}}while$=;$,="\x20";print+$_=>September=>MCMXCIII=>=>=>=>=>=>=>=>'


------------------------------

Date: 5 Sep 2003 13:36:51 GMT
From: sholden@flexal.cs.usyd.edu.au (Sam Holden)
Subject: Re: regex help
Message-Id: <slrnblh4bj.o42.sholden@flexal.cs.usyd.edu.au>

On 5 Sep 2003 12:53:46 GMT, Sam Holden <sholden@flexal.cs.usyd.edu.au> wrote:
> On Sat, 06 Sep 2003 00:48:45 +1200, mdew <not@home.com> wrote:
>> On Fri, 05 Sep 2003 12:17:38 +0000, Anno Siegel wrote:
>> 
>>> Matching "this" or "that" in any order *can* be done in a single (Perl-)
>>> regex, but it's a nuisance and doesn't scale.  Use an extra regex for
>>> each word.
>> 
>> I'm testing to 2 possibilities, to prevent legit websites from being
>> unnecessarily filtered, I'm thinking of *penis*enlarge* and in reverse
>> *enlarge*penis*. Whats the proper regex way of doing this?
> 
> /enlarge/ || /penis/

/enlarge/ && /penis/

Must stop posting late at night :(

-- 
Sam Holden



------------------------------

Date: Sat, 06 Sep 2003 01:37:23 +1200
From: mdew <not@home.com>
Subject: Re: regex help
Message-Id: <pan.2003.09.05.13.37.22.297272@home.com>

On Fri, 05 Sep 2003 12:53:46 +0000, Sam Holden wrote:

> On Sat, 06 Sep 2003 00:48:45 +1200, mdew <not@home.com> wrote:
>> On Fri, 05 Sep 2003 12:17:38 +0000, Anno Siegel wrote:
>> 
>>> Matching "this" or "that" in any order *can* be done in a single (Perl-)
>>> regex, but it's a nuisance and doesn't scale.  Use an extra regex for
>>> each word.
>> 
>> I'm testing to 2 possibilities, to prevent legit websites from being
>> unnecessarily filtered, I'm thinking of *penis*enlarge* and in reverse
>> *enlarge*penis*. Whats the proper regex way of doing this?
> 
> /enlarge/ || /penis/
> 
> Your just trying to make us use rudy words (like "enlarge") aren't you :)

I'm no big perl guru, but doesnt || mean "OR", so any url's with "enlarge"
in the title would get mark as spam? how about,

(enlarge AND penis) OR (penis AND enlarge)

the OR could be ditched, for say to regex, am i looking at this the right
way?


-- 
"Prayer has no place in Public School, just like facts have no place in
organized religion." --The Simpsons




------------------------------

Date: Fri, 5 Sep 2003 12:29:55 +0200
From: "thorsten schau" <hacker@amazon.de>
Subject: regex weirdness
Message-Id: <bj9ojb$gjfvi$1@ID-64752.news.uni-berlin.de>

Hi,

I wonder why the 'yes' print does not go thru.

if($line =~ /\$\{general-foo\s+(\S+)([^}]*)\}/ {
my $thewhole = $&;

print"thewhole: $thewhole\n";
print"yes\n" if $line =~ m/$thewhole/;

}

$& and $thwhole has exactly the matched string from $line.

any idea?

thanls




------------------------------

Date: 05 Sep 2003 10:39:01 GMT
From: Abigail <abigail@abigail.nl>
Subject: Re: regex weirdness
Message-Id: <slrnblgpu5.kg2.abigail@alexandra.abigail.nl>

thorsten schau (hacker@amazon.de) wrote on MMMDCLVII September MCMXCIII
in <URL:news:bj9ojb$gjfvi$1@ID-64752.news.uni-berlin.de>:
``  Hi,
``  
``  I wonder why the 'yes' print does not go thru.
``  
``  if($line =~ /\$\{general-foo\s+(\S+)([^}]*)\}/ {
``  my $thewhole = $&;
``  
``  print"thewhole: $thewhole\n";
``  print"yes\n" if $line =~ m/$thewhole/;
``  
``  }
``  
``  $& and $thwhole has exactly the matched string from $line.
``  


$& contains at least '$', '{', '}', and perhaps more characters
that have a special meaning in a regexp. 

You probably want:

    print "yes\n" if $line =~ /\Q$thewhole/;

or

    print "yes\n" if 0 <= index $line, $thewhole;


Abigail
-- 
perl -swleprint -- -_='Just another Perl Hacker'


------------------------------

Date: Fri, 5 Sep 2003 12:40:23 +0200
From: "thorsten schau" <hacker@amazon.de>
Subject: Re: regex weirdness
Message-Id: <bj9p6t$gb1td$1@ID-64752.news.uni-berlin.de>

ahh, that makes sense. thanks :)


"Abigail" <abigail@abigail.nl> wrote in message
news:slrnblgpu5.kg2.abigail@alexandra.abigail.nl...
> thorsten schau (hacker@amazon.de) wrote on MMMDCLVII September MCMXCIII
> in <URL:news:bj9ojb$gjfvi$1@ID-64752.news.uni-berlin.de>:
> ``  Hi,
> ``
> ``  I wonder why the 'yes' print does not go thru.
> ``
> ``  if($line =~ /\$\{general-foo\s+(\S+)([^}]*)\}/ {
> ``  my $thewhole = $&;
> ``
> ``  print"thewhole: $thewhole\n";
> ``  print"yes\n" if $line =~ m/$thewhole/;
> ``
> ``  }
> ``
> ``  $& and $thwhole has exactly the matched string from $line.
> ``
>
>
> $& contains at least '$', '{', '}', and perhaps more characters
> that have a special meaning in a regexp.
>
> You probably want:
>
>     print "yes\n" if $line =~ /\Q$thewhole/;
>
> or
>
>     print "yes\n" if 0 <= index $line, $thewhole;
>
>
> Abigail
> --
> perl -swleprint -- -_='Just another Perl Hacker'




------------------------------

Date: Fri, 05 Sep 2003 15:14:59 +0100
From: Graham Wood <Graham.T.Wood@oracle.com>
Subject: Telling perl to expect input data to be in UTF8
Message-Id: <3F589A63.C6FE7476@oracle.com>

This is a multi-part message in MIME format.
--------------E6BAA3DCE1D9E73ACFD6D6E4
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Hi all,
<p>I've got this friend who's using perl 5.6.1 and DBI to import data that
is in UTF8 and he is unable to display the data successfully after it imports
from the database.&nbsp; I read in perldoc perlunicode shipped with 5.6.1
that you can't tell perl to expect UTF8 data from an external file or source
but that it would shortly be addressed in a future version.&nbsp; Can anyone
tell me if this has happened yet, and if so, in which version of perl it
is available?
<p>Thanks
<p>Graham</html>

--------------E6BAA3DCE1D9E73ACFD6D6E4
Content-Type: text/x-vcard; charset=UTF-8;
 name="Graham.T.Wood.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Graham Wood
Content-Disposition: attachment;
 filename="Graham.T.Wood.vcf"

begin:vcard 
n:;Graham
x-mozilla-html:FALSE
adr:;;;;;;
version:2.1
email;internet:Graham.Wood@oracle.com
fn:Graham Wood
end:vcard

--------------E6BAA3DCE1D9E73ACFD6D6E4--



------------------------------

Date: 5 Sep 2003 03:35:29 -0700
From: tom@ztml.com (Tom)
Subject: Re: View NG with Net::NNTP
Message-Id: <59b4279a.0309050235.11c4e6ff@posting.google.com>

csdude@hotmail.com (Mike) wrote in message news:<46cdc619.0309041834.7a6db561@posting.google.com>...
> 
> You guys are being an incredible help, I really appreciate that.
> Apparently, my package simply doesn't recognize "$fh =
> $nntp->articlefh;", like you said Tom, because changing it worked
> perfectly.
> 
> Last question (I think). I noticed that, in my own script, I didn't
> need to include the last "print;" statement; the statement

PERL will assume alots of stuff if you don't specify something to operate
on. In the example, you can enter print $_ or just print, or in what
you have discovered by leaving it out all together, and PERL will take care
the rest.

> "$nntp->article($msgid,*STDOUT);" was printing for me. Based on this,
> how would I modify the output? I can't seem to set a variable equal to
> the output, but I need to do things like changing /n to <br>, and I
> was hoping to create a blacklist to take out profanity.
> 
> Can I do something like this (this didn't work, but it describes what
> I'm needing)?
> $nntp->article($msgid,*STDOUT) =~ tr/\n/<br>/;
> 

$msgid and *STDOUT are optional arguments. To capture the output from the
function, you simply enter:
my $text = $nntp->article;
{ do something with $text }

Tom
ztml.com


------------------------------

Date: Sat, 19 Jul 2003 01:59:56 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: 
Message-Id: <3F18A600.3040306@rochester.rr.com>

Ron wrote:

> Tried this code get a server 500 error.
> 
> Anyone know what's wrong with it?
> 
> if $DayName eq "Select a Day" or $RouteName eq "Select A Route") {

(---^


>     dienice("Please use the back button on your browser to fill out the Day
> & Route fields.");
> }
 ...
> Ron

 ...
-- 
Bob Walton



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 5448
***************************************


home help back first fref pref prev next nref lref last post