[25162] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 7411 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Nov 16 21:05:46 2004

Date: Tue, 16 Nov 2004 18:05:06 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 16 Nov 2004     Volume: 10 Number: 7411

Today's topics:
        87676 Mining the Web :Searches with Kriging, Inverse Di <kevansb2b@yahoo.com>
    Re: Advice needed: threads and queues <swartz@inbox.ru>
    Re: Cookies (Tuba Chuck)
        FAQ 9.23: How do I find out my hostname/domainname/IP a <comdog@panix.com>
        First Perl Program (DGP)
    Re: First Perl Program <toreau@gmail.com>
    Re: First Perl Program <tadmc@augustmail.com>
    Re: First Perl Program <gnari@simnet.is>
    Re: First Perl Program <lawshouse.public@btconnect.com>
        NCBI ASN1 File in Perl (SWus)
    Re: NCBI ASN1 File in Perl <jgibson@mail.arc.nasa.gov>
    Re: problems using taint to check an array be created b <phill@mywebstuff.com>
    Re: RegEx challenge - doesn't work (John R)
    Re: RegEx challenge - doesn't work <1usa@llenroc.ude.invalid>
    Re: RegEx challenge - doesn't work <lv@aol.com>
    Re: RegEx challenge - doesn't work <tadmc@augustmail.com>
    Re: RegEx challenge - doesn't work <gnari@simnet.is>
    Re: RegEx challenge - doesn't work <gnari@simnet.is>
    Re: regexp question <toreau@gmail.com>
    Re: regexp question <nnpospamm@front.org>
    Re: regexp question <gnari@simnet.is>
        Search for string and return file name <misc@actitud.NOSPAM.se>
    Re: Search for string and return file name <jgibson@mail.arc.nasa.gov>
    Re: Search for string and return file name <MrReallyVeryNice.REMOVE.NO.SPAM@Yahoo.REMOVE.NO.SPAM.com>
    Re: Time::HiRes module and timing a command... <toreau@gmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 16 Nov 2004 17:01:50 -0500
From: "Web Science" <kevansb2b@yahoo.com>
Subject: 87676 Mining the Web :Searches with Kriging, Inverse Distance Weighting, eigenVectors and Cross-Pollination 87676
Message-Id: <419a85d1$0$31279$9a6e19ea@news.newshosting.com>

Site and Features:  http://www.eigensearch.com

Search engine, eigenMethod, eigenvector, mathematical, manifolds, science, technical, search tools, eigenmath, Jacobian, quantum, mechanics, manifolds, science, physics, chemistry, law, legal, government, home, office, business, domain lookup, medical, travel, food, university students, searching, searchers, surfing, advanced search, search tools

Chemistry, mathematics, physical sciences, engineering, aerospace, astronomy, photography, news, computers, software, investment, venture capital, stakeholder, Biology, Chemistry, Geosciences, Biotechnology, Medical, Nursing, Anthropology, psychology, psychiatry, Philosophy, History, Business, bachelor, Ph.D., Masters, administrative, MBA, eigenMethod, eigenvector, mathematical, manifolds, science, technical, search tools, eigenmath, Jacobian, quantum, mechanics, manifolds, physics, chemistry, law, legal, health, government, home, office, business, domain, lookup, medical, travel, food, university, students, search, searches, search engine, directory, directories, category, categories, help,  searching, searchers, surfing, advanced search, search help, search tips


Beta Users and advanced features Sign-up here...  http://www.eigensearch.com/inc/constructs/betasignup.htm

Central to eigenSearch Advanced is the freedom to construct complex search explorations, save the forms for later use; and apply weight factor to each phrase and term. EigenSearch processing will apply eigenvector math and Jacobian matrices to construct search terms that are tailored to your exploration. Cross-pollination is also applied as described below. The eigenvector approach is clearly highly advanced and would normally be useful for very sophisticated applications. Nevertheless, anyone may utilize the method. An advanced form is simply a matrix in which the user types words and phrases randomly in a multi-cell form (please click thumbnail to view).

Advanced features

EigenOperator (cross pollination) and eigenvector constructs
Cross document content pollination within every web site directory tree (unlike conventional search engines and tools eigenSearch checks for your terms and phrases and drills down though multiple directory documents)
EigenSearch cross-pollination is applied to documents within the same (tree) level in a URL (peer documents). Thereby limiting the amount of contamination of results
Illustration:
"Blood Hounds" + "English Breed" will present documents that contain either of these phrases within the same peer level in a document storage structure; for example, within the directory: www.smartdogs/hounds.
eigenSearch limits pollinating occurrences outside a peer level. For example; "blood hounds" + "English breed" found in two different directories would not report an eigenSearch result: i.e. "Blood hounds" found in www.smartdogs/hounds and "English Breed" found in. www.smartdogs/hounds/Europe would not be found. EigenSearch therefore searches one tree (peer) level in a site and looks for multiple occurrences of multiple phrases across all documents within this peer level.
Corporate products can be tailored to drill down infinite levels for eigenOperator (cross-pollinating operator) matching.
eigenSearch single phrase results will find all documents and show the results as independent findings. This way the user can find results across many documents and the combined highly constrained results are reserved for a single level cross pollination.
Extremely high (cross-pollinating) eigenValues will correspond to finely granular and refined search explorations.

Beta users receive the following features:
Login and password
Save search constructs for later use in your own personal construct tables
EigenOperator (cross Pollinating Operator) advanced features as described above (eigenvector to follow)
Database (Table) upload and eigenvector computations
EigenSearch seeks 300,000 beta testers for its advanced eigenOperator based cognitive engine. This engine will allow for a multiplicity of search parameters for users to select so as to mathematically narrow results. The system will employ eigenVectors, eigenValues and eigenMatrices to determine relevance to user searches; thereby rendering high fidelity confirmed search results.
Naturally the computational power for doing such math is why beta testers are required. Each tester is welcome to comment on user friendliness, speed, change and ergonomic elegance. It is an eigenSearch goal to continue advancing the user interface so as to remain intuitively simple to use while at the same time providing hi-fidelity explorations.
All beta testers will receive a login and password, which provides entry into features for saving search constructs and parameters according to their own classification approach. Saved results and parameters can be used at any time and modified to alter search results. Beta users will be able to import their own data sets (2-dimentional) and perform an eigenValue analysis.







<r<p_


------------------------------

Date: Tue, 16 Nov 2004 18:20:32 -0600
From: "Swartz" <swartz@inbox.ru>
Subject: Re: Advice needed: threads and queues
Message-Id: <10pl61ce76t529c@corp.supernews.com>


"Ben Morrow" <usenet@morrow.me.uk> wrote
> Quoth "Swartz" <swartz@inbox.ru>:
> >
> > I'm trying to write a small threaded "proof of concept" application.
> > Basic ideas:
> >  - there is a number of songs users can listen to,
> >  - each song can only be accessed by one user at a time
> >  - each user can only listen to one song at a time,
> You don't use lock to state that a song is being played: use it just to
> sync access to variables. You want something more like.
I tried your method, it results in all threads accessing whatever songs they
picked at the same time. I need to allow only one thread per whatever song
at a time.

[... code snipped]

> sub get_song {
>     my $song = shift;
>     lock %songs;
>
>     if ($songs{$song}{locked}) {
>         return;
>     }
>     else {
>         $songs{$song}{locked} = threads->tid;
>         return $songs{$song};
>     }
> }

[... code snipped]

After playing around and incorporating your code with mine...
This sub above seems to always return ($songs{$song}).
In other words test for $songs{$song}{locked} is always false.
What am I doing wrong bellow?

Here's what I've ended up with. Mind you, I kept it simple for now (not
using complex hashes for songs and play time, etc)... I'm just trying to get
it to work.
------------------------- < begin > ---------------------------
#!/usr/bin/perl
use threads;
use threads::shared;
use Thread::Queue;

# Hash of songs
my %song : shared =
        ("1" => "Song 1",
         "2" => "Song 2",
         "3" => "Song 3",
         "4" => "Song 4",
         "5" => "Song 5",
         "6" => "Song 6",
         "7" => "Song 7",
         "8" => "Song 8");

# spin off 10 threads (a.k.a users)
for ($i = 0; $i < 10 ; $i++) {
        new threads \&login;
}

# Loop through all the threads and 'join' them (credit: perlthrtut)
foreach $thr (threads->list) {
        # Don't join the main thread or ourselves
        if ($thr->tid && !threads::equal($thr, threads->self)) {
            $thr->join;
        }
}


##########################################################
sub login {
        my $ThreadID = threads->self->tid;
        print "Thread ${ThreadID}: started\n";
        # random song from 1 to 8 for testing
        my $song_num = int rand (8)+1;
        if ( request_song($song_num) ) {
                listen_to_song($song_num);
                return_song($song_num);
        }
        else {
                print "Thread ${ThreadID}: Queueing request!\n";
                # to be finished...
        }
} # --------------------------------------------------------

##########################################################
sub request_song {
        my $song_num = shift @_;
        my $ThreadID = threads->self->tid;
        print "Thread ${ThreadID}: requesting song $song_num...\n";
        lock (%song);

        if ($song{$song_num}{locked}) {
                return;
        }
        else {
                $song{$song_num}{locked} = threads->self->tid;
                return $song{$song_num};

        }
} # -------------------------------------------------------- 

###########################################################
sub return_song {
        my $song_num = shift @_;
        my $ThreadID = threads->self->tid;
        lock (%song);

        $song{$song_num}{locked} == threads->tid or die "Thread ${ThreadID}:
ATTEMPT TO RELEASE SONG THAT IT DIDNT OWN";

        $song{$song_num}{locked} = undef;

} #--------------------------------------------------------

##########################################################
sub listen_to_song {
        my $song_num = shift @_;
        my $song_time = int rand (9)+1;
        my $ThreadID = threads->self->tid;
        print "Thread ${ThreadID}: listening to song $song_num for
$song_time seconds\n";
        # pretending to listen to music
        sleep $song_time;
        print "Thread ${ThreadID}: finished playing song $song_num\n";

} # --------------------------------------------------------

---------------------------- < end > ----------------------------------- 


Any thoughts?

Thanks.








------------------------------

Date: 16 Nov 2004 12:18:09 -0800
From: charles.teague@gmail.com (Tuba Chuck)
Subject: Re: Cookies
Message-Id: <7234602.0411161218.48fb0370@posting.google.com>

Tad McClellan <tadmc@augustmail.com> wrote in message news:<slrncpj10i.2ip.tadmc@magna.augustmail.com>...
> Tuba Chuck <charles.teague@gmail.com> wrote:
> 
> > I seem to be having problems with the cookies.
> > script is:
>  
> > #!/usr/bin/perl
> 
>    use warnings;
>    use strict;
> 
> Ask for all the help you can get!
> 
> 
> > $cookie_jar = HTTP::Cookies->new();
> > 
> > $login = LWP::UserAgent->new();
> 
> 
>    $login->cookie_jar($cookie_jar);
> 
> You never associated that cookie jar with the user agent.
> 
> 
> > my $reql = POST 'http://www.pitas.com/cgi-bin/login.phtml', [
> > 'username' => $uname, 'password' => $pass, 'remember_me' => 'no' ];
> > $responsel = $login->request($reql);
>  
> > $addent = LWP::UserAgent->new();
> 
> 
> Why do you need another UserAgent?
> 
> Just reuse $login.

Thanks!


------------------------------

Date: Tue, 16 Nov 2004 23:03:01 +0000 (UTC)
From: PerlFAQ Server <comdog@panix.com>
Subject: FAQ 9.23: How do I find out my hostname/domainname/IP address?
Message-Id: <cne0v5$47t$1@reader1.panix.com>

This message is one of several periodic postings to comp.lang.perl.misc
intended to make it easier for perl programmers to find answers to
common questions. The core of this message represents an excerpt
from the documentation provided with Perl.

--------------------------------------------------------------------

9.23: How do I find out my hostname/domainname/IP address?

    The normal way to find your own hostname is to call the `hostname`
    program. While sometimes expedient, this has some problems, such as not
    knowing whether you've got the canonical name or not. It's one of those
    tradeoffs of convenience versus portability.

    The Sys::Hostname module (part of the standard perl distribution) will
    give you the hostname after which you can find out the IP address
    (assuming you have working DNS) with a gethostbyname() call.

        use Socket;
        use Sys::Hostname;
        my $host = hostname();
        my $addr = inet_ntoa(scalar gethostbyname($host || 'localhost'));

    Probably the simplest way to learn your DNS domain name is to grok it
    out of /etc/resolv.conf, at least under Unix. Of course, this assumes
    several things about your resolv.conf configuration, including that it
    exists.

    (We still need a good DNS domain name-learning method for non-Unix
    systems.)



--------------------------------------------------------------------

Documents such as this have been called "Answers to Frequently
Asked Questions" or FAQ for short.  They represent an important
part of the Usenet tradition.  They serve to reduce the volume of
redundant traffic on a news group by providing quality answers to
questions that keep coming up.

If you are some how irritated by seeing these postings you are free
to ignore them or add the sender to your killfile.  If you find
errors or other problems with these postings please send corrections
or comments to the posting email address or to the maintainers as
directed in the perlfaq manual page.

Note that the FAQ text posted by this server may have been modified
from that distributed in the stable Perl release.  It may have been
edited to reflect the additions, changes and corrections provided
by respondents, reviewers, and critics to previous postings of
these FAQ. Complete text of these FAQ are available on request.

The perlfaq manual page contains the following copyright notice.

  AUTHOR AND COPYRIGHT

    Copyright (c) 1997-2002 Tom Christiansen and Nathan
    Torkington, and other contributors as noted. All rights 
    reserved.

This posting is provided in the hope that it will be useful but
does not represent a commitment or contract of any kind on the part
of the contributers, authors or their agents.


------------------------------

Date: 16 Nov 2004 13:04:41 -0800
From: parkerdg@gmail.com (DGP)
Subject: First Perl Program
Message-Id: <8752e64b.0411161304.440a50ec@posting.google.com>

A few weeks ago I learned that our CAD group was spending way too much
effort to extract point locations from our CAD system. I realized the
problem could be solve by writing a program to do some text file
manipulation.

I decided Perl was the best tool for the job. After reading a begginer
Perl book, I was able to complete the program below within a few
hours. I think this speaks to Perl's ease of use, power, and
flexibility.

It seems to work pretty well, but I wanted to get input to see if it
could be improved.

Thanks Dave

#!C:\Perl\bin\perl.exe
# iges2pt.pl
# Extract point locations from IGES file and write to new text file.
use warnings;
use strict;

my $infile;
my $outfile;
my $line;
my @parts;

# Define Files
print "Enter IGES filename: ";
$infile = <STDIN>;
chomp $infile;
$outfile="points.txt";

# Open Files
open (INF, "<$infile") || die "Cannot open $infile for read.\n";
open (OUTF, ">$outfile") || die "Cannot open $outfile for write.\n";

# Process Data
foreach $line (<INF>) {
    # Find lines defining points (Type=116)
    if ($line =~ /^116/) {
        @parts = split (",",$line);
        # IGES format uses 'D' in scientific notation. Replace D with
E.
        for (@parts) { s/D/E/ }
        # Write out point x,y,z location.
        print OUTF "@parts[1..3]\n";
    }
}

close INF;
close OUTF;


------------------------------

Date: Tue, 16 Nov 2004 22:49:08 +0100
From: Tore Aursand <toreau@gmail.com>
Subject: Re: First Perl Program
Message-Id: <pan.2004.11.16.21.49.05.511864@gmail.com>

On Tue, 16 Nov 2004 13:04:41 -0800, DGP wrote:
> #!C:\Perl\bin\perl.exe
> # iges2pt.pl
> # Extract point locations from IGES file and write to new text file.
> use warnings;
> use strict;

Excellent!

> my $infile;
> my $outfile;
> my $line;
> my @parts;

Generally, don't declare any variables before you use, ie. using them in
the "smallest scope" as possible.

> # Define Files
> print "Enter IGES filename: ";
> $infile = <STDIN>;

How about making it a bit more user-friendly (and making it possible to be
run from another script)?

  my $infile = ( @ARGV ) ? $ARGV[0] : <STDIN>;

> foreach $line (<INF>) {

Personally, I prefer 'while' here (and I almost always 'chomp' the value,
in addition to almost never using an intermediate variable). :-)

  while ( <INF> ) {
      chomp;
      # ...
  }

>         @parts = split (",",$line);

No need to use double quotes when there's nothing to interpolate.

>         for (@parts) { s/D/E/ }

IMO, better written the other way around;

  s/D/E/ for ( @parts );

Rewritten, your script would look something like this (untested):

  #!C:\Perl\bin\perl.exe
  # iges2pt.pl
  # Extract point locations from IGES file and write to new text file.
  #
  use warnings;
  use strict;

  # Define Files
  my $infile = ( @ARGV ) ? $ARGV[0] : <STDIN>;
  chomp( $infile );
  my $outfile= 'points.txt';

  # Open Files
  open( INF, '<', $infile ) or die "Couldn't open '$infile' for reading; $!\n";
  open( OUTF, '>', $outfile ) or die "Couldn't open '$outfile' for writing; $!\n";

  # Process Data
  while ( <INF> ) {
      chomp;
      next unless ( /^116/ );

      my @parts = split( ',', $_ );
      s/D/E/ for ( @parts );

      print OUTF "@parts[1..3]\n";
  }

  # Close files
  close( INF );
  close( OUTF );


-- 
Tore Aursand <toreau@gmail.com>
"Out of missiles. Out of bullets. Down to harsh language." (Unknown)


------------------------------

Date: Tue, 16 Nov 2004 15:47:25 -0600
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: First Perl Program
Message-Id: <slrncpktbd.826.tadmc@magna.augustmail.com>

DGP <parkerdg@gmail.com> wrote:

> It seems to work pretty well, but I wanted to get input to see if it
> could be improved.


> $infile = <STDIN>;
> chomp $infile;


You can combine those into a single statement:

   chomp( my $infile = <STDIN> );


> foreach $line (<INF>) {


The *entire file* must fit in memory, that is wasteful when you
are only going to process a line at a time anyway.

Read a line at a time instead:

   while ( my $line = <INF> ) {


>         @parts = split (",",$line);


A regex should *look like* a regex:

   my @parts = split (/,/, $line);
or
   my @parts = split /,/, $line;


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Tue, 16 Nov 2004 22:13:54 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: First Perl Program
Message-Id: <cndtu3$78d$1@news.simnet.is>

"DGP" <parkerdg@gmail.com> wrote in message
news:8752e64b.0411161304.440a50ec@posting.google.com...
> A few weeks ago I learned that our CAD group was spending way too much
> effort to extract point locations from our CAD system. I realized the
> problem could be solve by writing a program to do some text file
> manipulation.
>
> I decided Perl was the best tool for the job. After reading a begginer
> Perl book, I was able to complete the program below within a few
> hours. I think this speaks to Perl's ease of use, power, and
> flexibility.
>
> It seems to work pretty well, but I wanted to get input to see if it
> could be improved.
>
> [snip script]

if you want a oneliner:

perl -ne"s/D/E/g;print qq($1 $2 $3\n) if /^116.*?,(.*?),(.*?),([^,]*)/"
igesfile >points.txt

notes:
  a) as I do not know the IGES format, I do not know if there can be
characters
     between the 116 and first comma. thus the .*?
  b) hopefully there is no type 1160, but then your code would break too.
  c) I am assuming there may be more fields after the 3rd number

gnari





------------------------------

Date: Tue, 16 Nov 2004 23:21:02 +0000
From: Henry Law <lawshouse.public@btconnect.com>
Subject: Re: First Perl Program
Message-Id: <tj2lp0d6pf23t7sr2jvs2dpd3ios3qeuqg@4ax.com>

On 16 Nov 2004 13:04:41 -0800, parkerdg@gmail.com (DGP) wrote:

>I decided Perl was the best tool for the job. After reading a begginer
>Perl book, I was able to complete the program below within a few
>hours. I think this speaks to Perl's ease of use, power, and
>flexibility.

Having learnt some Perl myself in the last year, I think it also
speaks to your own abilities.  I have learned - taught myself - many
programming languages in the last 30 years and I found Perl harder to
pick up than most others (even including APL ;-).  It's not so hard to
write code that more or less does what you want, but to write perlish
code is a knack which only comes after a bit.   You've fallen into
some of the non-perlish (but non-fatal) traps as others have pointed
out, but believe me your program is pretty good as a first attempt.

Lurk here .. it's very instructive.
-- 

Henry Law       <><     Manchester, England 


------------------------------

Date: 16 Nov 2004 15:59:27 -0800
From: swus@lycos.de (SWus)
Subject: NCBI ASN1 File in Perl
Message-Id: <1620f493.0411161559.2410f35b@posting.google.com>

Hi 

I have an ASN1 file like this (4GB large and about 60000 "Entrezgene"
entries)

___________________________________________________

Entrezgene ::= {
  track-info {
    geneid 1489959,
    create-date std {
      year 2004,
      month 9,
      day 2,
      hour 6,
      minute 12,
      second 7
    },
    update-date std {
      year 2004,
      month 9,
      day 2,
      hour 10,
      minute 12,
      second 0
    }
  }, ....
---------------------------------------------------------------------------------------------

I've found in CPAN the "Convert::ANS1" Module but no idea how to use
it.

I need for every entry for example the "geneid" and the "update-date".

How can I make this in Perl?


Thanks SWus   (and sorry for the bad English)


------------------------------

Date: Tue, 16 Nov 2004 17:23:00 -0800
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: NCBI ASN1 File in Perl
Message-Id: <161120041723001861%jgibson@mail.arc.nasa.gov>

In article <1620f493.0411161559.2410f35b@posting.google.com>, SWus
<swus@lycos.de> wrote:

> Hi 
> 
> I have an ASN1 file like this (4GB large and about 60000 "Entrezgene"
> entries)
> 

[ASN1 entry snipped; see __DATA__ section below]

> I've found in CPAN the "Convert::ANS1" Module but no idea how to use
> it.

I know nothing about ANS1, but it looks like you need to supply an
encoding rule to a Convert:ANS1 object, which you can then use to
encode or decode a set of data.

> 
> I need for every entry for example the "geneid" and the "update-date".
> 
> How can I make this in Perl?

Here is one way in Perl, assuming that all of the entries in your file
look the example one:

#!/usr/local/bin/perl
#
use warnings;
use strict;

my( $geneid, @date);
while( <DATA> ) {
  if( /Entrezgene/ ) {
    $geneid = '';
    @date = ();
  }elsif( /geneid\s*(\d+)/ ) {
    $geneid = $1;
  }elsif( /update-date/ ) {
    foreach (1..6) {
      my $line = <DATA>;
      if( $line =~ /\w+\s+(\d+)/ ) {
        push( @date, $1);
      }
    }
    print "$geneid: ", join('/',@date), "\n" if
      $geneid && (@date == 6);
  }
}

__DATA__
Entrezgene ::= {
  track-info {
    geneid 1489959,
    create-date std {
      year 2004,
      month 9,
      day 2,
      hour 6,
      minute 12,
      second 7
    },
    update-date std {
      year 2004,
      month 9,
      day 2,
      hour 10,
      minute 12,
      second 0
    }
  }


------------------------------

Date: Wed, 17 Nov 2004 00:05:12 +0100
From: phill hw <phill@mywebstuff.com>
Subject: Re: problems using taint to check an array be created by cgi
Message-Id: <pan.2004.11.16.23.05.10.55129@mywebstuff.com>

Am Mon, 15 Nov 2004 23:43:05 +0000 schrieb A. Sinan Unur:

> phill hw <phill@mywebstuff.com> wrote in
> news:pan.2004.11.15.21.40.33.966125@mywebstuff.com: 
> 
>> Hello Usenet Perl,
>> I have a html form which produces a load of checkboxes. They all have
>> the same name (sports) and if a check box is ticked(checked) it holds
>> a numeric value which represents the id of the sport.:
>> 
>> Pseudo CGI FORM:
>> <FORM>
>> <input type="text" name="s" value="aslakslad1231">
>> <p><input type="checkbox" name="sport" value="1">Football</p>
>> <p><input type="checkbox" name="sport" value="2">Basketball</p>
>> <p><input type="checkbox" name="sport" value="3">Hockey</p>
>> <SUBMIT BUTTON>
>> </FORM>
> 
> May I suggest that this is probably not a good way to set up the form? Life 
> would probably be easier if you had, e.g.
> 
> <p><input type="checkbox" name="sport" value="Hockey">Hockey</p>
> 
>> This works with taint on, but I have to parse the values in the
> 
> You are not parsing them, you are untaiting them. The difference is 
> important.
> 
> Here is how I might do it although I would urge you to take it with a grain 
> of salt:
> 
> use strict;
> use warnings;
> 
> use Data::Dumper;
> 
> my %valid = map { $_ => 1 } qw(Basketball Football Hockey);
> my @sports = qw(Basketball Hockey /etc/password);
> @sports = map { $valid{$_} ? ($_) = ( $_ =~ /^(.+)$/ ) : '' } @sports;
> 
> print Dumper \@sports;
> 
> __END__
> 
> All of the code I show below is untested.
> 
>> #!/usr/bin/perl -Tw
> 
> use warnings;
> 
> rather than -w.
> 
>> use CGI qw/:standard :html3/;
> 
> Below, you use only the object interface, so a
> 
> use CGI;
> 
> would suffice here.
> 
>> use CGI::Carp 'fatalsToBrowser';
>> 
>> if (param())
> 
> This is very unnecessary. Just contruct your CGI object.
> 
>> {
>>  my ($query) = new CGI;
>>  my ($s) = $query->param('s') =~ /^([\w]+)$/ if $query->param('s');
> 
> A few errors and style issues here. Your code would fail to untaint $s if 
> $query->param('s') returned '0' or '0E0' or '0 but true'. Also, I remember 
> discussions where it was mentioned that 
> 
> my $s = 'something' if some other thing;
> 
> is not really a safe construct.
> 
> You are better off actually spelling out what you are doing:
> 
> my $s = $query->param('s');
> 
> if(defined $s and $s =~ /^(\w+)$/) {
>     $s = $1;
> } else {
>     $s = '';
> }
> 
> \w itself is a character class. I am not sure [\w] is an error but is 
> unnecessary and confusing.
> 
>>  my (@sports) = $query->param('sports');
> 
> I'll recommend setting up a hash of acceptable values for sports as above 
> and using the keys to validate the values. If the value is a key in %valid, 
> then you know it is safe to blindly capture it.
> 
>>                 if ($_ =~ /^([\d]+)$/)
> 
> Again, \d is a character class of its own. And, by the way, is 
> 987327332882727273774746655222411313232638482929399949872873498238971232838
> 123847328723984792837482374982374621347326473624723648723648732648726423444
> 23424 a valid value for sports?
> 
> Sinan.


Thanks Sinan
	You have pointed out a few interesting things that I will look
further into and been very helpful. I find it really difficult to get 
clear answers with the taint subject. Most people just write 
"Oh here is a regexp ... blah .. blah but IMHO it still cannot be considered safe". 
Which does not really help at all.


My code: 
my ($s) = $query->param('s') =~ /^([\w]+)$/ if $query->param('s');

is from the perl cookbook http://www.hk8.org/old_web/linux/cgi/ch08_04.htm 
so I consider it safe as I am very restrictive on params. 
I have to admit your method reads more clearly as the action
on the undefined $s is immediately apparent. I usually add a default value
or fail the script shortly after. It depends on the script.


my $s = $query->param('s');

if(defined $s and $s =~ /^(\w+)$/) {
    $s = $1;
} else {
    $s = '';
}

I will test your method:

use Data::Dumper;

my %valid = map { $_ => 1 } qw(Basketball Football Hockey);
my @sports = qw(Basketball Hockey /etc/password);
@sports = map { $valid{$_} ? ($_) = ( $_ =~ /^(.+)$/ ) : '' } @sports;

print Dumper \@sports;

but what concerns me here is that the values are stored in the cgi script
which I find hard to maintain. That is why I only pass numberic values back
because they are then matched to values in a db. Hockey, Football and
Hockey are easy examples.

What about if my form looked like:

Which of the following have you backed up?
<p><input type="checkbox" name="passwd" value="/etc/passwd">etc/p>
<p><input type="checkbox" name="passwd" value="/var/mysql.users">mysql/p>
<p><input type="checkbox" name="passwd" value="/etc/pppoe.secrets">pppoe/p>


>Again, \d is a character class of its own. And, by the way, is 
>987327332882727273774746655222411313232638482929399949872873498238971232838
>123847328723984792837482374982374621347326473624723648723648732648726423444
>23424 a valid value for sports?

Opps, I really do need to check the length! Thanks for pointing that out.

Phill




------------------------------

Date: 16 Nov 2004 12:04:47 -0800
From: google@servangle.net (John R)
Subject: Re: RegEx challenge - doesn't work
Message-Id: <100a40e5.0411161204.3a13a9a9@posting.google.com>

"A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote in message news:<Xns95A2D6215D1Casu1cornelledu@132.236.56.8>...
> while(<DATA>) {
>     if( /^dsr:(".+"); height:(\d+); width:(\d+); client:(".+")\s*$/ ) {
>         push @data, {
>             dsr    => $1,
>             height => $2,
>             width  => $3,
>             client => $4,
>         };
>     }
> }

Thanks for the effort, but this will not work either. Your regex is
pattern matching the specific words "dsr" "height" etc... Maybe I
should have mentioned this earlier, the name/value pairs cannot be
predicted. The expected format is:
name:value; name:value; name:"value blah"; name:"valu;e" 

> Yeah whatever ... An almost identical question was posted and answered 
> withing the last couple of days. Doesn't anyone lurk and read the FAQ any 
> more?

Give me some more credit here. I read the FAQ. This is a unique
situation.

> What do you mean "resulting in two arrays"?

RegEx option /m like /(\w+):?([^;]+)?;\s?/m for multiple matches
you'll get an array foreach grouping () eachtime the pattern is
repeatedly matched.

Does anyone understand my question?

<snip>


------------------------------

Date: 16 Nov 2004 20:28:28 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: RegEx challenge - doesn't work
Message-Id: <Xns95A39D6A281CAasu1cornelledu@132.236.56.8>

google@servangle.net (John R) wrote in
news:100a40e5.0411161204.3a13a9a9@posting.google.com: 

> "A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote in message
> news:<Xns95A2D6215D1Casu1cornelledu@132.236.56.8>... 
>> while(<DATA>) {
>>     if( /^dsr:(".+"); height:(\d+); width:(\d+); client:(".+")\s*$/ )
>>     { 
>>         push @data, {
>>             dsr    => $1,
>>             height => $2,
>>             width  => $3,
>>             client => $4,
>>         };
>>     }
>> }
> 
> Thanks for the effort, but this will not work either. Your regex is
> pattern matching the specific words "dsr" "height" etc... Maybe I
> should have mentioned this earlier, the name/value pairs cannot be
> predicted. The expected format is:
> name:value; name:value; name:"value blah"; name:"valu;e" 

The quality of the help you get is in direct proportion to the amount of 
effort you put into formulating your question. Your question contained an 
incomplete description of the problem. 

Please see http://www.catb.org/~esr/faqs/smart-questions.html

>> Yeah whatever ... An almost identical question was posted and
>> answered withing the last couple of days. Doesn't anyone lurk and
>> read the FAQ any more?
> 
> Give me some more credit here. I read the FAQ. This is a unique
> situation.

Your situation is not unique. You have not read the FAQ.
> 
>> What do you mean "resulting in two arrays"?
> 
> RegEx option /m like /(\w+):?([^;]+)?;\s?/m for multiple matches
> you'll get an array foreach grouping () eachtime the pattern is
> repeatedly matched.

Each time you capture matches, the match in list context returns a list 
of matches. There are no arrays involved. Again, it is time to hit the 
FAQ list.

> Does anyone understand my question?

Oh yes. And your question has been answered completely by the FAQ 
reference I posted.

Sinan


------------------------------

Date: Tue, 16 Nov 2004 15:04:22 -0600
From: "l v" <lv@aol.com>
Subject: Re: RegEx challenge - doesn't work
Message-Id: <419a68fa$1_4@127.0.0.1>


"John R" <google@servangle.net> wrote in message
news:100a40e5.0411161204.3a13a9a9@posting.google.com...
> "A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote in message
news:<Xns95A2D6215D1Casu1cornelledu@132.236.56.8>...
> > while(<DATA>) {
> >     if( /^dsr:(".+"); height:(\d+); width:(\d+); client:(".+")\s*$/ ) {
> >         push @data, {
> >             dsr    => $1,
> >             height => $2,
> >             width  => $3,
> >             client => $4,
> >         };
> >     }
> > }
>
> Thanks for the effort, but this will not work either. Your regex is
> pattern matching the specific words "dsr" "height" etc... Maybe I
> should have mentioned this earlier, the name/value pairs cannot be
> predicted. The expected format is:
> name:value; name:value; name:"value blah"; name:"valu;e"
>
> > Yeah whatever ... An almost identical question was posted and answered
> > withing the last couple of days. Doesn't anyone lurk and read the FAQ
any
> > more?
>
> Give me some more credit here. I read the FAQ. This is a unique
> situation.
>
> > What do you mean "resulting in two arrays"?
>
> RegEx option /m like /(\w+):?([^;]+)?;\s?/m for multiple matches
> you'll get an array foreach grouping () eachtime the pattern is
> repeatedly matched.
>
> Does anyone understand my question?
>
> <snip>

try splitting vs regex
@groups = split(/; /, $line);
for my $group (@groups) {
  my ($key, $value) = split(/:/, $group);
  print "$key -> $value\n";
}

Len





----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= East/West-Coast Server Farms - Total Privacy via Encryption =---


------------------------------

Date: Tue, 16 Nov 2004 15:43:07 -0600
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: RegEx challenge - doesn't work
Message-Id: <slrncpkt3b.826.tadmc@magna.augustmail.com>

John R <google@servangle.net> wrote:


> RegEx option /m like /(\w+):?([^;]+)?;\s?/m for multiple matches
> you'll get an array foreach grouping () eachtime the pattern is
> repeatedly matched.


m//m has absolutely nothing to do with multiple matches.

m//m changes the meaning for only the ^ and $ anchor, 
it *does nothing* for your pattern above because your
pattern does not contain ^ or $ anchors.

m// in *list context* returns a list of all the matching memories.


> Does anyone understand my question?


It sounds just like this question:

       How can I split a [character] delimited string except when inside
       [character]? (Comma-separated files)

Do you understand the answer given along with it?


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Tue, 16 Nov 2004 21:51:30 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: RegEx challenge - doesn't work
Message-Id: <cndsk4$71k$1@news.simnet.is>

"John R" <google@servangle.net> wrote in message
news:100a40e5.0411161204.3a13a9a9@posting.google.com...
> "A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote in message
news:<Xns95A2D6215D1Casu1cornelledu@132.236.56.8>...
> >[snip nice solution based on original description]
>
> Thanks for the effort, but this will not work either. Your regex is
> pattern matching the specific words "dsr" "height" etc... Maybe I
> should have mentioned this earlier, the name/value pairs cannot be
> predicted. The expected format is:
> name:value; name:value; name:"value blah"; name:"valu;e"
>
> [snip]
>
> RegEx option /m like /(\w+):?([^;]+)?;\s?/m for multiple matches

actually, your regex seems to imply more that your 'expected format' above.
is the colon optional ?
is the value itself optional ?
doesn't the last name-value pair require a terminating semicolon ?

>
> Does anyone understand my question?

don't be so arrogant when you are asking for help.

maybe you want something like: /(\w+):("[^"]+"|[^;]+);\s?/m
but in any case read the faq entry you have been referred to.

gnari





------------------------------

Date: Tue, 16 Nov 2004 22:17:40 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: RegEx challenge - doesn't work
Message-Id: <cndu56$7a6$1@news.simnet.is>

"gnari" <gnari@simnet.is> wrote in message
news:cndsk4$71k$1@news.simnet.is...
> "John R" <google@servangle.net> wrote in message
> news:100a40e5.0411161204.3a13a9a9@posting.google.com...
> > [stuff]
> maybe you want something like: /(\w+):("[^"]+"|[^;]+);\s?/m

aargh. the /m was mindlessly copied from John's code.

serves me right to reply to posts that do not really need it.

gnari





------------------------------

Date: Tue, 16 Nov 2004 20:14:52 +0100
From: Tore Aursand <toreau@gmail.com>
Subject: Re: regexp question
Message-Id: <pan.2004.11.16.19.14.51.820445@gmail.com>

On Tue, 16 Nov 2004 18:56:05 +0000, Patty Reynolds wrote:
>>> I'm have the following problem: I have pair of numbers like 111.1234
>>> and 111.1235. But I want to display that than as 111.1234|5 (to save
>>> space on the screen).
>>> Other examples:
>>>
>>> 111.98 and 114.0 -> 111.98|4.0
>>> 1.2345 and 1.2945 -> 1.2345|945

>> If you're only dealing with two numbers at a time, you should store the
>> difference between the numbers instead

> but with the difference he has still to deal with other problems: e.g
> 111.98 and 114.0
> would be 111.98|2.02 and not 111.98|4.0

I know that - I just wanted to give him another solution to his "space
problem".

As long as one knows the difference between two numbers, and knowing one
of the numbers, you can always calculate the other number.  The same can
be done with more than two numbers, ie.:

  111.98|2.02|-3.2 ... etc ...

First, calculate the sum of the two first numbers, then proceed one to the
next "sequence" and so on.  No need for a regular expression, and it is
easier to maintain (especially if you some day _need_ to handle more than
two numbers).


-- 
Tore Aursand <toreau@gmail.com>
"It's not so much what you have to learn if you accept weird theories,
 it's what you have to unlearn." (Isaac Asimov)


------------------------------

Date: Tue, 16 Nov 2004 20:24:45 GMT
From: "Sam" <nnpospamm@front.org>
Subject: Re: regexp question
Message-Id: <hmtmd.383$CK.30@twister.nyroc.rr.com>

> On Tue, 16 Nov 2004 18:56:05 +0000, Patty Reynolds wrote:
> >>> I'm have the following problem: I have pair of numbers like 111.1234
> >>> and 111.1235. But I want to display that than as 111.1234|5 (to save
> >>> space on the screen).
> >>> Other examples:
> >>>
> >>> 111.98 and 114.0 -> 111.98|4.0
> >>> 1.2345 and 1.2945 -> 1.2345|945
>
> >> If you're only dealing with two numbers at a time, you should store the
> >> difference between the numbers instead
>
> > but with the difference he has still to deal with other problems: e.g
> > 111.98 and 114.0
> > would be 111.98|2.02 and not 111.98|4.0
>
> I know that - I just wanted to give him another solution to his "space
> problem".
>
> As long as one knows the difference between two numbers, and knowing one
> of the numbers, you can always calculate the other number.  The same can
> be done with more than two numbers, ie.:
>
>   111.98|2.02|-3.2 ... etc ...
>
> First, calculate the sum of the two first numbers, then proceed one to the
> next "sequence" and so on.  No need for a regular expression, and it is
> easier to maintain (especially if you some day _need_ to handle more than
> two numbers).

sorry for not being clear enough - I want to save these values in a column
(therefor the space limitations) and I would like to have it a format like
xx.xxx|yy so that you can see the values and not only the difference




------------------------------

Date: Tue, 16 Nov 2004 21:58:09 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: regexp question
Message-Id: <cndt0j$72g$1@news.simnet.is>

"Sam" <nnpospamm@front.org> wrote in message
news:mDpmd.61$CK.1@twister.nyroc.rr.com...
> Hello,
>
> I'm have the following problem: I have pair of numbers like 111.1234 and
> 111.1235. But I want to display that than as 111.1234|5 (to save space on
> the screen).
> Other examples:
>
> 111.98 and 114.0 -> 111.98|4.0
> 1.2345 and 1.2945 -> 1.2345|945

how can this work ?
given 1.2345|945 , how do you differentiate between the pairs:
  1.2345 and 1.234945 -> 1.2345|945
  1.2345 and 1.23945 -> 1.2345|945
  1.2345 and 1.2945 -> 1.2345|945
  1.2345 and 1.945 -> 1.2345|945
  1.2345 and 1945 -> 1.2345|945
  1.2345 and 945 -> 1.2345|945
?

gnari






------------------------------

Date: Wed, 17 Nov 2004 00:13:57 GMT
From: "�rjan Johansson" <misc@actitud.NOSPAM.se>
Subject: Search for string and return file name
Message-Id: <9Jwmd.122063$dP1.422386@newsc.telia.net>

Hi all!

I'm trying to figure out how to go through all files with a certain 
extension in a directory, search for a string and return the names of the 
files that contains the string. I have written scripts before that opens a 
single file for reading and writing, but have no clue how to go through all 
files in a folder, like a wildcard  *.log kind of thing. Any pointers?

TIA,
�rjan 




------------------------------

Date: Tue, 16 Nov 2004 17:34:30 -0800
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: Search for string and return file name
Message-Id: <161120041734303274%jgibson@mail.arc.nasa.gov>

In article <9Jwmd.122063$dP1.422386@newsc.telia.net>, �rjan Johansson
<misc@actitud.NOSPAM.se> wrote:

> Hi all!
> 
> I'm trying to figure out how to go through all files with a certain 
> extension in a directory, search for a string and return the names of the 
> files that contains the string. I have written scripts before that opens a 
> single file for reading and writing, but have no clue how to go through all 
> files in a folder, like a wildcard  *.log kind of thing. Any pointers?

See 'perldoc -f glob' or check out the Find::File module.

Simply put, you can use a loop of the type

   for( <*.ext> ) {
      print "file is $_\n";
   }

to iterate over all of the files with extension 'ext' in the current
directory.


------------------------------

Date: Tue, 16 Nov 2004 17:44:26 -0800
From: MrReallyVeryNice <MrReallyVeryNice.REMOVE.NO.SPAM@Yahoo.REMOVE.NO.SPAM.com>
Subject: Re: Search for string and return file name
Message-Id: <HtidnVJmTdhXMQfcRVn-sg@comcast.com>

Jim Gibson wrote:
> In article <9Jwmd.122063$dP1.422386@newsc.telia.net>, �rjan Johansson
> <misc@actitud.NOSPAM.se> wrote:
> 
> 
>>Hi all!
>>
>>I'm trying to figure out how to go through all files with a certain 
>>extension in a directory, search for a string and return the names of the 
>>files that contains the string. I have written scripts before that opens a 
>>single file for reading and writing, but have no clue how to go through all 
>>files in a folder, like a wildcard  *.log kind of thing. Any pointers?
> 
> 
> See 'perldoc -f glob' or check out the Find::File module.
Might it be File::Find instead of Find::File?
> 
> Simply put, you can use a loop of the type
> 
>    for( <*.ext> ) {
>       print "file is $_\n";
>    }
> 
> to iterate over all of the files with extension 'ext' in the current
> directory.


------------------------------

Date: Tue, 16 Nov 2004 20:16:11 +0100
From: Tore Aursand <toreau@gmail.com>
Subject: Re: Time::HiRes module and timing a command...
Message-Id: <pan.2004.11.16.19.16.11.834442@gmail.com>

On Tue, 16 Nov 2004 09:54:08 -0800, Adam wrote:
> I want to write a small perl program to test the performance of
> ClearCase. [...]

Use the 'Benchmark' module.  It's comes standard with Perl.


-- 
Tore Aursand <toreau@gmail.com>
"What we do is never understood, but only praised and blamed."
 (Friedrich Nietzsche)


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 7411
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[25162] in Perl-Users-Digest

Perl-Users Digest, Issue: 7411 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Tue Nov 16 21:05:46 2004

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Nov 16 21:05:46 2004