[17671] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 5091 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Dec 12 14:17:40 2000

Date: Tue, 12 Dec 2000 11:15:26 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <976648526-v9-i5091@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Tue, 12 Dec 2000     Volume: 9 Number: 5091

Today's topics:
        Perl 5.6, Regexp, performance problems hume.spamfilter@bofh.halifax.ns.ca
    Re: Perl 5.6, Regexp, performance problems (Logan Shaw)
    Re: Posting Guidelines for comp.lang.perl.misc ($Revisi (Rafael Garcia-Suarez)
    Re: Posting Guidelines for comp.lang.perl.misc ($Revisi (Eric Bohlman)
    Re: Posting Guidelines for comp.lang.perl.misc ($Revisi <jeff@vpservices.com>
    Re: Posting Guidelines for comp.lang.perl.misc ($Revisi (Jon Bell)
    Re: problem with script (Abigail)
    Re: Profanity Check (OT) (Mark-Jason Dominus)
        quick way to search array members <chris_beaudette@my-deja.com>
    Re: quick way to search array members <tony_curtis32@yahoo.com>
    Re: quick way to search array members <mischief@velma.motion.net>
        Search and Display - Need assistance msalerno@my-deja.com
        Setting timeout with IO::Socket (Dave Sherohman)
        sighandler, and strict (Stan Brown)
    Re: sighandler, and strict nobull@mail.com
    Re: sighandler, and strict <bart.lateur@skynet.be>
    Re: single quote hell <tmartin@tlmartin.com>
    Re: sorting an array of hashes <hayes@sympatico.ca>
    Re: srand() for CGI (Richard Zilavec)
    Re: srand() for CGI <sluppy@my-deja.com>
    Re: srand() for CGI <bart.lateur@skynet.be>
    Re: Testing deliverability with Perl and Sendmail -bv (Monte Phillips)
    Re: Testing deliverability with Perl and Sendmail -bv <elijah@workspot.net>
        unicode question <dwl@slip.net>
        what am i doing wrong? <a.v.a@home.nl>
    Re: what am i doing wrong? (Rafael Garcia-Suarez)
        Win32 --> ppm , Linux --> ??? <cheesy@upb.de>
    Re: Win32 --> ppm , Linux --> ??? <jschauma@netmeister.org>
    Re: Win32 --> ppm , Linux --> ??? <mischief@velma.motion.net>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 12 Dec 2000 14:33:21 +0000 (UTC)
From: hume.spamfilter@bofh.halifax.ns.ca
Subject: Perl 5.6, Regexp, performance problems
Message-Id: <915cvh$5t9$1@News.Dal.Ca>

Back with 5.004, I'd written a script to analyze the daily logs on our 
servers.  "Analysis" meant basically looking for and counting patterns it
recognized, and flagging things it didn't.  It was extremely useful, if not
real-time.

The logs in question amount to up and over a million lines per day.  Running
regexps against each of these lines took many hours, up until I installed
the Regexp module... then, I could simply store the regexps permanently in
compiled form for the length of the process.  What took several hours was
reduced to less than one.

Unfortunately, the Regexp module doesn't appear to have been updated for 5.6.
I thought I could simply use the qr// syntax, but even that doesn't seem much
of a performance gain.

My code does something like this:

	load config file
	for every regexp in config file
		store qr/$pattern/ in object
	end for
	for every line in the file
		for every regexp configured for this file
			if line =~ object->pattern
				count, next line
			end if
		end for

		flag unmatched line
	end for

The code does contain provisions for logical structuring... like, if the line
starts with "named", only run named-related patterns against it.  But still,
its just the sheer volume of data.  An UltraSPARC 1 with Perl 5.5 and Regexp
can process the same volume of data faster than an E220 and 5.6.

What are my options?  When doing the matches, I:
	- Never bother trying to get () values, $1 and so on...
	- Never need interpolation.
	- Need to go as fast as possible.

-- 
Brandon Hume    - hume -> BOFH.Halifax.NS.Ca, http://WWW.BOFH.Halifax.NS.Ca/
                       -> Solaris Snob and general NOCMonkey


------------------------------

Date: 12 Dec 2000 12:51:50 -0600
From: logan@cs.utexas.edu (Logan Shaw)
Subject: Re: Perl 5.6, Regexp, performance problems
Message-Id: <915s46$mn4$1@boomer.cs.utexas.edu>

In article <915cvh$5t9$1@News.Dal.Ca>,
 <hume.spamfilter@bofh.halifax.ns.ca> wrote:
>The logs in question amount to up and over a million lines per day.  Running
>regexps against each of these lines took many hours, up until I installed
>the Regexp module... then, I could simply store the regexps permanently in
>compiled form for the length of the process.  What took several hours was
>reduced to less than one.
>
>Unfortunately, the Regexp module doesn't appear to have been updated for 5.6.
>I thought I could simply use the qr// syntax, but even that doesn't seem much
>of a performance gain.
>
>My code does something like this:
>
>	load config file
>	for every regexp in config file
>		store qr/$pattern/ in object
>	end for
>	for every line in the file
>		for every regexp configured for this file
>			if line =~ object->pattern
>				count, next line
>			end if
>		end for
>
>		flag unmatched line
>	end for

What you're doing sounds like about the best way to structure such a
thing, assuming you're going to use a bunch of regular expressions.

Probably the best thing you could do at this point is to go through
the regular expressions and see if there is any way you can optimize
them.  It can take exponential time to match certain regular
expressions, and linear time for others.  Obviously, you want to make
sure yours are more on the linear end of things than the exponential.

If you're not too familiar with the internals of how regular
expressions work, you might want to read http://perl.plover.com/Regex/ .
This doesn't exactly describe how Perl regular expressions work, but
it can give you an idea of how things work in general and where the
degenerate cases lie.

A good rule of thumb for making regular expressions fast is that you
want to make the regular expression as specific as is reasonable.
That is, you want to give the engine that runs the regular
expressions as much ammunition as possible to help it in the war of
eliminating possible ways of matching your string early on.

Here's an example.  Imagine you have this string:

	1 2 3 4 foo 567 bar

and you want to match the "567", which you know comes before the "bar".

You could match it with

	/\d+\s+bar/

and this would work, but the engine would see the "1" and would get
sidetracked considering the possibility that this matches.  It
doesn't, and the engine would correctly detect this, but only after
doing some work.  The same thing would happen starting at "2", "3",
and "4", and finally it would find the real match at "567".

If, on the other hand, you use

	/foo\s+\d+\s+bar/

then the engine has enough information to know not to even consider
the possibility that those other numbers start the match.

There are zillions of other ways that similar kinds of regular
expression speedups can be accomplished, so I won't even attempt to
cover them all.  (I'm sure I don't even know what they all are.)

I'm sure there are other ways to do your task that are probably even
faster.  A nice unified grammar for each line could really make a
difference.  It might be a headache to do this, but you might get very
good results using an LALR parser against each line.  If you have a
single carefully-built grammar that understands all the possibilities
for each line, I think you should be able to scan the log file very
efficiently.  If you're interested in pursuing that, you might either
use yacc (i.e. don't use Perl at all) or use the Perl equivalent,
Parse::Yapp.  (See http://search.cpan.org/search?dist=Parse-Yapp .)

  - Logan


------------------------------

Date: Tue, 12 Dec 2000 14:14:35 GMT
From: rgarciasuarez@free.fr (Rafael Garcia-Suarez)
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 0.1 $)
Message-Id: <slrn93ccoa.5kf.rgarciasuarez@rafael.kazibao.net>

It may be useful to include some remarks about error and warning
messages, how much useful they are, where to find them.

Tad McClellan wrote in comp.lang.perl.misc:
> 
> You can look up any of the messages that perl might issue to
> find out what the message means and how to resolve the
> potential mistake (perldoc perldiag).

"Note that CGI programs issue their error messages in the webserver's
error logfile."

> =head2 Social faux pax to avoid
                       ^
Typo:           faux pas

> =item Beware of saying "doesn't work"
> 
> This is a "red flag" phrase. If you find yourself writing that,
> pause and see if you can't describe what is not working without
> saying "doesn't work". That is, describe how it is not what you want.

"And if your program has produced some error messages, include them in
your post."

-- 
# Rafael Garcia-Suarez / http://rgarciasuarez.free.fr/


------------------------------

Date: 12 Dec 2000 15:30:43 GMT
From: ebohlman@omsdev.com (Eric Bohlman)
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 0.1 $)
Message-Id: <915gb3$bbg$1@bob.news.rcn.net>

Tad McClellan <tadmc@metronet.com> wrote:
> If you have no idea at all of how to code up your situation, be
> sure to at least include the 2 things that you I<do> know: input and
> desired output.

Describe precisely what your input should look like.  Providing an example
is helpful, but isn't enough.  Nobody here is psychic; if you say the
input should "look like this" you run the risk that readers a) will skip
your post because they can't be bothered puzzling out what you meant b)
will puzzle out what they think you meant and come up with something
different, thereby wasting both your time and theirs, or c) will give you
a silly answer (e.g. one that works only for the exact sample input you
gave).


------------------------------

Date: Tue, 12 Dec 2000 08:39:58 -0800
From: Jeff Zucker <jeff@vpservices.com>
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 0.1 $)
Message-Id: <3A3654DE.6DA4C427@vpservices.com>

Tad McClellan wrote:
> 
> Here is a start on some Posting Guidelines for our newsgroup.

Great work, thanks Tad!

> Since clpmisc is such a high-traffic newsgroup (more than 100 articles
> per day), there are some "crowd control" measures that have evolved
> to keep things manageable.

While "crowd control" is descriptive, it is also a bit off-putting.  I
suggest instead replacing everything after the comma with: "there are a
number of measures that each reader can take to make the group most
useable by the most number of people."  Puts it more in the realm of
personal responsibility rather than rules from the top.

> You can get a rough outline of the points made here by running:
> 
>    perl -ne 'print "   $_" if /^=item/; print if /^=head/' thisfile.pod

or, on windows:

   perl -ne "print qq(   $_) if /^=item/; print if /^=head/"
thisfile.pod

> =item Check the Perl Frequently Asked Questions (FAQ)
> ...
> =item Check the other standard Perl docs (*.pod)

I'd suggest a brief description of how to find these on a local install
and a URL of where they can be located on the web, and either a brief
description of -q and -f or a pointer to perldoc perldoc.

-- 
Jeff


------------------------------

Date: Tue, 12 Dec 2000 18:28:47 GMT
From: jtbell@presby.edu (Jon Bell)
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 0.1 $)
Message-Id: <G5Gwnz.JqK@presby.edu>

Good start!

How about mentioning proper formatting of posted responses?  Something
like:

   If you post a response to question or participate in an ongoing
   discussion, please make things easier for your readers by following
   these commonly accepted guidelines for formatting responses:  quote
   *selectively* from the posting you're responding to, *attribute* those
   quotes properly to their author(s), and place your comments *after* the
   text that they respond to.  For a more extensive discussion of this,
   see <http://www.geocities.com/nnqweb/nquote.html>.

-- 
Jon Bell <jtbell@presby.edu>                         Presbyterian College
Dept. of Physics and Computer Science         Clinton, South Carolina USA
[   Help try to keep the Deja.com archive alive!  Sign the petition at  ]
[         http://www2.PetitionOnline.com/dejanews/petition.html         ]


------------------------------

Date: 12 Dec 2000 19:02:07 GMT
From: abigail@foad.org (Abigail)
Subject: Re: problem with script
Message-Id: <slrn93cthf.6jj.abigail@tsathoggua.rlyeh.net>

On 10 Dec 2000 02:47:36 GMT, Ilmari Karonen (iltzu@sci.invalid) wrote in comp.lang.perl.misc <URL: news:<976416137.18207@itz.pp.sci.fi>>:
++ 
++ The rest of my advice still holds, though.  Getting a good editor with
++ syntax highlighting makes programming *so* much easier.  Really.


Nah, all the fancy colours just distract.



Abigail


------------------------------

Date: Tue, 12 Dec 2000 16:29:47 GMT
From: mjd@plover.com (Mark-Jason Dominus)
Subject: Re: Profanity Check (OT)
Message-Id: <3a36527a.6b32$1ba@news.op.net>
Keywords: Kenton, bison, croon, tirade


In article <3A300A59.3C6B0AAC@rac.ray.com>,
Russ Jones  <russ_jones@rac.ray.com> wrote:
>This kind of filtering always ends up throwing out the baby with the
>bath water. The AOL chat room filters used to ban the word "breast,"
>which made the breast cancer support chat pretty silly. One of the
>cyber nanny programs briefly started locking out all of the Latin
>language sites because the Latin word for "with" is "cum". 

Around 1996, one of my clients got complaints from users that the site
was blocked.  It turned out that the site had been blocked because of
the use of the word...wait for it..."pink".

The page in question was a catalog of pink lipsticks.

I called them up in a rage and asked if they were unaware that pink
was the name of a color.

This keyword blocking idea was worth a try, but it really, really does
not work.  At all.



-- 
@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f|ord
($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print


------------------------------

Date: Tue, 12 Dec 2000 14:48:47 GMT
From: crb <chris_beaudette@my-deja.com>
Subject: quick way to search array members
Message-Id: <915dsc$ec6$1@nnrp1.deja.com>



i'm looking for a quick, shorthand way to search thru an array and see
if a particular string already exists in the array.  the goal is to
build an array of unique members only, by comparing potential array
members to existing array members.

i'd rather not have to use a foreach loop to make the comparison.

i thought i've seen a quick operator to do this before, but now i can't
find it in any of the manuals i have.

ideas?

thx,

~~crb


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: 12 Dec 2000 09:03:19 -0600
From: Tony Curtis <tony_curtis32@yahoo.com>
Subject: Re: quick way to search array members
Message-Id: <87itopadhk.fsf@limey.hpcc.uh.edu>

>> On Tue, 12 Dec 2000 14:48:47 GMT,
>> crb <chris_beaudette@my-deja.com> said:

> i'm looking for a quick, shorthand way to search thru an
> array and see if a particular string already exists in
> the array.  the goal is to build an array of unique
> members only, by comparing potential array members to
> existing array members.

Use a hash.  "perldoc -q duplicate" may help some.

hth
t
-- 
Eih bennek, eih blavek.


------------------------------

Date: Tue, 12 Dec 2000 16:06:23 -0000
From: Chris Stith <mischief@velma.motion.net>
Subject: Re: quick way to search array members
Message-Id: <t3cj7vkdtmlccd@corp.supernews.com>

Tony Curtis <tony_curtis32@yahoo.com> wrote:
>>> On Tue, 12 Dec 2000 14:48:47 GMT,
>>> crb <chris_beaudette@my-deja.com> said:

>> i'm looking for a quick, shorthand way to search thru an
>> array and see if a particular string already exists in
>> the array.  the goal is to build an array of unique
>> members only, by comparing potential array members to
>> existing array members.

> Use a hash.  "perldoc -q duplicate" may help some.

Indeed. Any time I see the words 'unique', 'existing', 
'first', 'last', or 'duplicate' in the same sentence
as the word 'array', it causes me to cringe and become
nauseated.

Would it surprise anyone to find out that the builtin
exists() is available for hashes? If so, there needs to
be some study on data structures and algorithms.

Chris
--
Christopher E. Stith
mischief@(?:(?:velma.)?motion.net|pikenet.net|bornnaked.com)


------------------------------

Date: Tue, 12 Dec 2000 16:16:53 GMT
From: msalerno@my-deja.com
Subject: Search and Display - Need assistance
Message-Id: <915j1c$jah$1@nnrp1.deja.com>

I need to search the /etc/passwd file for usernames and show all of the
names that match the search criteria.  I am still new at this and I am
pretty sure that perls grep does not do what I need it to do.  I am not
100% sure what I need to use.

Please help.
Thanks,
Matt

#!/usr/bin/perl -w
use strict;
print "Enter the information that you want to search for : ";
chomp(my $search = <STDIN>);
setpwent();
my @passwd = getpwent;
my($username, $passwd, $uid, $gid, $quota, $comment, $name, $dir,
$shell) = @passwd;


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: Tue, 12 Dec 2000 18:50:07 GMT
From: esper@news.visi.com (Dave Sherohman)
Subject: Setting timeout with IO::Socket
Message-Id: <slrn93csqu.kof.esper@pchan.sherohman.org>

(I asked this question in comp.lang.perl.modules yesterday, but have yet to
get any responses, so I'm bringing it over to .misc.  I'd rather have just
left it in .modules, but that group doesn't seem to be very active...)

I'm working on a little monitor program for a UDP-based server.  The client
sends a copy of requests to the monitor and instructs the server to report
success or failure to the monitor instead of the client.  The monitor then
sits there and matches up requests with responses.  This much works, but
isn't terribly useful by itself, so the monitor also needs to periodically
sweep through its list of outstanding requests and, if there are too many
that are over a certain age, send mail to the admins informing them that the
server needs to be restarted.  (It's ugly and probably not the best way to
go, but this "critical" server was recently down for over a week before
anyone noticed and this seems to be the quickest way to get monitoring in
place.)

Anyhow, I figure the best way to handle this sweep for orphaned requests is
to slap a timeout on my recv call so that it will be performed whenever X
minutes pass without any activity.  All the examples I've been able to turn
up on the web essentially boil down to 'die on timeout' (which is not
appropriate here) implemented using SIGALRM.  I don't think a signal handler
is the correct approach here - sweeping through a hash of hashes (hash of
requests with a hash of request details for each) and examining them all
takes longer than I'm comfortable with in a signal handler.  Plus, some 'old'
entries will be deleted on the assumption that they were orphaned by UDP
delivery failures instead of server failure; modifying data in signal
handlers is generally considered to be a Bad Thing as well.

The obvious solution,

---
my $socket = IO::Socket::INET->new( LocalPort => $port,
                                    Proto => 'udp',
                                    Timeout => 1
                                  );

while (1) {
  my $sender = $socket->recv(my $datagram, 8192);
  print "Recv completed.\n";
}
---

has been running for over an hour and has yet to time out.  (If I send
something to it, the datagram is received intact; no problems there.  But it
never times out.)

What do I need to do to make this work the way I intended?  Also, if a
timeout does occur, am I correct in assuming that $sender would be undef, or
is there some other way to distinguish a timeout from a successful recv?

-- 
"Two words: Windows survives." - Craig Mundie, Microsoft senior strategist
"So does syphillis. Good thing we have penicillin." - Matthew Alton
Geek Code 3.1:  GCS d? s+: a- C++ UL++$ P++>+++ L+++>++++ E- W--(++) N+ o+
!K w---$ O M- V? PS+ PE Y+ PGP t 5++ X+ R++ tv b+ DI++++ D G e* h+ r y+


------------------------------

Date: 12 Dec 2000 12:40:51 -0500
From: stanb@panix.com (Stan Brown)
Subject: sighandler, and strict
Message-Id: <915nv3$ae1$1@panix6.panix.com>

	How should I set up a signal handler in a script where I have strict
	turned on?

	The perdoc example dose not seem to work with this turned on.
	Specificaly the program won't precompile.



------------------------------

Date: 12 Dec 2000 17:52:49 +0000
From: nobull@mail.com
Subject: Re: sighandler, and strict
Message-Id: <u94s09r0ge.fsf@wcl-l.bham.ac.uk>

stanb@panix.com (Stan Brown) writes:

> 	How should I set up a signal handler in a script where I have strict
> 	turned on?

The same as with it turned off.  With strict turned on you can't use a
symbolic function ref, but this approach is not recommended anyhow.

That's the whole point about strict.  It prevents you doing things you
probably shouldn't be doing anyhow.

Use hard code references for signal handlers.

> 	The perdoc example dose not seem to work with this turned on.

You mean there's only one example of signal handlers in the whole of
perldoc?  Really?

> 	Specificaly the program won't precompile.

LOL!  I guess you and I have different idea of what "specificaly"
means.  I don't consider "some program I'm not going to show you fails
to do something I'm not going to define in a way that I won't
describe" to be "specific".

-- 
     \\   ( )
  .  _\\__[oo
 .__/  \\ /\@
 .  l___\\
  # ll  l\\
 ###LL  LL\\


------------------------------

Date: Tue, 12 Dec 2000 18:46:31 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: sighandler, and strict
Message-Id: <uisc3tgp2gkkc6t2nci5d3a3h9l3g820eb@4ax.com>

Stan Brown wrote:

>	How should I set up a signal handler in a script where I have strict
>	turned on?
>
>	The perdoc example dose not seem to work with this turned on.
>	Specificaly the program won't precompile.

I can't imagine what you're talking about. But at least, reference your
subs by reference, not by name.

	$SIG{WHATEVER} = \&mysub; #not "mysub"

-- 
	Bart.


------------------------------

Date: Tue, 12 Dec 2000 11:02:41 -0600
From: "Tommy Martin" <tmartin@tlmartin.com>
Subject: Re: single quote hell
Message-Id: <t3cmekm9ffnh05@corp.supernews.com>

Great information. Thanks alot....

DBI is what I am using...

Tommy

"Doran Barton" <fozz@xmission.com> wrote in message
news:3A35F154.DD838B47@xmission.com...
> Tommy Martin wrote:
> >
> > Thanks.... That got it...
> >
> > I could swear I tried $var =~ s/'/''/; but maybe I didn't. It sure got
it
> > though.
> >
> > Out of curiosity what is the g for as in s/'/''/g;
>
> g = global. It matches all occurances of the pattern.
>
> I thought I should also mention... if you're using DBI to interact with
> your database (and it is usually a good idea to use DBI), you may want
> to use the quote() function that comes with the DBI package. Below is an
> excerpt from the DBI POD:
>
>              $sql = $dbh->quote($value);
>              $sql = $dbh->quote($value, $data_type);
>
>
>            Quote a string literal for use as a literal value in
>            an SQL statement, by escaping any special characters
>            (such as quotation marks) contained within the string
>            and adding the required type of outer quotation marks.
>
>              $sql = sprintf "SELECT foo FROM bar WHERE baz = %s",
>                            $dbh->quote("Don't");
>
>            For most database types, quote would return 'Don''t'
>            (including the outer quotation marks).
>
> I think that's what you want to do. Instead of reinventing the wheel,
> use the quote() method available through DBI.
>
> -=Fozz
> --
> Doran L. Barton -- < fozz@xmission.com >
> "Some people don't know Unix from Meuslix. Others claim to dream in
> Perl."




------------------------------

Date: Tue, 12 Dec 2000 16:06:31 GMT
From: "Wayne Hayes" <hayes@sympatico.ca>
Subject: Re: sorting an array of hashes
Message-Id: <b2sZ5.74181$i%4.2412237@news20.bellglobal.com>

>(It would make sense to make the last name a field by itself, but
>unfortunately, I am extracting data that has already been created.)

If you can't use seperate inputs for each field, try splitting on a space to
get each word (name) and evaluate them individually from there.

Find commonalities in your data. For example, is there always a first and
last name?

If so, I would do something like:

Seperate data on space:

$name1
$name2
$name3

If you always have a first and last name, then $name1 will always be the
first name.

If $name2=~/^\w{1}\.*$/, $name2=initial    # Single word character, may or
may not be followed by a period. If this is true, $name3=last name

Else $name2=last name, if ($name3), it is a title.

This example doesn't take into account 2-word last names, however.

If at all possible, I strongly recommend inputting each field seperately. It
will save you many a headache!

Cheers!

Wayne




kip wrote in message ...
>>When I sort on my regexed last name, a few times i have a name that's
>>similar to Kelly Smith III, (like Kelly Smith the 3rd.).  The regex i am
>>using now is /(\S+$)/.  what kind of regex can i use to extract the last
>>name, but take the next one to the left if the case turns out like the
>>example above?  i suppose i'd have to disregard any combination of Roman
>>numerals.  Any input would be greatly appreciated.  Thanks Again!!!
>
>>>/(\S+)\s\(S+)\s*(\S*)/
>
>>>Where:
>
>>>$first_name=$1;
>>>$last_name=$2;
>>>$opt_name=$3;
>
>This is also excellent, and works great.  But each little piece of help I
>receive, I unfortunately realize another hurdle.  Some of my names have
>middle names and middle initials.
>
>with /(\S+)\s(\S+)\s*(\S*)/.....
>
>John Smith Jr.  is going to match Smith for the last name, awesome, But....
>John J. Smith Jr. is going to match J. as the last name.
>
>of course it would seem easy enough to disregard anything in the  middle if
>it happened to contain a period, but what about full middle names, or
double
>middle initials, like..
>
>John Jerry Smith Jr.  or  John A. G. Smith IV
>
>Also, I am dealing with people from nations all over the world, so they can
>easily have four or five names.
>
>I have too much faith in Perl though to believe that it can't be done.  So
>anybody that has that same faith, lend me a hand.  I'm gonna go hack out
>some regex right now and see what i can do.
>

>
>thanks everyone,
>
>Kip
>
>




------------------------------

Date: Tue, 12 Dec 2000 15:32:33 GMT
From: rzilavec@tcn.net (Richard Zilavec)
Subject: Re: srand() for CGI
Message-Id: <3a3644aa.692078683@news.tcn.net>

On Tue, 12 Dec 2000 04:23:43 GMT, sluppy <sluppy@my-deja.com> wrote:

>Is there a way I can minimize the possibility of two or more visitors
>to the web site getting the same random result?


$uniqkey = $^T . $$;

--
 Richard Zilavec
 rzilavec@tcn.net


------------------------------

Date: Tue, 12 Dec 2000 16:10:45 GMT
From: sluppy <sluppy@my-deja.com>
Subject: Re: srand() for CGI
Message-Id: <915ilt$isv$1@nnrp1.deja.com>

In article <3A35EEEE.23685E63@xmission.com>,
  fozz@xmission.com wrote:
> Ahh, RTFM. The POD documentation for srand() gives an excellent
example:
>
>   srand (time ^ $$ ^ unpack "%L*", `ps axww | gzip`);
>
> Nice.

Thanks for the reply. That's a good idea to use the IP. BTW, I
did "RTFM" before posting and saw the line of code you quoted above. It
didn't help me much being on a Windows system; perhaps there is a
Windows equivalent?

Thanks again.

--Tim


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: Tue, 12 Dec 2000 16:20:36 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: srand() for CGI
Message-Id: <1tjc3t0sfhm1s9sk09ccabbos021mmt2us@4ax.com>

sluppy wrote:

>So... I assume that two visitors to my web site during the exact same
>second would be issued the exact same session ID. (This, of course,
>would be bad.)

Append the value of $$ (possibly formatted to a fixed number of digits)
to your session ID. And put this in your source:

	END {
	     close STDOUT; close STDERR;
	     sleep 1;
	}

This will make the script at least hang on to the process ID, until the
current second has passed. If this script is the only one generating
session ID's, that should definitely fix it.

-- 
	Bart.


------------------------------

Date: Tue, 12 Dec 2000 14:16:34 GMT
From: montep@hal-pc.org (Monte Phillips)
Subject: Re: Testing deliverability with Perl and Sendmail -bv
Message-Id: <3a37323f.1938186@news.hal-pc.org>

On Tue, 12 Dec 2000 09:17:07 -0400, Gil Vautour <vautourNO@SPAMunb.ca>
wrote:

>Ok, here's a clarification.  I figured out Sendmail won't do what I need.
>What I want to do is for my file of email addresses (which are all
>internal) is query our CCSO (Ph) nameserver to determine if the address is
>valid.  I'm just not sure about how to do the query and test the result.
>We already have a CGI script for public queries, so I was thinking of
>somehow calling that from my script and test the results...  Any ideas?
>Gil Vautour wrote:
>
>> Hello,
>>
>> I was wondering if anyone knew a way using Perl and Sendmail to test an
>> email
>> address to see if it is deliverable before sending the message?  I'm
>> using Perl to call Sendmail to send a standard message to a large file
>> of email addresses and I would like to check to see if the message will
>> get bounced and if so not send it at all.  I know there is the -bv
>> option, will this do the job?  More specifically how
>> in Perl would I be able to test the the result of -bv in order to send
>> or not?
>> Thanks,


Well, as I understand the mail system, you have two things you can do.
One is the quick Perl thing of simply testing the email address for
obvious things. (check back on Deja, there was an extensive discussion
on the regex to do this sort of thing).  

This of course in no way checks the existence.  To do that you must
send a test mail to the address and wait the return.  You can then
parse those fail to deliver responses, strip the invalid address and
compare it to your addressbook list.  If one matches delete it.

g'Luk
Monte


------------------------------

Date: 12 Dec 2000 18:13:00 GMT
From: Eli the Bearded <elijah@workspot.net>
Subject: Re: Testing deliverability with Perl and Sendmail -bv
Message-Id: <eli$0012121304@qz.little-neck.ny.us>

In comp.lang.perl.misc, Monte Phillips <montep@hal-pc.org> wrote:
> Gil Vautour <vautourNO@SPAMunb.ca> wrote:
> >Ok, here's a clarification.  I figured out Sendmail won't do what I need.
> >What I want to do is for my file of email addresses (which are all
> >internal) is query our CCSO (Ph) nameserver to determine if the address is
> >valid.  I'm just not sure about how to do the query and test the result.

Would it be possible to get a list of all valid email addresses, 
updated in a regular fashion? That will make the process very easy.
And since they are internal, there is a slight chance the list
of email addresses is known somewhere.

> >We already have a CGI script for public queries, so I was thinking of
> >somehow calling that from my script and test the results...  Any ideas?

Why not just look at its code to see how it does it?

> Well, as I understand the mail system, you have two things you can do.
> One is the quick Perl thing of simply testing the email address for
> obvious things. (check back on Deja, there was an extensive discussion
> on the regex to do this sort of thing).  

Client side you cannot tell if an address will actually deliver.

> This of course in no way checks the existence.  To do that you must
> send a test mail to the address and wait the return.  You can then

This is sometimes more than is necessary[1] including this specific
case of *internal* email.

Elijah
------
[1] If you get a '250 ok' to the 'rcpt to: address' request from
the final mail server then most times the mail will go through.
Getting to the final mail server can be a trick though, as there
are a goodly number of sites that have at least one mx host which
just forwards all mail for the domain to the real server, not 
knowing what's good and what's not.


------------------------------

Date: Tue, 12 Dec 2000 17:31:27 -0000
From: David Lee <dwl@slip.net>
Subject: unicode question
Message-Id: <t3co7f4nmo9ic8@corp.supernews.com>

Hi,

I'm trying to read and write a utf8 unicode file.
For example if I open my file with Notepad and
try to "Save As" the encoding is Unicode. I want
to read this file, add some strings, and write
another file also in unicode, i.e. if I read
this file with Notepad the encoding should be unicode.
By unicode file I mean a file in utf8 encoding. 
I tried using binmode with discipline ":utf8" but it
didn't seem to work. Is this working in perl 5.6?

For example here's what I tried to create a new utf8
encoded file:

my $string="hello";
open(F,":utf8",">file");
print F $string;

Any help appreciated.

Thanks
David




------------------------------

Date: Tue, 12 Dec 2000 15:18:43 GMT
From: AvA <a.v.a@home.nl>
Subject: what am i doing wrong?
Message-Id: <3A364306.6A8E30A8@home.nl>

when someone enters my webpage (for testing only) i set a cookie and the

value of that cookie generates
a file on my system...so a unique file for every visitor.

when i try it out myself (using netscape on my apache server) it works
as i intended.
but when i ask someone from the outside (with ms explorer)to enter the
page and press the refresh button a few times,  it generates a new file
on every new page load (cookies were enabled).

what is wrong with the following code?
---------------------------------------------
#!/usr/bin/perl

($cookie_name, $cookie_value) = split(/=/, $ENV{'HTTP_COOKIE'});

if ($ENV{'HTTP_COOKIE'}){
 $number=$cookie_value;
} else {
 $number = time;
 $expire="Sat Dec 31 2005 24:00 GMT";
 print "Set-Cookie: id=$number; expires=$expire\n";
}

open(FILE, ">>$number");
print FILE "date|".localtime ."\n";
print FILE "referer|" . $ENV{'HTTP_REFERER'}. "\n";
close(FILE);

print "Content-type: text/html\n\n";

print <<EOF;
 ................
-----------------------------------------------




------------------------------

Date: Tue, 12 Dec 2000 16:22:08 GMT
From: rgarciasuarez@free.fr (Rafael Garcia-Suarez)
Subject: Re: what am i doing wrong?
Message-Id: <slrn93ck7e.5p1.rgarciasuarez@rafael.kazibao.net>

AvA wrote in comp.lang.perl.misc:
> what is wrong with the following code?
> ---------------------------------------------
> #!/usr/bin/perl
> 
> ($cookie_name, $cookie_value) = split(/=/, $ENV{'HTTP_COOKIE'});

You're trying to parse a cookie header by yourself. 
That's wrong and bug-prone (as you're discovering). Use the CGI module
or the CGI::Cookie module instead.

-- 
# Rafael Garcia-Suarez / http://rgarciasuarez.free.fr/


------------------------------

Date: Tue, 12 Dec 2000 15:27:32 +0100
From: "Bjoern Kaiser" <cheesy@upb.de>
Subject: Win32 --> ppm , Linux --> ???
Message-Id: <915co1$j0c$05$1@news.t-online.com>

Hi,
unter Windows benutzt man ppm (perlPackageManager)
aber was macht man wenn man unter Linux ein Paket installieren will???

thx bjoern




------------------------------

Date: Tue, 12 Dec 2000 15:29:50 GMT
From: "Jan Schaumann" <jschauma@netmeister.org>
Subject: Re: Win32 --> ppm , Linux --> ???
Message-Id: <OvrZ5.124583$DG3.2469426@news2.giganews.com>

* "Bjoern Kaiser" <cheesy@upb.de> wrote:

> Hi, unter Windows benutzt man ppm (perlPackageManager) aber was macht
> man wenn man unter Linux ein Paket installieren will???

<translation>
Hi, in Widows you use ppm (perlPackageManger), but what do you do if you
want to install a packet under Linux?
</translation>

In der richtigen NG nachfrage (xpost & fup2 nach de.comp.lang.perl.misc)

Ask in the right NG.

Normalerweise installiert man perl-modules mit

Usually you install a perl-module with:

perl Makefile.pl
make
make test
su
make install

-Jan

-- 
Jan Schaumann <http://www.netmeister.org>

"The point is, you shouldn't eat things that feel pain." *BONK!* "Ow!"
"Okay, we won't eat you!" --hippie & Bender


------------------------------

Date: Tue, 12 Dec 2000 15:54:25 -0000
From: Chris Stith <mischief@velma.motion.net>
Subject: Re: Win32 --> ppm , Linux --> ???
Message-Id: <t3cihh5mht72c4@corp.supernews.com>

Bjoern Kaiser <cheesy@upb.de> wrote:
> Hi,

Hello. 

> unter Windows benutzt man ppm (perlPackageManager)

Good. This is the preferred method on Windows as I understand.

> aber was macht man wenn man unter Linux ein Paket installieren will???

On Linux (or other Unixy systems), you should use the CPAN.pm or you
should download the .tar.gz for the proper package which you could
then unarchive and install.

Usually for a CPAN-dsitributed package, there will be a file called
Makefile.PL, which you can run using the command 'perl Makefile.PL'.
Then you can run 'make', 'make test', and 'make install'. Some packages
have specific instructions, found in the files README or INSTALL.

I hope this all helps. I'm not the best with German, but I don't
take that to be a serious flaw in an Anglophonic newsgroup. 

Chris

--
Christopher E. Stith - mischief@(motion.net|pikenet.net|bornnaked.com)
"If we all got along, the world wouldn't have any good action movies."



------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 5091
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[17671] in Perl-Users-Digest

Perl-Users Digest, Issue: 5091 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Tue Dec 12 14:17:40 2000

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Dec 12 14:17:40 2000