[17075] in Perl-Users-Digest
Perl-Users Digest, Issue: 4487 Volume: 9
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Oct 2 03:10:56 2000
Date: Mon, 2 Oct 2000 00:05:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <970470311-v9-i4487@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Mon, 2 Oct 2000 Volume: 9 Number: 4487
Today's topics:
Re: die() ignores tied STDERR? (Martien Verbruggen)
Executing a program from a CGI <scottdellar@hotmail.com>
Re: Expect.pm May Do The Trick <reedfish@ix.netcom.com>
Re: flocking while using the diamond operator? (Martien Verbruggen)
Re: help with a d/l script for pdf files. <scottl@sympac.com.au>
Re: help with a d/l script for pdf files. <simonis@myself.com>
Re: help with a d/l script for pdf files. <scottl@sympac.com.au>
Re: help with a d/l script for pdf files. (Martien Verbruggen)
How can I install DBI? <idleisidle@usa.net>
How to define window and colors with curses? <Torsten.Eymann@ostsee-zeitung.de>
Re: How to get length of scalar? (David H. Adler)
Re: Is this is Regexp bug? <anmcguire@ce.mediaone.net>
Re: Regex comparing street addresses <godzilla@stomp.stomp.tokyo>
Re: Regex comparing street addresses <peter.sundstrom@eds.com>
Re: Regex comparing street addresses <peter.sundstrom@eds.com>
Re: Regex comparing street addresses <anonymous@anonymous.anonymous>
Re: Regex comparing street addresses <mauldin@netstorm.net>
Re: Regex comparing street addresses <anmcguire@ce.mediaone.net>
Re: Regex comparing street addresses <peter.sundstrom@eds.com>
Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 2 Oct 2000 16:09:14 +1100
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: die() ignores tied STDERR?
Message-Id: <slrn8tg63q.jr6.mgjv@martien.heliotrope.home>
On Sun, 1 Oct 2000 01:48:26 -0400,
David Coppit <newspost@coppit.org> wrote:
>
> I tied STDERR to a module in order to detect when something goes wrong
why?
> in a CGI script. Unfortunately, die() doesn't seem to do "print
> STDERR". Below is a test script. Any suggestions, besides checking $?
> in addition to checking for output to STDERR?
>
> Thanks,
> David
>
> package CATCH;
>
> sub TIEHANDLE
> {
> my $package = shift;
>
> return bless {},$package;
> }
>
> sub PRINT
> {
> # Temporarily untie the filehandle so that we won't recursively call
> # ourselves
> untie *STDERR;
> print "Caught STDERR:\n";
> print STDERR @_;
> tie *STDERR,__PACKAGE__;
> }
Did you try to run this with -w? If not, try it.
> package main;
>
> tie *STDERR,'CATCH';
> die "died!";
Can I ask WHY you are doing all this? The code above obviously can't be
your ultimate goal, because it doesn't add anything. If all you want to
do is redirect STDERR output to a file, then:
open(STDERR, ">/tmp/stderr") or die $!;
will do that. If you want it redirected to STDOU, you can do that as
well. If you want formatted output to STDOUT, then use CGI::Carp or
something like that.
If you do that, then the output of a die will nicely follow.
I don't know what die internally cals, but redirecting the whole STDERR,
instead of trying to do this with tied filehandles doesn't care :)
# perldoc perlopentut
# perldoc -f open
Martien
--
Martien Verbruggen |
Interactive Media Division | If it isn't broken, it doesn't have
Commercial Dynamics Pty. Ltd. | enough features yet.
NSW, Australia |
------------------------------
Date: Mon, 2 Oct 2000 17:12:49 +1100
From: "Scott Dellar" <scottdellar@hotmail.com>
Subject: Executing a program from a CGI
Message-Id: <8r9911$on2$1@news.latrobe.edu.au>
Hey there,
I am trying to execute either (preferably both of) postgres or pg_dump from
within a CGI. I am doing this so I can back up and restore a postgres
database - with a single click from the Internet.
The commands are:
'pg_dump groupa00 > backup.txt' (for backing up to a text file)
'psql -e groupa00 < backup.txt' (for getting the data back from the text
file)
If you can help with the execution of either of these programs it would make
my day.
Thanks (in advance),
Scott
------------------------------
Date: Mon, 2 Oct 2000 00:00:42 -0400
From: "Brian Kelly" <reedfish@ix.netcom.com>
Subject: Re: Expect.pm May Do The Trick
Message-Id: <8r91fs$u9j$1@slb6.atl.mindspring.net>
You can accomplish the same thing with Net::Telnet. There isn't anything
I haven't been able to automate with it. Furthermore, with the
Crypt::CBCeasy,
Crypt::CBC and Crypt::DES and Digest::MD5, I've been able to craft my own
"sudo" SUDO (a unix superuser utility that grants non-root users ability to
run
certain commands as root). This allows me to automate difficult programs
like
ClearCase that won't work with SUDO. Only drawback is that I have to have
the Unix administrator update the embedded root password whenever he changes
it (which is frequently and without warning. Oh well - no one said it's a
"perfect"
world!!!). I go a step further and bit-shift the encrypted-sting before
writing it
to a file. The user's own password is used as the salt to encrypt the root
password.
I also use perlcc to create a binary of the script. Granted it wouldn't stop
a
"Super Hacker" who could decompile the script and who also knew the password
that was used as salt - but hell, they've NEVER employed anyone even HALF
that good where I'm currently consulting!!
KDW <jwestover@sprintmail.com> wrote in message
news:TARB5.3240$YY2.142832@newsread2.prod.itd.earthlink.net...
> Thanks for your response. I found expect.pm while doing some research on
> the web. This seems to be a viable approach to dealing with Unix system
> commands (passwd) that expect an interactive response from the terminal.
>
> I suppose in my original message I should have said that I wanted to
tackle
> this issue using Perl instead of some other language......Which is why I
> chose to post my question on this forum.........
>
>
------------------------------
Date: Mon, 2 Oct 2000 16:25:03 +1100
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: flocking while using the diamond operator?
Message-Id: <slrn8tg71f.jr6.mgjv@martien.heliotrope.home>
On Sat, 30 Sep 2000 23:19:37 GMT,
tmdryden@my-deja.com <tmdryden@my-deja.com> wrote:
> Howdy:
>
> I'm trying to lock a file I'm using the diamond operator (<>) on, but
> can't figure out how to make it work.
>
> Here's my basic concept:
>
> use Fcntl qw(:flock);
>
> @ARGV = ("file");
> flock(ARGV,LOCK_EX); # <== doesnt work
ARGV isn't open yet here, which you would have known had you checked the
return value of flock:
flock(ARGV,LOCK_EX) or die "Cannot flock: $!";
The message:
flock() on closed filehandle main::ARGV at /tmp/foo.pl line 6.
Cannot flock: Bad file descriptor at /tmp/foo.pl line 6.
You can do this, but you'd have to acquire a lock _after_ the first <>
(which also opens the file to ARGV), and then you are too late. You
would also need to make sure never to try to lock the same file twice.
Can of worms. Just not necessary. Rethink what it is you want to do.
Besides all that... The magic <> will open the file for reading, NOT
writing, so you might not even be granted an exclusive lock on all
platforms.
> $^I = ".bak";
And what is this supposed to do? You're not going to be writing, you
know.. You might _want_ to write, but you won't be able to with the
tools you're using.
> while (<>) {
> do something;
> }
Since you will only be reading, why do you need the exclusive lock? Are
you actually trying to do something like the -p and -i flag to perl do?
The -i flag actually doesn't do a real in-place edit. It creates a new
file, in which the output oges, and then renames (or unlinks) the
original file, and gives the target name to the new file.
You could just use the -p and -i flag, you know?
#!/usr/local/bin/perl -pi.bak
s/foo/bar/;
works perfectly fine. and there shouldn't be any need for file locking.
Changing a file's contents by creating a new one, and renaming that to
the original name is a perfectly fine way to avoid concurrency issues.
If you want to actually, really _rewrite_ a file while other programs
might be reading it, just loop over @ARGV explicitly, and open and lock
the files (make sure ALL readers also acquire a LOCK_SH).
Maybe, instead of assuming that you already know how to do what you
want, you should again go back to the drawing board, and ask yourself
the question what it is you want to achieve. Instead of trying to solve
something that isn't going to be possible, try to solve the problem you
actually are interested in.
Martien
--
Martien Verbruggen |
Interactive Media Division | Useful Statistic: 75% of the people
Commercial Dynamics Pty. Ltd. | make up 3/4 of the population.
NSW, Australia |
------------------------------
Date: Mon, 02 Oct 2000 15:14:10 -0700
From: Scott Laughton <scottl@sympac.com.au>
Subject: Re: help with a d/l script for pdf files.
Message-Id: <39D908B2.1EF22A9F@sympac.com.au>
Thanks for the quick reply ..... is there any way I can do it with
javascript or anyother language then...
Thanks for your time.
Scott Laughton.
Tony Curtis wrote:
> >> On Mon, 02 Oct 2000 14:01:49 -0700,
> >> Scott Laughton <scottl@sympac.com.au> said:
>
> > Is there some header I can send with the link to the
> > file that I would normally print to the location bar?
>
> Ooooh, this one again.
>
> The answer is: no.
>
> You cannot force the client software to do something like
> this from the server-side.
>
> hth
> t
> --
> Namaste!
> And an "oogabooga" to you too!
> -- Homer Simpson
------------------------------
Date: 02 Oct 2000 04:23:54 GMT
From: Drew Simonis <simonis@myself.com>
Subject: Re: help with a d/l script for pdf files.
Message-Id: <39D80B4B.599C1541@myself.com>
Scott Laughton wrote:
>
> Thanks for the quick reply ..... is there any way I can do it with
> javascript or anyother language then...
no is no. You can make suggestions, but the client is free to do
whatever it wants.
------------------------------
Date: Mon, 02 Oct 2000 15:41:07 -0700
From: Scott Laughton <scottl@sympac.com.au>
Subject: Re: help with a d/l script for pdf files.
Message-Id: <39D90F03.BD8EBA8C@sympac.com.au>
Ok thanks for the info...... the only reason that I wanted to force a
download to a user specifyed address instead of spawning the Acrobat
Readeer is because the documents that will be available are quite large
and will only be available to our customer as a part of our client
support. The boss wnated me to get them to download the documents as
they are user manuals and then they could referr to them at any time as
they would have the docs saved on their local machines. The people that
are getting these manuals are not very computer eliterate and could make
the mistake very of just waiting for ages to get the documents and then
loosing them when they shut the browser.... thus making more calls for us
in support... which is what we are trying to avoive by making the
documents available on the internet.
but if no is no then I will just have to make the instructions very
clear...
thanks again for your time..........
Scott Laughton.
Drew Simonis wrote:
> Scott Laughton wrote:
> >
> > Thanks for the quick reply ..... is there any way I can do it with
> > javascript or anyother language then...
>
> no is no. You can make suggestions, but the client is free to do
> whatever it wants.
------------------------------
Date: Mon, 2 Oct 2000 16:46:19 +1100
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: help with a d/l script for pdf files.
Message-Id: <slrn8tg89b.jr6.mgjv@martien.heliotrope.home>
[please, in the future, post your reply _after_ the suitably trimmed
text you reply to.]
[Note the followups]
On Mon, 02 Oct 2000 15:41:07 -0700,
Scott Laughton <scottl@sympac.com.au> wrote:
[snip of large and unnecessary justification of offtopic post,
including jeopardy quote]
This is just to prevent even more verbosity from you on this subject:
If I were you, I would go off to one of the groups in the
comp.infosystems.www.* hierarchy, and ask there whether there are
generally accepted methods of doing this. The answers you got are
technically correct: No, you cannot do this reliably, because browsers
are free to do what they want.
However, there may be practical solutions that work in _many_ cases. We
don't really know, and we really do not want to discuss that here, since
it has NOTHING to do with Perl or perl.
I don't even know where you got the idea that it did.
Martien
--
Martien Verbruggen |
Interactive Media Division | The world is complex; sendmail.cf
Commercial Dynamics Pty. Ltd. | reflects this.
NSW, Australia |
------------------------------
Date: Mon, 2 Oct 2000 00:26:36 -0500
From: "Ben Ben" <idleisidle@usa.net>
Subject: How can I install DBI?
Message-Id: <01VB5.4310$rr.70575@vixen.cso.uiuc.edu>
I run MySQL server and Perl 6.18 server on Win98, and now I want to install
the DBI package.
I perform:
ppm install DBI.ppd
and the result is:
Retrieving package 'DBI.ppd'...
Error installing package 'DBI.ppd': Could not locate a PPM binary of
'DBI.ppd' for this platform
Somebody help me please.
------------------------------
Date: Mon, 02 Oct 2000 08:31:08 +0200
From: Torsten Eymann <Torsten.Eymann@ostsee-zeitung.de>
Subject: How to define window and colors with curses?
Message-Id: <39D82BAB.29E92E26@ostsee-zeitung.de>
Hello,
can anyone say, how must i use the functions in the Curses.pm to define
and use windows and background/letter-colors with the Cuses module?
thanks
torsten
------------------------------
Date: 2 Oct 2000 05:48:29 GMT
From: dha@panix.com (David H. Adler)
Subject: Re: How to get length of scalar?
Message-Id: <slrn8tg8dc.e2u.dha@panix6.panix.com>
On Sun, 01 Oct 2000 19:03:48 GMT, David Steuber
<nospam@david-steuber.com> wrote:
>As for on-topic posts, I've tried to post to comp.lang.perl and
>cross-post to comp.lang.perl.misc (or modules) as clp has less traffic
>than clpm.
There's a reason for this. Said reason is that c.l.p *no longer
exists* (cf. Crazy Vaclav, some simpsons episode that I can't be
bothered to look up the code for...)
>Perhaps that alt.perl group I see in the headers could be used for the
>off topic rants?
By definition, off-topic rants should not appear in *any*
newsgroup... :-) (1/2).
dha
--
David H. Adler - <dha@panix.com> - http://www.panix.com/~dha/
"You don't understand. He *had* to murder the nun and harvest her
organs" - overheard at some convention
------------------------------
Date: Mon, 2 Oct 2000 00:50:20 -0500
From: "Andrew N. McGuire " <anmcguire@ce.mediaone.net>
Subject: Re: Is this is Regexp bug?
Message-Id: <Pine.LNX.4.21.0010020044560.1845-100000@hawk.ce.mediaone.net>
On Sun, 1 Oct 2000, Glyndwr quoth:
G> I was recently approached by a friend in my (poorly filled) role as a Perl
G> guru and asked to help with a script: he wanted to check if a string
G> supplied from a CGI input was a valid email address (it's a formmail
G> script). The line I came up with was:
G> if ($contactemail =~ /^[a-zA-Z0-9.\-]+@[a-z-A-Z0-9.\-]+/) {
G> which is a little crude, but it was late and I couldn't concentrate very
G> well ;o)
[ snip of code and other stuff ]
I will let another validate whether what you have is a bug or not, as
it is getting late, and I have to get some rest. However, if you are
trying to validate an email address, do a perldoc -q email and read:
How do I check a valid mail address?
Also, not mentioned there is Abigail's RFC822::Address module, for which
you will also need Parse::ReDescent.
HTH.
anm
--
perl -wMstrict -MText::ParseWords -e "
system echo => grep defined() ? /./ : q++ => quotewords '\s+', 0, <<JAPH;
"""""""""""""""""""""""""""""""" Just """"""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""" another """"""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""" Perl """"""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""" Hacker """"""""""""""""""""""""""""""""
JAPH
"
------------------------------
Date: Sun, 01 Oct 2000 21:34:43 -0700
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: Regex comparing street addresses
Message-Id: <39D81063.49D50BED@stomp.stomp.tokyo>
Peter Sundstrom wrote:
> I'm having problems trying to get the correct regex
> to perform an address comparison.
> The comparison involves:
> Reversing the order of any two adjacent address identifiers (where
> identifier is a group of alphanumeric characters preceded and followed by a
> space and containing at least one numeric character) in the first address
> and then comparing it against the second to see if they match.
> My current attempt is:
(snipped)
Here ya go. Working code. Only tests I have made on
this script is what you see for input data. Output
is precisely per your parameters.
Godzilla!
--
TEST SCRIPT:
____________
#!/usr/local/bin/perl
print "Content-Type: text/plain\n\n";
Compare('4 123A SMITH ST','123A 4 SMITH ST');
Compare('APARTMENT 4 123A SMITH ST', 'APARTMENT 123A 4 SMITH ST');
Compare('TOP APARTMENT 4 123A SMITH ST', 'TOP APARTMENT 123A 4 SMITH ST');
Compare('APARTMENT 4 123A SMITH ST', 'FLAT 123A 4 SMITH ST');
Compare('10 SMITH ST', '12 SMITH ST');
Compare('APARTMENT 4 4567A SMITH ST', 'APARTMENT 4567A 4 SMITH ST');
Compare('1600 PENNSLYVANIA AVE', '1600 PENNSLYVANIA AVE');
Compare('APARTMENT 123 4567A SMITH ST', 'APARTMENT 4567A 123 SMITH ST');
Compare('APARTMENT 14 4567A SMITH ST', 'APARTMENT 4567A 41 SMITH ST');
sub Compare
{
my ($add1,$add2) = @_;
print "$add1 ¦ $add2\n";
if ($add1 =~ /(\s*\d+\s*\d+[a-z])/i)
{
$work = $1;
$add1 =~ s/$1/¦/;
}
$work =~ s/(\d) (\d)/$1¦$2/;
($var1, $var2) = split (/¦/, $work);
$add1 =~ s/¦/ $var2 $var1/;
$add1 =~ s/^ //;
$add1 =~ s/\s+/ /g;
if ($add1 eq $add2)
{ print "SAME\n\n"; }
else
{ print "DIFFERENT\n\n"; }
($work, $var1, $var2) = "";
}
exit;
PRINTED RESULTS:
________________
4 123A SMITH ST ¦ 123A 4 SMITH ST
SAME
APARTMENT 4 123A SMITH ST ¦ APARTMENT 123A 4 SMITH ST
SAME
TOP APARTMENT 4 123A SMITH ST ¦ TOP APARTMENT 123A 4 SMITH ST
SAME
APARTMENT 4 123A SMITH ST ¦ FLAT 123A 4 SMITH ST
DIFFERENT
10 SMITH ST ¦ 12 SMITH ST
DIFFERENT
APARTMENT 4 4567A SMITH ST ¦ APARTMENT 4567A 4 SMITH ST
SAME
1600 PENNSLYVANIA AVE ¦ 1600 PENNSLYVANIA AVE
SAME
APARTMENT 123 4567A SMITH ST ¦ APARTMENT 4567A 123 SMITH ST
SAME
APARTMENT 14 4567A SMITH ST ¦ APARTMENT 4567A 41 SMITH ST
DIFFERENT
------------------------------
Date: Mon, 2 Oct 2000 17:24:34 +1300
From: "Peter Sundstrom" <peter.sundstrom@eds.com>
Subject: Re: Regex comparing street addresses
Message-Id: <8r92tp$61r$1@hermes.nz.eds.com>
Godzilla! wrote in message <39D7FC82.1053C8B@stomp.stomp.tokyo>...
>Nope. Your script does not produce these data
>you have posted.
You must have copied and pasted it *without* preserving the space in regex.
Should be $3 $2 not $3$2
------------------------------
Date: Mon, 2 Oct 2000 17:33:35 +1300
From: "Peter Sundstrom" <peter.sundstrom@eds.com>
Subject: Re: Regex comparing street addresses
Message-Id: <8r93ej$6fu$1@hermes.nz.eds.com>
Bob Walton wrote in message <39D80092.9ABE21B7@rochester.rr.com>...
>Peter Sundstrom wrote:
>>
>> I'm having problems trying to get the correct regex to perform an address
>> comparison.
>>
>> The comparison involves:
>>
>> Reversing the order of any two adjacent address identifiers (where
>> identifier is a group of alphanumeric characters preceded and followed by
a
>> space and containing at least one numeric character) in the first address
>> and then comparing it against the second to see if they match.
>>
>...
>> Am I going about this in the correct way? If so, how do I fix my regex?
>
>A regex isn't always the best way to do everything. In this case, try:
>
>use strict;
>Compare('4 123A SMITH ST','123A 4 SMITH ST');
>Compare('APARTMENT 4 123A SMITH ST', 'APARTMENT 123A 4 SMITH ST');
>Compare('TOP APARTMENT 4 123A SMITH ST', 'TOP APARTMENT 123A 4 SMITH
>ST');
>Compare('APARTMENT 4 123A SMITH ST', 'FLAT 123A 4 SMITH ST');
>Compare('10 SMITH ST', '12 SMITH ST');
>
>sub Compare{
> my($one,$two)=@_;
> my(%one,%two);
> print "address 1: $one\n";
> print "address 2: $two\n";
> for(split / /,$one){$one{$_}++}
> for(split / /,$two){$two{$_}++}
> my $diff=0;
> for(keys %one){
> {local $^W=0;
> if($two{$_}!=$one{$_}){$diff++}
> }
> }
> for(keys %two){
> {local $^W=0;
> if($two{$_}!=$one{$_}){$diff++}
> }
> }
> if($diff){
> print "DIFFERENT\n";
> }
> else{
> print "SAME\n";
> }
>}
Thanks for that solution Bob. Certainly a different way to look at it. It
works for the set of data that I supplied, but will not work in other
instances.
For example, if I compare:
'APARTMENT 4 123A UPPER SMITH ST'
and
'APARTMENT 123A 4 SMITH ST UPPER'
Your solution says they are the same, when they should be different.
I've since found the problem with the logic in my regex. I was trying to
match too much stuff. I only needed to worry about the stuff I was
reversing.
Here's the corrected script:
#!/usr/local/bin/perl -w
use strict;
Compare('4 123A SMITH ST','123A 4 SMITH ST');
Compare('APARTMENT 4 123A SMITH ST', 'APARTMENT 123A 4 SMITH ST');
Compare('APARTMENT 4 123A UPPER SMITH ST', 'APARTMENT 123A 4 SMITH ST
UPPER');
Compare('TOP APARTMENT 4 123A SMITH ST', 'TOP APARTMENT 123A 4 SMITH ST');
Compare('APARTMENT 4 123A SMITH ST', 'FLAT 123A 4 SMITH ST');
Compare('10 SMITH ST', '12 SMITH ST');
sub Compare {
my ($add1,$add2)=@_;
print "$add1,$add2\n";
$add1 =~ s/([A-Z]*\d+[A-Z]*) ([A-Z]*\d+[A-Z]*)/$2 $1/;
if ($add1 eq $add2) {
print "SAME\n\n";
}
else {
print "DIFFERENT\n\n";
}
}
------------------------------
Date: Sun, 01 Oct 2000 22:28:44 -0700
From: Anonymous <anonymous@anonymous.anonymous>
Subject: Re: Regex comparing street addresses
Message-Id: <39D81D0C.49F2579F@stomp.stomp.tokyo>
Peter Sundstrom wrote:
> Godzilla! wrote:
> > Nope. Your script does not produce these data
> > you have posted.
> You must have copied and pasted it *without* preserving the space in regex.
> Should be $3 $2 not $3$2
Precisely. Replacing this missing space yielded
correct results upon testing. When you post to
USENET, work on keeping your line length to a
bare minimum to avoid this word wrap plague.
Not using such a high amount of indentation
will help on this.
Quite frequently, before I post, I will actually
click my cursor at a line start, and manually
count how many characters are present. If in
excess of eighty, I will modify this line by
splitting it with a space added as needed or
add a note to be cautious about word wrap if
no space is needed.
I will remind myself to be careful about a
copy and paste of word wrapped lines.
Anonymous Godzilla!
==
--------== Posted Anonymously via Newsfeeds.Com ==-------
Featuring the worlds only Anonymous Usenet Server
-----------== http://www.newsfeeds.com ==----------
------------------------------
Date: Mon, 02 Oct 2000 05:39:51 GMT
From: Jim Mauldin <mauldin@netstorm.net>
Subject: Re: Regex comparing street addresses
Message-Id: <39D81F62.DE33E3A8@netstorm.net>
Peter Sundstrom wrote:
>
> The comparison involves:
>
> Reversing the order of any two adjacent address identifiers (where
> identifier is a group of alphanumeric characters preceded and followed by a
> space and containing at least one numeric character) in the first address
> and then comparing it against the second to see if they match.
I'm sure you meant "identifier ... separated by a space" rather than
"preceded and followed by a space". You also contradict yourself by
saying that the identifiers you're interested in must have at least one
digit, in which case:
> APARTMENT 4 123A SMITH ST,FLAT 123A 4 SMITH ST
> DIFFERENT
should be SAME because FLAT and APARTMENT are irrelevant (don't have a
digit), and the relevant portions 4 and 123A are in fact adjacent and in
reverse order.
This said, and assuming DWIM, here's a slightly shorter version of Bob
Walton's fine solution:
sub Compare {
my ($one,$two) = @_;
print "$one\n$two\n";
my (%one,%two);
my $same = my $i = 0;
$one{$_} = $i++ for(split / /,$one);
$i = 0;
$two{$_} = $i++ for(split / /,$two);
for $one (keys %one) {
$same = exists($two{$one}) && (abs($one{$one}-$two{$one})<2);
last unless $same;
}
$same and print "Same\n" or print "Different\n";
}
-- Jim
------------------------------
Date: Mon, 2 Oct 2000 00:40:32 -0500
From: "Andrew N. McGuire " <anmcguire@ce.mediaone.net>
Subject: Re: Regex comparing street addresses
Message-Id: <Pine.LNX.4.21.0010020033100.1845-100000@hawk.ce.mediaone.net>
On Mon, 2 Oct 2000, Peter Sundstrom quoth:
PS> I'm having problems trying to get the correct regex to perform an address
PS> comparison.
PS>
PS> The comparison involves:
PS>
PS> Reversing the order of any two adjacent address identifiers (where
PS> identifier is a group of alphanumeric characters preceded and followed by a
PS> space and containing at least one numeric character) in the first address
PS> and then comparing it against the second to see if they match.
[ snip code ]
I would consider doing it a different way, you don't necessarily need
a regex for this, and I think that the below is a lot easier to read.
Of course there is a problem with the below, and that is:
'TOP APARTMENT 4 123A SMITH ST'
will match
'ST 4 TOP APARTMENT 123A SMITH'
which probably is not desired, but then again, you may not care.
sub compare {
my @addr_1 = sort split '\s+' => shift;
my @addr_2 = sort split '\s+' => shift;
if ("@addr_1" eq "@addr_2") {
print "Addresses 1 and 2 match!\n";
}
else {
print "Addresses 1 and 2 differ!\n";
}
}
anm
--
perl -wMstrict -MText::ParseWords -e "
system echo => grep defined() ? /./ : q++ => quotewords '\s+', 0, <<JAPH;
"""""""""""""""""""""""""""""""" Just """"""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""" another """"""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""" Perl """"""""""""""""""""""""""""""""
"""""""""""""""""""""""""""""""" Hacker """"""""""""""""""""""""""""""""
JAPH
"
------------------------------
Date: Mon, 2 Oct 2000 18:19:49 +1300
From: "Peter Sundstrom" <peter.sundstrom@eds.com>
Subject: Re: Regex comparing street addresses
Message-Id: <8r965d$8ef$1@hermes.nz.eds.com>
Godzilla! wrote in message <39D81063.49D50BED@stomp.stomp.tokyo>...
>Peter Sundstrom wrote:
>
>> I'm having problems trying to get the correct regex
>> to perform an address comparison.
>
>> The comparison involves:
>
>> Reversing the order of any two adjacent address identifiers (where
>> identifier is a group of alphanumeric characters preceded and followed by
a
>> space and containing at least one numeric character) in the first address
>> and then comparing it against the second to see if they match.
>
>
>Here ya go. Working code. Only tests I have made on
>this script is what you see for input data. Output
>is precisely per your parameters.
Not quite.
>#!/usr/local/bin/perl
>
>print "Content-Type: text/plain\n\n";
>
>Compare('4 123A SMITH ST','123A 4 SMITH ST');
>Compare('APARTMENT 4 123A SMITH ST', 'APARTMENT 123A 4 SMITH ST');
>Compare('TOP APARTMENT 4 123A SMITH ST', 'TOP APARTMENT 123A 4 SMITH ST');
>Compare('APARTMENT 4 123A SMITH ST', 'FLAT 123A 4 SMITH ST');
>Compare('10 SMITH ST', '12 SMITH ST');
>Compare('APARTMENT 4 4567A SMITH ST', 'APARTMENT 4567A 4 SMITH ST');
>Compare('1600 PENNSLYVANIA AVE', '1600 PENNSLYVANIA AVE');
>Compare('APARTMENT 123 4567A SMITH ST', 'APARTMENT 4567A 123 SMITH ST');
>Compare('APARTMENT 14 4567A SMITH ST', 'APARTMENT 4567A 41 SMITH ST');
>
>sub Compare
> {
> my ($add1,$add2) = @_;
>
> print "$add1 ¦ $add2\n";
>
> if ($add1 =~ /(\s*\d+\s*\d+[a-z])/i)
> {
> $work = $1;
> $add1 =~ s/$1/¦/;
> }
>
> $work =~ s/(\d) (\d)/$1¦$2/;
>
> ($var1, $var2) = split (/¦/, $work);
>
> $add1 =~ s/¦/ $var2 $var1/;
> $add1 =~ s/^ //;
> $add1 =~ s/\s+/ /g;
>
> if ($add1 eq $add2)
> { print "SAME\n\n"; }
> else
> { print "DIFFERENT\n\n"; }
>
> ($work, $var1, $var2) = "";
> }
Your solution will fail if you compare:
'APARTMENT 4B 4567A SMITH ST'
and
'APARTMENT 4567A 4B SMITH ST'
This should produce a same result.
I've previously posted an updated version of my script which does produce
correct results.
------------------------------
Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V9 Issue 4487
**************************************