[16175] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 3587 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Jul 10 18:13:39 2000

Date: Mon, 10 Jul 2000 15:13:26 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <963267206-v9-i3587@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Mon, 10 Jul 2000     Volume: 9 Number: 3587

Today's topics:
        Non-English LAnguage in Perl Script. <omygod@my-deja.com>
    Re: Non-English LAnguage in Perl Script. <stupid@pobox.com>
    Re: Non-English LAnguage in Perl Script. <stupid@pobox.com>
    Re: Non-English LAnguage in Perl Script. <stupid@pobox.com>
        NT=>Unix text formatting (DLachap419)
    Re: Number of matches question <lr@hpl.hp.com>
    Re: Number of matches question (Abigail)
    Re: Number of matches question <abe@ztreet.demon.nl>
        RE: Number of matches question (H. Merijn Brand)
    Re: Number of matches question (Abigail)
        RE: Number of matches question (H. Merijn Brand)
    Re: Number of matches question (Clinton A. Pierce)
    Re: Numbers and Strings and... <joe_beanNOjoSPAM@coffeehome.com.invalid>
    Re: Numbers and Strings and... <tina@streetmail.com>
    Re: Numbers and Strings and... (Tad McClellan)
    Re: Numbers and Strings and... (Tad McClellan)
    Re: Numbers and Strings and... (Craig Berry)
        OCR <news@fido.workone.com>
    Re: OCR <care227@attglobal.net>
        Off topic: Circular mapping logic <eggrock@yahoo.com>
    Re: Off topic: Circular mapping logic (Neil Kandalgaonkar)
    Re: Off topic: Circular mapping logic (Bart Lateur)
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 04 Jul 2000 05:03:26 GMT
From: Ramya <omygod@my-deja.com>
Subject: Non-English LAnguage in Perl Script.
Message-Id: <8jrr6o$nrr$1@nnrp1.deja.com>

Hi,
  How can I customize a script for a non-english language. Like ....a
part of script has a code something like this....

the print statements has ..
 ...
 ...
} else {
		print "There was an error opening the log file.  Please
tell the webmaster.";
	}
 .....
 ..


I want to change the commment to a different language so that the html
page generated would be in non-english. How can I do it? Where can I
mention the font type so that it is generated in that particular
language?

Thanks a lot,
Ramya.


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: Tue, 04 Jul 2000 01:40:55 -0400
From: Michael G Schwern <stupid@pobox.com>
To: Ramya <omygod@my-deja.com>
Subject: Re: Non-English LAnguage in Perl Script.
Message-Id: <396178E7.30B5F9E2@pobox.com>

Ramya wrote:
> I want to change the commment to a different language so that the html
> page generated would be in non-english. How can I do it? Where can I
> mention the font type so that it is generated in that particular
> language?

You'll be wanting to look at the Locale::gettext module on CPAN as well
as the GNU Gettext info files.  This is a library giving you the basics
of Internationalization.  It has alot of shortcomings, but its been
around for a while and is well documented.  If you plan on dealing with
similar languages (ie. most western european languages) gettext should
be enough.

Alteratively, you can use the more sophisticated, but less documented,
Locale::Maketext module.  The module itself is poorly documented, but
there was a TPJ article on it in... #15?  Its best for when you have to
deal with widely differing languages (such as Chinese and Swahili).

Internationalization is not simple, so don't expect either module to
simply whisk the problem away.

--

Michael G Schwern      http://www.pobox.com/~schwern/      schwern@pobox.com
Plus I remember being impressed with Ada because you could write an
 infinite loop without a faked up condition.  The idea being that in Ada
 the typical infinite loop would be normally be terminated by detonation.
         -- Larry Wall in <199911192212.OAA23621@kiev.wall.org>


------------------------------

Date: Tue, 04 Jul 2000 01:41:20 -0400
From: Michael G Schwern <stupid@pobox.com>
To: Ramya <omygod@my-deja.com>
Subject: Re: Non-English LAnguage in Perl Script.
Message-Id: <39617900.E5B291E@pobox.com>

Ramya wrote:
> I want to change the commment to a different language so that the html
> page generated would be in non-english. How can I do it? Where can I
> mention the font type so that it is generated in that particular
> language?

You'll be wanting to look at the Locale::gettext module on CPAN as well
as the GNU Gettext info files.  This is a library giving you the basics
of Internationalization.  It has alot of shortcomings, but its been
around for a while and is well documented.  If you plan on dealing with
similar languages (ie. most western european languages) gettext should
be enough.

Alteratively, you can use the more sophisticated, but less documented,
Locale::Maketext module.  The module itself is poorly documented, but
there was a TPJ article on it in... #15?  Its best for when you have to
deal with widely differing languages (such as Chinese and Swahili).

Internationalization is not simple, so don't expect either module to
simply whisk the problem away.

--

Michael G Schwern      http://www.pobox.com/~schwern/      schwern@pobox.com
Plus I remember being impressed with Ada because you could write an
 infinite loop without a faked up condition.  The idea being that in Ada
 the typical infinite loop would be normally be terminated by detonation.
         -- Larry Wall in <199911192212.OAA23621@kiev.wall.org>


------------------------------

Date: Tue, 04 Jul 2000 01:41:56 -0400
From: Michael G Schwern <stupid@pobox.com>
Subject: Re: Non-English LAnguage in Perl Script.
Message-Id: <39617924.E00C5649@pobox.com>

Ramya wrote:
> I want to change the commment to a different language so that the html
> page generated would be in non-english. How can I do it? Where can I
> mention the font type so that it is generated in that particular
> language?

You'll be wanting to look at the Locale::gettext module on CPAN as well
as the GNU Gettext info files.  This is a library giving you the basics
of Internationalization.  It has alot of shortcomings, but its been
around for a while and is well documented.  If you plan on dealing with
similar languages (ie. most western european languages) gettext should
be enough.

Alteratively, you can use the more sophisticated, but less documented,
Locale::Maketext module.  The module itself is poorly documented, but
there was a TPJ article on it in... #15?  Its best for when you have to
deal with widely differing languages (such as Chinese and Swahili).

Internationalization is not simple, so don't expect either module to
simply whisk the problem away.

--

Michael G Schwern      http://www.pobox.com/~schwern/      schwern@pobox.com
Plus I remember being impressed with Ada because you could write an
 infinite loop without a faked up condition.  The idea being that in Ada
 the typical infinite loop would be normally be terminated by detonation.
         -- Larry Wall in <199911192212.OAA23621@kiev.wall.org>


------------------------------

Date: 10 Jul 2000 03:16:09 GMT
From: dlachap419@aol.com (DLachap419)
Subject: NT=>Unix text formatting
Message-Id: <20000709231609.23770.00000754@ng-fy1.aol.com>

My NT perl program outputs text which is then used on a Unix system. The
problem is that the formatting is incorrect (and therefore unusable in my case)
It seems the file saved out of my perl script on NT has the wrong EOL. I know
there are differences between NT/unix in this regard but have not been able to
come up with the correct one. 
      I saved some properly formatted text on the unix system and made a dummy
script on NT to just read in lines and save them out again without doing any
processing. That text then formatted properly back on unix. But if I do a
simple substitution on one line, no dice. I figure this has to be one of those
idiot things so, ok, I'm an idiot, can anyone help? 


------------------------------

Date: Tue, 4 Jul 2000 16:45:54 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: Number of matches question
Message-Id: <MPG.13cc1709e7c82edc98aba3@nntp.hpl.hp.com>

In article <QUo75.2151$4C4.44375@news2-win.server.ntlworld.com>, 
tony@pyxis.blackstar.co.uk says...
> Clinton A. Pierce <clintp@geeksalad.org> wrote:
> > Try something more idiomatic like this:
> > @list=qw/adam bob adams carol dave adamson carol bob/;
> > @match=grep {/^adam$/} @list;
> > print scalar @match;   # This prints 1
> > You get 1 if there's exactly 1 adam.  If there's two, you get 2.
> > If there's no adams...you get 0.  Which is false in Perl.
> 
> Surely, if all you want to know is how many matches there are it's much 
> quicker to just iterate over the array and count them?
> 
> my $count = 0; 
> foreach (@list) { $count++ if ($_ eq "adam") };

Surely it would be faster to use grep in scalar context:

  my $count = grep $_ eq 'adam' => @list;

Benchmarks left as an exercise for the reader.  :-)

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: 04 Jul 2000 22:41:35 EDT
From: abigail@delanet.com (Abigail)
Subject: Re: Number of matches question
Message-Id: <slrn8m594h.59a.abigail@alexandra.delanet.com>

Clinton A. Pierce (clintp@geeksalad.org) wrote on MMCDXCVI September
MCMXCIII in <URL:news:P%f75.15895$fR2.185211@news1.rdc1.mi.home.com>:
@@ 
@@ @match=grep {/^adam$/} @list;

Urg. A regex for an exact match. Why not just:

   @match = grep {$_ eq "adam"} @list;   



Abigail
-- 
perl -weprint\<\<EOT\; -eJust -eanother -ePerl -eHacker -eEOT


------------------------------

Date: Wed, 05 Jul 2000 09:25:40 +0200
From: Abe Timmerman <abe@ztreet.demon.nl>
Subject: Re: Number of matches question
Message-Id: <2ep4mssvd37th7r3nrkti1vghpojlvd1tr@4ax.com>

On Sat, 01 Jul 2000 16:22:08 GMT, Tony Bowden
<tony@pyxis.blackstar.co.uk> wrote:

> Clinton A. Pierce <clintp@geeksalad.org> wrote:
> > Try something more idiomatic like this:
> > @list=qw/adam bob adams carol dave adamson carol bob/;
> > @match=grep {/^adam$/} @list;
> > print scalar @match;   # This prints 1
> > You get 1 if there's exactly 1 adam.  If there's two, you get 2.
> > If there's no adams...you get 0.  Which is false in Perl.
> 
> Surely, if all you want to know is how many matches there are it's much 
> quicker to just iterate over the array and count them?
> 
> my $count = 0; 
> foreach (@list) { $count++ if ($_ eq "adam") };
> 
Better still, use the 

	grep EXPR, LIST

form:

#!/usr/bin/perl -w
use strict;

use Benchmark;
use vars qw(@list);

@list = qw/adam bob adams carol dave adamson carol bob/;

timethese(1 << (shift || 16), {
	grep_bl	=> q#my $count = grep { /^adam$/ } @list;#,
	grep_ex	=> q#my $count = grep  /^adam$/ => @list;#,
	iter	=> q#my $count; 
			for (@list) { ++$count if $_ eq 'adam'; }#,
});

__END__

-- 
Good luck,
Abe


------------------------------

Date: Wed, 5 Jul 00 10:09:39 +0200
From: h.m.brand@hccnet.nl (H. Merijn Brand)
Subject: RE: Number of matches question
Message-Id: <8F686B0F0Merijn@192.0.1.90>

abigail@delanet.com (Abigail) wrote in 
<slrn8m594h.59a.abigail@alexandra.delanet.com>:

>Clinton A. Pierce (clintp@geeksalad.org) wrote on MMCDXCVI September
>MCMXCIII in <URL:news:P%f75.15895$fR2.185211@news1.rdc1.mi.home.com>:
>@@ 
>@@ @match=grep {/^adam$/} @list;
>
>Urg. A regex for an exact match. Why not just:
>
>   @match = grep {$_ eq "adam"} @list;   

Won't that be optimized to be the same anyway?

Of course, it's a way of thinking.

--8<---
#!/pro/bin/perl -w

use strict;

my @p = (qw(adam eve jonathan pope zacharias)) x 50000;

use Benchmark;

print + (scalar grep { m/^adam$/    } @p), ", ",
	(scalar grep { $_ eq "adam" } @p), "\n";
timethese (5000000, {
    regex => 'my $n = grep { m/^adam$/    } @p',
    match => 'my $n = grep { $_ eq "adam" } @p',
    });
-->8---
# perl xx.pl
50000, 50000
Benchmark: timing 5000000 iterations of match, regex...
     match:  7 wallclock secs ( 7.63 usr +  0.00 sys =  7.63 CPU)
     regex: 10 wallclock secs ( 7.65 usr +  0.03 sys =  7.68 CPU)
#

0.03 sys CPU more over this amount... Yes, I think it's optimized
somewhere :-)

-- 
H.Merijn Brand
using perl5.005.03 and 5.6.0 on HP-UX 10.20, HP-UX 11.00, AIX 4.2, AIX 4.3,
  DEC OSF/1 4.0 and WinNT 4.0 SP-6a,  often with Tk800.022 and/or DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/
Member of Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)


------------------------------

Date: 05 Jul 2000 04:35:34 EDT
From: abigail@delanet.com (Abigail)
Subject: Re: Number of matches question
Message-Id: <slrn8m5tsc.ibb.abigail@alexandra.delanet.com>

H. Merijn Brand (h.m.brand@hccnet.nl) wrote on MMD September MCMXCIII in
<URL:news:8F686B0F0Merijn@192.0.1.90>:
|| abigail@delanet.com (Abigail) wrote in 
|| <slrn8m594h.59a.abigail@alexandra.delanet.com>:
|| 
|| >Clinton A. Pierce (clintp@geeksalad.org) wrote on MMCDXCVI September
|| >MCMXCIII in <URL:news:P%f75.15895$fR2.185211@news1.rdc1.mi.home.com>:
|| >@@ 
|| >@@ @match=grep {/^adam$/} @list;
|| >
|| >Urg. A regex for an exact match. Why not just:
|| >
|| >   @match = grep {$_ eq "adam"} @list;   
|| 
|| Won't that be optimized to be the same anyway?

The efficiency was irrelevant to me. $_ eq "adam" is a lot clearer
than /^adam$/, IMO.



Abigail
-- 
perl -wle '$, = " "; print grep {(1 x $_) !~ /^(11+)\1+$/} 2 .. shift'


------------------------------

Date: Wed, 5 Jul 00 14:32:28 +0200
From: h.m.brand@hccnet.nl (H. Merijn Brand)
Subject: RE: Number of matches question
Message-Id: <8F6898999Merijn@192.0.1.90>

abigail@delanet.com (Abigail) wrote in 
<slrn8m5tsc.ibb.abigail@alexandra.delanet.com>:

>H. Merijn Brand (h.m.brand@hccnet.nl) wrote on MMD September MCMXCIII in
><URL:news:8F686B0F0Merijn@192.0.1.90>:
>|| abigail@delanet.com (Abigail) wrote in 
>|| <slrn8m594h.59a.abigail@alexandra.delanet.com>:
>|| 
>|| >Clinton A. Pierce (clintp@geeksalad.org) wrote on MMCDXCVI September
>|| >MCMXCIII in <URL:news:P%f75.15895$fR2.185211@news1.rdc1.mi.home.com>:
>|| >@@ 
>|| >@@ @match=grep {/^adam$/} @list;
>|| >
>|| >Urg. A regex for an exact match. Why not just:
>|| >
>|| >   @match = grep {$_ eq "adam"} @list;   
>|| 
>|| Won't that be optimized to be the same anyway?
>
>The efficiency was irrelevant to me. $_ eq "adam" is a lot clearer
>than /^adam$/, IMO.

I do agree completely, That's why I said "Of course, it's a way of 
thinking." But some people sometimes are interrested in the differences (me 
for instance).

-- 
H.Merijn Brand
using perl5.005.03 and 5.6.0 on HP-UX 10.20, HP-UX 11.00, AIX 4.2, AIX 4.3,
  DEC OSF/1 4.0 and WinNT 4.0 SP-6a,  often with Tk800.022 and/or DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/
Member of Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)


------------------------------

Date: Wed, 05 Jul 2000 21:03:08 GMT
From: clintp@geeksalad.org (Clinton A. Pierce)
Subject: Re: Number of matches question
Message-Id: <goN85.25097$fR2.228439@news1.rdc1.mi.home.com>

[Posted and mailed]

In article <slrn8m594h.59a.abigail@alexandra.delanet.com>,
	abigail@delanet.com (Abigail) writes:
> Clinton A. Pierce (clintp@geeksalad.org) wrote on MMCDXCVI September
> MCMXCIII in <URL:news:P%f75.15895$fR2.185211@news1.rdc1.mi.home.com>:
> @@ 
> @@ @match=grep {/^adam$/} @list;
> 
> Urg. A regex for an exact match. Why not just:
> 
>    @match = grep {$_ eq "adam"} @list;   


Because, I was trying to make a point.  I was hoping the original poster
noted the difference between his:

	@match=grep {/adam/} @list;

and my

	@match=grep {/^adam$/} @list;

It was not so much a better solution (which "eq" surely is) as it was
a teaching opportunity about anchors.  Sometimes I don't WANT to be
completely obvious.  :)

-- 
    Clinton A. Pierce              Teach Yourself Perl in 24 Hours! 
  clintp@geeksalad.org         for details see http://www.geeksalad.org
"If you rush a Miracle Man, 
	you get rotten Miracles." --Miracle Max, The Princess Bride


------------------------------

Date: Mon, 03 Jul 2000 21:16:29 -0700
From: bean <joe_beanNOjoSPAM@coffeehome.com.invalid>
Subject: Re: Numbers and Strings and...
Message-Id: <152d9f87.a63e6c81@usw-ex0102-014.remarq.com>

It amazing how many of the posts on this group start with "I've
got a CGI problem..." or "How do I make my SSI do this..." I'm
surprised anyone even bothers to answer most of them at this
point.

Anyway, just wanted to say thanks for the help. chomp fixed
everything but the problems I didn't even know I had (thanks
Craig).

Two more quick questions. One, why is this true:

>my @abcs = (a..z, A..Z);
>
>Barewords like that aren't a good idea; if you use -w, Perl
will tell you this. Better would be:
>
>my @abcs = ('a'..'z', 'A'..'Z');

It seemed to work fine before I put everything in '' but I
changed it anyway.

Second, I've been accessing this group through a site called
RemarQ. Their interface leaves much to be desired, and it's not
exactly fast. I'm assuming that I can access the group using
some sort of (newsreader?) software... Any recommendations? I'm
running a PC w/ win98.

Thanks again.


-----------------------------------------------------------

Got questions?  Get answers over the phone at Keen.com.
Up to 100 minutes free!
http://www.keen.com



------------------------------

Date: 4 Jul 2000 06:45:31 GMT
From: Tina Mueller <tina@streetmail.com>
Subject: Re: Numbers and Strings and...
Message-Id: <8js16b$15knk$9@ID-24002.news.cis.dfn.de>

hi,
bean <joe_beanNOjoSPAM@coffeehome.com.invalid> wrote:

> Two more quick questions. One, why is this true:

>>my @abcs = (a..z, A..Z);
>>
>>Barewords like that aren't a good idea; if you use -w, Perl
> will tell you this. Better would be:
>>
>>my @abcs = ('a'..'z', 'A'..'Z');

> It seemed to work fine before I put everything in '' but I
> changed it anyway.

it worked because perl assumes that a should be 'a', and so on...
but even if perl tries everything to interprete your
code like you want it to be interpreted, it would
fail if you use month names, for example.
@months = (sep, oct, nov);
would not work as you might think.
perl thinks oct is the function oct()
always write code that passes the -w, that saves you
a lot of time, i promise...

> Second, I've been accessing this group through a site called
> RemarQ. Their interface leaves much to be desired, and it's not
> exactly fast. I'm assuming that I can access the group using
> some sort of (newsreader?) software... Any recommendations? I'm
> running a PC w/ win98.

hm, there are some newsreaders, but as i've not worked
with windoze for a long time now, i can only recommend:
use linux and (emacs or tin or ...)

on windows: try to use any other newsrader than outlook or netscape

:-)

regards,
tina

-- 
http://tinita.de    \  enter__| |__the___ _ _ ___
tina's moviedatabase \     / _` / _ \/ _ \ '_(_-< of
search & add comments \    \ _,_\ __/\ __/_| /__/ perception
"The Software required Win98 or better, so I installed Linux."


------------------------------

Date: Tue, 4 Jul 2000 09:11:03 -0400
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: Numbers and Strings and...
Message-Id: <slrn8m3oj7.drs.tadmc@magna.metronet.com>

On Mon, 03 Jul 2000 21:16:29 -0700, bean <joe_beanNOjoSPAM@coffeehome.com.invalid> wrote:

>One, why is this true:
>
>>my @abcs = (a..z, A..Z);
>>
>>Barewords like that aren't a good idea


It is not a good idea, because perl will interpret barewords
as a subroutine call if it has seen a subroutine with the
same name as the bareword.

So, you need to remember the names of every subroutine if
you want to avoid Twilight Zone behavior.

Play around (switch subs, quote barewords, enable strictness)
with the program below and see what happens.


---------------------------------
#!/usr/bin/perl -w
#use strict;

#sub a { return 'xyz' }
sub a { return 'x' }

my @abcs = (a..z, A..Z);

foreach (@abcs) {
   print "$_\n";
}
---------------------------------


-- 
    Tad McClellan                          SGML Consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: Tue, 4 Jul 2000 09:04:33 -0400
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: Numbers and Strings and...
Message-Id: <slrn8m3o71.drs.tadmc@magna.metronet.com>

On Mon, 03 Jul 2000 21:16:29 -0700, bean <joe_beanNOjoSPAM@coffeehome.com.invalid> wrote:

>Two more quick questions. 


Two questions should get two posts, each with a different subject.

Someone looking for (or having) recommendations for newsreader
software is not going to think to look in a thread with
the above subject.

The Subject header is the "index entry" for your article.

Help people help themselves.


Even just _asking_ a question contributes to the community,
(if the community can find what they are looking for).



>One, why is this true:
>
>>my @abcs = (a..z, A..Z);
>>
>>Barewords like that aren't a good idea


The primary reason being that using them implies no "use strict",
and that is Very Bad Indeed.

You should be using the strict pragma in your programs
(and declaring your variables before you use them).

   perldoc strict

If you were, the above would be a fatal error (which is a _good_ thing).



>It seemed to work fine 


In your current particular case.

The whole point is moot though. 

Conscientious Perl programmers always turn on strictness,
(and warnings (-w) and, if doing CGI stuff, taint checking (-T)).

All readers of this newsgroup are conscientious (heh, heh).

So barewords won't even compile (which is a powerful
incentive to quote them :-)

For the UNconscientious who are wondering why it might be
a bad idea even without strictness, see my other followup.


>before I put everything in '' but I
>changed it anyway.


Good Idea.



[ snip newsreader question ]


-- 
    Tad McClellan                          SGML Consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: Wed, 05 Jul 2000 22:05:38 GMT
From: cberry@cinenet.net (Craig Berry)
Subject: Re: Numbers and Strings and...
Message-Id: <sm7c9i49gas65@corp.supernews.com>

bean (joe_beanNOjoSPAM@coffeehome.com.invalid) wrote:
: Anyway, just wanted to say thanks for the help. chomp fixed
: everything but the problems I didn't even know I had (thanks
: Craig).

Quite welcome!

: Two more quick questions. One, why is this true:
: 
: >my @abcs = (a..z, A..Z);
: >
: >Barewords like that aren't a good idea; if you use -w, Perl
: >will tell you this. Better would be:
:
: >my @abcs = ('a'..'z', 'A'..'Z');
: 
: It seemed to work fine before I put everything in '' but I
: changed it anyway.

The interpreter tries to treat barewords as functions first; if your
bareword happens to match a function name, things get ugly (and hard to
debug at 3am).  Observe:

  /usr2/people/cberry > perl -w
  @a = (a..y);
  Unquoted string "a" may clash with future reserved word at - line 1.
  print "@a\n";
  Translation pattern not terminated at - line 1.

You see, y is a synonym for tr.  Or even worse:

  /usr2/people/cberry > perl
  sub z { return 'b'; }
  @a = (a..z);
  print "@a\n";

Output:

  a b

I hope this leaves you properly terrified of barewords. :)

-- 
   |   Craig Berry - http://www.cinenet.net/users/cberry/home.html
 --*--  "Beauty and strength, leaping laughter and delicious
   |   languor, force and fire, are of us." - Liber AL II:20


------------------------------

Date: Mon, 10 Jul 2000 17:15:29 +0200
From: Kirill Miazine <news@fido.workone.com>
Subject: OCR
Message-Id: <Pine.LNX.4.21.0007101713510.30738-100000@gilda.uio.no>

Hello world.
What does OCR mean in the newsgroup stats. Is it good or bad to
have high OCR. Sorry, this is not a perl-related question, but it woud be
nice if someone helped me.



------------------------------

Date: Mon, 10 Jul 2000 11:17:29 -0400
From: Drew Simonis <care227@attglobal.net>
Subject: Re: OCR
Message-Id: <3969E909.F4322949@attglobal.net>

Kirill Miazine wrote:
> 

[replied via email]


------------------------------

Date: Mon, 3 Jul 2000 11:41:14 -0500
From: "edge" <eggrock@yahoo.com>
Subject: Off topic: Circular mapping logic
Message-Id: <8jqfsl$k9$1@shadow.skypoint.net>

This may prove to be a long read, and is a logic question, not specifically
a Perl question. I'd apologize but if I were truly pentitent I wouldn't post
this in the first place....

If you've run into this and know of a more efficient way to do this then
please let me know. Here's the scoop:

We have a system in place on a server that allows for e-mail aliasing within
a hosted domain. The e-mail addresses in this domain aren't 'real' e-mail
drops, but simply pointers to a valid address.

For the domain e.com, I may have several maps like this:
a@e.com => realaddress@somedomain.com
b@e.com => anotheraddy@adomain.com

And so on.

I've written a program that allows me to easily view and modify all
information within each database (each domain has a separate file).

What I would like to do is 'idiot proof' the program so that anyone can use
this program without fear of making a crucial mistake like this:

a@e.com => b@e.com
b@e.com => a@e.com

No, no, NO. a maps to b, which maps back to a, which is a circular map and
is NOT desired. I'm not worried about anyone doing this now, but who knows
what kind of rifraff will be using this in the future. :-)

So, I've been attempting to add a subroutine to the program that will check
for this type of thing.

The code that I have now is like this, using e.com as the domain:

#!/usr/bin/perl -w

$domain = "e.com";
%aliases = (
        a => 'b@e.com', #good map, b is valid
        b => 'b@a.com', #valid
        c => 'd@e.com', #circular
        d => 'e@e.com', #circular
        e => 'c@e.com', #circular
        f => 'z@e.com', #no map
        g => 'h@e.com', #good, ends up mapping to b
        h => 'i@e.com', #good, ditto
        i => 'j@e.com', #good, ditto
        j => 'a@e.com', #good, ditto
);

#make duplicate to preserve original data
%dup = %aliases;

#crappy way to count how many possible circular mappings there are using a
counter
$circ = 0;

foreach $key (keys %dup) {

        #add a newline for pattern matching
        $dup{$key} = $dup{$key} . "\n";

        #get any maps that resolve back to $domain
        if($dup{$key} =~ /(.*?)\@$domain$/) {

                #is the map defined?
                if(defined($dup{$1})) {
                        #yes, add to the list of possible circular mappings
and increment $circ
                        $circular{$key} = $1;
                        $circ++;
                } else {
                        #no, stick it in the @nomap array to be displayed
later
                        push @nomap, "$key\@$domain => $1\@$domain";
                }#if
        } else {
                #maps that don't resolve back to $domain are fine and are
set as ! (for this example)
                $circular{$key} = "!";
        }#if
}#foreach

#set $changed to 1 if any changes are made, if not then exit this while loop
$changed = 0;
while(1) {
        foreach $key (keys %circular) {
                #the next two lines are just so I can watch what's happening
                print "Checking: $key => $circular{$key}\n";
                sleep 1;

                #if the map is not known to be valid, then...
                if($circular{$key} ne "!") {
                        #check the map's map to see if it's valid. If it is,
set this as a valid
                        #map by changing the value to !, decrement $circ and
set $changed to 1
                        if($circular{$circular{$key}} eq "!") {
                                $circular{$key} = "!";
                                $circ--;
                                $changed = 1;
                        }
                }
        }
        #if we made changes then continue the loop
        if($changed == 1) {
                $changed = 0;
                next;

        #if no changes were made, check to see if there are any unresolved
maps and display them
        } else {
                if($circ == 0) {
                        last;
                }#if
                print "Found circular maps.\n";
                foreach $key (keys %circular) {
                        if($circular{$key} ne "!") {
                                print "$key\@domain =>
$circular{$key}\@$domain\n";
                        }
                }
                last;
        }
}

#show any maps to $domain that don't resolve to anything
foreach (@nomap) {
        print "Not mapped: $_\n";
}

Done. Now whether or not this is efficient, or even good Perl I don't know.
Point out the ignorance and I'll learn the correct way. For now, my
conundrum is with the logic as explained below.

The problem is this:
Aliases @e.com may map to /multiple/ addresses, like this:
a@e.com => b@d.net,c@rr.com,somebody@somewhere.org

 ...and I must check these as well. I was thinking about finding any mappings
that have commas (signifying multiple recipients) and checking them at a
later time, but I'm not sure how to proceed with it. The only thing I could
think of is to split the recipient addresses into key/value pairs like:
a.0 => 'b@d.net',
a.1 => 'c@rr.com',
a.2 => 'somebody@somewhere.org',

Then check all the a.# keys to make sure they're ALL valid, and if one isn't
display it as an invalid address.


That's about as concise as I can get it, and if somebody has an answer it
would help immensely in any future progs I write that use this. Specific
code examples would be welcome if you deem it feasible, but general answers
(geared somewhat towards the layman) are more welcome since it'll force me
to think rather than cut/paste/modify.

Enjoy!
Barry

The most enjoyable thing about this post is how Microsoft in its infinite
wisdom decided to 'httpify' any words that contain the @ symbol surrounded
by other characters. I mean, redefining standards is one thing but I figure
that plain text is plain text. Even better, in their omniscient knowledge
about all things computer related, they deem it unnecessary to disable this
'feature'. Thanks Microsoft, for doing all of my thinking for me.




------------------------------

Date: Tue, 04 Jul 2000 01:25:44 GMT
From: neil@brevity.org (Neil Kandalgaonkar)
Subject: Re: Off topic: Circular mapping logic
Message-Id: <8jre1q$4cd$1@localhost.localdomain>

In article <8jqfsl$k9$1@shadow.skypoint.net>, edge <eggrock@yahoo.com> wrote:

>What I would like to do is 'idiot proof' the program so that anyone can use
>this program without fear of making a crucial mistake like this:
>
>a@e.com => b@e.com
>b@e.com => a@e.com

Seems like excessive paranoia to me.

IIRC modern MTAs will not forward items more than a certain number of 
times before finally sending it back to the original sender. Of course,
this might be embarassing if the sender is from outside the company.


>The problem is this:
>Aliases @e.com may map to /multiple/ addresses, like this:
>a@e.com => b@d.net,c@rr.com,somebody@somewhere.org

I had trouble even following your logic. So I wrote it my way.

It's not particularly efficient, and it has the flaw of not always
checking every path. When it finds a certain rule has any circularity,
it screams "circular!" and goes to the next one. I have a feeling
this could be simplified a bit.



#!/usr/bin/perl -w

use strict;

my $domain = "e.com";
my %aliases = (
        a => 'b@e.com', #good map, b is valid
        b => 'b@a.com', #valid
        c => 'd@e.com', #circular
        d => 'e@e.com', #circular
        e => 'c@e.com', #circular
        f => 'z@e.com', #no map
        g => 'h@e.com', #good, ends up mapping to b
        h => 'i@e.com', #good, ditto
        i => 'j@e.com', #good, ditto
        j => 'a@e.com', #good, ditto
	k => 'l@e.com, d@e.com',  # first is ok, second is circular.
	m => 'n@e.com, b@e.com',  # both ok.
	n => 'o@e.com, joe@foobar.com, k@e.com',  # last is circular.
);


# an address is circular if it somewhere in its chain is itself.


for my $addr (sort keys %aliases) {
	# print "TRYING: $addr\n";
	my @chain = circular ($addr);
	
	if ( @chain ) {
		print "$addr is circular! ";
		print join " -> ", @chain;
		print "\n";
	} else {
		print "$addr is ok\n";
	}; 
}

sub circular {
	my (@chain) = @_;

	my $this_addr = pop @chain;

	if ( grep { $_ eq $this_addr } @chain ) {
		return @chain, $this_addr;
	}
	
	for my $alias ( aliases_in_our_domain($this_addr) ) {
	#	print "checking out @chain $this_addr $alias...\n";
		my @circ = circular ( @chain, $this_addr, $alias );
		return @circ if @circ;
	}

	return ();
	
}


sub aliases_in_our_domain {
	my ($this_addr) = @_;
	return () unless $aliases{$this_addr};
	
	# the RFC 822 police will whine about this.
	# just ensure it works for whatever your setup is.

        my @next_aliases;
	for my $alias ( split /\s*,\s*/ => $aliases{$this_addr} ) {
		my ($alias_addr, $alias_domain) = split /@/, $alias;
		if ($alias_domain) { 
			next if $alias_domain ne $domain;
		}
		push @next_aliases, $alias_addr;
	}
	# print "next for $this_addr:", (map {"<$_> "} @next_aliases), "\n";
	return @next_aliases;
}




-- 
Neil Kandalgaonkar <neil@brevity.org>


------------------------------

Date: Tue, 04 Jul 2000 17:31:59 GMT
From: bart.lateur@skynet.be (Bart Lateur)
Subject: Re: Off topic: Circular mapping logic
Message-Id: <39661029.2507271@news.skynet.be>

Neil Kandalgaonkar wrote:

>>a@e.com => b@e.com
>>b@e.com => a@e.com
>
>Seems like excessive paranoia to me.
>
>IIRC modern MTAs will not forward items more than a certain number of 
>times before finally sending it back to the original sender. Of course,
>this might be embarassing if the sender is from outside the company.

You give me a good idea for a testing approach.

This methodology is indeed common in forwarding TCP packets. The common
name is TTL, for "Time To Live". Ech time a packet is forwarded, a
special integer field (TTL) is decremented. As soon as it's value
reaches zero, the packet is dropped. It is indeed intended to prevent
eternal circles.

Now, what if you create a test program for these data, using a TTL
scheme? Start with an upper bound for TTL, typically a bit more than the
number of different names. Now, for each name, start following links.
Each time, decrement TTL. When it reaches zero, this means you've
followed a circular reference.

Alternatively, there's the "garbage collection" scheme. Create a lmarker
flag for each host. Set them to false. Start following links, setting
the marker flag as you pass by. When you get to a host with the marker
already set, you've followed a circlular link.

-- 
	Bart.


------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 3587
**************************************


home help back first fref pref prev next nref lref last post