[31336] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 2581 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Sep 3 14:10:26 2009

Date: Thu, 3 Sep 2009 11:09:51 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 3 Sep 2009     Volume: 11 Number: 2581

Today's topics:
        @ARGV array incorrect when calling perl program from sy <udo.grabowski@imk.fzk.REMOVEJUNK.de>
    Re: @ARGV array incorrect when calling perl program fro <peter@makholm.net>
    Re: @ARGV array incorrect when calling perl program fro <udo.grabowski@imk.fzk.REMOVEJUNK.de>
    Re: @ARGV array incorrect when calling perl program fro <yankeeinexile@gmail.com>
    Re: @ARGV array incorrect when calling perl program fro <RedGrittyBrick@spamweary.invalid>
    Re: @ARGV array incorrect when calling perl program fro <udo.grabowski@imk.fzk.REMOVEJUNK.de>
        Can't locate Term/ANSIColor.pm <bubunia2000ster@gmail.com>
        command perl - SR <fred78980@yahoo.com>
    Re: Data cleaning issue involving bad wide characters i <jurgenex@hotmail.com>
    Re: Data cleaning issue involving bad wide characters i <jurgenex@hotmail.com>
    Re: Data cleaning issue involving bad wide characters i <r.ted.byers@gmail.com>
    Re: Data cleaning issue involving bad wide characters i <jurgenex@hotmail.com>
        Data cleaning issue involving bad wide characters in wh <r.ted.byers@gmail.com>
        FAQ 3.7 How do I cross-reference my Perl programs? <brian@theperlreview.com>
        FAQ 3.8 Is there a pretty-printer (formatter) for Perl? <brian@theperlreview.com>
        FAQ 4.35 How do I find the soundex value of a string? <brian@theperlreview.com>
    Re: Help me with authomatic authentication <derykus@gmail.com>
    Re: need help, will pay <rkb@i.frys.com>
        Small date question <tuxedo@mailinator.com>
    Re: Small date question <ben@morrow.me.uk>
    Re: Small date question <tadmc@seesig.invalid>
    Re: Small date question <jurgenex@hotmail.com>
    Re: Small date question <tuxedo@mailinator.com>
    Re: Small date question <tuxedo@mailinator.com>
    Re: Small date question <tuxedo@mailinator.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 03 Sep 2009 15:11:29 +0200
From: Udo Grabowski <udo.grabowski@imk.fzk.REMOVEJUNK.de>
Subject: @ARGV array incorrect when calling perl program from system
Message-Id: <h7ofa1$jh9$1@news2.rz.uni-karlsruhe.de>

Hi,

the following construction gives additional @ARGV entries which shouldn't
be there:

  @command = ("perl-program","--help",">","logfile");
  eval { system(@command) };

Inside perl-program (which starts with #!/usr/bin/perl),
@ARGV now contains also '>' and 'logfile' as additional
entries. This contradicts the definition of @ARGV:
   "The array @ARGV contains the command-line arguments
    INTENDED FOR THE SCRIPT"

Since the redirection parameters are not script parameters, they
shouldn't appear (and didn't appear before). This bug breaks all
and everything depending on @ARGV and the number of parameters !

Is this a perl bug, or is the system command of the OS responsible
for this broken behaviour ? Or even the distributor of this perl
version ?

Platform: OpenSolaris 2009.06,
Version:  perl 5.8.8 (Coolstack distribution)
           i86pc-solaris-thread-multi


------------------------------

Date: Thu, 03 Sep 2009 15:29:11 +0200
From: Peter Makholm <peter@makholm.net>
Subject: Re: @ARGV array incorrect when calling perl program from system
Message-Id: <87ocpsrwbc.fsf@vps1.hacking.dk>

Udo Grabowski <udo.grabowski@imk.fzk.REMOVEJUNK.de> writes:

>  @command = ("perl-program","--help",">","logfile");
>  eval { system(@command) };

Read the documentation for the system command:

  If there is more than one argument in LIST, or if LIST is an array
  with more than one value, starts the program given by the first
  element of the list with arguments given by the rest of the list. 

So of course '>' and 'logfile' is parsed to the script as script
arguments. If you need any kind of shell handling, you have to use
one, and only one, argument to system.

I can't imagine that this has changed recently on any posix like
platform.

//Makholm


------------------------------

Date: Thu, 03 Sep 2009 15:48:30 +0200
From: Udo Grabowski <udo.grabowski@imk.fzk.REMOVEJUNK.de>
Subject: Re: @ARGV array incorrect when calling perl program from system
Message-Id: <h7ohfe$k9e$1@news2.rz.uni-karlsruhe.de>

Peter Makholm wrote:
> Udo Grabowski <udo.grabowski@imk.fzk.REMOVEJUNK.de> writes:
> 
>>  @command = ("perl-program","--help",">","logfile");
>>  eval { system(@command) };
> 
> Read the documentation for the system command:
> 
>   If there is more than one argument in LIST, or if LIST is an array
>   with more than one value, starts the program given by the first
>   element of the list with arguments given by the rest of the list. 
> 
> So of course '>' and 'logfile' is parsed to the script as script
> arguments. If you need any kind of shell handling, you have to use
> one, and only one, argument to system.


So this means that calling 'perl-program --help > logfile' and
using the construction above (which should be equivalent, since
it should be transparent to the program from whom it is called)
will give inconsistent @ARGV arrays. So I still consider this
to be a bug (referring to Unix semantics) instead of a feature.

If this is really Posix' definition, then Posix is inconsistent.
But I think that 'system' called here is the perl implementation,
so maybe that is simply done wrong there.

Workaround is indeed to call it this way:
   $return=system('perl-program --help > logfile')


------------------------------

Date: Thu, 03 Sep 2009 08:58:53 -0500
From: Lawrence Statton <yankeeinexile@gmail.com>
Subject: Re: @ARGV array incorrect when calling perl program from system
Message-Id: <m1ocpsruxu.fsf@mac.gateway.2wire.net>

Udo Grabowski <udo.grabowski@imk.fzk.REMOVEJUNK.de> writes:
> Peter Makholm wrote:
> So this means that calling 'perl-program --help > logfile' and
> using the construction above (which should be equivalent, since
> it should be transparent to the program from whom it is called)

"should" according to whom?  The documentation clearly points out that
they are *not* equivalent.  One builds the argv[] array manually, and
forks and execs your binary without ever going near a shell.  The other
form passes the string to a shell for argument parsing (using the
shell's rules for "argument vs. shell redirection").


> will give inconsistent @ARGV arrays. So I still consider this
> to be a bug (referring to Unix semantics) instead of a feature.


You're still wrong.



------------------------------

Date: Thu, 03 Sep 2009 15:14:51 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: @ARGV array incorrect when calling perl program from system
Message-Id: <4a9fcf5e$0$2539$da0feed9@news.zen.co.uk>


Udo Grabowski wrote:
> Peter Makholm wrote:
>> Udo Grabowski <udo.grabowski@imk.fzk.REMOVEJUNK.de> writes:
>>
>>>  @command = ("perl-program","--help",">","logfile");
>>>  eval { system(@command) };
>>
>> Read the documentation for the system command:
>>
>>   If there is more than one argument in LIST, or if LIST is an array
>>   with more than one value, starts the program given by the first
>>   element of the list with arguments given by the rest of the list.
>> So of course '>' and 'logfile' is parsed to the script as script
>> arguments. If you need any kind of shell handling, you have to use
>> one, and only one, argument to system.
> 
> 
> So this means that calling 'perl-program --help > logfile' and
> using the construction above (which should be equivalent, 

No they should not be equivalent.

perldoc -f system

"Note that argument processing varies
depending on the number of arguments. If there is more than one
argument in LIST, or if LIST is an array with more than one
value, starts the program given by the first element of the list
with arguments given by the rest of the list. If there is only
one scalar argument, the argument is checked for shell
metacharacters, and if there are any, the entire argument is
passed to the system's command shell for parsing"

In one case a shell is *not* ever invoked, and therefore shell features 
such as output redirection are not applicable.

In the other case, a shell *will* be invoked to process any shell 
metacharacters.

> since
> it should be transparent to the program from whom it is called)
> will give inconsistent @ARGV arrays. So I still consider this
> to be a bug (referring to Unix semantics) instead of a feature.

If a program's behaviour agrees with it's documentation I consider it a 
feature. No matter how much I might dislike or be confused by that feature.

Since changing this feature would break many existing Perl programs I 
strongly doubt it will ever get changed in Perl 5.

-- 
RGB


------------------------------

Date: Thu, 03 Sep 2009 16:17:43 +0200
From: Udo Grabowski <udo.grabowski@imk.fzk.REMOVEJUNK.de>
Subject: Re: @ARGV array incorrect when calling perl program from system
Message-Id: <h7oj67$kuv$1@news2.rz.uni-karlsruhe.de>

Lawrence Statton wrote:
> One builds the argv[] array manually, and
> forks and execs your binary without ever going near a shell.  

Thanks for pointing that out, it wasn't clear to me from the doc
that no shell is involved when calling it this way. Maybe
your sentence would be helpful in the doc to avoid confusion.



------------------------------

Date: Wed, 2 Sep 2009 22:19:23 -0700 (PDT)
From: Pradeep <bubunia2000ster@gmail.com>
Subject: Can't locate Term/ANSIColor.pm
Message-Id: <f98e9140-e308-4c28-a3e8-70e8c4715ef3@r39g2000yqm.googlegroups.com>

Hi all,
   I have installed the perl 5.8.8(ActivePerl-5.8.8.824-MSWin32-
x86-287188.msi from http://downloads.activestate.com/ActivePerl/Windows/5.8/)
and  perl 5.8.9 also. I am getting the following error when my script
runs.
c:\perl\bin;c:\perl\lib;c:\perl\site\lib is already appended to PATH
env variable.


     [exec] Can't locate Term/ANSIColor.pm in @INC (@INC contains: /
usr/lib/perl
5/5.8/cygwin /usr/lib/perl5/5.8 /usr/lib/perl5/site_perl/5.8/cygwin /
usr/lib/per
l5/site_perl/5.8 /usr/lib/perl5/site_perl/5.8 /usr/lib/perl5/
vendor_perl/5.8/cyg
win /usr/lib/perl5/vendor_perl/5.8 /usr/lib/perl5/vendor_perl/5.8 .)
at -e line
2.
     [exec] BEGIN failed--compilation aborted at -e line 2.

Can anybody help me in this regard?

Thanks
Pradeep


------------------------------

Date: Thu, 3 Sep 2009 10:55:18 -0700 (PDT)
From: fred <fred78980@yahoo.com>
Subject: command perl - SR
Message-Id: <44c40c9e-fd7f-44f4-b5d3-8991ac52fcc7@j9g2000vbp.googlegroups.com>

Would like to replace my third field (empty) for each five lines by
foo. This command is not correct. Can you help me fix it ?

!perl -pe 's(&&)($n++ = 5 ? (&foo&)eg' text.txt

xxxx&(ght)(hgf)&&(yyt)
xx9x&(gg)(ff)&&(yyt)
oixxx&(hfd)(jj)&&(yyt)
xxxx&(jj)(kk)&&(yyt)
xjhxxx&(jj)(j)&&(yyt)


Thanks


------------------------------

Date: Thu, 03 Sep 2009 09:40:33 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Data cleaning issue involving bad wide characters in what ought  to be ascii data
Message-Id: <duqv9512mmnunm1999bng14ekd3d5itujb@4ax.com>

Ted Byers <r.ted.byers@gmail.com> wrote:
>On Sep 3, 11:51 am, Jürgen Exner <jurge...@hotmail.com> wrote:
>> Ted Byers <r.ted.by...@gmail.com> wrote:
>My program needs to store the data as plain ascii 

I dare to question the wisdom of this requirement. In today's world
restricting your data to ASCII only is a severe limitation and will more
often than not backfire when you least expect it. Does your data contain
e.g. any names? Customers, employees, places, tools or equipment named
after people or places? Can you guarantee that it will never be used
outside of the English-speaking world, not even for Spanish names in the
US?
A much more robust way is to finally accept that ASCII is almost 50
years old, obsolete, and completely inadequate for today's world and to
use Unicode/UTF-8 as the standard throughout.

>regardless of how the original data was encoded.  

If you insist on limiting yourself to ASCII only then obviously you will
have to deal with any non-ASCII character in some way. What do you
propose to do with e.g. my first name?

>And apart from this string, it looks
>like all the data can be safely treated as ascii.  The data comes as a
>text/html attachment to the emails, so I am wondering if the headers
>to the email might tell me something about the encoding ...

Sorry, I'm not a MIME expert.

>> >How can I make certain that
>> >when I either print it or store it in my DB, I get the correct
>> >"rec'd" (or, better, "received")?

Convert it, transform it, remove it, reject it, ....
If it's really, really, really only this one instance ever, then
probably a simple s/// will do. But that will work only until some other
non-ASCII character shows up at your doorstep.

>> Does the file have a BOM? AFAIR Notepad uses the BOM to determine if a
>> file is in UTF-8 in disregard of UTF-8 being a byte sequence and thus
>> files in UTF-8 typically neither having nor needing a BOM.
>>
>I don't know what a BOM is, let alone how to tell if a file has one.

See http://en.wikipedia.org/wiki/Byte-order_mark. You might be able to
use it to determine the encoding of your data.

>Is there a safe way to ensure that all the data that is being
>processed is plain ascii?

Only if the character set is explicitely specified as ASCII. Every other
character set does contain non-ASCII characters which you will have to
handle.

>I have seen email clients displaying this
>data so I know that there are never characters in it, as displayed,
>that would not be valid ascii.

Would you bet your house on it?

jue


------------------------------

Date: Thu, 03 Sep 2009 09:51:17 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Data cleaning issue involving bad wide characters in what ought  to be ascii data
Message-Id: <2qsv955u66v3pdci3km9siq2pr5g09iq6c@4ax.com>

Ted Byers <r.ted.byers@gmail.com> wrote:
>I thought I'd have to resort to a regex, if I could figure out what to
>scan for, but if there is a perl package that will make it easier to
>deal with this odd character, great.

Forgot to mention:
There is Text::Iconv (see
http://search.cpan.org/~mpiotr/Text-Iconv-1.7/Iconv.pm) which will
convert text between different encodings. However I have no idea what it
does with characters that do not exist in the target character set.

jue


------------------------------

Date: Thu, 3 Sep 2009 09:09:31 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: Re: Data cleaning issue involving bad wide characters in what ought  to be ascii data
Message-Id: <8c0acc8d-231a-437f-9eed-e34a23bb2724@k26g2000vbp.googlegroups.com>

On Sep 3, 11:51=A0am, J=FCrgen Exner <jurge...@hotmail.com> wrote:
> Ted Byers <r.ted.by...@gmail.com> wrote:
> >Again, I am trying to automatically process data I receive by email,
> >so I have no control over the data that is coming in.
>
> >The data is supposed to be plain text/HTML, but there are quite a
> >number of records where the contraction "rec'd" is misrepresented when
> >written to standard out as "Rec\342\200\231d"
>
> >When the data is written to a file, these characters are represented
> >by the character ' when it is opened using notepad, but by the string
> >'=92' when it is opened by open office.
>
> >So how do I tell what character it is when in three different contexts
> >it is displayed in three different ways?
>
> By explicitely telling the displaying program the encoding that was used
> to create/save the file. In your case it very much looks like UTF-8.
>
My program needs to store the data as plain ascii regardless of how
the original data was encoded.  And apart from this string, it looks
like all the data can be safely treated as ascii.  The data comes as a
text/html attachment to the emails, so I am wondering if the headers
to the email might tell me something about the encoding ...

> >How can I make certain that
> >when I either print it or store it in my DB, I get the correct
> >"rec'd" (or, better, "received")?
>
> >I suspect a minor glitch in the software that makes and send the email
> >as this is the ONLY string where what ought to be an ascii ' character
> >is identified as a wide character.
>
> That's not a wide character. A wide character is something totally
> different.
>
I have done almost no programming dealing with i18n, so I called it a
wide character because that's what Emacs called it when my program
wrote the data to standard out.

> >Regardless of how that happens (as
> >I don't control that), I need to clean this. =A0And it gets confusing
> >when different applications handle the i18n differently (Notepad is
> >undoubtedly using the OS i18n support and Open Office is handling it
> >differently, and Emacs is doing it differently from both).
>
> Yep. If the file doesn't contain information about the encoding and/or
> the application either doesn't support this encoding or misinterprets it
> or cannot guess the encoding correctly then you will have to tell the
> application which encoding to use (or use a different application).
>
> Does the file have a BOM? AFAIR Notepad uses the BOM to determine if a
> file is in UTF-8 in disregard of UTF-8 being a byte sequence and thus
> files in UTF-8 typically neither having nor needing a BOM.
>
> jue
I don't know what a BOM is, let alone how to tell if a file has one.

Is there a safe way to ensure that all the data that is being
processed is plain ascii?  I have seen email clients displaying this
data so I know that there are never characters in it, as displayed,
that would not be valid ascii.

I thought I'd have to resort to a regex, if I could figure out what to
scan for, but if there is a perl package that will make it easier to
deal with this odd character, great.

Thanks
Ted


------------------------------

Date: Thu, 03 Sep 2009 08:51:55 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Data cleaning issue involving bad wide characters in what ought to be  ascii data
Message-Id: <doov95pbgohbg88b1tedbo2b09d88r96mg@4ax.com>

Ted Byers <r.ted.byers@gmail.com> wrote:
>Again, I am trying to automatically process data I receive by email,
>so I have no control over the data that is coming in.
>
>The data is supposed to be plain text/HTML, but there are quite a
>number of records where the contraction "rec'd" is misrepresented when
>written to standard out as "Rec\342\200\231d"
>
>When the data is written to a file, these characters are represented
>by the character ' when it is opened using notepad, but by the string
>'â€™' when it is opened by open office.
>
>So how do I tell what character it is when in three different contexts
>it is displayed in three different ways? 

By explicitely telling the displaying program the encoding that was used
to create/save the file. In your case it very much looks like UTF-8.

>How can I make certain that
>when I either print it or store it in my DB, I get the correct
>"rec'd" (or, better, "received")?
>
>I suspect a minor glitch in the software that makes and send the email
>as this is the ONLY string where what ought to be an ascii ' character
>is identified as a wide character.

That's not a wide character. A wide character is something totally
different.

>Regardless of how that happens (as
>I don't control that), I need to clean this.  And it gets confusing
>when different applications handle the i18n differently (Notepad is
>undoubtedly using the OS i18n support and Open Office is handling it
>differently, and Emacs is doing it differently from both).

Yep. If the file doesn't contain information about the encoding and/or
the application either doesn't support this encoding or misinterprets it
or cannot guess the encoding correctly then you will have to tell the
application which encoding to use (or use a different application).

Does the file have a BOM? AFAIR Notepad uses the BOM to determine if a
file is in UTF-8 in disregard of UTF-8 being a byte sequence and thus
files in UTF-8 typically neither having nor needing a BOM.

jue


------------------------------

Date: Thu, 3 Sep 2009 07:10:36 -0700 (PDT)
From: Ted Byers <r.ted.byers@gmail.com>
Subject: Data cleaning issue involving bad wide characters in what ought to be  ascii data
Message-Id: <33cfe446-b799-4580-ab9d-e45681f00ffd@p15g2000vbl.googlegroups.com>

Again, I am trying to automatically process data I receive by email,
so I have no control over the data that is coming in.

The data is supposed to be plain text/HTML, but there are quite a
number of records where the contraction "rec'd" is misrepresented when
written to standard out as "Rec\342\200\231d"

When the data is written to a file, these characters are represented
by the character ' when it is opened using notepad, but by the string
'=E2=80=99' when it is opened by open office.

So how do I tell what character it is when in three different contexts
it is displayed in three different ways?  How can I make certain that
when I either print it or store it in my DB, I get the correct
"rec'd" (or, better, "received")?

I suspect a minor glitch in the software that makes and send the email
as this is the ONLY string where what ought to be an ascii ' character
is identified as a wide character.  Regardless of how that happens (as
I don't control that), I need to clean this.  And it gets confusing
when different applications handle the i18n differently (Notepad is
undoubtedly using the OS i18n support and Open Office is handling it
differently, and Emacs is doing it differently from both).

A little enlightenment would be appreciated.

Thanks

Ted


------------------------------

Date: Thu, 03 Sep 2009 16:00:03 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 3.7 How do I cross-reference my Perl programs?
Message-Id: <7KRnm.7297$nP6.3821@newsfe25.iad>

This is an excerpt from the latest version perlfaq3.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

3.7: How do I cross-reference my Perl programs?

    The B::Xref module can be used to generate cross-reference reports for
    Perl programs.

        perl -MO=Xref[,OPTIONS] scriptname.plx



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Thu, 03 Sep 2009 04:00:09 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 3.8 Is there a pretty-printer (formatter) for Perl?
Message-Id: <dbHnm.128945$sC1.33437@newsfe17.iad>

This is an excerpt from the latest version perlfaq3.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

3.8: Is there a pretty-printer (formatter) for Perl?

    Perltidy is a Perl script which indents and reformats Perl scripts to
    make them easier to read by trying to follow the rules of the perlstyle.
    If you write Perl scripts, or spend much time reading them, you will
    probably find it useful. It is available at
    http://perltidy.sourceforge.net

    Of course, if you simply follow the guidelines in perlstyle, you
    shouldn't need to reformat. The habit of formatting your code as you
    write it will help prevent bugs. Your editor can and should help you
    with this. The perl-mode or newer cperl-mode for emacs can provide
    remarkable amounts of help with most (but not all) code, and even less
    programmable editors can provide significant assistance. Tom
    Christiansen and many other VI users swear by the following settings in
    vi and its clones:

        set ai sw=4
        map! ^O {^M}^[O^T

    Put that in your .exrc file (replacing the caret characters with control
    characters) and away you go. In insert mode, ^T is for indenting, ^D is
    for undenting, and ^O is for blockdenting--as it were. A more complete
    example, with comments, can be found at
    http://www.cpan.org/authors/id/TOMC/scripts/toms.exrc.gz

    The a2ps http://www-inf.enst.fr/%7Edemaille/a2ps/black+white.ps.gz does
    lots of things related to generating nicely printed output of documents.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Thu, 03 Sep 2009 10:00:01 GMT
From: PerlFAQ Server <brian@theperlreview.com>
Subject: FAQ 4.35 How do I find the soundex value of a string?
Message-Id: <BsMnm.112308$nL7.46406@newsfe18.iad>

This is an excerpt from the latest version perlfaq4.pod, which
comes with the standard Perl distribution. These postings aim to 
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

4.35: How do I find the soundex value of a string?

    (contributed by brian d foy)

    You can use the Text::Soundex module. If you want to do fuzzy or close
    matching, you might also try the "String::Approx", and
    "Text::Metaphone", and "Text::DoubleMetaphone" modules.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in 
perlfaq.pod.


------------------------------

Date: Wed, 2 Sep 2009 20:25:28 -0700 (PDT)
From: "C.DeRykus" <derykus@gmail.com>
Subject: Re: Help me with authomatic authentication
Message-Id: <f333b59b-475c-47d1-bfb9-afa8f8dd4ead@w37g2000prg.googlegroups.com>

On Sep 2, 3:08=A0pm, mattia <ger...@gmail.com> wrote:
> I just need to automatically authenticate to the following webpage:http:/=
/pf.rossoalice.alice.it/login.html
> Notice that the webpage has numerous redirect.
>

The WWW::Mechanize ('Mech') module can make it much easier by handling
cookies and redirecting automatically. You'll still usually  need to
inspect the login page source for form widget names  although 'Mech'
can figure them out if the form is simple.
If there're any embedded Javascript redirects, you'll need to extract
them and customize your Perl code to follow them too.

LWP::UserAgent (which Mech subclasses) can be used as well
but doesn't offer all the conveniences of Mech.

--
Charles DeRykus








------------------------------

Date: Wed, 2 Sep 2009 15:18:59 -0700 (PDT)
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: need help, will pay
Message-Id: <f16c19ff-c476-4ead-9aaf-0730bd1021da@a37g2000prf.googlegroups.com>

On Sep 2, 10:06=A0am, boman <shamb...@yahoo.com> wrote:
>
> Yes, I've got the $|=3D1 set, and I made sure apache is not buffering
> output, but can't get around the browsers doing the buffering (I think
> this is what's happening).
>

If you run the script from the command line, you'll see that the issue
not that the browser is buffering the netstat output, the netstat
command is buffering its output.


------------------------------

Date: Thu, 3 Sep 2009 01:50:11 +0200
From: Tuxedo <tuxedo@mailinator.com>
Subject: Small date question
Message-Id: <h7n0bj$dtg$01$1@news.t-online.com>

I would like to return the date and use in daily file names as in 
somefile-2009-09-03.txt etc. Anyway. the date part is where I got stuck....

Where in a Bash shell I would simply do: ...

date +%Y-%m-%d

 ... to return 2009-09-03

How could the same be done in a fairly short Perl code and without use of 
modules? (I guess by the localtime builtin procedures).

Any examples would be much appreciated.

Thanks,
Tudedo


------------------------------

Date: Thu, 3 Sep 2009 01:14:16 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Small date question
Message-Id: <obe3n6-q4g2.ln1@osiris.mauzo.dyndns.org>


Quoth Tuxedo <tuxedo@mailinator.com>:
> I would like to return the date and use in daily file names as in 
> somefile-2009-09-03.txt etc. Anyway. the date part is where I got stuck....
> 
> Where in a Bash shell I would simply do: ...
> 
> date +%Y-%m-%d
> 
> ... to return 2009-09-03
> 
> How could the same be done in a fairly short Perl code and without use of 
> modules? (I guess by the localtime builtin procedures).

You could use localtime and format the result yourself, but it would be
much easier to use strftime from the POSIX module. POSIX is available
everywhere perl is. Unfortunately the Perl documentation for strftime
just points you to the C documentation for the same function; if you are
on a system that doesn't have C documentation installed, it's easy to
find it on the Web.

Ben



------------------------------

Date: Wed, 02 Sep 2009 19:18:19 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: Small date question
Message-Id: <slrnh9u262.3ki.tadmc@tadmc30.sbcglobal.net>

Tuxedo <tuxedo@mailinator.com> wrote:
> I would like to return the date and use in daily file names as in 
> somefile-2009-09-03.txt etc. Anyway. the date part is where I got stuck....
>
> Where in a Bash shell I would simply do: ...
>
> date +%Y-%m-%d
>
> ... to return 2009-09-03
>
> How could the same be done in a fairly short Perl code and without use of 
> modules? (I guess by the localtime builtin procedures).
>
> Any examples would be much appreciated.


sub today {
   my($day, $mon, $year) = (localtime)[3,4,5];

   return sprintf "%4d-%02d-%02d", $year+1900, $mon+1, $day;
} # end sub today


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Wed, 02 Sep 2009 17:43:30 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Small date question
Message-Id: <gd2u959gvecig3b7uuqpg0clm2gla9cqt9@4ax.com>

Tuxedo <tuxedo@mailinator.com> wrote:
>I would like to return the date and use in daily file names as in 
>somefile-2009-09-03.txt etc. Anyway. the date part is where I got stuck....
>
>Where in a Bash shell I would simply do: ...
>
>date +%Y-%m-%d
>
>... to return 2009-09-03
>
>How could the same be done in a fairly short Perl code 

One way:
use POSIX qw(strftime);
my $today = strftime "%Y-%m-%d", localtime;

>and without use of modules? 

Why would you want to tie your hands behind your back and wear a
blindfold?

>(I guess by the localtime builtin procedures).

Oh, if you already knew the answer, then why were you asking? 

Yes, indeed you could do it that way if you absolutely wanted to
reinvent the wheel. Just capture the 4th, 5th and 6th elements of the
return value of localtime, adjust for the different starting values as
described in the man page of localtime, and use sprintf to enforce your
desired 4-2-2 digit format.

jue


------------------------------

Date: Thu, 3 Sep 2009 08:25:35 +0200
From: Tuxedo <tuxedo@mailinator.com>
Subject: Re: Small date question
Message-Id: <h7nngv$a67$00$1@news.t-online.com>

Ben Morrow wrote:

[...]

> just points you to the C documentation for the same function; if you are
> on a system that doesn't have C documentation installed, it's easy to
> find it on the Web.

Thanks for pointing this out!


------------------------------

Date: Thu, 3 Sep 2009 08:27:56 +0200
From: Tuxedo <tuxedo@mailinator.com>
Subject: Re: Small date question
Message-Id: <h7nnlc$a67$00$2@news.t-online.com>

Tad J McClellan wrote:

[...]


> sub today {
>    my($day, $mon, $year) = (localtime)[3,4,5];
> 
>    return sprintf "%4d-%02d-%02d", $year+1900, $mon+1, $day;
> } # end sub today
> 
> 

Thanks for this example, it works perfectly, and is just what I needed!



------------------------------

Date: Thu, 3 Sep 2009 08:46:57 +0200
From: Tuxedo <tuxedo@mailinator.com>
Subject: Re: Small date question
Message-Id: <h7nop1$o9d$02$1@news.t-online.com>

Jürgen Exner wrote:

[...]

> One way:
> use POSIX qw(strftime);
> my $today = strftime "%Y-%m-%d", localtime;
> 
> >and without use of modules?
> 
> Why would you want to tie your hands behind your back and wear a
> blindfold?

I mean modules that aren't included in perl as pre-installed standard 
modules. If a module is already included, that's good. But I don't want the 
hassle of installing a variety of modules for only small script procedures 
unless it is really needed, for easier portability.

> >(I guess by the localtime builtin procedures).
> 
> Oh, if you already knew the answer, then why were you asking?
> 

Far from being well versed in perl, I wasn't sure and wanted suggestions by 
those who know, like yourself. The only way to know by experience without 
having the experience :-)

I definitely didn't know the answer. I only guessed.

> Yes, indeed you could do it that way if you absolutely wanted to
> reinvent the wheel. Just capture the 4th, 5th and 6th elements of the
> return value of localtime, adjust for the different starting values as
> described in the man page of localtime, and use sprintf to enforce your
> desired 4-2-2 digit format.

Thanks for this and the above POSIX example!



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2581
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[31336] in Perl-Users-Digest

Perl-Users Digest, Issue: 2581 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Thu Sep 3 14:10:26 2009

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Sep 3 14:10:26 2009