[27832] in Perl-Users-Digest
Perl-Users Digest, Issue: 9196 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Apr 24 14:05:54 2006
Date: Mon, 24 Apr 2006 11:05:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 24 Apr 2006 Volume: 10 Number: 9196
Today's topics:
Re: "our" from XS and some other questions (Anno Siegel)
ANNOOUNCE: DBIx::Admin::TableInfo V 2.00 <ron@savage.net.au>
How to remove all duplications of characters <ignoramus21673@NOSPAM.21673.invalid>
Re: How to remove all duplications of characters <David.Squire@no.spam.from.here.au>
Re: How to remove all duplications of characters <rwxr-xr-x@gmx.de>
Re: How to remove all duplications of characters <David.Squire@no.spam.from.here.au>
Re: How to remove all duplications of characters <ignoramus21673@NOSPAM.21673.invalid>
Re: How to remove all duplications of characters <David.Squire@no.spam.from.here.au>
Re: How to remove all duplications of characters <ignoramus21673@NOSPAM.21673.invalid>
Re: How to remove all duplications of characters <tadmc@augustmail.com>
Re: How to remove all duplications of characters <ignoramus21673@NOSPAM.21673.invalid>
Re: How to remove all duplications of characters <David.Squire@no.spam.from.here.au>
Re: How to remove all duplications of characters <rvtol+news@isolution.nl>
Re: How to remove all duplications of characters (Anno Siegel)
Re: How to remove all duplications of characters (Anno Siegel)
Re: Term::ReadKey on Win? 5.005 vs 5.8.8? <rvtol+news@isolution.nl>
Re: XS Progamming with Perl 6 <rvtol+news@isolution.nl>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 24 Apr 2006 13:23:42 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: "our" from XS and some other questions
Message-Id: <4b41quFvedcnU1@news.dfncis.de>
Ferry Bolhar <bol@adv.magwien.gv.at> wrote in comp.lang.perl.misc:
> Anno Siegel:
>
> >> I wrote about XS programming. So, the question was, how can I mark
> >> a variable as "our"-ed from XS code to suppress the "Global variable..."
> >> message when later using the variable in Perl.
> >
> > That was answered. You can't.
>
> OK. I'm not happy with it, but this answer is better than no answer.
>
> > > Or, to be more precise: what happens behind the scenes, when a
> > > variable is declared with "our"?
> >
> > A lexical alias is created whose name is the variable name without the
> > package part.
>
> Sorry, I can't understand that. I know about "lexical variables" (those
> declared with "my") and about aliases like "*main::bar = *Foo::bar"
> to create them for variables in different namespaces. But what is a "lexical
> alias"?
The situation is much like this:
for my $variable ( $main::variable ) {
# $variable is a lexical alias for $main::variable, like the
# one "our $variable" would create.
}
except that the one-shot for gives you the alias in a nested block
while "our" creates it in the current block.
> > In subsequent code it overrides any reference to the original
> > package variable and thus avoids the warning. That's why you can't do it
> > from a function: You can't (or, if you can, you shouldn't) create a
> lexical
> > variable in your caller's scope.
>
> IIRC this can be done by creating a lexical in the caller's scratch pad
> instead of the own. Ok, whether this make sense is another question...
It would fly in the face of the concept of "lexical scope". But even
if you did that, it wouldn't help. Both the warning "... used only once"
and the error "Global symbol ... requires explicit package name"
are generated at compile time. Your function, XS or otherwise, is only
called at run time, too late to suppress them.
Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
------------------------------
Date: Mon, 24 Apr 2006 03:40:41 GMT
From: Ron Savage <ron@savage.net.au>
Subject: ANNOOUNCE: DBIx::Admin::TableInfo V 2.00
Message-Id: <Iy89Kv.85E@zorch.sf-bay.org>
The pure Perl module DBIx::Admin::TableInfo V 2.00
is available immediately from CPAN,
and from http://savage.net.au/Perl-modules.html.
On-line docs, and a *.ppd for ActivePerl are also
available from the latter site.
An extract from the docs:
2.00 Thu Apr 20 11:19:00 2006
- Add primary key info
- Add foreign key info
- Rename parameters to new():
o table_catalog is now catalog
o table_schema is now schema
o column_catalog is now catalog
o column_schema is now schema
- Add parameters to new() to support Oracle:
o table
o type
- Document parameter values for:
o MS Access
o MySQL
o Oracle
o PostgreSQL
- Update docs
- Rewrite examples/test-table-info.pl to use Data::Dumper
- Chop examples/test-table-info.cgi because it added nothing useful to the
distro
------------------------------
Date: Mon, 24 Apr 2006 14:43:05 GMT
From: Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid>
Subject: How to remove all duplications of characters
Message-Id: <Zt53g.59973$ku2.48189@fe58.usenetserver.com>
I am writing a little mail filter:
I receive messages with Subjects such as:
Hardcoore incesst Content
I want to replace that with "Hardcore incest Content" (note removal of
duplicate characters. Is there some regexp that would let me do that.
i
------------------------------
Date: Mon, 24 Apr 2006 15:51:59 +0100
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: How to remove all duplications of characters
Message-Id: <e2iomf$6fl$1@news.ox.ac.uk>
Ignoramus21673 wrote:
> I am writing a little mail filter:
>
> I receive messages with Subjects such as:
>
> Hardcoore incesst Content
>
> I want to replace that with "Hardcore incest Content" (note removal of
> duplicate characters. Is there some regexp that would let me do that.
Yes.
What have you tried so far?
Also, many English words contain perfectly valid double letters (there's
one now :) ). If you want your filtered results to be human-readable,
you will need to take that into account. If you intend just to reduce
things to a standard form before feeding to a filter, then this will not
matter.
DS
------------------------------
Date: Mon, 24 Apr 2006 16:53:13 +0200
From: "Lukas Mai" <rwxr-xr-x@gmx.de>
Subject: Re: How to remove all duplications of characters
Message-Id: <e2ioop$i9m$01$1@news.t-online.com>
Ignoramus21673 <ignoramus21673@nospam.21673.invalid> schrob:
> I am writing a little mail filter:
>
> I receive messages with Subjects such as:
>
> Hardcoore incesst Content
>
> I want to replace that with "Hardcore incest Content" (note removal of
> duplicate characters. Is there some regexp that would let me do that.
Not a regexp, but you can use tr/// with the s modifier. See perldoc
perlop.
HTH, Lukas
------------------------------
Date: Mon, 24 Apr 2006 16:01:25 +0100
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: How to remove all duplications of characters
Message-Id: <e2ip85$6j1$1@news.ox.ac.uk>
David Squire wrote:
> Ignoramus21673 wrote:
>> I am writing a little mail filter:
>>
>> I receive messages with Subjects such as:
>> Hardcoore incesst Content
>>
>> I want to replace that with "Hardcore incest Content" (note removal of
>> duplicate characters. Is there some regexp that would let me do that.
>
> Yes.
>
> What have you tried so far?
>
> Also, many English words contain perfectly valid double letters (there's
> one now :) ). If you want your filtered results to be human-readable,
> you will need to take that into account. If you intend just to reduce
> things to a standard form before feeding to a filter, then this will not
> matter.
OK. Here's an example of one:
echo 'Heelllooo WWWoorrld' | perl -e '{while (<>) {s/([A-Za-z])\1+/$1/g;
print}}'
(assuming that you are only interested in alphabetic characters being
duplicated)
DS
------------------------------
Date: Mon, 24 Apr 2006 15:03:11 GMT
From: Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid>
Subject: Re: How to remove all duplications of characters
Message-Id: <PM53g.61573$k45.54974@fe83.usenetserver.com>
On Mon, 24 Apr 2006 15:51:59 +0100, David Squire <David.Squire@no.spam.from.here.au> wrote:
> Ignoramus21673 wrote:
>> I am writing a little mail filter:
>>
>> I receive messages with Subjects such as:
>>
>> Hardcoore incesst Content
>>
>> I want to replace that with "Hardcore incest Content" (note removal of
>> duplicate characters. Is there some regexp that would let me do that.
>
> Yes.
>
> What have you tried so far?
perldoc perlre
> Also, many English words contain perfectly valid double letters (there's
> one now :) ). If you want your filtered results to be human-readable,
> you will need to take that into account. If you intend just to reduce
> things to a standard form before feeding to a filter, then this will not
> matter.
The corrected text is intended for the consumption of the filter, not
humans.
I need to filter certain spams, one is a sex spammer who sends emails
with subjects similar to the above, and another is a medications
spammer who sends messages with lines like
X a n @ x
etc. I want to write something smart that woudl detect it.
i
------------------------------
Date: Mon, 24 Apr 2006 16:12:42 +0100
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: How to remove all duplications of characters
Message-Id: <e2ipta$6ue$1@news.ox.ac.uk>
Lukas Mai wrote:
> Ignoramus21673 <ignoramus21673@nospam.21673.invalid> schrob:
>> I am writing a little mail filter:
>>
>> I receive messages with Subjects such as:
>>
>> Hardcoore incesst Content
>>
>> I want to replace that with "Hardcore incest Content" (note removal of
>> duplicate characters. Is there some regexp that would let me do that.
>
> Not a regexp, but you can use tr/// with the s modifier. See perldoc
> perlop.
Yes. This is indeed nicer:
echo 'Heelllooo WWWoorrld' | perl -e '{while (<>) {tr/A-Za-z//s; print}}'
DS
------------------------------
Date: Mon, 24 Apr 2006 15:16:31 GMT
From: Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid>
Subject: Re: How to remove all duplications of characters
Message-Id: <jZ53g.80460$W9.6780@fe80.usenetserver.com>
On Mon, 24 Apr 2006 16:01:25 +0100, David Squire <David.Squire@no.spam.from.here.au> wrote:
> David Squire wrote:
>> Ignoramus21673 wrote:
>>> I am writing a little mail filter:
>>>
>>> I receive messages with Subjects such as:
>>> Hardcoore incesst Content
>>>
>>> I want to replace that with "Hardcore incest Content" (note removal of
>>> duplicate characters. Is there some regexp that would let me do that.
>>
>> Yes.
>>
>> What have you tried so far?
>>
>> Also, many English words contain perfectly valid double letters (there's
>> one now :) ). If you want your filtered results to be human-readable,
>> you will need to take that into account. If you intend just to reduce
>> things to a standard form before feeding to a filter, then this will not
>> matter.
>
> OK. Here's an example of one:
>
> echo 'Heelllooo WWWoorrld' | perl -e '{while (<>) {s/([A-Za-z])\1+/$1/g;
> print}}'
>
> (assuming that you are only interested in alphabetic characters being
> duplicated)
>
> DS
Thanks, works beautifully.
i
------------------------------
Date: Mon, 24 Apr 2006 10:30:39 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: How to remove all duplications of characters
Message-Id: <slrne4prov.4co.tadmc@magna.augustmail.com>
Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid> wrote:
> I am writing a little mail filter:
>
> I receive messages with Subjects such as:
>
> Hardcoore incesst Content
>
> I want to replace that with "Hardcore incest Content" (note removal of
> duplicate characters. Is there some regexp that would let me do that.
Yes, but a regex is not the Right Tool for this job.
You can do it fine without any regular expressions:
tr/a-zA-Z//s;
Note that 'Mississippi' becomes 'Misisipi' ...
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Mon, 24 Apr 2006 15:34:41 GMT
From: Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid>
Subject: Re: How to remove all duplications of characters
Message-Id: <le63g.78615$3M3.4032@fe19.usenetserver.com>
On Mon, 24 Apr 2006 10:30:39 -0500, Tad McClellan <tadmc@augustmail.com> wrote:
> Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid> wrote:
>> I am writing a little mail filter:
>>
>> I receive messages with Subjects such as:
>>
>> Hardcoore incesst Content
>>
>> I want to replace that with "Hardcore incest Content" (note removal of
>> duplicate characters. Is there some regexp that would let me do that.
>
>
> Yes, but a regex is not the Right Tool for this job.
>
> You can do it fine without any regular expressions:
>
> tr/a-zA-Z//s;
>
>
> Note that 'Mississippi' becomes 'Misisipi' ...
>
>
Thanks. Someone suggested to use a regexp like this
$s =~ s/([A-Za-z])\1+/$1/g;
which actually works. If tr is somehow better (not sure why), I can
switch to using tr.
i
------------------------------
Date: Mon, 24 Apr 2006 16:35:31 +0100
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: How to remove all duplications of characters
Message-Id: <e2ir83$7e9$1@news.ox.ac.uk>
Tad McClellan wrote:
> Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid> wrote:
>> I am writing a little mail filter:
>>
>> I receive messages with Subjects such as:
>>
>> Hardcoore incesst Content
>>
>> I want to replace that with "Hardcore incest Content" (note removal of
>> duplicate characters. Is there some regexp that would let me do that.
>
>
> Yes, but a regex is not the Right Tool for this job.
>
> You can do it fine without any regular expressions:
>
> tr/a-zA-Z//s;
>
>
Out of interest, can tr handle more general cases, such as:
s/(.)\1+/$1/g;
or is a regex necessary for this?
DS
PS. Yes, I have tested 'tr/.//s;', and it doesn't remove any dupes.
------------------------------
Date: Mon, 24 Apr 2006 18:40:52 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: How to remove all duplications of characters
Message-Id: <e2j68k.1cs.1@news.isolution.nl>
Tad McClellan schreef:
> Ignoramus21673:
>> Hardcoore incesst Content
>>
>> I want to replace that with "Hardcore incest Content" (note removal
>> of duplicate characters. Is there some regexp that would let me do
>> that.
>
> Yes, but a regex is not the Right Tool for this job.
Well, it is if you would rather use [:alpha:].
(there can be more in [[:alpha:]] than is in [A-Za-z])
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: 24 Apr 2006 17:12:32 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: How to remove all duplications of characters
Message-Id: <4b4f80Fvi9voU1@news.dfncis.de>
David Squire <David.Squire@no.spam.from.here.au> wrote in comp.lang.perl.misc:
> Tad McClellan wrote:
> > Ignoramus21673 <ignoramus21673@NOSPAM.21673.invalid> wrote:
> >> I am writing a little mail filter:
> >>
> >> I receive messages with Subjects such as:
> >>
> >> Hardcoore incesst Content
> >>
> >> I want to replace that with "Hardcore incest Content" (note removal of
> >> duplicate characters. Is there some regexp that would let me do that.
> >
> >
> > Yes, but a regex is not the Right Tool for this job.
> >
> > You can do it fine without any regular expressions:
> >
> > tr/a-zA-Z//s;
> >
> >
>
> Out of interest, can tr handle more general cases, such as:
>
> s/(.)\1+/$1/g;
>
> or is a regex necessary for this?
tr/\x00-\x7f//s;
covers the ASCII range. Any set of character ranges can be covered.
See tr/// in perlop.
> PS. Yes, I have tested 'tr/.//s;', and it doesn't remove any dupes.
Do look up tr///. The similarity with s/// is rather superficial. In
particular, "." doesn't do in tr/// what it does in a regex.
Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
------------------------------
Date: 24 Apr 2006 17:29:04 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: How to remove all duplications of characters
Message-Id: <4b4g70F102r4iU1@news.dfncis.de>
Dr.Ruud <rvtol+news@isolution.nl> wrote in comp.lang.perl.misc:
> Tad McClellan schreef:
> > Ignoramus21673:
>
> >> Hardcoore incesst Content
> >>
> >> I want to replace that with "Hardcore incest Content" (note removal
> >> of duplicate characters. Is there some regexp that would let me do
> >> that.
> >
> > Yes, but a regex is not the Right Tool for this job.
>
> Well, it is if you would rather use [:alpha:].
>
> (there can be more in [[:alpha:]] than is in [A-Za-z])
$_ = 'Heelllooo WWWoorrld';
do {
my $alpha = join '' =>
grep /[[:alpha:]]/,
map chr, 0 .. 255; # or whatever
eval "sub { tr/$alpha//s }";
}->();
print "$_\n";
:)
Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
------------------------------
Date: Mon, 24 Apr 2006 18:48:11 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: Term::ReadKey on Win? 5.005 vs 5.8.8?
Message-Id: <e2j6rj.2io.1@news.isolution.nl>
Ilya Zakharevich schreef:
> Dr.Ruud:
>> I now really like to have a more useful getc() for Win32.
>
> AFAIU, one should debug PerlIO; most probably, the bug is there.
It is not just the bug, but also the lack of support for Function-keys
and special combinations with Alt/Ctrl/Shift, etc.
I have a Dr.Rudimentary getc() (I called it getuc) here that returns
Unicode-codepoints, for example "\x{2193}" for Arrow-Down. These
return-values are in a table, so one can easily change that to Esc-[B,
or "\0\120", or dynamically to whatever the current Win32-console
dictates.
I have not yet looked into existing Unicode-console solutions, that may
well be already doing much of what I have in mind.
> One
> should check with pre-5.6.0 Perl first, which has no PerlIO. Or write
> a simple C test with read(in, buf, 1) (the C code to replace TRK on
> Win is provided in one of the (p5p?) threads I mentioned before;
> google for CON CONIN ReadKey).
In the code in my previous post, the first-13-gets-eaten-bug is not
there, as long as you leave USE_GETC at 0.
With SET_BINMODE 1, there are no eaten or delayed 13s, even if USE_GETC
is 1 and/or USE_CONIN is 0.
>>> (some of my patches made it into
>>> 5.8.8, so it has a significant chance to work better/differently
>>> than 5.8.7).
>
>> Which patches are you addressing here?
>
> My patches for PerlIO. Should not be hard to google for.
OK, PerlIO, I'll see.
Wow, that
http://search.cpan.org/src/ILYAZ/Term-ReadLine-Perl-1.03/ReadLine/readline.pm
is pretty loaded with externalizable stuff, like Vi-mode and keymaps
etc.
While googling around, I found
http://rt.cpan.org/Public/Bug/Display.html?id=17773
about the bug.
That thread also mentions "SetConsoleMode failed, LastError=|6| at ..."
which is indeed solved by the "+<" in the open, but which can also occur
if you got the console's explicit filehandle to close prematurely.
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: Mon, 24 Apr 2006 18:46:52 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: XS Progamming with Perl 6
Message-Id: <e2j6ri.2io.1@news.isolution.nl>
Ferry Bolhar schreef:
> Sisyphus:
>> My understanding is that, with the advent of perl6, XS becomes
>> non-existent.
>
> Can't believe that. Many, many modules depend on XS code. And consider
> mod_perl? How to recode it without XS?
>
> Maybe, there's no XS with Perl6, but there must be a way to include
> native C/C++ code and so give programmers a way to access API's in
> other libraries from a Perl script or to embedd Perl in C code.
>
> Well, and this brings me back to my initial question: Are there
> already some infos about this way, about the successor of XS?
Make a C-compiler on Parrot? I don't see one here yet:
http://www.parrotcode.org/languages/
Maybe the code of http://www.tinycc.org is a nice starting point.
(C-- is Microsoft related)
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 9196
***************************************