[24613] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 6789 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Jul 10 21:05:54 2004

Date: Sat, 10 Jul 2004 18:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 10 Jul 2004     Volume: 10 Number: 6789

Today's topics:
    Re: Code Style -- here-docs -- How do you make them loo <me@privacy.net>
    Re: Code Style -- here-docs -- How do you make them loo <noreply@gunnar.cc>
    Re: double quotes vs. single quotes (was Re: hash as ar <nilram@hotpop.com>
    Re: hash as argument (Anno Siegel)
    Re: how perl set envirment variable (Anno Siegel)
    Re: Installing seperate version of Perl. <olczyk2002@yahoo.com>
    Re: Installing seperate version of Perl. <spamtrap@dot-app.org>
    Re: Negation of RegEx <tadmc@augustmail.com>
        Perl Regex Question: how to translate only the leading  (Yu)
    Re: Perl Regex Question: how to translate only the lead <1usa@llenroc.ude>
    Re: Perl Regex Question: how to translate only the lead (Jay Tilton)
    Re: Perl Regex Question: how to translate only the lead <noreply@gunnar.cc>
    Re: Perl Regex Question: how to translate only the lead <noreply@gunnar.cc>
        Perl regex to remove c-comments, taking into account st <sr_ng@goawaynms-sys-lts.demon.co.uk>
    Re: Perl regex to remove c-comments, taking into accoun <rwxr-xr-x@gmx.de>
    Re: Regexp substitution problem - suggestions? <jhalbrook@bjc.org>
    Re: Regexp substitution problem - suggestions? <jhalbrook@bjc.org>
    Re: Regexp substitution problem - suggestions? <1usa@llenroc.ude>
    Re: Regexp substitution problem - suggestions? <1usa@llenroc.ude>
    Re: Regular expression to match surrounding parenthesis <1usa@llenroc.ude>
    Re: why utf8::upgrade is needed? (Anno Siegel)
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sun, 11 Jul 2004 09:40:13 +1200
From: "Tintin" <me@privacy.net>
Subject: Re: Code Style -- here-docs -- How do you make them look good?
Message-Id: <2lb60uFasonnU1@uni-berlin.de>


"MST" <mst@fiftyvolts.com> wrote in message
news:ec46364f.0407100910.2346bf2@posting.google.com...
> Thanks for the advice, as I said I'm not usually treading in CGI land
> so my CGI programming style is style being worked out. I didn't
> realize that the quote like operators were ok with embedded newlines;
> I guess you learn something new every day :)

CGI has no "programming style" nor "quote like operators".  Perhaps you are
confusing CGI with Perl?




------------------------------

Date: Sun, 11 Jul 2004 00:37:04 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Code Style -- here-docs -- How do you make them look good?
Message-Id: <2lb93fFatpj8U1@uni-berlin.de>

Tintin wrote:
> MST wrote:
>> Thanks for the advice, as I said I'm not usually treading in CGI
>> land so my CGI programming style is style being worked out. I
>> didn't realize that the quote like operators were ok with
>> embedded newlines; I guess you learn something new every day :)
> 
> CGI has no "programming style" nor "quote like operators".  Perhaps
> you are confusing CGI with Perl?

You should have read the whole thread before making such a comment. If
you had, you had realized that the OP is well aware of the distinction
between Perl and CGI. (And you wouldn't have posted.)

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: 10 Jul 2004 12:24:55 -0500
From: Dale Henderson <nilram@hotpop.com>
Subject: Re: double quotes vs. single quotes (was Re: hash as argument)
Message-Id: <87acy7ojvc.fsf@camel.tamu-commerce.edu>

>>>>> "Abigail" == Abigail  <abigail@abigail.nl> writes:


Abigail> Now, clearly you weren't talking about an interpreter
Abigail> that executes output of the compiler. You were talking
Abigail> about an interpreter in the classical sense - one that
Abigail> takes a unit of code, interprets and executes it.

Actually, I did mean the interpreter that executes the output of
the compiler. For some bizarre reason, I was thinking that the
compiler would just store the double quote string somewhere and
the interpreter would have to recompile it every time it was
used. In short I was being stupid. Sorry. 

Abigail> If you want to call whatever is executing the compile
Abigail> code an "interpreter", that's fine with me. Just don't
Abigail> confuse matters by suggesting that same thing will
Abigail> actually compile the code as well.

I'll try to be more careful in the future.

-- 
Dale Henderson 

"Imaginary universes are so much more beautiful than this stupidly-
constructed 'real' one..."  -- G. H. Hardy


------------------------------

Date: 10 Jul 2004 19:50:09 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: hash as argument
Message-Id: <ccph9h$caq$1@mamenchi.zrz.TU-Berlin.DE>

Abigail  <abigail@abigail.nl> wrote in comp.lang.perl.misc:
> Anno Siegel (anno4000@lublin.zrz.tu-berlin.de) wrote on MMMCMLXVI
> September MCMXCIII in <URL:news:ccotts$1na$1@mamenchi.zrz.TU-Berlin.DE>:
> **  
> **  The current leaning seems to be "Single quotes when possible, double when
> **  needed".  Preferring single quotes is reader-friendly because they are
> **  easier to parse, but less programmer-friendly because (applied strictly)
> **  it calls for frequent changes of quotes during maintenance.  I could
> **  also live with the opposite rule, "Double quotes unless they are positively
> **  unwieldy", but at least among the majority of clpm regulars, reader-
> **  friendliness appears to have won.
> 
> 
> I think the majority of the clpm regulars have not stated a preference,
> only a handful of people did.

Okay.  The preference for '' is more often stated than the opposite.

>                               As for 'readerfriendliness', that too I find
> debatable. It may be reader friendly to *you*, because that's what you
> are used to. It's not reader friendly to someone with different habits.
> And I highly doubt the suggestion that single quoted strings are easier
> to parse than double quoted strings.

That was indeed my premise, as opposed to personal habits.  As a perl
construct, '' is clearly the simpler one.

>                                      It's not reader friendly towards
> those with a C background, for whom 'a' is 97, and not a single character
> string. I don't program that often in C, but even I occasionally think
> "integer" when seeing a single quoted single character string, instead of
> "string".

I didn't consider cross-effects with other languages.

Anyhow, I'm not so much in favor of one particular rule, than of having
any rule at all.  Since the '' crowd is the one with an out-spoken lobby,
I give it better chances and side with that one.  Opportunistic?  Sue
me.

Anno


------------------------------

Date: 10 Jul 2004 21:18:46 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: how perl set envirment variable
Message-Id: <ccpmfm$ejp$2@mamenchi.zrz.TU-Berlin.DE>

A. Sinan Unur <1usa@llenroc.ude> wrote in comp.lang.perl.misc:
> jl_post@hotmail.com (J. Romano) wrote in
> news:b893f5d4.0407100619.7a09dd66@posting.google.com: 
> 
> ...
> 
> > #!/usr/bin/perl -w
> > use strict;
> > $ENV{PATH} = "/usr/bin";
> > exec $ENV{SHELL};
> > __END__
> > 
> > 
> >    All I did was set the PATH by setting $ENV{PATH}, and then adding
> > "exec $ENV{SHELL}" as the last line of the Perl script.  That way,
> > your environment changes will "stick" when the Perl script ends.
> 
> ...
> 
> Changes will not stick. You have just invoked a new copy of your shell 
> with  a new environment.
> 
> Each time you run this script, a new copy of your shell will run. 
> 
> Why would you want to do that for something that can be handled much for 
> easily using your shell's facilities?

One of those is the eval function that most shells have.  You can say

    eval `perl_script`

in your shell and pull in the power of Perl, if needed.

The perl_script must print a bit of shell code which "eval" will
execute, say "PATH = something; export PATH" for a bourne-like shell.
The eval command can also be conserved in an alias for interactive
use.

That is a much better approach than heavy-handedly starting another shell.

Anno


------------------------------

Date: Sat, 10 Jul 2004 13:45:22 -0500
From: TLOlczyk <olczyk2002@yahoo.com>
Subject: Re: Installing seperate version of Perl.
Message-Id: <n7e0f0dqrvogotatrvgeadb6upgml5vb6m@4ax.com>

On Sat, 10 Jul 2004 08:56:02 -0400, Sherm Pendley
<spamtrap@dot-app.org> wrote:

>TLOlczyk wrote:
>
>> I am using Linux and want to debug some code written in a slightly
>> older version of Pwel. So I want to setup a user who uses that old
>> version. How do I install it, without mucking up any of the present
>> perl stuff?
>
>That's described in the standard installation docs. The key word to look for
>there is "prefix".
>
>Let's say you used a prefix of /usr/local/oldperl. The Perl binary would
>then be in /usr/local/oldperl/bin, so add that to your user's PATH. Or,
>begin scripts that use the old perl with #!/usr/local/oldperl/bin/perl.
>
Sorry your answer shows me that I asked the wrong question.
The right question should have been:
"How do I get two different versions of perl to coexist on the same
machine."


The reply-to email address is olczyk2002@yahoo.com.
This is an address I ignore.
To reply via email, remove 2002 and change yahoo to
interaccess,

**
Thaddeus L. Olczyk, PhD

There is a difference between
*thinking* you know something,
and *knowing* you know something.


------------------------------

Date: Sat, 10 Jul 2004 15:11:21 -0400
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: Installing seperate version of Perl.
Message-Id: <fKSdne0kN-vEom3dRVn-tw@adelphia.com>

TLOlczyk wrote:

> Sorry your answer shows me that I asked the wrong question.
> The right question should have been:
> "How do I get two different versions of perl to coexist on the same
> machine."

How is that question any different than your first one?

Like I said - build each version using a different install prefix, as
described in the standard docs that come with the Perl source. You can
install as many different versions that way as disk space and patience
allows.

sherm--

-- 
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org


------------------------------

Date: Sat, 10 Jul 2004 15:26:36 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Negation of RegEx
Message-Id: <slrncf0k7s.f16.tadmc@magna.augustmail.com>

Dan <dan_yuan@trendmicro.com> wrote:

[ Please learn (and then use) the proper form for a followup posting.
  Soon! 
]
> can't be any other formats. :-(


Why can't it be in any other format?



[ snip yet more TOFU. Sheesh! ]

-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: 10 Jul 2004 15:52:24 -0700
From: enstrophy_2000@yahoo.com (Yu)
Subject: Perl Regex Question: how to translate only the leading letters of a line
Message-Id: <75b30916.0407101452.37cad795@posting.google.com>

Hi, 
  I wonder if there is an elegant way of converting
number 1-9 into letter A-I for the LEADING letter 
of a line. For example:

Input:
1 xxxx1234....

Ouput: 
A xxxx1234....

The tr operator does not take the special position
character such as '^' and '$', so it would operate
on every match in the input line. Any input will 
greatly appreciated. 

-Yu


------------------------------

Date: 10 Jul 2004 23:12:06 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude>
Subject: Re: Perl Regex Question: how to translate only the leading letters of a line
Message-Id: <Xns9522C356B325Basu1cornelledu@132.236.56.8>

enstrophy_2000@yahoo.com (Yu) wrote in news:75b30916.0407101452.37cad795
@posting.google.com:

> Hi, 
>   I wonder if there is an elegant way of converting
> number 1-9 into letter A-I for the LEADING letter 
> of a line. For example:
> 
> Input:
> 1 xxxx1234....
> 
> Ouput: 
> A xxxx1234....
> 
> The tr operator does not take the special position
> character such as '^' and '$', so it would operate
> on every match in the input line.

What is preventing you from using s/// then?

#! perl

use strict;
use warnings;

my @repl = ('A' .. 'I');

while(<DATA>) {
    s/^([1-9]) /$repl[$1-1] /;
    print;
}

__DATA__
1 xxxx1234....
2 xxxx1234....
3 xxxx1234....
4 xxxx1234....
5 xxxx1234....
6 xxxx1234....
1other kind of line
7 xxxx1234....
8 xxxx1234....
9 xxxx1234....
0 xxxx1234....
0 xxxx1234....
3 xxxx1234....
5 xxxx1234....
6 xxxx1234....
7 xxxx1234....

-- 
A. Sinan Unur
1usa@llenroc.ude (reverse each component for email address)


------------------------------

Date: Sat, 10 Jul 2004 23:57:33 GMT
From: tiltonj@erols.com (Jay Tilton)
Subject: Re: Perl Regex Question: how to translate only the leading letters of a line
Message-Id: <40f081b2.71861251@news.erols.com>

enstrophy_2000@yahoo.com (Yu) wrote:

:   I wonder if there is an elegant way of converting
: number 1-9 into letter A-I for the LEADING letter 
: of a line. For example:
: 
: Input:
: 1 xxxx1234....
: 
: Ouput: 
: A xxxx1234....

    $_ ^= 'p';



------------------------------

Date: Sun, 11 Jul 2004 01:19:57 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Perl Regex Question: how to translate only the leading letters of a line
Message-Id: <2lbbjsFaot6eU1@uni-berlin.de>

Yu wrote:
> I wonder if there is an elegant way of converting number 1-9 into
> letter A-I for the LEADING letter of a line. For example:
> 
> Input:
> 1 xxxx1234....
> 
> Ouput:
> A xxxx1234....
> 
> The tr operator does not take the special position character such
> as '^' and '$', so it would operate on every match in the input
> line.

     substr($line, 0, 1) =~ tr/1-9/A-I/;

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: Sun, 11 Jul 2004 02:27:30 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Perl Regex Question: how to translate only the leading letters of a line
Message-Id: <2lbfijFarkejU1@uni-berlin.de>

Jay Tilton wrote:
> 
>     $_ ^= 'p';

Hmm ...  I know what $_ is. Would you mind explaining ^= and 'p'? Are 
they explained in the Perl docs, or are they C things that happen to 
work in Perl as well?

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: Thu, 8 Jul 2004 20:51:08 +0100
From: Saeed <sr_ng@goawaynms-sys-lts.demon.co.uk>
Subject: Perl regex to remove c-comments, taking into account string literals
Message-Id: <pOO4YXCsWa7AFwH+@nms-sys-ltd.demon.co.uk>


I have seen searching for a code example that removes c-style comments, 
but none of these take into account strings literals, e.g.

----------------------------------------------------
/*
** a comment
*/

printf /* blah */ ("Comments begin with /*\n" );

printf ( "Comments end with */\n" ); /* blah */
----------------------------------------------------

I want this stripped to:

----------------------------------------------------
printf ("Comments begin with /*\n" );

printf ( "Comments end with */\n" );
----------------------------------------------------

but the sample's I've seen would most probably give:

----------------------------------------------------
printf ("Comments begin with

\n" );
----------------------------------------------------



------------------------------

Date: 10 Jul 2004 21:12:03 GMT
From: Lukas Mai <rwxr-xr-x@gmx.de>
Subject: Re: Perl regex to remove c-comments, taking into account string literals
Message-Id: <ccpm33$sgq$1@wsc10.lrz-muenchen.de>

Saeed schrob:

> I have seen searching for a code example that removes c-style comments, 
> but none of these take into account strings literals, e.g.
[...]

That's a FAQ; see perldoc -q comments. But that solution is incomplete,
too:

/??/
* foo *\
/
is a single comment, according to the C standard. "??/" is a trigraph
expanding to "\", and backslash-newline pairs are deleted before
tokenizing the program, so the above is equivalent to

/* foo */

The following script should do the job:

#!/usr/local/bin/perl -wp0777
use strict;

# this script reads files, removes C comments,
# and prints the results to stdout

s{
   / 
   (?: (?: \\ | \?\?/) \n)*
   (?:
      / (?: (?: \\ | \?\?/) \n | [^\n] )*
   |
      \* [^*]* \*+ (?: (?: \\ | \?\?/) \n)*
      (?: [^/*][^*]* \*+ (?: (?: \\ | \?\?/) \n)* )*
      (/)
   )
|
   (
      " (?: (?: \\ | \?\?/) . | [^"])* "
   |
      ' (?: (?: \\ | \?\?/) . | [^'])* '
   |
      . [^'"/]*
   )
}{
    (defined $1 ? ' ' : '') . (defined $2 ? $2 : '') 
}gsex
__END__

HTH, Lukas
-- 
print+74.117.115.116,,qq.\c!..not::.her,Perl=>q$hacker,$,!($,=$")


------------------------------

Date: Sat, 10 Jul 2004 14:43:04 -0500
From: "news.socket.net" <jhalbrook@bjc.org>
Subject: Re: Regexp substitution problem - suggestions?
Message-Id: <10f0hmcj2qjtp5e@corp.supernews.com>


"A. Sinan Unur" <1usa@llenroc.ude> wrote in message
news:Xns952270FAA2587asu1cornelledu@132.236.56.8...
> "Joe Halbrook" <jhalbrook@bjc.org> wrote in
> news:10evrgbrqee6208@corp.supernews.com:
>
> [ Please do not top-post ]
>
> > The file I was using to test contained the following:
> >
> > E-mail me at:  mailto:inquiries@domain.com
> > E-mail me at:  mailto:inquiries@domain.com.
> > E-mail me at:  mailto:inquiries@domain again.
> > Try this:      mailto:addr@domain.com?Subject=Testing-blues
> >
> > re:  Spam
> >
> > I hate it as much as the next person.
> > I assure you that I am NOT in any way, shape, or form trying
> > to facilitate spamming.  Quite to the contrary.
>
> Well, now, it looks like you are trying to add addresses from domain.com
to
> spammers' databases.
>
> -- 
> A. Sinan Unur
> 1usa@llenroc.ude (reverse each component for email address)


How did you derive that inaccurate conclusion?

I'm simply trying to hyperlink email addresses in an email discussion
group for an HTML version of those emails.  The sample lines I sent
in the previous post is to allow testing of a solution, depending on
the positions of an email address in the file being processed, in this case,
an email from a discussion group.

Besides, why is it any of your business WHY I asked a question?

Joe




------------------------------

Date: Sat, 10 Jul 2004 14:47:59 -0500
From: "news.socket.net" <jhalbrook@bjc.org>
Subject: Re: Regexp substitution problem - suggestions?
Message-Id: <10f0hvl9c1g2o4b@corp.supernews.com>

"Gunnar Hjalmarsson" <noreply@gunnar.cc> wrote in message
news:2lai67Fa1k6vU1@uni-berlin.de...
> Joe Halbrook wrote:
> > I tried all the tips you've suggested, but just have not been
> > successful.
>
> Then post a short but complete program that illustrates the problem,
> and someone may help you fix it.
>
> I suggest that you read the posting guidelines for this group:
> http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
>
> > re:  Spam
> >
> > I hate it as much as the next person.
> > I assure you that I am NOT in any way, shape, or form trying to
> > facilitate spamming.  Quite to the contrary.
>
> If you publish email addresses on publicly availabe web pages, you
> *are* facilitating spamming, whether that's your intention or not.
> (Not to mention the point Sinan made.) ;-)
>
> -- 
> Gunnar Hjalmarsson
> Email: http://www.gunnar.cc/cgi-bin/contact.pl


Sorry, for the top post.

I'll post my script along with the sample file lines (as in my previous
post)
later today.  Thank you for your suggestions and offer to help.

Joe




------------------------------

Date: 10 Jul 2004 19:57:55 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude>
Subject: Re: Regexp substitution problem - suggestions?
Message-Id: <Xns9522A26ABDA12asu1cornelledu@132.236.56.8>

"news.socket.net" <jhalbrook@bjc.org> wrote in
news:10f0hmcj2qjtp5e@corp.supernews.com: 

> "A. Sinan Unur" <1usa@llenroc.ude> wrote in message
> news:Xns952270FAA2587asu1cornelledu@132.236.56.8...
>> "Joe Halbrook" <jhalbrook@bjc.org> wrote in
>> news:10evrgbrqee6208@corp.supernews.com:
>>
>> [ Please do not top-post ]
>>
>> > The file I was using to test contained the following:
>> >
>> > E-mail me at:  <address snipped>
>> > E-mail me at:  <address snipped>
>> > E-mail me at:  <address snipped>
>> > Try this:      <address snipped>
>> >
>> > re:  Spam
>> >
>> > I hate it as much as the next person.
>> > I assure you that I am NOT in any way, shape, or form trying
>> > to facilitate spamming.  Quite to the contrary.
>>
>> Well, now, it looks like you are trying to add addresses from
>> domain.com to spammers' databases.

 ...

> How did you derive that inaccurate conclusion?

You are posting email unmunged addresses from domain.com to a newsgroup for 
spambots to harvest. I doubt you own the domain. IMNSHO, you will be 
responsible for a lot of the future spam received at domain.com. If I were 
the administrator of that site, I would complain loudly to your ISP.

If you want to use an example, try something like domain.tld. That would be 
responsible.

> I'm simply trying to hyperlink email addresses in an email discussion
> group for an HTML version of those emails.

I frankly do not care what you are doing. Your message contained a major 
blunder and I pointed that out.

Other people kindly pointed out the fact that if you were going to post a 
page with a whole bunch of mailto links, you would be making those 
addresses available to spambots to harvest. It was an advance warning that 
may or may not have been applicable to your situation, but given that your 
cluemeter seems to be stuck at somewhere below zero, it definitely was 
necessary to tell you the obvious.

> The sample lines I sent in the previous post 

Your previous post was poorly formatted.

> Besides, why is it any of your business WHY I asked a question?

Frankly, I couldn't care less. I just pointed a serious blunder which will 
cause increased spam to domain.com. That is responsible and rude.


-- 
A. Sinan Unur
1usa@llenroc.ude (reverse each component for email address)


------------------------------

Date: 10 Jul 2004 20:08:28 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude>
Subject: Re: Regexp substitution problem - suggestions?
Message-Id: <Xns9522A434EB69Casu1cornelledu@132.236.56.8>

"A. Sinan Unur" <1usa@llenroc.ude> wrote in
news:Xns9522A26ABDA12asu1cornelledu@132.236.56.8: 

> That is responsible 

s/responsible/irresponsible/

-- 
A. Sinan Unur
1usa@llenroc.ude (reverse each component for email address)


------------------------------

Date: 10 Jul 2004 18:34:44 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude>
Subject: Re: Regular expression to match surrounding parenthesis
Message-Id: <Xns952294502EE6Easu1cornelledu@132.236.56.8>

Ilya Zakharevich <nospam-abuse@ilyaz.org> wrote in
news:ccml0r$1bqb$1@agate.berkeley.edu: 

> [A complimentary Cc of this posting was sent to

I think you mean 'complementary' rather than 'complimentary'.

Just a friendly heads-up from one nonnative speaker to another :)

-- 
A. Sinan Unur
1usa@llenroc.ude (reverse each component for email address)


------------------------------

Date: 10 Jul 2004 21:01:33 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: why utf8::upgrade is needed?
Message-Id: <ccplfd$ejp$1@mamenchi.zrz.TU-Berlin.DE>

Petr Pajas  <pajas@ufal.ms.mff.cuni.cz> wrote in comp.lang.perl.misc:

[...]

> Without going into boring details, my situation is as follows: in my
> program, the user provides arbitrary Perl expression which I parse using
> Text::Balanced. The expression is expected to result in a ascii or UTF8
> string (or maybe some other perl object). Due to a reported (and already
> fixed) bugs in substr of Perl<=5.8.3, this module fails to handle utf8 code
> correctly, so the users are forced to use ASCII code. To insert literal
> utf8 data into ascii code, the user has to use \x{...}. After I evaluate
> the expression, I'm passing it to a XS module, which is utf8 aware, but
> treats non-utf8-flagged non-ascii strings in a specific way. On the other
> hand, having a blood-signed treaty with the user on my desk:-), I know that
> when he says "\x{e1}", he means characters, not bytes. But, since "\x{e1}"
> evaluates as to a non-ascii non-UTF8-flagged string, the modules behaves
> incorrectly. So, in order to resolve it, I have to manually force upgrade
> at all entry points to the library (hundreds). Other solution would be to
> remove the "special treatment" of non-utf8 non-ascii data from the XS
> module (being one of the developers I could try to establish that), but
> unfortunately, lots of users rely on that behavior.

Let me just throw in a reminder that the behavior of literals can be
overloaded.  If the problem can be solved by changing the way string
literals are interpreted, this may help:

    use overload;
    overload::constant( q => \ &make_utf8);
    sub make_utf8 {
        my ( $orig, $perl, $mode) = @_;
        utf8::encode( $perl) if grep ord() >= 128, split //, $perl;
        $perl;
    }

That would enforce utf8 interpretation of any string containing a character
in the 128 - 255 range.  If the code is put in a library, the call to
overload::constant() should should go in the import() routine.

Then again, I may be entirely on the wrong track...

Anno


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6789
***************************************


home help back first fref pref prev next nref lref last post