[31876] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 3139 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Sep 22 06:09:50 2010

Date: Wed, 22 Sep 2010 03:09:20 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 22 Sep 2010     Volume: 11 Number: 3139

Today's topics:
        Displaying 'umlaut' character <dn.perl@gmail.com>
    Re: Displaying 'umlaut' character <ben@morrow.me.uk>
    Re: Displaying 'umlaut' character (Jens Thoms Toerring)
    Re: Displaying 'umlaut' character <fbortel@home.nl>
    Re: Displaying 'umlaut' character <hjp-usenet2@hjp.at>
    Re: Displaying 'umlaut' character <fbortel@home.nl>
    Re: Displaying 'umlaut' character <hhr-m@web.de>
    Re: Displaying 'umlaut' character <ben@morrow.me.uk>
    Re: Displaying 'umlaut' character <ben@morrow.me.uk>
    Re: Displaying 'umlaut' character <ben@morrow.me.uk>
    Re: FAQ 1.14 What is a JAPH? <bart.lateur@telenet.be>
    Re: FAQ 1.14 What is a JAPH? <ben@morrow.me.uk>
        Getopt:Long arguments that are not (options or option v <nahum_barnea@yahoo.com>
    Re: Getopt:Long arguments that are not (options or opti <NoSpamPleaseButThisIsValid3@gmx.net>
    Re: Getopt:Long arguments that are not (options or opti <jl_post@hotmail.com>
    Re: Getopt:Long arguments that are not (options or opti <peter@makholm.net>
    Re: Removing tag + closing tag <jwcarlton@gmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 21 Sep 2010 21:50:31 -0700 (PDT)
From: "dn.perl@gmail.com" <dn.perl@gmail.com>
Subject: Displaying 'umlaut' character
Message-Id: <54c79751-70b5-40e5-831d-b5322a724286@z34g2000pro.googlegroups.com>


My aim is to display the =91special=92 (NON-Ascii) German character/
diacritic umlaut or diaresis correctly on a browser. The browser calls
a cgi perl-script which resides on a linux server. The browser which
calls the perl-script displays Vietnamese characters correctly (but
not the umlaut) without any special setting. The script sets NLS_LANG
variable to AMERICAN_AMERICA.UTF8 and uses utf8 module, but that=92s
about it.

$ENV{'NLS_LANG'}=3D'AMERICAN_AMERICA.UTF8';
    Works for Vietnamese characters, but not with umlaut (=F6).

But even before we get to a perl-script, perhaps the LC_CTYPE env
variable needs to be set correctly. From my windows laptop, if I
access Oracle through Oracle Query Server, I can see the umlaut. But
if I open a linux-window, initiate an sqlplus session, and run the
same SQL, I do not see the umlaut correctly. I have tried a few values
for the env variable LC_CTYPE (like iso_8859_1, en_US,
en_US.iso88591), but with no luck. The surprising thing is that
=91umalut=92 is a muck-known alphabet, Vietnamese alphabets are less-
known. Yet the Vietnamese characters are being displayed correctly.

What settings should I use in a perl-script or for a linux-window to
see the umlaut correctly? Please advise.



------------------------------

Date: Wed, 22 Sep 2010 07:01:31 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Displaying 'umlaut' character
Message-Id: <rmiom7-n45.ln1@osiris.mauzo.dyndns.org>


Quoth "dn.perl@gmail.com" <dn.perl@gmail.com>:
> 
> My aim is to display the ‘special’ (NON-Ascii) German character/
> diacritic umlaut or diaresis correctly on a browser. The browser calls
> a cgi perl-script which resides on a linux server. The browser which
> calls the perl-script displays Vietnamese characters correctly (but
> not the umlaut) without any special setting. The script sets NLS_LANG
> variable to AMERICAN_AMERICA.UTF8 and uses utf8 module, but that’s
> about it.

You almost certainly don't want to do either of those. 'use utf8' does
exactly one thing: it tells Perl your script itself is written in UTF-8.
If that isn't the case you don't want to use it. Perl also doesn't take
any notice of NLS_LANG or any of the other locale envvars unless you ask
it to (and, normally, that's a bad idea). However, it's possible that
whatever database interface you're using does.

> $ENV{'NLS_LANG'}='AMERICAN_AMERICA.UTF8';
>     Works for Vietnamese characters, but not with umlaut (ö).

I don't think that's usually a valid locale on a Linux system. Usually
they are of the form 'en_US.UTF-8', but in any case if you need locales
at all you will want to check which locales are available on your
system.

> But even before we get to a perl-script, perhaps the LC_CTYPE env
> variable needs to be set correctly. From my windows laptop, if I
> access Oracle through Oracle Query Server, I can see the umlaut. But
> if I open a linux-window, initiate an sqlplus session, and run the
> same SQL, I do not see the umlaut correctly. I have tried a few values
> for the env variable LC_CTYPE (like iso_8859_1, en_US,
> en_US.iso88591), but with no luck. The surprising thing is that
> ‘umalut’ is a muck-known alphabet, Vietnamese alphabets are less-
> known. Yet the Vietnamese characters are being displayed correctly.
> 
> What settings should I use in a perl-script or for a linux-window to
> see the umlaut correctly? Please advise.

OK. What is actually stored in the database (what data types are you
using, and how is the data encoded before being stored)? How are you
getting the data out of the database (the only correct answer here is
'DBI', or possibly a wrapper around that)? Have you read the DBI and
DBD::Oracle docs for anything concerning character encodings? Have you
read perlunitut and the other docs that refers you to?

FWIW when I do this sort of thing I use Postgres with DBD::Pg, I set the
database encoding to UTF-8 (this is a Pg-specific feature, but I
wouldn't be surprised if Ora has got something similar), I push an
:encoding(utf8) layer onto any filehandles, I make sure to send a
'Content-type: text/html; charset=utf-8' header, and everything Just
Works. There are variations on that which work just as well, but that's
by far the simplest approach.

Ben



------------------------------

Date: 22 Sep 2010 07:18:28 GMT
From: jt@toerring.de (Jens Thoms Toerring)
Subject: Re: Displaying 'umlaut' character
Message-Id: <8ftou4Fnd8U1@mid.uni-berlin.de>

In comp.lang.perl.misc dn.perl@gmail.com <dn.perl@gmail.com> wrote:

> My aim is to display the ‘special’ (NON-Ascii) German character/
> diacritic umlaut or diaresis correctly on a browser. The browser calls
> a cgi perl-script which resides on a linux server. The browser which
> calls the perl-script displays Vietnamese characters correctly (but
> not the umlaut) without any special setting.

Stop right here. If you mean with "browser" something like
firefox, Internet Explorer etc. then there's some mis-under-
standing here. The browser does not "call" a cgi-script. The
browser just sends a request to the server which in turn may
call a cgi-script (that may be written in Perl) and then sends
the results back to the browser. And a web server normally
sends a HTML header with the page that may contain a line
like

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

That tells the browser which type of character set to use
when displaying the page it got from the server. And when
the browser has the necessary fonts it will display the
page correctly (otherwise some or all of the characters
may be replaced by a square or something like that). (And
it also may require that the web-server isn't configured
to send conflicting information in the HTTP reply header..)

> The script sets NLS_LANG
> variable to AMERICAN_AMERICA.UTF8 and uses utf8 module, but that’s
> about it.

> $ENV{'NLS_LANG'}='AMERICAN_AMERICA.UTF8';
>     Works for Vietnamese characters, but not with umlaut (ö).

> But even before we get to a perl-script, perhaps the LC_CTYPE env
> variable needs to be set correctly.

> From my windows laptop, if I
> access Oracle through Oracle Query Server, I can see the umlaut. But
> if I open a linux-window, initiate an sqlplus session, and run the
> same SQL, I do not see the umlaut correctly. I have tried a few values
> for the env variable LC_CTYPE (like iso_8859_1, en_US,
> en_US.iso88591), but with no luck. The surprising thing is that
> ‘umalut’ is a muck-known alphabet, Vietnamese alphabets are less-
> known. Yet the Vietnamese characters are being displayed correctly.

> What settings should I use in a perl-script or for a linux-window to
> see the umlaut correctly? Please advise.

All this doesn't seem to be a Perl problem but one of how your
terminal is set up. If the terminal isn't started with the cor-
rect setting for LC_CTYPE then it won't display Unicode charac-
ters, no matter what you set afterwards - that is only seen by
programs that you start from that terminal. They then might try
to output UTF-8 but the terminal doesn't know how to display
them. The simplest thing probably would be to start a new ter-
minal with LC_CTYPE set to something reasonable, like for ex-
ample with the command 

LC_CTYPE=en_US.UTF-8 xterm

Then the new xterm you started should display UTF-8 quite
fine (assuming that the en_US.UTF-8 locale is installed on
your machine).

To make that setting of LC_CTYPE the default you could add a
line of

export LC_CTYPE=en_US.UTF8

into your .bashrc file, or to make it the system-wise default,
into /etc/bash.bashrc.

Now, getting a Perl sript to deal correctly with UTF-8 is still
another thing. If it takes input from files etc. it may have to
indicate that it expects UTF-8 from them in the call of open(),
 e.g. by using

open my $f, '<:utf8', $filename;

But that's just one point. And since you don't show any Perl
code it's too hard to guess what you may need.

                               Regards, Jens
-- 
  \   Jens Thoms Toerring  ___      jt@toerring.de
   \__________________________      http://toerring.de


------------------------------

Date: Wed, 22 Sep 2010 09:20:05 +0200
From: Frank van Bortel <fbortel@home.nl>
Subject: Re: Displaying 'umlaut' character
Message-Id: <92c84$4c99ae25$524ba3af$28867@cache6.tilbu1.nb.home.nl>

On 09/22/2010 06:50 AM, dn.perl@gmail.com wrote:
>
> My aim is to display the ‘special’ (NON-Ascii) German character/
> diacritic umlaut or diaresis correctly on a browser. The browser calls
> a cgi perl-script which resides on a linux server. The browser which
> calls the perl-script displays Vietnamese characters correctly (but
> not the umlaut) without any special setting. The script sets NLS_LANG
> variable to AMERICAN_AMERICA.UTF8 and uses utf8 module, but that’s
> about it.
>
> $ENV{'NLS_LANG'}='AMERICAN_AMERICA.UTF8';
>      Works for Vietnamese characters, but not with umlaut (ö).
>
> But even before we get to a perl-script, perhaps the LC_CTYPE env
> variable needs to be set correctly. From my windows laptop, if I
> access Oracle through Oracle Query Server, I can see the umlaut. But
> if I open a linux-window, initiate an sqlplus session, and run the
> same SQL, I do not see the umlaut correctly. I have tried a few values
> for the env variable LC_CTYPE (like iso_8859_1, en_US,
> en_US.iso88591), but with no luck. The surprising thing is that
> ‘umalut’ is a muck-known alphabet, Vietnamese alphabets are less-
> known. Yet the Vietnamese characters are being displayed correctly.
>
> What settings should I use in a perl-script or for a linux-window to
> see the umlaut correctly? Please advise.
>
Maybe this helps: (shameless self promotion)
http://vanbortel.blogspot.com/2009/04/special-characters-part-i.html
Last part is here:
http://vanbortel.blogspot.com/2010/01/special-characters-part-iv.html
-- 

Regards,

Frank van Bortel


------------------------------

Date: Wed, 22 Sep 2010 09:36:59 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Displaying 'umlaut' character
Message-Id: <slrni9jcgt.9om.hjp-usenet2@hrunkner.hjp.at>

On 2010-09-22 06:01, Ben Morrow <ben@morrow.me.uk> wrote:
> Quoth "dn.perl@gmail.com" <dn.perl@gmail.com>:
>> My aim is to display the ‘special’ (NON-Ascii) German character/
>> diacritic umlaut or diaresis correctly on a browser. The browser calls
>> a cgi perl-script which resides on a linux server. The browser which
>> calls the perl-script displays Vietnamese characters correctly (but
>> not the umlaut) without any special setting. The script sets NLS_LANG
>> variable to AMERICAN_AMERICA.UTF8 and uses utf8 module, but that’s
>> about it.
>
> You almost certainly don't want to do either of those. 'use utf8' does
> exactly one thing: it tells Perl your script itself is written in UTF-8.
> If that isn't the case you don't want to use it. Perl also doesn't take
> any notice of NLS_LANG or any of the other locale envvars unless you ask
> it to (and, normally, that's a bad idea). However, it's possible that
> whatever database interface you're using does.
>
>> $ENV{'NLS_LANG'}='AMERICAN_AMERICA.UTF8';
>>     Works for Vietnamese characters, but not with umlaut (ö).
>
> I don't think that's usually a valid locale on a Linux system. Usually
> they are of the form 'en_US.UTF-8', but in any case if you need locales
> at all you will want to check which locales are available on your
> system.

The NLS_LANG environment variable is for Oracle. He does need that if he
wants to get anything but US-ASCII out of (or into) an Oracle database.
AMERICAN_AMERICA.UTF8 is a valid locale for Oracle, but for Oracle 9 or
later you should use .AL32UTF8 instead of .UTF8 (.AL32UTF8 is real
UTF-8, .UTF8 is a weird mixture of UTF-8 and UTF-16).



>> But even before we get to a perl-script, perhaps the LC_CTYPE env
>> variable needs to be set correctly. From my windows laptop, if I
>> access Oracle through Oracle Query Server, I can see the umlaut. But
>> if I open a linux-window,

Whatever "a linux window" may be. Putty? An X server? A VM running on
the windows host? Whatever it is, NLS_LANG must match the character set
used by the terminal emulator.

>> initiate an sqlplus session, and run the same SQL, I do not see the
>> umlaut correctly. I have tried a few values for the env variable
>> LC_CTYPE (like iso_8859_1, en_US, en_US.iso88591), but with no luck.
>> The surprising thing is that ‘umalut’ is a muck-known alphabet,
>> Vietnamese alphabets are less- known. Yet the Vietnamese characters
>> are being displayed correctly.
>> 
>> What settings should I use in a perl-script or for a linux-window to
>> see the umlaut correctly? Please advise.
>
> OK. What is actually stored in the database (what data types are you
> using, and how is the data encoded before being stored)? How are you
> getting the data out of the database (the only correct answer here is
> 'DBI', or possibly a wrapper around that)? Have you read the DBI and
> DBD::Oracle docs for anything concerning character encodings? Have you
> read perlunitut and the other docs that refers you to?
>
> FWIW when I do this sort of thing I use Postgres with DBD::Pg, I set the
> database encoding to UTF-8 (this is a Pg-specific feature, but I
> wouldn't be surprised if Ora has got something similar),

DBD::Oracle does this if NLS_LANG includes a UTF-8-like character set.
Since he has set that correctly he gets wide characters back from the
database. The umlauts all have character codes <= 0xFF, so they can be
printed as a single byte and perl does that. The vietnamese characters
have codes >= 0x0100, so Perl converts them to UTF-8 (I bet he has a lot
of "Wide character in print" warnings in log file).

> I push an :encoding(utf8) layer onto any filehandles, I make sure to
> send a 'Content-type: text/html; charset=utf-8' header, and everything
> Just Works. There are variations on that which work just as well, but
> that's by far the simplest approach.

ACK. The OP is probably missing the :encoding(utf8) layer.

	hp



------------------------------

Date: Wed, 22 Sep 2010 10:13:42 +0200
From: Frank van Bortel <fbortel@home.nl>
Subject: Re: Displaying 'umlaut' character
Message-Id: <2801b$4c99bab7$524ba3af$13858@cache1.tilbu1.nb.home.nl>

On 09/22/2010 06:50 AM, dn.perl@gmail.com wrote:
>
> My aim is to display the ‘special’ (NON-Ascii) German character/
> diacritic umlaut or diaresis correctly on a browser. The browser calls
> a cgi perl-script which resides on a linux server. The browser which
> calls the perl-script displays Vietnamese characters correctly (but
> not the umlaut) without any special setting. The script sets NLS_LANG
> variable to AMERICAN_AMERICA.UTF8 and uses utf8 module, but that’s
> about it.
>
> $ENV{'NLS_LANG'}='AMERICAN_AMERICA.UTF8';
>      Works for Vietnamese characters, but not with umlaut (ö).
>
> But even before we get to a perl-script, perhaps the LC_CTYPE env
> variable needs to be set correctly. From my windows laptop, if I
> access Oracle through Oracle Query Server, I can see the umlaut. But
> if I open a linux-window, initiate an sqlplus session, and run the
> same SQL, I do not see the umlaut correctly. I have tried a few values
> for the env variable LC_CTYPE (like iso_8859_1, en_US,
> en_US.iso88591), but with no luck. The surprising thing is that
> ‘umalut’ is a muck-known alphabet, Vietnamese alphabets are less-
> known. Yet the Vietnamese characters are being displayed correctly.
>
> What settings should I use in a perl-script or for a linux-window to
> see the umlaut correctly? Please advise.
>

Apart from what I replied earlier, the correct way to encode
is of course "&ouml;" (without the quotes...)
As this is all ASCII, no problems should arise.
-- 

Regards,

Frank van Bortel


------------------------------

Date: Wed, 22 Sep 2010 10:16:01 +0200
From: Helmut Richter <hhr-m@web.de>
Subject: Re: Displaying 'umlaut' character
Message-Id: <Pine.LNX.4.64.1009220955010.4549@lxhri01.lrz.lrz-muenchen.de>

On Wed, 22 Sep 2010, Jens Thoms Toerring wrote:

> And a web server normally
> sends a HTML header with the page that may contain a line
> like
> 
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
> 
> That tells the browser which type of character set to use
> when displaying the page it got from the server.

Caution: A Web server sends an HTTP header (this is *not* a part of the 
text of the Web page, in particular, it has *not* the form of an HTML tag 
like <meta>) telling the MIME type, e.g. "text/html" and *optionally* 
containing a charset specification, e.g. "charset=utf-8". The Web page may 
*optionally* contain such a <meta> tag. The Web server is not obliged to 
send the HTTP header suggested in the <meta> tag, and most servers don't 
-- I am not sure any of them does.

In the special case that the page is generated by a CGI script, the output 
of the script contains *both* the HTTP header and the HTML text.

If there is a character code specified in the HTTP header, it takes 
precedence. If there is none, the one in the <meta> tag is honoured. 
Opinions are divided whether one should use the <meta> tag, as it has not 
always the intended effect, to wit when the server sends an HTTP header 
with a diverging code specification. I prefer using it for documentation 
and for those cases where there is no other code specification. More 
important than that it is present is that it is true if present -- if 
true, it does never any harm. All these specifications only describe what 
code is used in the content; they do not enforce the code.

Back to perl: Whatever your problem is (which is by no means obvious), you 
won't be able to understand it, let alone fix it, before knowing what is 
written in http://perldoc.perl.org/perlunitut.html.

-- 
Helmut Richter


------------------------------

Date: Wed, 22 Sep 2010 09:09:02 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Displaying 'umlaut' character
Message-Id: <u5qom7-io5.ln1@osiris.mauzo.dyndns.org>


Quoth jt@toerring.de (Jens Thoms Toerring):
> In comp.lang.perl.misc dn.perl@gmail.com <dn.perl@gmail.com> wrote:
> 
> > My aim is to display the ‘special’ (NON-Ascii) German character/
> > diacritic umlaut or diaresis correctly on a browser. The browser calls
> > a cgi perl-script which resides on a linux server. The browser which
> > calls the perl-script displays Vietnamese characters correctly (but
> > not the umlaut) without any special setting.
> 
> Stop right here. If you mean with "browser" something like
> firefox, Internet Explorer etc. then there's some mis-under-
> standing here. The browser does not "call" a cgi-script. The
> browser just sends a request to the server which in turn may
> call a cgi-script (that may be written in Perl) and then sends
> the results back to the browser. And a web server normally
> sends a HTML header with the page that may contain a line
> like
> 
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

I think you mean 'the page may contain a <head> section which may
contain a line like...'. Also, it's pretty-much always better to put
that in the HTTP header.

<snip>
> Now, getting a Perl sript to deal correctly with UTF-8 is still
> another thing. If it takes input from files etc. it may have to
> indicate that it expects UTF-8 from them in the call of open(),
>  e.g. by using
> 
> open my $f, '<:utf8', $filename;

Don't do that. If the file contains invalid UTF-8, you will get strange
behaviour up to and including perl segfaults. Use :encoding(utf8)
instead: it's a little slower, but much safer. (This doesn't, in
general, apply to filehandles open for output only.)

Ben



------------------------------

Date: Wed, 22 Sep 2010 09:19:25 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Displaying 'umlaut' character
Message-Id: <dpqom7-io5.ln1@osiris.mauzo.dyndns.org>


Quoth "Peter J. Holzer" <hjp-usenet2@hjp.at>:
> On 2010-09-22 06:01, Ben Morrow <ben@morrow.me.uk> wrote:
> >
> > You almost certainly don't want to do either of those. 'use utf8' does
> > exactly one thing: it tells Perl your script itself is written in UTF-8.
> > If that isn't the case you don't want to use it. Perl also doesn't take
> > any notice of NLS_LANG or any of the other locale envvars unless you ask
> > it to (and, normally, that's a bad idea). However, it's possible that
> > whatever database interface you're using does.
> >
> >> $ENV{'NLS_LANG'}='AMERICAN_AMERICA.UTF8';
> >>     Works for Vietnamese characters, but not with umlaut (ö).
> >
> > I don't think that's usually a valid locale on a Linux system. Usually
> > they are of the form 'en_US.UTF-8', but in any case if you need locales
> > at all you will want to check which locales are available on your
> > system.
> 
> The NLS_LANG environment variable is for Oracle. He does need that if he
> wants to get anything but US-ASCII out of (or into) an Oracle database.
> AMERICAN_AMERICA.UTF8 is a valid locale for Oracle, but for Oracle 9 or
> later you should use .AL32UTF8 instead of .UTF8 (.AL32UTF8 is real
> UTF-8, .UTF8 is a weird mixture of UTF-8 and UTF-16).

Ah, I see. (I don't use Oracle.) I was getting confused with NLSPATH
used by catgets(3), I think.

Weird choice of environment variable: I would expect something prefixed
with OC8 or some such. <shrug> I guess it's just part of the 'we own the
whole world' Oracle mentality... :)

> > FWIW when I do this sort of thing I use Postgres with DBD::Pg, I set the
> > database encoding to UTF-8 (this is a Pg-specific feature, but I
> > wouldn't be surprised if Ora has got something similar),
> 
> DBD::Oracle does this if NLS_LANG includes a UTF-8-like character set.

In Pg this is a per-database setting indicating how the strings are
stored as well as how they are returned by default; asking for
per-connection on-the-fly reencoding is different. (Not really important
here, I know.)

> Since he has set that correctly he gets wide characters back from the
> database. The umlauts all have character codes <= 0xFF, so they can be
> printed as a single byte and perl does that. The vietnamese characters
> have codes >= 0x0100, so Perl converts them to UTF-8 (I bet he has a lot
> of "Wide character in print" warnings in log file).

Yup. This presumably means he *is* correctly sending the charset
Content-type parameter, otherwise the situation would be exactly
reversed.

Ben



------------------------------

Date: Wed, 22 Sep 2010 09:22:26 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Displaying 'umlaut' character
Message-Id: <2vqom7-io5.ln1@osiris.mauzo.dyndns.org>


Quoth frank.van.bortel@gmail.com:
> On 09/22/2010 06:50 AM, dn.perl@gmail.com wrote:
> >
> > My aim is to display the ‘special’ (NON-Ascii) German character/
> > diacritic umlaut or diaresis correctly on a browser. The browser calls
> > a cgi perl-script which resides on a linux server. The browser which
> > calls the perl-script displays Vietnamese characters correctly (but
> > not the umlaut) without any special setting. The script sets NLS_LANG
> > variable to AMERICAN_AMERICA.UTF8 and uses utf8 module, but that’s
> > about it.
> >
> > $ENV{'NLS_LANG'}='AMERICAN_AMERICA.UTF8';
> >      Works for Vietnamese characters, but not with umlaut (ö).
> >
> > But even before we get to a perl-script, perhaps the LC_CTYPE env
> > variable needs to be set correctly. From my windows laptop, if I
> > access Oracle through Oracle Query Server, I can see the umlaut. But
> > if I open a linux-window, initiate an sqlplus session, and run the
> > same SQL, I do not see the umlaut correctly. I have tried a few values
> > for the env variable LC_CTYPE (like iso_8859_1, en_US,
> > en_US.iso88591), but with no luck. The surprising thing is that
> > ‘umalut’ is a muck-known alphabet, Vietnamese alphabets are less-
> > known. Yet the Vietnamese characters are being displayed correctly.
> >
> > What settings should I use in a perl-script or for a linux-window to
> > see the umlaut correctly? Please advise.
> >
> 
> Apart from what I replied earlier, the correct way to encode
> is of course "&ouml;" (without the quotes...)
> As this is all ASCII, no problems should arise.

Also note that if you push :encoding(US-ASCII) with
$PerlIO::encoding::fallback set to Encode::FB_XMLCREF Perl will do the
conversion for you (well, it'll give you &#xHHHH; entities, but that's
equivalent). (Yes, this is a really nasty interface.)

Ben



------------------------------

Date: Wed, 22 Sep 2010 08:44:19 +0200
From: Bart Lateur <bart.lateur@telenet.be>
Subject: Re: FAQ 1.14 What is a JAPH?
Message-Id: <1d9j96h4l4tf5nefr32ofuhsc78vh6nu8o@4ax.com>

Randal L. Schwartz wrote:

>I think the first non-Randal JAPH was in fact, Japhy.  What ever
>happened to him?

Since a week or so he's back on Perlmonks. I think PHP was what happened
to him.

-- 
	Bart.


------------------------------

Date: Wed, 22 Sep 2010 08:13:15 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: FAQ 1.14 What is a JAPH?
Message-Id: <btmom7-5g5.ln1@osiris.mauzo.dyndns.org>


Quoth Bart Lateur <bart.lateur@telenet.be>:
> Randal L. Schwartz wrote:
> 
> >I think the first non-Randal JAPH was in fact, Japhy.  What ever
> >happened to him?
> 
> Since a week or so he's back on Perlmonks. I think PHP was what happened
> to him.

Eww. I wish him a speedy recovery.

Ben

(SCNR)



------------------------------

Date: Tue, 21 Sep 2010 10:33:17 -0700 (PDT)
From: nahum_barnea <nahum_barnea@yahoo.com>
Subject: Getopt:Long arguments that are not (options or option values)
Message-Id: <02dc3d7d-3465-4568-ab39-dabd0d3b6ad8@f26g2000vbm.googlegroups.com>

Hi Group.
I use Perl "use Getopt::Long;" .
Now, I know how to get command line options and their values.
But I want to get into a Perl array all the command line arguments
that are NOT options and NOT values.

For example:

 ./myscript.pl -f option_val_of_f -g option_val_of_g   arg1 arg2 arg3

I would like to get a Perl array with arg1,arg2,arg3  .

Do you know how to do such stuff?

Thanks,
NAHUM


------------------------------

Date: Tue, 21 Sep 2010 19:58:18 +0200
From: Wolf Behrenhoff <NoSpamPleaseButThisIsValid3@gmx.net>
Subject: Re: Getopt:Long arguments that are not (options or option values)
Message-Id: <4c98f23b$0$6976$9b4e6d93@newsspool4.arcor-online.net>

On 21.09.2010 19:33, nahum_barnea wrote:
> Hi Group.
> I use Perl "use Getopt::Long;" .
> Now, I know how to get command line options and their values.
> But I want to get into a Perl array all the command line arguments
> that are NOT options and NOT values.
> 
> For example:
> 
> ./myscript.pl -f option_val_of_f -g option_val_of_g   arg1 arg2 arg3
> 
> I would like to get a Perl array with arg1,arg2,arg3  .

Look at @ARGV.

Wolf


------------------------------

Date: Tue, 21 Sep 2010 10:59:18 -0700 (PDT)
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: Getopt:Long arguments that are not (options or option values)
Message-Id: <15072431-14b0-4aea-9113-de290b87bc52@c21g2000vba.googlegroups.com>

On Sep 21, 11:33=A0am, nahum_barnea <nahum_bar...@yahoo.com> wrote:
> I use Perl "use Getopt::Long;" .
> Now, I know how to get command line options and their values.
> But I want to get into a Perl array all the command line arguments
> that are NOT options and NOT values.
>
> For example:
>
> ./myscript.pl -f option_val_of_f -g option_val_of_g =A0 arg1 arg2 arg3
>
> I would like to get a Perl array with arg1,arg2,arg3 =A0.


   Just check @ARGV after you call GetOptions().  Calling GetOptions()
will cause the options (and their values) to be removed from the @ARGV
array, leaving only the non-option-arguments behind.

   So then, @ARGV has exactly what you're looking for (but only AFTER
you call GetOptions()).

   I hope this helps, Nahum.

   -- Jean-Luc


------------------------------

Date: Tue, 21 Sep 2010 20:28:28 +0200
From: Peter Makholm <peter@makholm.net>
Subject: Re: Getopt:Long arguments that are not (options or option values)
Message-Id: <87vd5z3q9f.fsf@vps1.hacking.dk>

nahum_barnea <nahum_barnea@yahoo.com> writes:

> Now, I know how to get command line options and their values.
> But I want to get into a Perl array all the command line arguments
> that are NOT options and NOT values.

Read the documentation for the module and look for the 'pass_through'
configuration option.

//Makholm



------------------------------

Date: Wed, 22 Sep 2010 00:13:52 -0700 (PDT)
From: jwcarlton <jwcarlton@gmail.com>
Subject: Re: Removing tag + closing tag
Message-Id: <ea8d65b7-22cd-484e-846d-d5709cf981cb@c21g2000vba.googlegroups.com>

On Sep 20, 11:25=A0pm, Tad McClellan <ta...@seesig.invalid> wrote:
> jwcarlton <jwcarl...@gmail.com> wrote:
> > On Sep 20, 9:18=A0pm, J=FCrgen Exner <jurge...@hotmail.com> wrote:
> >> jwcarlton <jwcarl...@gmail.com> wrote:
> >> >Let's say I have something like this:
>
> >> >$var =3D "<font background=3D'#F5F5F5'>Here is some <font
> >> >color=3D'#DADADA'>text</font>. Cool, huh?</font>";
>
> >> >I want to remove <font background=3D'#F5F5F5'> and it's matching </
> >> >font>, but not the nested font tags.
> > Tad, I guess you got me! LOL =A0TECHNICALLY, that works on my sample,
>
> My (secondary) point was that you have not taken care to make a
> good sample.
>
> > but doesn't quite solve the overall problem.
>
> I don't intend to solve your problems.
>
> I don't normally see any of your posts. =A0I was "slumming" down in
> the killfiled score range (I was bored).
>
> You make your reputation, and then you live with it.
>
> --
> Tad McClellan
> email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
> The above message is a Usenet post.
> I don't recall having given anyone permission to use it on a Web site.

Reputation? To my knowledge, I have no enemies here.

Methinks you may just be filtering people that use Google Groups. If
so, no skin off my back; others did a grand job answering, so all you
really contributed was unnecessary BS.


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 3139
***************************************


home help back first fref pref prev next nref lref last post