[29540] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 784 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Aug 22 18:09:42 2007

Date: Wed, 22 Aug 2007 15:09:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 22 Aug 2007     Volume: 11 Number: 784

Today's topics:
    Re: Can I load a package's Autoload-able subs without e anno4000@radom.zrz.tu-berlin.de
    Re: FAQ 8.39 How do I set CPU limits? <brian.d.foy@gmail.com>
    Re: FAQ 8.39 How do I set CPU limits? xhoster@gmail.com
        Mac: Perl script that will run when double-clicked  amirkarger@gmail.com
    Re: Mac: Perl script that will run when double-clicked <spamtrap@dot-app.org>
    Re: On redhat, different users = different @INC <m@rtij.nl.invlalid>
    Re: Problem Creating Socket : Permission Denied <m@rtij.nl.invlalid>
        Starting with SOAP  <perl4hire@softouch.on.ca>
    Re: Starting with SOAP xhoster@gmail.com
    Re: Starting with SOAP <perl4hire@softouch.on.ca>
    Re: Stumped: returning a read pipe from a function xhoster@gmail.com
    Re: Symbolic representation of logical operators jgraber@ti.com
    Re: Unexpected 1 in Error File From DBI->connect <bsmith@sudleydeplacespam.com>
    Re: UTF-8 problem <vachkov@math.tu-berlin.de>
    Re: UTF-8 problem <m@rtij.nl.invlalid>
    Re: UTF-8 problem <vachkov@math.tu-berlin.de>
    Re: UTF-8 problem <m@rtij.nl.invlalid>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 22 Aug 2007 15:42:44 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Can I load a package's Autoload-able subs without executing them?
Message-Id: <5j33rkF3s0bbqU1@mid.dfncis.de>

 <w.c.humann@arcor.de> wrote in comp.lang.perl.misc:
> I want to be able to really load all subs in a given package,
> including all load-on-demand subs. I don't want to execute them in the
> process. Ideally it should work irrespective of the mechanism (e.g.
> AutoLoader, SelfLoader, load-pragma) used inside the module. But the
> next best thing would be a solution just for "use AutoLoader"-modules.
> 
> I could manually look for ".al"-files and require them but that's not
> very elegant as it uses knowledge about internals I shouldn't need to
> have.

That would only cover cases where an external *.al file is used.

> Why do I want to do this? I'm deriving from Devel::TraceCalls to see
> what's going on inside my Perl/Tk application. For a while I wondered
> why some functions never got traced. Well, when the tracer is
> instantiated they are not there -- they are autoloaded later -- so the
> tracer can't wrap them...

Programmers frequently would like to do that.  Yours is as good a reason
as any.

The sad truth is that it can't be done.  All the mechanisms you mention
are based on the behavior of the AUTOLOAD routine, which is called
before the interpreter gives up on a sub name.  This is described
in perlsub.  One fact is that AUTOLOAD doesn't even have to define
the function it is called to "autoload".  It could perform the
function directly, depending on the sub name, and never leave a trace.

Anno


------------------------------

Date: Wed, 22 Aug 2007 08:21:43 -0700
From: brian d  foy <brian.d.foy@gmail.com>
Subject: Re: FAQ 8.39 How do I set CPU limits?
Message-Id: <220820070821437799%brian.d.foy@gmail.com>

In article <1187735666.719070.239920@e9g2000prf.googlegroups.com>, Bill
H <bill@ts1000.us> wrote:

> I like these perl faq postings and have learned a lot from them, but
> is it possible to get more details in them? For instance this one.
> This is a very interesting subject:
> 
>  8.39: How do I set CPU limits?
> 
> But it only gives a one line answer:
> 
>      Use the BSD::Resource module from CPAN.
> 
> Could we get a little more detail in these? 

That answer is a bit light, but most FAQ answers give you what you need
to know to do more research on your own. They aren't meant to be
everything you need to know on the subject.

If someone has more information on any of the entries, I can
incorporate that into the current answers.

Thanks,

-- 
Posted via a free Usenet account from http://www.teranews.com



------------------------------

Date: 22 Aug 2007 16:43:51 GMT
From: xhoster@gmail.com
Subject: Re: FAQ 8.39 How do I set CPU limits?
Message-Id: <20070822124353.549$V4@newsreader.com>

Bill H <bill@ts1000.us> wrote:
> On Aug 21, 9:03 am, PerlFAQ Server <br...@stonehenge.com> wrote:
> >
> > --------------------------------------------------------------------
> >
> > 8.39: How do I set CPU limits?
> >
> >     Use the BSD::Resource module from CPAN.
> >
> > --------------------------------------------------------------------
> >
>
> I like these perl faq postings and have learned a lot from them, but
> is it possible to get more details in them? For instance this one.
> This is a very interesting subject:
>
>  8.39: How do I set CPU limits?
>
> But it only gives a one line answer:
>
>      Use the BSD::Resource module from CPAN.
>
> Could we get a little more detail in these?

I agree.  The docs for BSD::Resource seem to really be targeted to people
you already understand the use of these limits from C and are now just
trying to port that knowledge to Perl.  It isn't very good for people
who are new to the concept in the first place.  Also, many of the examples
it gives are incomplete--they depend on using variables that have never
been set, and isn't obvious what they are examples of in the first place.


> A cursory (and I do mean
> cursory) look at BSD::Resourse on CPAN gives details on how to
> implement it but really doesnt explain upfront what it will actually
> do.

Down somewhere in the guts it does say:

       Processes have soft and hard resource limits.  On crossing
       the soft limit they receive a signal (for example the
       "SIGXCPU" or "SIGXFSZ", corresponding to the "RLIMIT_CPU"
       and "RLIMIT_FSIZE", respectively).  The processes can trap
       and handle some of these signals, please see "Signals" in
       perlipc.  After the hard limit the processes will be ruth-
       lessly killed by the "KILL" signal which cannot be caught.

But it isn't easy to find and understand for someone who doesn't already
know the answer.


> Does it set how much CPU time a program can use, or a percentage
> of usage, or a time period before it will be stopped?

CPU time, not percentage, and not wall time.

How about something like this:

8.39: How do I set CPU limits?

    Use the BSD::Resource module from CPAN.
    As an example:

    use BSD::Resource;
    setrlimit(RLIMIT_CPU,10,20) or die $!;

    This sets the soft and hard limits to 10 and 20 seconds, respectively.
    After 10 seconds of time spent running on the CPU (not "wall" time),
    the process will be sent a signal (XCPU on some systems) which, if not
    trapped, will cause the process to terminate.  If that signal is
    trapped, then after 10 more seconds (20 seconds in total) the process
    will be killed with a non-trappable signal.

    See the BSD::Resource and your systems documentation for the gory
    details.

It would also be nice if it described what other modules to use on
systems that don't support BSD::Resource, if anyone can contribute
that information.


Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: Wed, 22 Aug 2007 10:28:03 -0700
From:  amirkarger@gmail.com
Subject: Mac: Perl script that will run when double-clicked
Message-Id: <1187803683.519241.275480@q3g2000prf.googlegroups.com>

(This may be more a Mac question than a Perl question.)

When I double-click on a Perl script, it opens in TextEdit. Can I tell
Mac to run Perl scripts when they're double-clicked? The "Open With"
menu only lets me pick .apps, and perl isn't one.

It's hard to get good Google results with words like "mac double click
perl". Is there something simple like Windows' file types?

-Amir Karger



------------------------------

Date: Wed, 22 Aug 2007 14:47:32 -0400
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: Mac: Perl script that will run when double-clicked
Message-Id: <m2tzqrv22j.fsf@dot-app.org>

amirkarger@gmail.com writes:

> When I double-click on a Perl script, it opens in TextEdit. Can I tell
> Mac to run Perl scripts when they're double-clicked? The "Open With"
> menu only lets me pick .apps, and perl isn't one.

Open it with Terminal.app.

sherm--

-- 
Web Hosting by West Virginians, for West Virginians: http://wv-www.net
Cocoa programming in Perl: http://camelbones.sourceforge.net


------------------------------

Date: Wed, 22 Aug 2007 22:03:22 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: On redhat, different users = different @INC
Message-Id: <pan.2007.08.22.20.03.22@rtij.nl.invlalid>

On Mon, 20 Aug 2007 12:00:34 -0700, Russ wrote:

> Hi,
> 
> We have RedHat 4EL and perl 5.8.5.  Per a user's request I installed
> Date:Simple, using 	perl -MCPAN -e shell as the root user.

Next time, do a 'yum install perl-Date-Simple'.

HTH,
M4


------------------------------

Date: Wed, 22 Aug 2007 22:04:25 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: Problem Creating Socket : Permission Denied
Message-Id: <pan.2007.08.22.20.04.25@rtij.nl.invlalid>

On Mon, 20 Aug 2007 15:23:27 +0000, vivekian wrote:

> Hi,
> 
> I have a cgi script which opens a telnet session to a switch. The script
> is listed below. It executes fine on command line. When i call it via a
> web browser, the error log shows the following error :
> 
> problem creating socket: Permission denied at /var/www/html/vlab/cgi-
> bin/init_lab.cgi line 10
> 
> Other scripts which don't have sockets execute fine.

Just a guess, but could it be selinux that gets in the way?

M4


------------------------------

Date: Wed, 22 Aug 2007 13:24:06 -0400
From: Amer Neely <perl4hire@softouch.on.ca>
Subject: Starting with SOAP 
Message-Id: <nh_yi.361$k22.139@read2.cgocable.net>

I need to update a script on one server with data from a form on another 
server. It has been suggested that SOAP would work for this. I've never 
used SOAP, and am overwhelmed with the number of 'SOAP*' modules on 
CPAN. I've read that perhaps I should use a language with better support 
for SOAP (PHP ?) but the existing script is in Perl and I'd prefer to 
stick with that if possible.

I've got some bookmarked tutorials / documents etc. which I am reading 
through but really need some very basic direction as to what modules I 
might need to get started with this. I've successfully installed 
SOAP-0.28 on the server where the data will be coming from (the 
client?), but just need a nudge in the right direction.
-- 
Amer Neely
Perl | MySQL programming for all data entry forms.
"Others make web sites. We make web sites work!"


------------------------------

Date: 22 Aug 2007 20:39:56 GMT
From: xhoster@gmail.com
Subject: Re: Starting with SOAP
Message-Id: <20070822163958.537$sI@newsreader.com>

Amer Neely <perl4hire@softouch.on.ca> wrote:
> I need to update a script on one server with data from a form on another
> server. It has been suggested that SOAP would work for this. I've never
> used SOAP, and am overwhelmed with the number of 'SOAP*' modules on
> CPAN. I've read that perhaps I should use a language with better support
> for SOAP (PHP ?) but the existing script is in Perl and I'd prefer to
> stick with that if possible.

It sounds like the tail is wagging the dog.  For one thing, you probably
shouldn't update scripts based on form submissions.  Why not update some
database that the script accesses?  That would probably solve the problem
right there.  But if you want Perl script-to-Perl script communication,
pick a protocol that Perl is good at, rather than picking a random protocol
and then figure out to implement in Perl.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: Wed, 22 Aug 2007 17:44:16 -0400
From: Amer Neely <perl4hire@softouch.on.ca>
Subject: Re: Starting with SOAP
Message-Id: <46CCAE30.3000100@softouch.on.ca>

xhoster@gmail.com wrote:
> Amer Neely <perl4hire@softouch.on.ca> wrote:
>> I need to update a script on one server with data from a form on another
>> server. It has been suggested that SOAP would work for this. I've never
>> used SOAP, and am overwhelmed with the number of 'SOAP*' modules on
>> CPAN. I've read that perhaps I should use a language with better support
>> for SOAP (PHP ?) but the existing script is in Perl and I'd prefer to
>> stick with that if possible.
> 
> It sounds like the tail is wagging the dog.  For one thing, you probably
> shouldn't update scripts based on form submissions.  Why not update some
> database that the script accesses?  That would probably solve the problem
> right there.  But if you want Perl script-to-Perl script communication,
> pick a protocol that Perl is good at, rather than picking a random protocol
> and then figure out to implement in Perl.
> 
> Xho
> 

Sounds like good advice. However the 'other script' is not in my 
control, and I'm not even sure it is Perl - likely PHP. The owner is the 
one looking for a SOAP solution. They are asking for an XML document of 
the form data.

At present the form data is not being saved in a database, so that is 
not an immediate solution, although I could present that to my client 
and the 3rd party.

I have managed to get some headway on some test scripts. But an error 
message is confusing me.

The server code:
#! /usr/bin/perl
## test using SOAP to display values from another script

BEGIN
{
	open (STDERR,">>$0-err.txt");
	print STDERR "\n",scalar localtime,"\n";
}

use strict;
use warnings;

use lib '/home/softouch/public_html/cgi-bin/PerlMods/SOAP-0.28/blib/lib';
use SOAP::Transport::HTTP;
SOAP::Transport::HTTP::CGI
-> dispatch_to('ShowMe')
-> handle;

package LarMar;

sub ShowMe
{
	my $incoming = shift;
	return "$incoming\n";
}

######################################

The error:
Can't locate SOAP/Transport/HTTP.pm in @INC (@INC contains: 
/home/softouch/public_html/cgi-bin/PerlMods/SOAP-0.28/blib/lib 
/usr/lib/perl5/5.8.8/i686-linux /usr/lib/perl5/5.8.8 
/usr/lib/perl5/site_perl/5.8.8/i686-linux /usr/lib/perl5/site_perl/5.8.8 
/usr/lib/perl5/site_perl/5.8.7/i686-linux /usr/lib/perl5/site_perl/5.8.7 
/usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 
/usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl/5.8.2 
/usr/lib/perl5/site_perl/5.8.1 /usr/lib/perl5/site_perl/5.8.0 
/usr/lib/perl5/site_perl/5.6.2 /usr/lib/perl5/site_perl .) at 
larmar_server.pl line 14.
BEGIN failed--compilation aborted at larmar_server.pl line 14.
#######################################

It seems that it is looking for HTTP.pm, but HTTP is a directory under 
SOAP/Transport. CGI.pm is in the HTTP directory.

This is modified from a script in the SOAP::Lite distribution.
-- 
Amer Neely
Perl | MySQL programming for all data entry forms.
"Others make web sites. We make web sites work!"


------------------------------

Date: 22 Aug 2007 16:03:16 GMT
From: xhoster@gmail.com
Subject: Re: Stumped: returning a read pipe from a function
Message-Id: <20070822120317.864$pe@newsreader.com>

anno4000@radom.zrz.tu-berlin.de wrote:
>  <xhoster@gmail.com> wrote in comp.lang.perl.misc:
>
> [...]
>
> > Hmm.  That makes me wonder, when you do an ordinary pipe open
> > (not IPC::Open? open), the corresponding close automatically waits on
> > the child.  How does it know what pid to wait on?
>
> It has been my understanding that close() waits for one child to finish,
> never mind the PID.

A linux strace of a simple program fork open and close on my system shows
that it waits for the specific pid, using "wait4".  Of course this could be
a system dependent thing.

> That's why specific "waitpid $my_known_pid" doesn't
> mix well with system() and friends -- Perl's child handling may have
> snatched the PID you're waiting for.

I've never noticed that problem (again, on linux).  In my experience it is
the *unspecific* waitpid (i.e. waitpid -1,...  Or just regular wait) done
in a $SIG{CHLD} handler that doesn't play nicely with system and qx.

>
> > The pid must be stored
> > somewhere in the resulting file handle, but where?  I tried finding it
> > with Devel::Peek, but couldn't.  Considering my lack of experience with
> > Devel::Peek, I guess that that isn't surprising.
>
> Well, there's glob magic involved.  If it's hidden there, it wouldn't be
> evident from a Devel::Peek::Dump of the file handle.  Another (less
> likely) possibility would be to associate the PID via the handle's
> refaddr (inside- out style).  That would leave no traces at all in the
> file handle.

I just looked at the source for 5.8.7.  It seems to store the pid in
some secret array, with the fd as the index.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: 22 Aug 2007 10:17:11 -0500
From: jgraber@ti.com
Subject: Re: Symbolic representation of logical operators
Message-Id: <yvnps1ffvk8.fsf@famous02.dal.design.ti.com>



Paul Lalli <mritty@gmail.com> writes:

> On Aug 21, 10:08 am, markhob...@hotpop.deletethisbit.com (Mark Hobley)
> wrote:
> > Paul Lalli <mri...@gmail.com> wrote:
> 
> > print (2 || 7);  # 7, both true, but the first value is 2,
> 
> No it doesn't, unless you have some very old or obscure version of
> Perl installed:
> 
> $ perl -le'print (2 || 7);'
> 2
> $
> 
> When you run that, it produces 7?  Really?  If so, please copy and
> paste the output of this:
> perl -v

This is likely due to a typo on OP part, of | vs ||.
 perl -le 'print (2||7)' # prints  2,  logical or
 perl -le 'print (2|7)'  # prints  7,  bitwise num or: 0010b | 0111b = 0111b
 perl -le 'print (2|8)'  # prints 10,  bitwise num or: 0010b | 1000b = 1010b
also note:
 perl -le 'print ("2"|"7")' # prints 7, bitwise ascii or:
        #   00110010b = '2'
        # | 00110111b = '7' =  00110111b = '7'
 perl -le 'print ("2"|"8")' # prints :, bitwise ascii or: 
        #   00110010b = '2'
        # | 00111000b = '7' =  00111010b = ':' 
-- 
Joel


------------------------------

Date: Wed, 22 Aug 2007 16:06:06 GMT
From: Bob Smith <bsmith@sudleydeplacespam.com>
Subject: Re: Unexpected 1 in Error File From DBI->connect
Message-Id: <O9Zyi.14381$ya1.4260@news02.roc.ny>

On 8/22/2007 10:12 AM, Paul Lalli wrote:
> On Aug 22, 10:04 am, Bob Smith <bsm...@sudleydeplacespam.com> wrote:
>> Using perl 5.8.7 and DBI 1.53 on a Linux system, the following
>> function outputs a spurious "1" to the web server's error file
>> on the DBI->connect line:
>>
>> sub DBConnect
>> {
>>    my ($DataBase) = @_;
>>    my $DSN_SFS  = "DBI:mysql:$DataBase";  # Data Source Name
>>    my $DSN_USER = "root";       # ...              (user name)
>>    my $DSN_PWD  = "secret";     # ...              (password)
>>
>>    my %attr = (PrintError => 0,  ## Don't report errors via warn ()
>>                RaiseError => 0   ## Don't report errors via die ()
>>               );
>>    $dbh = DBI->connect ($DSN_SFS, $DSN_USER, $DSN_PWD, \%attr) or die
>>                  print "Can't open database &lt;$DSN_SFS&gt;"
>>                      . "<br />$DBI::errstr";
>>    return $dbh;
>>
>> }
>>
>> Otherwise, the function works just fine.  Any ideas on what
>> could be triggering the spurious output?
> 
> die() takes a string to print to STDERR and exits the program.
> print() takes a string to print to STDOUT and returns 1 if successful.
> 
> die(print("whatever"));
> will therefore print "whatever" to STDOUT, and return 1 to die().
> die() will then print 1 to STDERR and exit the program.
> 
> Change die(print("whatever")) to die("whatever");

Excellent explanation!  Many thanks!

-- 
_________________________________________
Bob Smith -- bsmith@sudleydeplacespam.com

To reply to me directly, delete "despam".


------------------------------

Date: Wed, 22 Aug 2007 17:55:50 +0200
From: Todor Vachkov <vachkov@math.tu-berlin.de>
Subject: Re: UTF-8 problem
Message-Id: <5j34k5F3sc7vgU1@mid.dfncis.de>

Thanks for your replies!

The xml file is really huge - it has 666.025 lines and it is result of an export from a software.

It contents:
        - the meta description of the software itself (i am pretty sure that it is conform to UTF-8)
        - form inputs made by users. Thus, they fill out the software with information about several 
          databases.The goal is to have a distributed search engine. (again, I assume that the software 
          also saves the inputs in UTF-8)
        - perl scripts for each database, which are written by various programmers. The scripts are 
          the interfaces between the databases and the software (the UTF-8 encoding of the scripts is not guaranteed)
All this stuff is contained by the huge XML file.

Parsing the file with XML::LibXML gives:

        >Entity: line 315442: parser error : Input is not proper UTF-8, indicate                                
        >encoding ! 
        >Bytes: 0xE2 0x26 0x6C 0x74

I've figured out that this are the characters :

* U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX
  â (Â)

* U+0026 AMPERSAND
  &

* U+006C LATIN SMALL LETTER L
  l (L)

* U+0074 LATIN SMALL LETTER T
  t (T)

Line 315442 looks:
        ><line>&lt;refpt id=&quot;bafn1&quot;/&gt;&lt;lk refid=&quot;afn1&quot;&gt;&lt;sup&gt;â&lt;/sup&gt;&lt;/lk&gt;</line>
                                                                                              ^

The element <line></line> contains a single line from a perl script as mentioned above. The character 0xE2 was the point, 
where the parser stopped, at line 315442, it went far enough, almost to the half.  

It seems that the perl scripts within are my problem. I'am wondering why this single character is being treated from parser 
as a non utf-8 code point? Could I tell the parser somehow to ignore this?

Thanks for your help!

Greetings, Todor




------------------------------

Date: Wed, 22 Aug 2007 21:08:44 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: UTF-8 problem
Message-Id: <pan.2007.08.22.19.08.44@rtij.nl.invlalid>

On Wed, 22 Aug 2007 17:55:50 +0200, Todor Vachkov wrote:

> Parsing the file with XML::LibXML gives:
> 
>         >Entity: line 315442: parser error : Input is not proper UTF-8,
>         >indicate encoding !
>         >Bytes: 0xE2 0x26 0x6C 0x74
> 
> I've figured out that this are the characters :
> 
> * U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX
>   â (Â)

U+00E2 is Unicode. In utf-8 encoding this would be a two character 
sequence. So your input is not proper utf-8.

HTH,
M4


------------------------------

Date: Wed, 22 Aug 2007 21:52:16 +0200
From: Todor Vachkov <vachkov@math.tu-berlin.de>
Subject: Re: UTF-8 problem
Message-Id: <5j3ifgF3sbgj9U1@mid.dfncis.de>

Martijn Lievaart wrote:

>> Parsing the file with XML::LibXML gives:
>> 
>>         >Entity: line 315442: parser error : Input is not proper UTF-8,
>>         >indicate encoding !
>>         >Bytes: 0xE2 0x26 0x6C 0x74
>> 
>> I've figured out that this are the characters :
>> 
>> * U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX
>>   â (Â)
> 
> U+00E2 is Unicode. In utf-8 encoding this would be a two character
> sequence. So your input is not proper utf-8.

Thanks for your posting!

The parser says: 
        >Bytes: 0xE2 0x26 0x6C 0x74
So 0xE2 is meant to be the problematic character.

U+00E2 was not in the error message, I've just pasted the output of my check on linux with:
        user@timemashine:~$ unicode 0xe2
        U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX
        UTF-8: c3 a2  UTF-16BE: 00e2  Decimal: &#226;
        â (Â)
        Uppercase: U+00C2
        Category: Ll (Letter, Lowercase)
        Bidi: L (Left-to-Right)
        Decomposition: 0061 0302

Greetings Todor


------------------------------

Date: Wed, 22 Aug 2007 22:32:01 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: UTF-8 problem
Message-Id: <pan.2007.08.22.20.31.58@rtij.nl.invlalid>

On Wed, 22 Aug 2007 21:52:16 +0200, Todor Vachkov wrote:

> Martijn Lievaart wrote:
> 
>>> Parsing the file with XML::LibXML gives:
>>> 
>>>         >Entity: line 315442: parser error : Input is not proper
>>>         >UTF-8, indicate encoding !
>>>         >Bytes: 0xE2 0x26 0x6C 0x74
>>> 
>>> I've figured out that this are the characters :
>>> 
>>> * U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX
>>>   â (Â)
>> 
>> U+00E2 is Unicode. In utf-8 encoding this would be a two character
>> sequence. So your input is not proper utf-8.
> 
> Thanks for your posting!
> 
> The parser says:
>         >Bytes: 0xE2 0x26 0x6C 0x74
> So 0xE2 is meant to be the problematic character.
> 
> U+00E2 was not in the error message, I've just pasted the output of my
> check on linux with:
>         user@timemashine:~$ unicode 0xe2
>         U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX UTF-8: c3 a2 
>         UTF-16BE: 00e2  Decimal: &#226; â (Â)
>         Uppercase: U+00C2
>         Category: Ll (Letter, Lowercase)
>         Bidi: L (Left-to-Right)
>         Decomposition: 0061 0302

But 0xE2 seems to be the problematic character. It is not utf-8! Your 
imputfile seems to be encoded in most probably latin-1 or latin-15, not 
utf-8.

M4


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 784
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[29540] in Perl-Users-Digest

Perl-Users Digest, Issue: 784 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Wed Aug 22 18:09:42 2007

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Aug 22 18:09:42 2007