[31495] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 2754 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Jan 6 06:09:43 2010

Date: Wed, 6 Jan 2010 03:09:10 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 6 Jan 2010     Volume: 11 Number: 2754

Today's topics:
        correct way to call a perl subroutine <ptrajkumar@gmail.com>
    Re: correct way to call a perl subroutine <ben@morrow.me.uk>
    Re: correct way to call a perl subroutine <ptrajkumar@gmail.com>
    Re: correct way to call a perl subroutine <uri@StemSystems.com>
    Re: Determine physical location of IP <justin.1001@purestblue.com>
    Re: Determine physical location of IP <ben@morrow.me.uk>
    Re: Determine physical location of IP <sysadmin@example.com>
    Re: I need to make some cash here <john1949@yahoo.com>
    Re: I need to make some cash here <uri@StemSystems.com>
    Re: passing argument to a subroutine (aka ? the Platypus)
        Regex, spaces in pattern stored in variable. <justin.0911@purestblue.com>
    Re: Regex, spaces in pattern stored in variable. <peter@makholm.net>
    Re: significant figures <OJZGSRPBZVCX@spammotel.com>
        trouble processing non-English text <umass.vizlab@gmail.com>
    Re: trouble processing non-English text <rvtol+usenet@xs4all.nl>
    Re: trouble processing non-English text <ben@morrow.me.uk>
    Re: trouble processing non-English text <umass.vizlab@gmail.com>
    Re: unicode newbie, can you help? <m@rtij.nl.invlalid>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 5 Jan 2010 17:02:43 -0800 (PST)
From: Parapura  Rajkumar <ptrajkumar@gmail.com>
Subject: correct way to call a perl subroutine
Message-Id: <dbbc9885-60b3-4ef1-929f-ac83f5582aae@x15g2000vbr.googlegroups.com>

hey all

   I am attaching a small script to illustrate my problem.  The
attached script executes correctly.

But if I make Call 2 same as Call1 ie

do

WriteLine  "Test2" ; instead of WriteLine( "Test2" );

I get an error

String found where operator expected at /Users/foobar/bin/stest.pl
line 26, near "WriteLine  "Test2""
	(Do you need to predeclare WriteLine?)
syntax error at /Users/foobar/bin/stest.pl line 26, near "WriteLine
"Test2""
Execution of /Users/foobar/bin/stest.pl aborted due to compilation
errors.

Is this by design, is there an workaround ?

Thanks in advance
Raj


-----------------------------------------------------

use strict;
use warnings;

package LogHelper;
use base 'Exporter';
our @EXPORT = ('WriteLine');

sub WriteLine
{
    print  @_ , "\n";
}

package main;
import LogHelper;

sub WriteLine2
{
    print  @_ , "\n";
}

#Call 1
WriteLine2  "Test1";

#Call 2
WriteLine(  "Test2" );


------------------------------

Date: Wed, 6 Jan 2010 01:12:08 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: correct way to call a perl subroutine
Message-Id: <8k4d17-23q.ln1@osiris.mauzo.dyndns.org>


Quoth Parapura  Rajkumar <ptrajkumar@gmail.com>:
> hey all
> 
>    I am attaching a small script to illustrate my problem.  The
> attached script executes correctly.
> 
> But if I make Call 2 same as Call1 ie
> 
> do
> 
> WriteLine  "Test2" ; instead of WriteLine( "Test2" );
> 
> I get an error
> 
> String found where operator expected at /Users/foobar/bin/stest.pl
> line 26, near "WriteLine  "Test2""
> 	(Do you need to predeclare WriteLine?)
> syntax error at /Users/foobar/bin/stest.pl line 26, near "WriteLine
> "Test2""
> Execution of /Users/foobar/bin/stest.pl aborted due to compilation
> errors.
> 
> Is this by design, is there an workaround ?
> 
> use strict;
> use warnings;
> 
> package LogHelper;
> use base 'Exporter';
> our @EXPORT = ('WriteLine');
> 
> sub WriteLine
> {
>     print  @_ , "\n";
> }
> 
> package main;
> import LogHelper;
> 
> sub WriteLine2
> {
>     print  @_ , "\n";
> }
> 
> #Call 1
> WriteLine2  "Test1";
> 
> #Call 2
> WriteLine(  "Test2" );

This is by design, more-or-less. Calling a sub without parens requires
that the sub be defined, or at least declared, at the time the call is
compiled. (This is because of the many different things such a bareword
can mean in Perl.) When the call to WriteLine is compiled, the sub
&LogHelper::WriteLine exists, but the sub &main::WriteLine doesn't,
since the import hasn't run yet. This means that you have to write the
call with parens.

The 'workaround' (or rather, the correct way to do things) is to put
LogHelper in its own .pm file and pull it in with 'use'. Since this
calls import at compile time, &main::WriteLine will be visible at the
point where the call is compiled. If you have some particular reason for
doing the import at runtime, you just have to put up with needing the
parens on the call.

Ben



------------------------------

Date: Tue, 5 Jan 2010 21:56:37 -0800 (PST)
From: Parapura  Rajkumar <ptrajkumar@gmail.com>
Subject: Re: correct way to call a perl subroutine
Message-Id: <52f21f46-3f8c-4ae0-96db-1c271cf2daec@q41g2000vba.googlegroups.com>

On Jan 5, 8:12=A0pm, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth Parapura =A0Rajkumar <ptrajku...@gmail.com>:
>
>
>
>
>
> > hey all
>
> > =A0 =A0I am attaching a small script to illustrate my problem. =A0The
> > attached script executes correctly.
>
> > But if I make Call 2 same as Call1 ie
>
> > do
>
> > WriteLine =A0"Test2" ; instead of WriteLine( "Test2" );
>
> > I get an error
>
> > String found where operator expected at /Users/foobar/bin/stest.pl
> > line 26, near "WriteLine =A0"Test2""
> > =A0 =A0(Do you need to predeclare WriteLine?)
> > syntax error at /Users/foobar/bin/stest.pl line 26, near "WriteLine
> > "Test2""
> > Execution of /Users/foobar/bin/stest.pl aborted due to compilation
> > errors.
>
> > Is this by design, is there an workaround ?
>
> > use strict;
> > use warnings;
>
> > package LogHelper;
> > use base 'Exporter';
> > our @EXPORT =3D ('WriteLine');
>
> > sub WriteLine
> > {
> > =A0 =A0 print =A0@_ , "\n";
> > }
>
> > package main;
> > import LogHelper;
>
> > sub WriteLine2
> > {
> > =A0 =A0 print =A0@_ , "\n";
> > }
>
> > #Call 1
> > WriteLine2 =A0"Test1";
>
> > #Call 2
> > WriteLine( =A0"Test2" );
>
> This is by design, more-or-less. Calling a sub without parens requires
> that the sub be defined, or at least declared, at the time the call is
> compiled. (This is because of the many different things such a bareword
> can mean in Perl.) When the call to WriteLine is compiled, the sub
> &LogHelper::WriteLine exists, but the sub &main::WriteLine doesn't,
> since the import hasn't run yet. This means that you have to write the
> call with parens.
>
> The 'workaround' (or rather, the correct way to do things) is to put
> LogHelper in its own .pm file and pull it in with 'use'. Since this
> calls import at compile time, &main::WriteLine will be visible at the
> point where the call is compiled. If you have some particular reason for
> doing the import at runtime, you just have to put up with needing the
> parens on the call.
>
> Ben

Thanks for the explanation. It does seem to work with 'use'. But
unfortunately using 'use' doesn't seem to be legal when you have all
your packages defined in the same file.

  A bit disappointed with perl flexibility here :(

Raj


------------------------------

Date: Wed, 06 Jan 2010 03:06:33 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: correct way to call a perl subroutine
Message-Id: <87637fk67q.fsf@quad.sysarch.com>

>>>>> "PR" == Parapura Rajkumar <ptrajkumar@gmail.com> writes:

  PR> Thanks for the explanation. It does seem to work with 'use'. But
  PR> unfortunately using 'use' doesn't seem to be legal when you have all
  PR> your packages defined in the same file.

then don't do that. use does 2 main things, it loads your module (as
does require) and calls the import method of the class name passed to
use. so how would use know to call the import of class names inside a
module with multiple package names? 

  PR>   A bit disappointed with perl flexibility here :(

a bit better understanding of how use works would help you. rarely do
you want multiple package names in one file. and if you do, it should be
OO style and not procedural with exported names. so learn a better
coding style and improve your perl. it is easier than you think and it
is much harder to change perl in this area.

and if you really want this, then you can have the primary package (the
one with the same name as the file) have its own import method and it
can explicitly call the other packages import methods. it would also
have to export its own subs and such. you can have Exporter.pm do the
work for you or use other modules that support this. but as i said, that
is a poor design and will cause more pain down the road. just put the
packages into their own files or switch to an OO style.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Tue, 5 Jan 2010 23:28:43 +0000
From: Justin C <justin.1001@purestblue.com>
Subject: Re: Determine physical location of IP
Message-Id: <biuc17-6q8.ln1@purestblue.com>

In article <IqWdnQP-M6YzKN7WnZ2dnUVZ_ridnZ2d@insightbb.com>, monkeys paw wrote:
> I know some www sites perform this service, i'm interested in
> the underlying network code that could accomplish this. A working
> example would be fab, a point in the right direction much appreciated.

It's far from reliable. I'm on the Sussex coast (UK) and those things
put me either in Windsor, or in High Wycombe. There are about 30 million
people between me and those locations.

   Justin.

-- 
Justin C, by the sea.


------------------------------

Date: Wed, 6 Jan 2010 01:04:59 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Determine physical location of IP
Message-Id: <r64d17-23q.ln1@osiris.mauzo.dyndns.org>


Quoth Justin C <justin.1001@purestblue.com>:
> In article <IqWdnQP-M6YzKN7WnZ2dnUVZ_ridnZ2d@insightbb.com>, monkeys paw wrote:
> > I know some www sites perform this service, i'm interested in
> > the underlying network code that could accomplish this. A working
> > example would be fab, a point in the right direction much appreciated.
> 
> It's far from reliable. I'm on the Sussex coast (UK) and those things
> put me either in Windsor, or in High Wycombe. There are about 30 million
> people between me and those locations.

Generally speaking they have only the location of your ISP, at best,
since they are the people who actually own that IP. ISPs selling
customer addresses to random websites would almost certainly fall foul
of the Data Protection Act.

Ben



------------------------------

Date: Wed, 06 Jan 2010 02:18:24 -0800
From: Wanna-Be Sys Admin <sysadmin@example.com>
Subject: Re: Determine physical location of IP
Message-Id: <QrZ0n.1855$YP1.713@newsfe15.iad>

monkeys paw wrote:

> Has anyone coded anything to determine the physical location
> of an IP address? Something that takes an IP and tells you where
> geographically the server exists?
> 
> e.g.
> 
> my $location = ip2location('216.14.104.174');
> 
> print $location;
> 
> RESULT:
> New York, New York
> 
> I know some www sites perform this service, i'm interested in
> the underlying network code that could accomplish this. A working
> example would be fab, a point in the right direction much appreciated.

Geo::IP (and similar) on CPAN, but no list is going to be accurate (paid
for or not), so be aware of that.  For general things, you should be
okay, assuming it doesn't have to be perfect/accurate every time.
-- 
Not really a wanna-be, but I don't know everything.


------------------------------

Date: Wed, 6 Jan 2010 09:23:22 -0000
From: "John" <john1949@yahoo.com>
Subject: Re: I need to make some cash here
Message-Id: <hi1kqa$p77$1@news.albasani.net>


<sln@netherlands.com> wrote in message 
news:spg0k5p8e9hrqutb3j964ilusosjf4s7ji@4ax.com...
> My unemployment is running out here in California.
> I can do anything, give me a job.
>
> -sln

Just to redress the balance. I have seen a number of postings by 
sln@netherland,com that I found helpful.

Regards
John




------------------------------

Date: Wed, 06 Jan 2010 04:49:23 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: I need to make some cash here
Message-Id: <871vi3k1gc.fsf@quad.sysarch.com>

>>>>> "J" == John  <john1949@yahoo.com> writes:

  J> <sln@netherlands.com> wrote in message 
  J> news:spg0k5p8e9hrqutb3j964ilusosjf4s7ji@4ax.com...
  >> My unemployment is running out here in California.
  >> I can do anything, give me a job.
  >> 
  >> -sln

  J> Just to redress the balance. I have seen a number of postings by 
  J> sln@netherland,com that I found helpful.

even a broken clock is right twice a day. whatever meager perl he spouts
here is overridden by his bad perl, his lack of really knowing his own
perl skill level, his refusal to take any feedback and his general
asshole attitude. you would do better to ignore him and let others help
you. this happened with moronzilla, it would always snag a newbie or
two who fell for its lures and trivial perl. don't do the same with this
one.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: 06 Jan 2010 05:22:38 GMT
From: "David Formosa (aka ? the Platypus)" <dformosa@usyd.edu.au>
Subject: Re: passing argument to a subroutine
Message-Id: <00cda2fe$0$15616$c3e8da3@news.astraweb.com>

On Thu, 03 Dec 2009 19:11:45 +0100, Jochen Lehmeier <OJZGSRPBZVCX@spammotel.com> wrote:
[...]
> So, if you're pedantic, you might ask why all those bytes have to be  
> shifted around just to get at the first argument.

The bytes are not shifted around, the pointer to the start of the array
is moved forward one slot.



------------------------------

Date: Wed, 06 Jan 2010 10:47:57 -0000
From: Justin C <justin.0911@purestblue.com>
Subject: Regex, spaces in pattern stored in variable.
Message-Id: <eed.4b446a5d.b2d45@zem>

I have a list of strings, some contain spaces. The strings are used as
patterns for a regex match, they don't seem to be working! I tried
substituting the space with '\s' but I got warnings, and still no match.

I'm sure there's a way to do this, but Google isn't providing any
answers (more likely I don't know how to formulate a useful search).

Anyway, here's what I have:

#!/usr/bin/perl

use warnings;
use strict;

my @list = (
    "Fred Flintstone",
    "Barney Rubble",
);

while (<DATA>) {
    my $string = chomp $_;
    foreach (@list) {
	if ( $string =~ /($_)/ ) {
	    print "Matched ", $1, "\n";
	}
    }
}

__DATA__
A man called Fred Flintstone lives in a cave.
Fred's neighbour is called Barney Rubble.
Fred is married to Wilma Flintstone and Barney Rubble is married to Betty.

	Justin.

-- 
Justin C, by the sea.


------------------------------

Date: Wed, 06 Jan 2010 12:01:35 +0100
From: Peter Makholm <peter@makholm.net>
Subject: Re: Regex, spaces in pattern stored in variable.
Message-Id: <87aawrh4z4.fsf@vps1.hacking.dk>

Justin C <justin.0911@purestblue.com> writes:

> #!/usr/bin/perl
>
> use warnings;
> use strict;
>
> my @list = (
>     "Fred Flintstone",
>     "Barney Rubble",
> );
>
> while (<DATA>) {
>     my $string = chomp $_;

'perldoc -f chomp' says:

  [...] It returns the total number of characters removed from all its
  arguments. [...]

So $string would have the value 1. Using the perl debugger it would
have been very easy to check that you assumptions about the content of
$string and $_ was true at the conditional.
 

>     foreach (@list) {
> 	if ( $string =~ /($_)/ ) {
> 	    print "Matched ", $1, "\n";
> 	}
>     }
> }
>
> __DATA__
> A man called Fred Flintstone lives in a cave.
> Fred's neighbour is called Barney Rubble.
> Fred is married to Wilma Flintstone and Barney Rubble is married to Betty.
>
> 	Justin.


------------------------------

Date: Wed, 06 Jan 2010 11:36:49 +0100
From: "Jochen Lehmeier" <OJZGSRPBZVCX@spammotel.com>
Subject: Re: significant figures
Message-Id: <op.u53w3nlfmk9oye@frodo>

On Tue, 05 Jan 2010 22:38:41 +0100, dan <spam.meplease@ntlworld.com> wrote:

> Whilst trying to create something that would parse a number into one with
> an appropriate number of significant figures, I accidentally wrote this:

> sub sigfig {
>   my ($sigfigs, $number) = @_;

    $number = sprintf("%d",$number); # just in case...
    return $number if if ($sigfgs > length($number);

>   my $divisor = 10**(length(int $number) - $sigfigs);
>   $number /= $divisor;
>   $number = sprintf "%1.0f", $number;

    # instead:
    $number = sprintf("%d",$number);

>   $number *= $divisor;
>
>   return $number
> }

> inadequacies,foibles and associated notwithstandings of the above code,

Well, if it works, it works.

A bit shorter:

    $number = sprintf("%d",$number); # just in case...
    return substr($number,0,$sigfigs) . ("0" x (length($number) -  
$sigfigs));
 


------------------------------

Date: Tue, 5 Jan 2010 15:32:51 -0800 (PST)
From: DavidK <umass.vizlab@gmail.com>
Subject: trouble processing non-English text
Message-Id: <14499b76-e21f-40dc-88fa-39db9cd59655@j24g2000yqa.googlegroups.com>

Hello,

I am trying to process some Greek text using Perl.  Strangely, I can
print out the text properly but when I try to assign the text to a
variable or do some processing, it fails.

The data file is:

1 =CE=BA=CE=B1=CE=B9
2 =CE=BD=CE=B1

My program is:

#!/usr/bin/perl -w
use strict;
use encoding "greek";

my %symbols =3D ();

open(FILE, "$file");

while (my $line =3D <FILE>) {
    chomp($line);

    my @fields =3D split(/\s+/, $line);

    my $num_fields =3D @fields;

    if ($num_fields =3D=3D 2) {

	my $freq =3D shift(@fields);
	my $word =3D shift(@fields);

	print "$word\n";

	my @letters =3D split(//, $word);

	foreach my $letter (@letters) {
	    $symbols{$letter} =3D 1;

	    print "$letter -> $letter_test\n";
	}

	print "\n";
    }
}

The output is:

=CE=BA=CE=B1=CE=B9
=EF=BF=BD ->
=EF=BF=BD ->
=EF=BF=BD ->
=EF=BF=BD ->
=EF=BF=BD ->
=EF=BF=BD ->

=CE=BD=CE=B1
=EF=BF=BD ->
=EF=BF=BD ->
=EF=BF=BD ->
=EF=BF=BD ->

I've done some reading on the web and I still can't figure out what's
happening.

I'd appreciate any help.  Thanks!


------------------------------

Date: Wed, 06 Jan 2010 01:01:01 +0100
From: "Dr.Ruud" <rvtol+usenet@xs4all.nl>
Subject: Re: trouble processing non-English text
Message-Id: <4b43d2be$0$22916$e4fe514c@news.xs4all.nl>

DavidK wrote:

> I am trying to process some Greek text using Perl.  Strangely, I can
> print out the text properly but when I try to assign the text to a
> variable or do some processing, it fails.
> [...]
> use encoding "greek";
> [...]
> The output is:
> 
> και
> � ->
> [...]

In what sense does it fail?

What does `echo $LANG` show you?

--
Ruud


------------------------------

Date: Wed, 6 Jan 2010 00:17:02 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: trouble processing non-English text
Message-Id: <uc1d17-dup.ln1@osiris.mauzo.dyndns.org>


Quoth DavidK <umass.vizlab@gmail.com>:
> 
> I am trying to process some Greek text using Perl.  Strangely, I can
> print out the text properly but when I try to assign the text to a
> variable or do some processing, it fails.
> 
> The data file is:
> 
> 1 και
> 2 να
> 
> My program is:
> 
> #!/usr/bin/perl -w

'use warnings' is preferred to -w nowadays.

> use strict;
> use encoding "greek";

Don't do that. In principle 'encoding' specifies the encoding of your
*source* file, and also pushes encoding layers onto STD{IN,OUT}; it has
no effect on other filehandles. In practice it has never worked properly
and should be avoided.

I don't know how your data file is encoded, but AFAIK "greek" is not a
valid encoding name. You might have meant "iso-8859-7", which I believe
is the usual pre-Unicode encoding for Greek, or you might have meant
"UTF-8" (or you might have meant something else entirely). You will need
to find out which.

> my %symbols = ();
> 
> open(FILE, "$file");

Always check the return value of open.
Use 3-arg open instead of magic 2-arg open, unless you've got a good
reason not to.
Don't quote variables when you don't need to.
Use lexical filehandles instead of global barewords.
In your case, you want to push an encoding PerlIO layer when you open
the file.

    open(my $FILE, "<:encoding(iso-8859-7)", $file)
        or die "can't open '$file': $!";

You might also consider using the 'autodie' module, which will do the
'or die' check for you.

> 
> while (my $line = <FILE>) {
>     chomp($line);
> 
>     my @fields = split(/\s+/, $line);
> 
>     my $num_fields = @fields;
> 
>     if ($num_fields == 2) {

There's no need for this. '==' gives scalar context to both sides, so

    if (@fields < 2) {

will suffice.

> 	my $freq = shift(@fields);
> 	my $word = shift(@fields);
> 
> 	print "$word\n";
> 
> 	my @letters = split(//, $word);
> 
> 	foreach my $letter (@letters) {
> 	    $symbols{$letter} = 1;
> 
> 	    print "$letter -> $letter_test\n";

Where does $letter_test come from? Did you actually run the code you
posted?

> 	}
> 
> 	print "\n";
>     }
> }
> 
> The output is:
> 
> και
> � ->
> � ->
> � ->
> � ->
> � ->
> � ->

This suggests your file is not in ISO8859-7, but in some multi-byte
encoding like UTF-8 or UTF-16. If you're on a Unix machine it's probably
UTF-8.

Ben



------------------------------

Date: Tue, 5 Jan 2010 19:24:22 -0800 (PST)
From: DavidK <umass.vizlab@gmail.com>
Subject: Re: trouble processing non-English text
Message-Id: <638b1341-40dd-4880-8028-a1295a35424a@s3g2000yqs.googlegroups.com>

Thanks for the responses!

My $LANG variable is set to en_US.UTF-8.

The file I thought was in ISO8859-7 is actually UTF-8.  I should have
been opening the file with >

    open(my $FILE, "<:encoding(UTF-8)", $file)
        or die "can't open '$file': $!";

I also had to format the output with

binmode STDOUT, ":utf8";

to view it properly.

Thanks again.  It seems to be working now.  thank you ben for the Perl
style tips.

I'm sorry about the confusing source code.  I tried to simplify it and
I removed some lines by mistake.


On Jan 5, 7:17 pm, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth DavidK <umass.viz...@gmail.com>:
>
>
>
> > I am trying to process some Greek text using Perl.  Strangely, I can
> > print out the text properly but when I try to assign the text to a
> > variable or do some processing, it fails.
>
> > The data file is:
>
> > 1 =CE=BA=CE=B1=CE=B9
> > 2 =CE=BD=CE=B1
>
> > My program is:
>
> > #!/usr/bin/perl -w
>
> 'use warnings' is preferred to -w nowadays.
>
> > use strict;
> > use encoding "greek";
>
> Don't do that. In principle 'encoding' specifies the encoding of your
> *source* file, and also pushes encoding layers onto STD{IN,OUT}; it has
> no effect on other filehandles. In practice it has never worked properly
> and should be avoided.
>
> I don't know how your data file is encoded, but AFAIK "greek" is not a
> valid encoding name. You might have meant "iso-8859-7", which I believe
> is the usual pre-Unicode encoding for Greek, or you might have meant
> "UTF-8" (or you might have meant something else entirely). You will need
> to find out which.
>
> > my %symbols =3D ();
>
> > open(FILE, "$file");
>
> Always check the return value of open.
> Use 3-arg open instead of magic 2-arg open, unless you've got a good
> reason not to.
> Don't quote variables when you don't need to.
> Use lexical filehandles instead of global barewords.
> In your case, you want to push an encoding PerlIO layer when you open
> the file.
>
>     open(my $FILE, "<:encoding(iso-8859-7)", $file)
>         or die "can't open '$file': $!";
>
> You might also consider using the 'autodie' module, which will do the
> 'or die' check for you.
>
>
>
> > while (my $line =3D <FILE>) {
> >     chomp($line);
>
> >     my @fields =3D split(/\s+/, $line);
>
> >     my $num_fields =3D @fields;
>
> >     if ($num_fields =3D=3D 2) {
>
> There's no need for this. '=3D=3D' gives scalar context to both sides, so
>
>     if (@fields < 2) {
>
> will suffice.
>
> >    my $freq =3D shift(@fields);
> >    my $word =3D shift(@fields);
>
> >    print "$word\n";
>
> >    my @letters =3D split(//, $word);
>
> >    foreach my $letter (@letters) {
> >        $symbols{$letter} =3D 1;
>
> >        print "$letter -> $letter_test\n";
>
> Where does $letter_test come from? Did you actually run the code you
> posted?
>
> >    }
>
> >    print "\n";
> >     }
> > }
>
> > The output is:
>
> > =CE=BA=CE=B1=CE=B9
> > ->
> > ->
> > ->
> > ->
> > ->
> > ->
>
> This suggests your file is not in ISO8859-7, but in some multi-byte
> encoding like UTF-8 or UTF-16. If you're on a Unix machine it's probably
> UTF-8.
>
> Ben



------------------------------

Date: Wed, 6 Jan 2010 08:13:09 +0100
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: unicode newbie, can you help?
Message-Id: <5ppd17-f0a.ln1@news.rtij.nl>

On Tue, 05 Jan 2010 07:58:56 -0800, alexxx.magni@gmail.com wrote:

> wow, so many things I didnt know... there isnt any utility able to
> detect the coding of a file ?

Easy, as this file is html. Look at the encoding tag. If there is none, 
it'a iso-latin-1 by default.

M4


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2754
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[31495] in Perl-Users-Digest

Perl-Users Digest, Issue: 2754 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Wed Jan 6 06:09:43 2010

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Jan 6 06:09:43 2010