[23991] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 6192 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Feb 28 09:06:46 2004

Date: Sat, 28 Feb 2004 06:05:06 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 28 Feb 2004     Volume: 10 Number: 6192

Today's topics:
        [newbie] printing a hash <auto87829@hushmail.com>
    Re: [newbie] printing a hash <tassilo.parseval@rwth-aachen.de>
    Re: [newbie] printing a hash <beable+unsenet@beable.com.invalid>
    Re: [newbie] printing a hash <spamtrap@deepsea.force9.co.uk>
    Re: Comments requested: brief summary of Perl <bik.mido@tiscalinet.it>
        FIND: Parameterformat falsch (parameter format not corr (FMAS)
    Re: FIND: Parameterformat falsch (parameter format not  <Joe.Smith@inwap.com>
    Re: FIND: Parameterformat falsch (parameter format not  (Anno Siegel)
        Finding a string in an other string, then.. <joel@hotmail.ru>
    Re: Finding a string in an other string, then.. <beable+unsenet@beable.com.invalid>
    Re: function reference and -> <pkent77tea@yahoo.com.tea>
    Re: gd 1.41 install errors in red hat 9 <Joe.Smith@inwap.com>
    Re: hash of hashes with lists <gnari@simnet.is>
    Re: How to access filehandle through globref? <bik.mido@tiscalinet.it>
        HTML in utf8 and perl <niewiap@NOSPAM.widzew.net.INVALID>
    Re: HTML in utf8 and perl <flavell@ph.gla.ac.uk>
    Re: HTML in utf8 and perl <andy@andyh.co.uk>
    Re: Parameterformat falsch (parameter format not correc <gnari@simnet.is>
    Re: regex <gnari@simnet.is>
    Re: using sed from with a perl script <bik.mido@tiscalinet.it>
    Re: using sed from with a perl script <fifo@despammed.com>
    Re: XML best practices (was: Python as replacement for  <matthew.garrish@sympatico.ca>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 28 Feb 2004 20:42:15 +1100
From: "David" <auto87829@hushmail.com>
Subject: [newbie] printing a hash
Message-Id: <40406278$0$5871$afc38c87@news.optusnet.com.au>

I can't really visualise what's going on in this code.  We're simply reading
splitting each line of a file and saving the result in a hash, right?

How can I print the contents of the hash?

open(FILE, 'orig.txt') or die "Can't open orig.txt: $!";

my @fields = qw/a b c d e f g h i j k/;
my $record = <FILE>;

while (<FILE>) {
    chomp $record;
    my %hash;
    @hash{@fields} = split(/\t/, $record);
}

print "$. records processed.\n";

close FILE;




------------------------------

Date: 28 Feb 2004 09:56:45 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: [newbie] printing a hash
Message-Id: <c1pokt$ag1$1@nets3.rz.RWTH-Aachen.DE>

Also sprach David:

> I can't really visualise what's going on in this code.  We're simply reading
> splitting each line of a file and saving the result in a hash, right?
> 
> How can I print the contents of the hash?
> 
> open(FILE, 'orig.txt') or die "Can't open orig.txt: $!";
> 
> my @fields = qw/a b c d e f g h i j k/;
> my $record = <FILE>;
> 
> while (<FILE>) {
>     chomp $record;
>     my %hash;
>     @hash{@fields} = split(/\t/, $record);
> }

This makes no sense. You read the first record from FILE into $record
and then loop over the rest of the file, but without ever assigning the
next record to $record. The above should read:

    # drop this:
    # my $record = <FILE>;

    while (<FILE>) {
	chomp;
	my %hash;
	@hash{ @fields } = split /\t/, $_;
	# do something with %hash now as 
	# it will be recreated each
	# iteration
    }
	
> print "$. records processed.\n";
> 
> close FILE;

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: Sat, 28 Feb 2004 10:12:24 GMT
From: Beable van Polasm <beable+unsenet@beable.com.invalid>
Subject: Re: [newbie] printing a hash
Message-Id: <dyad33345b.fsf@dingo.beable.com>

"David" <auto87829@hushmail.com> writes:

> I can't really visualise what's going on in this code.  We're simply
> reading splitting each line of a file and saving the result in a
> hash, right?

Yep. The hash keys are the letters [a-k], and the hash values are the
fields from a line the file. The fields in the file are separated by
tabs.

> How can I print the contents of the hash?
> 
> open(FILE, 'orig.txt') or die "Can't open orig.txt: $!";
> 
> my @fields = qw/a b c d e f g h i j k/;

These next two lines appear to be slightly wrong:
> my $record = <FILE>;

Delete the above line, and...

> 
> while (<FILE>) {

change the above line to:

while (my $record = <FILE>) {

That way, $record will be updated every time around the loop, instead
of only once before the loop.

>     chomp $record;
>     my %hash;
>     @hash{@fields} = split(/\t/, $record);

    # print the hash
    my @keys = keys %hash;
    foreach my $key (@keys) {
        print("key: $key => value: $hash{$key}\n");
    }
    print("-" x 72, "\n");

> }
> 
> print "$. records processed.\n";
> 
> close FILE;


-- 
   


------------------------------

Date: Sat, 28 Feb 2004 10:20:05 GMT
From: Iain <spamtrap@deepsea.force9.co.uk>
Subject: Re: [newbie] printing a hash
Message-Id: <Xns949D696D6C228coraldeepsea@212.159.13.1>

"David" <auto87829@hushmail.com> wrote in
news:40406278$0$5871$afc38c87@news.optusnet.com.au: 

> I can't really visualise what's going on in this code.  We're simply
> reading splitting each line of a file and saving the result in a
> hash, right? 
> 
> How can I print the contents of the hash?

As you will no doubt learn very rapidly if you read this group, you 
ought to start every Perl program with the incantation:

use strict;
use warnings;

which allows Perl to help you by spitting out lots of useful information 
about possible mistakes in your program.

> open(FILE, 'orig.txt') or die "Can't open orig.txt: $!";
> 
> my @fields = qw/a b c d e f g h i j k/;
> my $record = <FILE>;

<> in scalar context returns a single line from a file, so this will 
read the first line of the contents of FILE into $record. It's not 
really necessary in this instance (see below).
 
> while (<FILE>) {

Whereas <> as the sole condition for a while loop is magic, and is 
shorthand for:

while ( $_ = <FILE> ) {

So in this case I suspect you would be better off with the condition

while ( my $record = <FILE> ) {

and then you can eliminate the initial 'my $record = <FILE>;' line 
above.

>     chomp $record;

This will work the first time around the loop, but in your original 
program you're not updating the value of $record anywhere, making it a 
fairly worthless statement.

>     my %hash;

You are using my() within a block (the {..} construct of the while 
loop). This will lexically scope the variable to a *single iteration* of 
the loop. In other words, you are creating a new %hash variable every 
time you read another line from FILE. And losing it almost immediately. 
Do you really want a new hash for each line of the file, or are you 
trying to aggregate the data in some form?

>     @hash{@fields} = split(/\t/, $record);

You are trying to assign to a hash slice of exactly 11 keys. Is this 
what you want -- in other words, can you guarantee that each line of 
FILE contains 11 items each separated by a single tab character?

> }
> 
> print "$. records processed.\n";

This works, but it's probably clearer to use an explicit line-counter 
variable. That will also have the advantage that it won't get reset if 
you close() the file.
 
> close FILE;
> 
> 

And I haven't yet touched upon how you might view the contents of your 
hash, but see the following. Note that it probably isn't doing what you 
expect or want (unles you really DO want the hash to be overwritten on 
each iteration), but it should give you a clearer idea of what's going 
on.



#!perl

use strict;
use warnings;

open(FILE, 'orig.txt') or die "Can't open orig.txt: $!";

my @fields = qw/a b c d e f g h i j k/;

my $lines = 0;
while (my $record =<FILE>) {
  chomp $record;
  my %hash;
  @hash{@fields} = split(/\t/, $record);
  $lines++;

  print "After $lines lines, hash contains:\n";
  # A simple way to print the contents of a hash:
  foreach ( sort keys %hash ) {
    print "  $_ => $hash{$_}\n";
  }

}

close FILE;
print "$lines records processed.\n";


-- 
# Iain | PGP mail preferred: pubkey@www.deepsea.f9.co.uk/misc/iain.asc
($=,$,)=split m$"13/$,qq;13"13/tl\.rnh  r   HITtahkPctacriAneeeusaoJ;;
for(@==sort@$=split m,,,$,){$..=$$[$=];$$=$=[$=];$@=1;$@++while$=[--$=
]eq$$&&$=>=$?;$==$?;for(@$){$@--if$$ eq$_;;last if!$@;$=++}}print$..$/


------------------------------

Date: Sat, 28 Feb 2004 12:57:24 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Comments requested: brief summary of Perl
Message-Id: <pde2as8n9r8u8qc79f9c3mtgru12077c9r@4ax.com>

On 27 Feb 2004 10:07:10 -0800, yf110@vtn1.victoria.tc.ca (Malcolm
Dew-Jones) wrote:

>: | As a shortcut for lists of strings, you can use qw (the letter q followed by the letter w):
>
>: Of "words"!
>
>not really words, actually fairly arbitrary strings, just as long as they
>don't have white space in them, try the following

Wild guess: may this be a good reason why I wrote qq|"words"| instead
of qq|words|?!?


Michele
-- 
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
  "perl bug File::Basename and Perl's nature"


------------------------------

Date: 28 Feb 2004 01:02:14 -0800
From: massion@gmx.de (FMAS)
Subject: FIND: Parameterformat falsch (parameter format not correct)
Message-Id: <f0b3f4c9.0402280102.4c81126d@posting.google.com>

I have downwloaded a professional script to do some tagging on a text.
This script should read files from a directory but I get an error
message:
FIND:: paramater format not correct

Here the critical parts of the script:

use Tk;
use Tk::BrowseEntry;
use Tk::Dialog;

if ($#ARGV < 2) {
    die "\nUsage: general_tagger.pl <input_dir> <output_dir>
<label_file>\n\nIMPORTANT: This script enables you to delete files in
the input directory\nwith the Skip button. Create a backup now!!\n\n";
}
else {
    $intextfiledir = "${ARGV[0]}\";
    $outtextfiledir = "${ARGV[1]}\";
    $labelfile = $ARGV[2];
}

(@infilelist) = read_filelist($intextfiledir);
($label_width, @labels) = read_labels($labelfile);

my $num_labels = $#labels + 1;
my $current_start = "0.0";
my $current_end = "0.0";
my $text_start = "0.0";
my $text_end = "0.0";

$current_file = "";
$fileonly = "";
$index = -1;
$num_files = $#infilelist + 1;
$done = $index + 1;

(...)


sub read_filelist {
    my ($dir) = @_;
    my ($file, @filelist);

    open(FIND, "find $dir | ") or die "Couldn't run find...\n";
    $file = <FIND>;  # Get rid of bogus first line
    while ($file = <FIND>) {
        chop $file;

        next if ($file !~ /\w/);
        push(@filelist, $file);
    }
  
    return (@filelist);
}

The script was written in '99. I am running it on an XP computer. Is
this the reason for the error message?

Thanks in advance for any suggestion

Francois


------------------------------

Date: Sat, 28 Feb 2004 09:49:43 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: FIND: Parameterformat falsch (parameter format not correct)
Message-Id: <XuZ%b.428750$na.935931@attbi_s04>

FMAS wrote:

> I have downwloaded a professional script to do some tagging on a text.
> This script should read files from a directory but I get an error
> message:
> FIND:: paramater format not correct

> open(FIND, "find $dir | ") or die "Couldn't run find...\n";
> 
> The script was written in '99. I am running it on an XP computer. Is
> this the reason for the error message?

Yes, 'find' is a Unix command.  You need to use File::Find instead.

#! perl
print join "\n",read_filelist("C:/temp");

use File::Find;

my @filelist;
sub wanted {
   next if -d;   # Don't include directory names
   push @filelist,$File::Find::name if /\w/;
}

sub read_filelist {
   my $dir = shift;
   find(\&wanted,$dir);
   @filelist;
}


------------------------------

Date: 28 Feb 2004 13:49:45 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: FIND: Parameterformat falsch (parameter format not correct)
Message-Id: <c1q69p$28a$1@mamenchi.zrz.TU-Berlin.DE>

FMAS <massion@gmx.de> wrote in comp.lang.perl.misc:
> I have downwloaded a professional script to do some tagging on a text.
                       ^^^^^^^^^^^^
No.  Like much Perl software that is freely downloadable, this is not
professional Perl code.

> This script should read files from a directory but I get an error
> message:
> FIND:: paramater format not correct
> 
> Here the critical parts of the script:

No strict, no warnings.

> use Tk;
> use Tk::BrowseEntry;
> use Tk::Dialog;
> 
> if ($#ARGV < 2) {

Better written as "@ARGV < 3".  The author doesn't seem to know that an
array returns the number of its elements in scalar context.

>     die "\nUsage: general_tagger.pl <input_dir> <output_dir>
> <label_file>\n\nIMPORTANT: This script enables you to delete files in
> the input directory\nwith the Skip button. Create a backup now!!\n\n";
> }
> else {
>     $intextfiledir = "${ARGV[0]}\";
>     $outtextfiledir = "${ARGV[1]}\";
>     $labelfile = $ARGV[2];
> }

This doesn't compile, the backslashes are bogus.  Ignoring them, the code
inconsistently and uselessly quotes two of the three arguments.

> (@infilelist) = read_filelist($intextfiledir);
> ($label_width, @labels) = read_labels($labelfile);
> 
> my $num_labels = $#labels + 1;

"$num_labels = @labels".  Wherever "$num_labels" is used, "@labels" in
scalar context could be used.  The variable is superfluous.

> my $current_start = "0.0";
> my $current_end = "0.0";
> my $text_start = "0.0";
> my $text_end = "0.0";

Quoting numeric values is unnecessary and can lead to subtle errors.

[rest of code snipped]

The code bears all the hallmarks of an inexperienced Perl programmer.
The error you received won't be the only one, just judging from the first
few lines.  Find something better.

Anno


------------------------------

Date: Sat, 28 Feb 2004 13:46:22 +0300
From: Joel <joel@hotmail.ru>
Subject: Finding a string in an other string, then..
Message-Id: <d9s040dhdvu4pg1pcljl4qrnm5dm1e9d96@4ax.com>

I have a text $alltext, and want to find a specific string
$stringtofind that is in the text.

Then, I want the script to read a number that is right after the
string.

What is the best way to get it ?

Thanks !


------------------------------

Date: Sat, 28 Feb 2004 10:58:25 GMT
From: Beable van Polasm <beable+unsenet@beable.com.invalid>
Subject: Re: Finding a string in an other string, then..
Message-Id: <ef1xof31z9.fsf@dingo.beable.com>

Joel <joel@hotmail.ru> writes:

> I have a text $alltext, and want to find a specific string
> $stringtofind that is in the text.
> 
> Then, I want the script to read a number that is right after the
> string.

This sounds like a job for... Regular Expressions! Please read these
documents:

perldoc perlrequick
perldoc perlretut
perldoc perlre

> What is the best way to get it ?

It depends upon your definition of "number", and so on. This
program might do something like what you want.

#!/usr/bin/perl

use strict;
use warnings;

my $alltext = "some string with some stuff in it 34454 and a number";
my $stringtofind = "stuff in it ";

if ($alltext =~ m/$stringtofind(\d+)/)
{
    my $number = $1;
    print(" found string, number is $number\n");
}
else
{
    print("string not found\n");
}

__END__


   


------------------------------

Date: Sat, 28 Feb 2004 12:18:49 +0000
From: pkent <pkent77tea@yahoo.com.tea>
Subject: Re: function reference and ->
Message-Id: <pkent77tea-96E46B.12184828022004@pth-usenet-02.plus.net>

In article <403d7adf$1_2@rain.i-cable.com>,
 toylet <toylet_at_mail.hongkong.com> wrote:

> my $rf = \&{"afunction"};

as the other poster says, you'd call this as $rf->() or $rf->($arg) or 
$rf->(@args) - although I'd personally write it as:

my $rf = \&afunction;

sub afunction {
   # blah
}

for what it's worth :-)

P

-- 
pkent 77 at yahoo dot, er... what's the last bit, oh yes, com
Remove the tea to reply


------------------------------

Date: Sat, 28 Feb 2004 10:00:24 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: gd 1.41 install errors in red hat 9
Message-Id: <XEZ%b.139573$uV3.667604@attbi_s51>

Stephen Strong wrote:

> how do i get this module installed

GD.pm requires that libgd.a, libgd.so, and gd.h be installed first.
Did you read the last 20 lines of the GD docs?
	-Joe


------------------------------

Date: Sat, 28 Feb 2004 11:55:17 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: hash of hashes with lists
Message-Id: <c1pvh8$fog$1@news.simnet.is>

"cousin_bubba" <cousin_bubba@hotmail.com> wrote in message
news:cf64e035.0402271856.3560fa18@posting.google.com...

> ...
> The hash of hashes looks like this:
>
[snipped wrong hash declaration]

> Basically I want to cycle through all of the four letter accronyms to
> see if they match a current value called $current_acc.  The numbers
> "one" "two" etc are command line arguments.  The $head_acc variable is
> the key of the inner hashes.  I have tried the following but it
> doesn't work:
>
[snip sort of code]

>
> Is there a way I can do this?

it is not quite clear exactly what values you are looking for, but it looks
to me
like you have your datastructure backwards

you should have the things that you look for as the *keys* of a hash.

if you need to lokup both ways, keep 2 hashes

gnari






------------------------------

Date: Sat, 28 Feb 2004 12:57:25 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: How to access filehandle through globref?
Message-Id: <2f0140du4ugv256lhdk25epaj43agganic@4ax.com>

On Fri, 27 Feb 2004 10:59:31 +0000 (UTC), Ben Morrow
<usenet@morrow.me.uk> wrote:

>open my $SAVE_DBOUT, '>&=', DB::OUT;
>open DB::OUT, '>', 'file';
>...
>open DB::OUT, '>&=', $SAVE_DBOUT;
>
>If you're not using 5.8 you'll need to change those to
>
>open my $SAVE_DBOUT, '>&=' . fileno DB::OUT;

Hey, I hope you won't think I'm starting some sort of personal
competition, but... may this be an occasion you were sloppy enough to
give me a chance of saying: aren't we checking if open()s succeeded
any more?!?
;-)


Michele
-- 
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
  "perl bug File::Basename and Perl's nature"


------------------------------

Date: Sat, 28 Feb 2004 13:02:24 +0000 (UTC)
From: Pawel Niewiadomski <niewiap@NOSPAM.widzew.net.INVALID>
Subject: HTML in utf8 and perl
Message-Id: <Xns949D8D1281DDBniewiapNOSPAMwidzewn@212.51.192.18>

I have been looking all over for an answer to this and haven't found a 
satisfactory one. Please tell me what's going on. I want to write a perl 
script generating an html page encoded in utf8. I was wondering why the 
following code

#!/usr/bin/perl
binmode (STDOUT, ":utf8");
use charnames ':full';
printf "\N{CYRILLIC SMALL LETTER EF}\n";
printf "\x{d184}\n";

produces two characters encoded differently, although theoretically it 
should generate two russian ef's identically encoded. The first character 
is normaly visible in a browser (provided I set utf8 encoding) and the 
second is not. Other than that, the second character is coded by three, 
not two bytes, as I would expect. Changing :utf8 to :raw in the second 
line only produces additional "Wide character in print at..." warnings 
but doesn't change the general output. Writing printf "\xd1\x84\n" would 
be a solution, but I am wondering what the problem here is with "\x
{d184}". If what I am asking has an obvious answer, please be so kind and 
refer me to a sensible source of information.
Thanks very much in advance,
Pawel


------------------------------

Date: Sat, 28 Feb 2004 13:20:55 +0000
From: "Alan J. Flavell" <flavell@ph.gla.ac.uk>
Subject: Re: HTML in utf8 and perl
Message-Id: <Pine.LNX.4.53.0402281305100.23691@ppepc56.ph.gla.ac.uk>

On Sat, 28 Feb 2004, Pawel Niewiadomski wrote:

> #!/usr/bin/perl

Where's your strict and warnings ?  Please read the group guidelines
and help yourself before asking others to help you.  (Even if it isn't
the actual issue here).

> binmode (STDOUT, ":utf8");
> use charnames ':full';
> printf "\N{CYRILLIC SMALL LETTER EF}\n";

Why on Earth use printf() instead of print() here?  Again not the
specific issue here - but one day that's going to bite.
http://www.perldoc.com/perl5.8.0/pod/func/printf.html

> printf "\x{d184}\n";

The character U+D184 is in the Hangul Syllables area, not Cyrillic
http://www.unicode.org/charts/PDF/UAC00.pdf

Your CYRILLIC SMALL LETTER EF character is \x{0444}

> produces two characters encoded differently, although theoretically it
> should generate two russian ef's identically encoded.

No, theoretically the second one should generate the Unicode character
which you specified.  You're confusing Unicode values with their utf-8
encodings.

> Other than that, the second character is coded by three,
> not two bytes,

Indeed it is, although you shouldn't normally be concerned with that
if you are using Perl 5.8 Unicode characters as intended.  It's not
the character that you wanted.

> refer me to a sensible source of information.

perldoc perluniintro, perldoc perlunicode, and
http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html

good luck


------------------------------

Date: Sat, 28 Feb 2004 14:04:01 +0000
From: Andy Hassall <andy@andyh.co.uk>
Subject: Re: HTML in utf8 and perl
Message-Id: <577140l8sdgcl0dcmehcc8ndorqhstkbvo@4ax.com>

On Sat, 28 Feb 2004 13:02:24 +0000 (UTC), Pawel Niewiadomski
<niewiap@NOSPAM.widzew.net.INVALID> wrote:

>I have been looking all over for an answer to this and haven't found a 
>satisfactory one. Please tell me what's going on. I want to write a perl 
>script generating an html page encoded in utf8. I was wondering why the 
>following code
>
>#!/usr/bin/perl
>binmode (STDOUT, ":utf8");
>use charnames ':full';
>printf "\N{CYRILLIC SMALL LETTER EF}\n";
>printf "\x{d184}\n";
>
>produces two characters encoded differently, although theoretically it 
>should generate two russian ef's identically encoded. The first character 
>is normaly visible in a browser (provided I set utf8 encoding) and the 
>second is not. Other than that, the second character is coded by three, 
>not two bytes, as I would expect. Changing :utf8 to :raw in the second 
>line only produces additional "Wide character in print at..." warnings 
>but doesn't change the general output. Writing printf "\xd1\x84\n" would 
>be a solution, but I am wondering what the problem here is with "\x
>{d184}". If what I am asking has an obvious answer, please be so kind and 
>refer me to a sensible source of information.

 \x{} produces a 'wide hex char' (see perlop). The main point here, I think, is
that it is a char, and not just dumping a series of bytes out.

 CYRILLIC SMALL LETTER EF is U+0444, which in UTF-8 encoding is represented in
two bytes by 0xd1 0x84.

>printf "\N{CYRILLIC SMALL LETTER EF}\n";

 You'd expect that to output 0xd1 0x84, no surprises here.

>printf "\x{d184}\n";

 This outputs 0xed 0x86 0x84.
 This is the UTF-8 representation of U+D184, HANGUL SYLLABLE TYE.

 Do you really mean:

printf "\x{444}\n";

 i.e. print U+0444 CYRILLIC SMALL LETTER EF, which gets encoded by the :utf8
specification as two bytes 0xd1 0x84?

#!/usr/bin/perl
use strict; 
use warnings;

binmode (STDOUT, ":utf8");
use charnames ':full';
printf "\N{CYRILLIC SMALL LETTER EF}\n";
printf "\x{444}\n";
printf "\x{d184}\n";

__END__

andyh@server:~$ test.pl | hexdump -C
00000000  d1 84 0a d1 84 0a ed 86  84 0a                    |..........|

-- 
Andy Hassall <andy@andyh.co.uk> / Space: disk usage analysis tool
<http://www.andyh.co.uk> / <http://www.andyhsoftware.co.uk/space>


------------------------------

Date: Sat, 28 Feb 2004 12:17:18 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: Parameterformat falsch (parameter format not correct)
Message-Id: <c1q0qi$fv7$1@news.simnet.is>

"FMAS" <massion@gmx.de> wrote in message
news:f0b3f4c9.0402280102.4c81126d@posting.google.com...
> I have downwloaded a professional script to do some tagging on a text.
> This script should read files from a directory but I get an error
> message:
> FIND:: paramater format not correct
>
> Here the critical parts of the script:
>
[snip
>
>
> sub read_filelist {
>     my ($dir) = @_;
>     my ($file, @filelist);
>
>     open(FIND, "find $dir | ") or die "Couldn't run find...\n";
>     $file = <FIND>;  # Get rid of bogus first line
>     while ($file = <FIND>) {
>         chop $file;

replace these 4 lines with:
       opendir(DIR,$dir) or die "Couldn't open dir '$dir' :$!";
       while ($file = readdir DIR) {

>
>         next if ($file !~ /\w/);
>         push(@filelist, $file);
change to:
           push(@filelist, "$dir/$file");

>     }
add here:
       closedir DIR;

>
>     return (@filelist);
> }

> The script was written in '99. I am running it on an XP computer. Is
> this the reason for the error message?

no the reason is that the script used a stupid method to read
a directory

gnari





------------------------------

Date: Sat, 28 Feb 2004 11:43:39 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: regex
Message-Id: <c1purf$fmp$1@news.simnet.is>

"yamini" <yamini_rajan@nospam.com> wrote in message
news:e74a9f9632a8c4a6b497c3cde10d6072@localhost.talkaboutprogramming.com...
> hi,

hi yourself.
instead of posting a new thread, you made a follow up on your own post,
but with a unrelated question. this is why you cget comments about regexes

> print "\t" x ($tab/8),' ' x ($tab%8);

this creates an indent denoted by $tab by using
TAB and SPACE characters. it assumes that a TAB is
equivalent to 8 SPACES

gnari






------------------------------

Date: Sat, 28 Feb 2004 12:57:24 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: using sed from with a perl script
Message-Id: <ugg2as4a6denq39cq80n2q4ujder1lpbr8@4ax.com>

On Fri, 27 Feb 2004 16:07:57 +0000, fifo <fifo@despammed.com> wrote:

>>   #!/usr/bin/perl -ni
>>   
>>   use strict;
>>   use warnings;
>>   
>>   close ARGV if eof;
>>   print our $l unless $.==1;
>>   $l=$_;
>>     
>>   __END__
>> 
>
>I think this would be a bit easier:
>
>#!/usr/bin/perl -pi
>last if eof
>__END__

But since we're not in a Perl golf competition, I would tend to count
this solution as not reliable since, as you surely know, it works for
one file only:

  # more test*
  ::::::::::::::
  test1
  ::::::::::::::
  aaaa
  bbbb
  cccc
  dddd
  ::::::::::::::
  test2
  ::::::::::::::
  aaaa
  bbbb
  cccc
  dddd
  # ./foo.pl test*
  # more test*
  ::::::::::::::
  test1
  ::::::::::::::
  aaaa
  bbbb
  cccc
  ::::::::::::::
  test2
  ::::::::::::::
  aaaa
  bbbb
  cccc
  dddd

(where of course I called your script foo.pl)

And OTOH next won't work either, since print()ing takes place in a
continue block.

Of course you got the point about "my" solution/example being overly
complex:

  #!/usr/bin/perl -ni
  
  next if eof;
  print;
  
  __END__

works perfectly though.

As a side note, again since we're not doing a Perl golf competition, I
think it is stylistically recommended to put a semicolon after the
(only) statement in your script in any case...

Well, "my" solution's only advantage is that it can be adapted to the
more general situation in which #n lines are to be discarded:

  #!/usr/bin/perl -ni
  
  use strict;
  use warnings;
  
  close ARGV if eof;
  undef our @l if $. == 1;
  print shift @l unless $. < 5;
  push @l, $_;
    
  __END__

For some reason I was thinking of something like this when I wrote the
snippet of code in my other posts, but of course I won't ask you to
count this as an excuse...


Michele
-- 
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
  "perl bug File::Basename and Perl's nature"


------------------------------

Date: Sat, 28 Feb 2004 13:03:50 +0000
From: fifo <fifo@despammed.com>
Subject: Re: using sed from with a perl script
Message-Id: <20040228130347.GB32004@fleece>

At 2004-02-28 12:57 +0100, Michele Dondi wrote:
> On Fri, 27 Feb 2004 16:07:57 +0000, fifo <fifo@despammed.com> wrote:
> 
> >>   #!/usr/bin/perl -ni
> >>   
> >>   use strict;
> >>   use warnings;
> >>   
> >>   close ARGV if eof;
> >>   print our $l unless $.==1;
> >>   $l=$_;
> >>     
> >>   __END__
> >> 
> >
> >I think this would be a bit easier:
> >
> >#!/usr/bin/perl -pi
> >last if eof
> >__END__
> 
> But since we're not in a Perl golf competition, I would tend to count
> this solution as not reliable since, as you surely know, it works for
> one file only:

Argh, never even considered calling it with more than one file.  In my
defence, I was thinking in terms of replacing the OP's sed call,

  `sed '\$d' file.csv > file.csvnew`

with something like

  {
    open my $in, '<', 'file.csv' or die $!;
    open my $out, '>', 'file.csvnew' or die $!;
    while (<$in>) {
      last if eof;
      print;
    }
  }

> As a side note, again since we're not doing a Perl golf competition, I
> think it is stylistically recommended to put a semicolon after the
> (only) statement in your script in any case...
> 

My personal preference is to omit the semicolon at the end of a
statement that's on its own in a block, but to put semicolons at the end
of all statements if there are more than one.  I was just applying this
rule to the admittedly unusual case of a script consisting of one
statement!


------------------------------

Date: Sat, 28 Feb 2004 07:56:45 -0500
From: "Matt Garrish" <matthew.garrish@sympatico.ca>
Subject: Re: XML best practices (was: Python as replacement for PHP?)
Message-Id: <ee00c.21660$253.1494407@news20.bellglobal.com>


"Cameron Laird" <claird@lairds.com> wrote in message
news:1040489b2ookafd@corp.supernews.com...
> In article <30260531.0402271901.1bbdda99@posting.google.com>,
> simo <simoninusa2001@yahoo.co.uk> wrote:
> .
> .
> .
> >Perl - bloody fast, if you're doing lots of text processing/regex
> >(e.g. XML parsing) then Perl is it, probably best for sysadmin tasks
> >too. We use this for large reports at work. I love its hash handling
> .
> Let me get this straight:  your preferred vehicle for XML
> parsing is Perl regular expressions?  That's ... well, it's
> a different impression than I've ever gained from anyone
> else with deep XML experience.  It's sure not my first
> instinct.
>

Not sure what the entire context of this thread is, but I get the impression
that you've misread what he was saying. XML parsing is an example of text
processing, and there are a number of Perl modules that can be used for that
task. Perl probably wouldn't be my first choice for parsing markup languages
(I'd personally use Omnimark), but it is all things to all people.

Matt




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6192
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[23991] in Perl-Users-Digest

Perl-Users Digest, Issue: 6192 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Sat Feb 28 09:06:46 2004

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Feb 28 09:06:46 2004