[32559] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 3825 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Nov 27 00:09:21 2012

Date: Mon, 26 Nov 2012 21:09:07 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 26 Nov 2012     Volume: 11 Number: 3825

Today's topics:
    Re: Creating visual graphics <jimsgibson@gmail.com>
    Re: Creating visual graphics <bernie@fantasyfarm.com>
    Re: Creating visual graphics <*@eli.users.panix.com>
    Re: New perl books in the last year or two (or three)?  <jimsgibson@gmail.com>
    Re: using templates effectively  act Two scene 1 <cal@example.invalid>
    Re: using templates effectively  act Two scene 1 <jimsgibson@gmail.com>
    Re: using templates effectively  act Two scene 1 <cal@example.invalid>
    Re: using templates effectively  act Two scene 1 <cal@example.invalid>
    Re: UTF-8 read & print? <rweikusat@mssgmbh.com>
    Re: UTF-8 read & print? <tuxedo@mailinator.com>
    Re: UTF-8 read & print? <tuxedo@mailinator.com>
    Re: UTF-8 read & print? <rweikusat@mssgmbh.com>
    Re: UTF-8 read & print? <tuxedo@mailinator.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 26 Nov 2012 13:17:52 -0800
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: Creating visual graphics
Message-Id: <261120121317523338%jimsgibson@gmail.com>

In article <g3t4b81lblhjmufb4rla6qg7c4labqgc5o@library.airnews.net>,
Bernie Cosell <bernie@fantasyfarm.com> wrote:

> I'm VERY new to all this, but I've started fooling around with the GD
> package on my win7/pro system [using ActiveState Perl] and it seems simple
> enough that even a dolt like me can manage to do some graphics-generation.
> 
> What I'm wondering if there's some [not too complicated? :o)] package
> that'd allow "active" graphics.  that is, it'd open a window and draw in it
> in some simple way [ala GD, if not GD::Simple :o)], and I could be typing
> into the command window to mess with things and have the display change.
> Thanks!

Any graphics package that is "fast enough" can do "active" graphics,
even if it means redrawing the entire graph from scratch each time. Of
course, it depends on how complicated the graphics are, how fast the
graphics package, and how powerful the hardware. 

For example, I have had success in the past using gnuplot and Perl to
implement "quasi-active" graphics. My Perl program would open gnuplot
as a separate process, write plot data to a file, pipe plot commands to
gnuplot to plot the data in the file, and wait for terminal input. I
could then type arrow, plus, and minus keys to navigate around the plot
region and zoom in and out. Even though the program was re-writing the
entire data file and redrawing the graph each time, it was fast enough
to be used as an interactive analysis and data visualization tool.

-- 
Jim Gibson


------------------------------

Date: Mon, 26 Nov 2012 19:23:24 -0500
From: Bernie Cosell <bernie@fantasyfarm.com>
Subject: Re: Creating visual graphics
Message-Id: <eg18b8tmd6auipj1dfdkd7fnfmtqa4ac4n@library.airnews.net>

Jim Gibson <jimsgibson@gmail.com> wrote:

} In article <g3t4b81lblhjmufb4rla6qg7c4labqgc5o@library.airnews.net>,
} Bernie Cosell <bernie@fantasyfarm.com> wrote:
} 
} > What I'm wondering if there's some [not too complicated? :o)] package
} > that'd allow "active" graphics.  that is, it'd open a window and draw in it
} > in some simple way [ala GD, if not GD::Simple :o)], and I could be typing
} > into the command window to mess with things and have the display change.
} > Thanks!
} 
} Any graphics package that is "fast enough" can do "active" graphics,
} even if it means redrawing the entire graph from scratch each time. Of
} course, it depends on how complicated the graphics are, how fast the
} graphics package, and how powerful the hardware. 

Interesting approach.  I expect it'll be fast enough and I'll look at
gnuplot.  What did you use to generate the plot data?  I appreciate the
reminder from Willem about Tk::Canvas, but I've used tk in the past and
didn't like it a lot.  Using a scratch file in temp seems easy enough.

It'd still be nice if I could open a window somehow and talk to it with
GD... :o)

tnx.  /b\
-- 
Bernie Cosell                     Fantasy Farm Fibers
bernie@fantasyfarm.com            Pearisburg, VA
    -->  Too many people, too few sheep  <--          


------------------------------

Date: Tue, 27 Nov 2012 01:35:17 +0000 (UTC)
From: Eli the Bearded <*@eli.users.panix.com>
Subject: Re: Creating visual graphics
Message-Id: <eli$1211262028@qz.little-neck.ny.us>

In comp.lang.perl.misc, Bernie Cosell  <bernie@fantasyfarm.com> wrote:
> reminder from Willem about Tk::Canvas, but I've used tk in the past and
> didn't like it a lot.  Using a scratch file in temp seems easy enough.
> 
> It'd still be nice if I could open a window somehow and talk to it with
> GD... :o)

Non Perl response:

On linux, I like feh as an image viewer. You can have feh reload an image
every (integer) N seconds. Not quite realtime, but good for enough for
some purposes. I have used it for reloading electoral result maps every
few minutes.

Perl response: X11::Protocol. It's a bit cumbersome, though.

Elijah
------
use X11::Protocol;$X=new X11::Protocol;END{$X->FreeGC($G);undef$X}map{$$_=$X->
new_rsrc}(W,F,G);$X->event_handler('queue');$X->CreateWindow($W,$X->root,'Inp'
 .'utOutput',$D=$X->root_depth,'CopyFromParent',(0,0),300,30,4,'event_mask',01,
background_pixel=>2**$D-1);$X->CreateGC($G,$W);$X->MapWindow($W);$X->PolyText8
($W,$G,25,28,[0,'Eli the Bearded:Just Another Perl Hacker']);$X->handle_input;


------------------------------

Date: Mon, 26 Nov 2012 13:04:58 -0800
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: New perl books in the last year or two (or three)?  [Borders, etc, gone]
Message-Id: <261120121304586887%jimsgibson@gmail.com>

In article <k8s0fa$khp$1@panix2.panix.com>, David Combs
<dkcombs@panix.com> wrote:

> Used to be, I could discover new books by browsing eg
> Borders (wonderful bookstore) or, best of all for NYC,
> the McGraw-Hill bookstore.
> 
> All gone, M-H *long* gone.
> 
> And the computer section of Barnes and Noble keeps shrinking,
> ditto even the big B&N in NYC at 5th ave and 18th st.
> 
> Amazon, unfortunately, doesn't (as far as I know) list books
> by publication date.  Who wants to look through a hundred or
> more books every few months, 99% of which you already know
> about.
> 
> So, as far as I know, the only practical way to discover (good) new
> Perl texts is by asking here.
> 
> So, that's my question: are there any?  And any *good* ones?  Or
> ones that cover new Perl features?

The latest (4th) edition of "Programming Perl", by Christiansen, foy,
Wall, & Orwant was published in February, 2012, and covers Perl 5.12,
5.14, and some previews of 5.16 features.
<http://shop.oreilly.com/product/9780596004927.do>

"Modern Perl", by chromatic also covers 5.12 and 5.14.
<http://onyxneon.com/books/modern_perl/>

-- 
Jim Gibson


------------------------------

Date: Mon, 26 Nov 2012 12:24:28 -0700
From: Cal Dershowitz <cal@example.invalid>
Subject: Re: using templates effectively  act Two scene 1
Message-Id: <YIKdnVx0mdtwXC7NnZ2dnUVZ_sudnZ2d@supernews.com>

On 11/26/2012 12:49 AM, Ben Morrow wrote:
>
> Quoth Cal Dershowitz <cal@example.invalid>:
>>
>> # main control
>> for my $name (@files) {
>>        print "name is $name\n";
>>        my ($ext) = $name =~ /([^.]*)$/;
>>        print "ext is $ext\n";
>>
>>        @matching = map /image_(\d+)\.$ext$/, @list;
>>        print "matching is @matching\n";
>>        push( @matching, 1 );
>>        @matching = sort { $a <=> $b } @matching;
>>        $winner = pop @matching;
>>        my $newnum    = $winner + 1;
>
> There is no need to redo the search through the list every time. If you
> move this code to a sub you can just remember the next number for each
> $ext, something like
>
>      my %next_for_ext;
>      sub next_for_ext {
>          my ($ext) = @_;
>
>          unless (exists $next_for_ext{$ext}) {
>              my @matching = map ... @list;
>              ...;
>              $next_for_ext{$ext} = pop @matching;
>          }
>
>          return ++$next_for_ext{$ext};
>      }
>
> then in the loop just call the sub
>
>      my $newnum = next_for_ext $ext;
>
> (and, of course, you don't then have to keep adding the new names to the
> list).

ok
>
>>        my $new_file2 = "image_$newnum.$ext";
>
> Give your variables more sensible names. This should be called something
> like $remote_file.

ok
>
>>        print "newfile is $new_file2\n";
>>        $ftp->put( $name, $new_file2 ) or die "put failed $!\n";
>>        push( @list, $new_file2 );
>>        # unlink($name);
>>
>>        print $fh "<img src=\"/images/$new_file2\"/>\n\n";
>>        print $fh "<p>caption for $new_file2 <\/p>\n";
>>
>> }
> <snip>
>
>>
>> So now, how do I write this so that this line:
>>
>>      print $fh "<img src=\"/images/$new_file2\"/>\n\n";
>>
>> is instead from a text file, one for each caption, in a template system
>> that mimics what I did here?
>
> The simplest template system is sprintf. It is rather limited, but
> sufficient for what you are doing here. First you create a template
> outside the loop, with printf %-codes in it:
>
>      my $template = <<'TEMPLATE';
>      <img src="/images/%s"/>
>
>      <p>%s</p>
>      TEMPLATE

What would be the filename this is in?
>
> then you open your text file
>
>      open my $CAPTIONS, "<", "captions.txt" or die ...;
>
> then in the loop you read a line from the file and fill in the template
>
>      my $remote_file = ...;
>      my $caption = <$CAPTIONS>;
>
>      $ftp->put($name, $remote_file) or ...;
>
>      printf $fh $template, $remote_file, $caption;
>
> This assumes, of course, that the captions are listed in the file in the
> order this script will find the files to be uploaded, and that the
> captions in the file are already in HTML. If they might not be in the
> right order you will need a more complicated file format than 'one line
> per file', so you can read it into a hash and look up the caption you
> need. If they aren't in HTML (if they are plain text which might contain
> special characters like <) you need to convert them to HTML using
> something like HTML::Entities.
>
> Ben
>


Alright, I don't see how this divides the captions up so that there is 
the appropriate one for every picture.  Are you suggesting that all the 
captions could be in one file?  If so, what would it split on?
-- 
Cal


------------------------------

Date: Mon, 26 Nov 2012 13:21:30 -0800
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: using templates effectively  act Two scene 1
Message-Id: <261120121321306378%jimsgibson@gmail.com>

In article <YIKdnVx0mdtwXC7NnZ2dnUVZ_sudnZ2d@supernews.com>, Cal
Dershowitz <cal@example.invalid> wrote:

> On 11/26/2012 12:49 AM, Ben Morrow wrote:
> >
> > Quoth Cal Dershowitz <cal@example.invalid>:
> >>

> > The simplest template system is sprintf. It is rather limited, but
> > sufficient for what you are doing here. First you create a template
> > outside the loop, with printf %-codes in it:
> >
> >      my $template = <<'TEMPLATE';
> >      <img src="/images/%s"/>
> >
> >      <p>%s</p>
> >      TEMPLATE
> 
> What would be the filename this is in?

That would be in your Perl program source file.

> >
> > then you open your text file
> >
> >      open my $CAPTIONS, "<", "captions.txt" or die ...;
> >
> > then in the loop you read a line from the file and fill in the template
> >
> >      my $remote_file = ...;
> >      my $caption = <$CAPTIONS>;
> >
> >      $ftp->put($name, $remote_file) or ...;
> >
> >      printf $fh $template, $remote_file, $caption;
> >
> > This assumes, of course, that the captions are listed in the file in the
> > order this script will find the files to be uploaded, and that the
> > captions in the file are already in HTML. If they might not be in the
> > right order you will need a more complicated file format than 'one line
> > per file', so you can read it into a hash and look up the caption you
> > need. If they aren't in HTML (if they are plain text which might contain
> > special characters like <) you need to convert them to HTML using
> > something like HTML::Entities.

-- 
Jim Gibson


------------------------------

Date: Mon, 26 Nov 2012 18:39:52 -0700
From: Cal Dershowitz <cal@example.invalid>
Subject: Re: using templates effectively  act Two scene 1
Message-Id: <e_GdnRzF6eN7hCnNnZ2dnUVZ_vKdnZ2d@supernews.com>

On 11/26/2012 02:21 PM, Jim Gibson wrote:
> In article <YIKdnVx0mdtwXC7NnZ2dnUVZ_sudnZ2d@supernews.com>, Cal
> Dershowitz <cal@example.invalid> wrote:
>
>> On 11/26/2012 12:49 AM, Ben Morrow wrote:
>>>
>>> Quoth Cal Dershowitz <cal@example.invalid>:
>>>>
>
>>> The simplest template system is sprintf. It is rather limited, but
>>> sufficient for what you are doing here. First you create a template
>>> outside the loop, with printf %-codes in it:
>>>
>>>       my $template = <<'TEMPLATE';
>>>       <img src="/images/%s"/>
>>>
>>>       <p>%s</p>
>>>       TEMPLATE
>>
>> What would be the filename this is in?
>
> That would be in your Perl program source file.

Ok, Jim. I found the link to the website for the badger book and then 
the tutorial, and it helped straighten me out on what files exist in all 
this:


http://www.template-toolkit.org/docs/tutorial/Web.html


$ pwd
/home/fred/Pictures/2012/11
$ tpage example.html
file error - header: not found$
$ tpage example.html
<html>
   <head>
     <title>This is an HTML example</title>
   </head>
   <body>

    <h1>Some Interesting Links</h1>
    <ul>

      <li><a href="http://foo.org">The Foo Organisation</a>

      <li><a href="http://bar.org">The Bar Organisation</a>

    </ul>

     <div class="copyright">
       &copy; Copyright 2007 Arthur Dent
     </div>
   </body>
</html>

$ cat header
<html>
   <head>
     <title>[% title %]</title>
   </head>
   <body>
$ cat footer
     <div class="copyright">
       &copy; Copyright 2007 Arthur Dent
     </div>
   </body>
</html>
$

So there's gonna be a header file and a footer file.  Header files in C 
look like header.h, so I was a little thrown by this until my compiler 
could straighten me out.

I always like to plug the author when I profit from reading a book, but 
I think I'll go with this from here on out:

$ cat header
<!DOCTYPE html>
   <head>
     <title>[% title %]</title>
   </head>
   <body>
$ cat footer
   </body>
</html>
$ tpage example.html
<!DOCTYPE html>
   <head>
     <title>This is an HTML example</title>
   </head>
   <body>

    <h1>Some Interesting Links</h1>
    <ul>

      <li><a href="http://foo.org">The Foo Organisation</a>

      <li><a href="http://bar.org">The Bar Organisation</a>

    </ul>

   </body>
</html>

$

I don't know whether this puts me closer or farther from my goal to get 
captions under pictures, tho.
-- 
Cal



------------------------------

Date: Mon, 26 Nov 2012 20:11:32 -0700
From: Cal Dershowitz <cal@example.invalid>
Subject: Re: using templates effectively  act Two scene 1
Message-Id: <nMydnS3Tm-b-sinNnZ2dnUVZ_oWdnZ2d@supernews.com>

On 11/26/2012 12:49 AM, Ben Morrow wrote:
>
> Quoth Cal Dershowitz <cal@example.invalid>:
>>
>> # main control
>> for my $name (@files) {
>>        print "name is $name\n";
>>        my ($ext) = $name =~ /([^.]*)$/;
>>        print "ext is $ext\n";
>>
>>        @matching = map /image_(\d+)\.$ext$/, @list;
>>        print "matching is @matching\n";
>>        push( @matching, 1 );
>>        @matching = sort { $a <=> $b } @matching;
>>        $winner = pop @matching;
>>        my $newnum    = $winner + 1;
>
> There is no need to redo the search through the list every time. If you
> move this code to a sub you can just remember the next number for each
> $ext, something like
>
>      my %next_for_ext;
>      sub next_for_ext {
>          my ($ext) = @_;
>
>          unless (exists $next_for_ext{$ext}) {
>              my @matching = map ... @list;
>              ...;
>              $next_for_ext{$ext} = pop @matching;
>          }
>
>          return ++$next_for_ext{$ext};
>      }
>
> then in the loop just call the sub
>
>      my $newnum = next_for_ext $ext;
>
> (and, of course, you don't then have to keep adding the new names to the
> list).
>

$ ./ftp4.pl my_ftp
 ...
name is target4/26-img_0004.jpg
ext is jpg
Can't locate object method "next_for_ext" via package "jpg" (perhaps you 
forgot to load "jpg"?) at ./ftp4.pl line 72.
$ cat ftp4.pl
#!/usr/bin/perl -w
use strict;
use 5.010;
use Net::FTP;

#identity and config
my $ident = 'my_ftp.txt';
my ( $config, $domain );
$config = do($ident);
unless ($config) {
     die("read error: $!")  if $!;
     die("parse error: $@") if $@;
}

$domain = $config->{ $ARGV[0] };
die("unknown domain: $ARGV[0]") unless $domain;

#preliminaries at top scope
my $word = "monday";
my %next_for_ext;


#dial up the server
my $ftp = Net::FTP->new( $domain->{domain}, Debug => 1, Passive => 1 )
   or die "Can't connect: $@\n";
$ftp->login( $domain->{username}, $domain->{password} )
   or die "Couldn't login\n";
$ftp->binary();

# get files from remote root that end in html:
my @remote_files = $ftp->ls();

# print "remote files are: @remote_files\n";
my @matching = map /${word}_(\d+)\.html/, @remote_files;
print "matching is @matching\n";
push( @matching, 0 );

@matching = sort { $a <=> $b } @matching;
my $winner    = pop @matching;
my $newnum1   = $winner + 1;
my $html_file = "${word}_$newnum1.html";
print "html file is  $html_file\n";

# create file for html stubouts
open( my $fh, '>', $html_file )
   or die("Can't open $html_file for writing: $!");
print $fh "<!DOCTYPE html>\n";
print $fh '<html lang="en">' . "\n";
print $fh "<head>\n";
print $fh '<meta charset="utf-8">' . "\n";
print $fh "<title>House Sale</title>\n";
print $fh "</head>\n";
print $fh "<body>\n";
print $fh "<h1>Kitchen Upgrade</h1>\n";
print $fh "<h2>stuff hitting the curb too</h2>\n";

# get files from Desktop/images/
my $path  = 'target4/';
my @files = <$path*>;
print "@files\n";

# get ls from remote image directory
$ftp->cwd('/images/') or die "cwd failed $@\n";
my @list = $ftp->ls();

# main control
for my $name (@files) {
     print "name is $name\n";
     my ($ext) = $name =~ /([^.]*)$/;
     print "ext is $ext\n";

     my $newnum = next_for_ext $ext;
     my $new_file2 = "image_$newnum.$ext";
     print "newfile is $new_file2\n";
     $ftp->put( $name, $new_file2 ) or die "put failed $!\n";
     push( @list, $new_file2 );
     # unlink($name);
     print $fh "<img src=\"/images/$new_file2\"/>\n\n";
     print $fh "<p>caption for $new_file2 <\/p>\n";
}
print $fh "</body>\n";
print $fh "</html>\n";
close $fh;
$ftp->cdup() or die "cdup failed $@\n";
$ftp->put($html_file) or die "put failed $@\n";

sub next_for_ext {
     my ($ext) = @_;

     unless ( exists $next_for_ext{$ext} ) {
         my @matching = map /image_(\d+)\.$ext$/, @list;
         @matching = sort { $a <=> $b } @matching;
         $next_for_ext{$ext} = pop @matching;
     }

     return ++$next_for_ext{$ext};
}
$

I'm really not getting this right now.  Am I correct to have the 
declaration of the hash at top scope?
-- 
Cal



------------------------------

Date: Mon, 26 Nov 2012 12:46:31 +0000
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: UTF-8 read & print?
Message-Id: <87mwy47crc.fsf@sapphire.mobileactivedefense.com>

Ben Morrow <ben@morrow.me.uk> writes:
> Quoth Tuxedo <tuxedo@mailinator.com>:

[...]

> If you're just copying a file, it's better to do it in blocks than
> line-by-line.
>
>     local $/ = \4096;
>     while (...) { ... }

As soon as an application starts to do any explicit buffer management,
using the supposedly transparent buffer management embedded in the
buffered I/O subsystem is not only pointless but actually a bad idea
(one would assume that it should be self-evident that reading data
into a buffer of size x, copying it into a buffer of size y, copying
it into another buffer of size x and finally 'writing' it out isn't a
particularly sensible thing to do ...)

NB: It is interesting the observe the effect of using a larger buffer
size. For the test I made, 8192 seemed to be the best choice and this
improves the 'blocks' version significantly but the fread version only
marginally (in the first case, the speed increase was 34% of the
slower speed, for the second, it was only 6%).

---------
use Benchmark;

open($out, '>', '/dev/null');

timethese(-5,
	  {
	   lines => sub {
	       my $line;
	       
	       seek(STDIN, 0, 0);
	       print $out ($line) while $line = <>;
	   },
	   
	   fread => sub {
	       my $block;
	       local $/ = \4096;
	       
	       seek(STDIN, 0, 0);
	       print $out ($block) while $block = <>;
	   },

	   blocks => sub {
	       my $block;

	       seek(STDIN, 0, 0);
	       syswrite($out, $block) while sysread(STDIN, $block, 4096);
	  }});


------------------------------

Date: Mon, 26 Nov 2012 19:50:49 +0100
From: Tuxedo <tuxedo@mailinator.com>
Subject: Re: UTF-8 read & print?
Message-Id: <k90dma$oi$1@news.albasani.net>

Helmut Richter wrote:

> On Sun, 25 Nov 2012, Tuxedo wrote:

[...]

> So you read the demo file and print it out again. If you print it to a
> file, why not do a diff of the two files and see what has changed, if
> anything? If the printing goes to HTTP output, why not give us the URL so
> that we all can see whether your server serves exactly the same text as
> the URL you gave us. We can hardly guess what happens when we are denied
> access to the difference of the two versions.

No denial intended. I have no online version, although you are right, a 
header sent by different servers may vary for example. I'm just trying gain 
a better understanding of the various issues in submitting, writing, 
reading and printing utf-8 and have some difficultly doing all of that in 
my localhost environment. However, I now understand that at least the most 
basic part is to set the charset. Thereafter, I'm not sure if encoding and 
decoding user input is always necessary, at least not for simply echoing 
some UTF-8 user input for example. For this, the below seems to work Ok: 

use strict;
use warnings;
use CGI ':standard';

print header(-charset => 'UTF-8'),
start_html,
start_form,
textfield('unicode'),
submit,
end_form;

print param('unicode');
print end_html;




------------------------------

Date: Mon, 26 Nov 2012 20:31:42 +0100
From: Tuxedo <tuxedo@mailinator.com>
Subject: Re: UTF-8 read & print?
Message-Id: <k90g2u$7bt$1@news.albasani.net>

Ben Morrow wrote:

> 
> Quoth Tuxedo <tuxedo@mailinator.com>:
> > In reading and printing a file that may contain UTF-8 characters and
> > print it into a web browser, my first attempt is:
> > 
> > #!/usr/bin/perl -w
> 
> You don't need -w if you use warnings.
> 
> > 
> > use warnings;
> > use strict;
> > use CGI qw(:standard);
> > 
> > print "Content-type: text/plain; charset=UTF-8\n\n";
> > 
> > open my $fh, "<:encoding(UTF-8)", 'UTF-8-demo.txt';
> > binmode STDOUT, ':utf-8';
> 
>     binmode STDOUT, ':utf8';
> 
> You should have got a warning about this. If you had been using autodie,
> you would have got an error (which is better, IMHO).
> 
> > while (my $line = <$fh>) {
> > print $line;
> > }
> 
> If you're just copying a file, it's better to do it in blocks than
> line-by-line.
> 
>     local $/ = \4096;
>     while (...) { ... }
> 
> Ben
> 

Thanks for these comments. I must have misunderstood utf-8 vs. utf8, 
thinking utf-8 caters to a broader spectrum of unicode charsets. I don't 
know what I'm doing with the file yet, as I'm just learning by testing.

I will look into autodie as well as skip the -w flag from now on.

Tuxedo 


------------------------------

Date: Mon, 26 Nov 2012 19:38:01 +0000
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: UTF-8 read & print?
Message-Id: <87obikdujq.fsf@sapphire.mobileactivedefense.com>

Tuxedo <tuxedo@mailinator.com> writes:
> Helmut Richter wrote:
>
>> On Sun, 25 Nov 2012, Tuxedo wrote:
>
> [...]
>
>> So you read the demo file and print it out again. If you print it to a
>> file, why not do a diff of the two files and see what has changed, if
>> anything? If the printing goes to HTTP output, why not give us the URL so
>> that we all can see whether your server serves exactly the same text as
>> the URL you gave us. We can hardly guess what happens when we are denied
>> access to the difference of the two versions.
>
> No denial intended. I have no online version, although you are right, a 
> header sent by different servers may vary for example. I'm just trying gain 
> a better understanding of the various issues in submitting, writing, 
> reading and printing utf-8 and have some difficultly doing all of that in 
> my localhost environment. However, I now understand that at least the most 
> basic part is to set the charset. Thereafter, I'm not sure if encoding and 
> decoding user input is always necessary, at least not for simply echoing 
> some UTF-8 user input for example.

Practically, encoding or deconding UTF-8 explicitly is not necessary
because perl was designed to work with UTF-8 encoded Unicode strings
which are supposed to be decoded (and possibly, re-encoded) when and
if this has to be done because of a processing step which needs
this. Theoretically, this is considered to be too difficult to
implement correctly and hence, users of the language are encouraged to
behave as if Perl wasn't capable of working with UTF-8 and always use
the three pass algorithm 1. Decode all of the input into some internal
representation the processing code can work with. 2. Perform whatever
processing is necessary. 3. Re-encode all of the processed data into
whatever output format happens to be desired.

The plan9 paper on UTF-8 support contains the following, nice
statement:

	To decide whether to compute using runes or UTF-encoded byte
	strings requires balancing the cost of converting the data
	when read and written against the cost of converting relevant
	text on demand. For programs such as editors that run a long
	time with a relatively constant dataset, runes are the better
	choice.
        
        http://plan9.bell-labs.com/sys/doc/utf.html

Since most Perl programs run a relatively short time with a highly
variable data set, the statement above suggests that the
implementation choice to do on-demand decoding was sensible. Eg, let's
assume someone is using some Perl code to do log file analysis. Log
files are often big and since this will usually involve doing regexp
matches on all input lines, decoding the input while trying to match
the regexp in a single processing loop will possibly be a lot cheaper
than first decoding everything and then looking for matches: When a
line of input is discarded as not being of interest, the hitertho
undecoded remainder doesn't need to be touched anymore.


------------------------------

Date: Mon, 26 Nov 2012 21:30:29 +0100
From: Tuxedo <tuxedo@mailinator.com>
Subject: Re: UTF-8 read & print?
Message-Id: <k90jh6$g9t$1@news.albasani.net>

Rainer Weikusat wrote:

> Tuxedo <tuxedo@mailinator.com> writes:
> > Helmut Richter wrote:
> >
> >> On Sun, 25 Nov 2012, Tuxedo wrote:
> >
> > [...]
> >
> >> So you read the demo file and print it out again. If you print it to a
> >> file, why not do a diff of the two files and see what has changed, if
> >> anything? If the printing goes to HTTP output, why not give us the URL
> >> so that we all can see whether your server serves exactly the same text
> >> as the URL you gave us. We can hardly guess what happens when we are
> >> denied access to the difference of the two versions.
> >
> > No denial intended. I have no online version, although you are right, a
> > header sent by different servers may vary for example. I'm just trying
> > gain a better understanding of the various issues in submitting,
> > writing, reading and printing utf-8 and have some difficultly doing all
> > of that in my localhost environment. However, I now understand that at
> > least the most basic part is to set the charset. Thereafter, I'm not
> > sure if encoding and decoding user input is always necessary, at least
> > not for simply echoing some UTF-8 user input for example.
> 
> Practically, encoding or deconding UTF-8 explicitly is not necessary
> because perl was designed to work with UTF-8 encoded Unicode strings
> which are supposed to be decoded (and possibly, re-encoded) when and
> if this has to be done because of a processing step which needs
> this. Theoretically, this is considered to be too difficult to
> implement correctly and hence, users of the language are encouraged to
> behave as if Perl wasn't capable of working with UTF-8 and always use
> the three pass algorithm 1. Decode all of the input into some internal
> representation the processing code can work with. 2. Perform whatever
> processing is necessary. 3. Re-encode all of the processed data into
> whatever output format happens to be desired.
> 
> The plan9 paper on UTF-8 support contains the following, nice
> statement:
> 
> To decide whether to compute using runes or UTF-encoded byte
> strings requires balancing the cost of converting the data
> when read and written against the cost of converting relevant
> text on demand. For programs such as editors that run a long
> time with a relatively constant dataset, runes are the better
> choice.
>         
>         http://plan9.bell-labs.com/sys/doc/utf.html
> 
> Since most Perl programs run a relatively short time with a highly
> variable data set, the statement above suggests that the
> implementation choice to do on-demand decoding was sensible. Eg, let's
> assume someone is using some Perl code to do log file analysis. Log
> files are often big and since this will usually involve doing regexp
> matches on all input lines, decoding the input while trying to match
> the regexp in a single processing loop will possibly be a lot cheaper
> than first decoding everything and then looking for matches: When a
> line of input is discarded as not being of interest, the hitertho
> undecoded remainder doesn't need to be touched anymore.

Thanks for the intel including the plan9 link, adding to my must-read-about 
list of subjects....

Tuxedo



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 3825
***************************************


home help back first fref pref prev next nref lref last post