[31522] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2781 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jan 21 14:09:47 2010

Date: Thu, 21 Jan 2010 11:09:13 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 21 Jan 2010     Volume: 11 Number: 2781

Today's topics:
        BEGIN and lexicals <catebekensail@yahoo.com>
    Re: BEGIN and lexicals <marc.girod@gmail.com>
    Re: BEGIN and lexicals <cartercc@gmail.com>
    Re: BEGIN and lexicals <uri@StemSystems.com>
    Re: BEGIN and lexicals <uri@StemSystems.com>
    Re: BEGIN and lexicals <jurgenex@hotmail.com>
    Re: macros: return or exit <marc.girod@gmail.com>
    Re: macros: return or exit <uri@StemSystems.com>
    Re: macros: return or exit <marc.girod@gmail.com>
    Re: macros: return or exit <uri@StemSystems.com>
    Re: macros: return or exit <marc.girod@gmail.com>
        Modules for PDFs especially tables. <justin.0911@purestblue.com>
    Re: Modules for PDFs especially tables. <RedGrittyBrick@spamweary.invalid>
    Re: Modules for PDFs especially tables. <john@castleamber.com>
    Re: Strip control characters in a file <marc.girod@gmail.com>
    Re: Strip control characters in a file sln@netherlands.com
    Re: Strip control characters in a file sln@netherlands.com
    Re: Strip control characters in a file <RedGrittyBrick@spamweary.invalid>
    Re: Strip control characters in a file sln@netherlands.com
    Re: Strip control characters in a file <marc.girod@gmail.com>
    Re: Strip control characters in a file <marc.girod@gmail.com>
    Re: Strip control characters in a file sln@netherlands.com
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 21 Jan 2010 06:12:21 -0800 (PST)
From: cate <catebekensail@yahoo.com>
Subject: BEGIN and lexicals
Message-Id: <1373c121-018f-4f0c-9fb6-56e7fed5a4af@p24g2000yqm.googlegroups.com>

I have a large list of my vars that I would like to get out of the
way; place them at the end of the script - kinda a class thing.

Is there a way to use BEGIN some how?   Something like this.  I
suspect you can't, but I'm asking the pros.

use strict;
code using $var1 ...
code using $var1 ...
more code


BEGIN {
  my $var1 = 'sfsdf';
  my $var2 = 'sdfsdf';
}

thank you


------------------------------

Date: Thu, 21 Jan 2010 07:08:28 -0800 (PST)
From: Marc Girod <marc.girod@gmail.com>
Subject: Re: BEGIN and lexicals
Message-Id: <d623ba61-4fb5-4d65-850d-a1a27f0be233@j14g2000yqm.googlegroups.com>

On Jan 21, 2:12=A0pm, cate <catebekens...@yahoo.com> wrote:

> Is there a way to use BEGIN some how? =A0 Something like this. =A0I
> suspect you can't

Their scope will be this of the BEGIN block.
You could do:

use vars qw($var1 $var2);

 ...

BEGIN {
  $var1 =3D 'sdfsdf';
  $var2 =3D 'sdfsd';
}

Now, should I say that I am not convinced it buys you much...

Marc


------------------------------

Date: Thu, 21 Jan 2010 08:02:51 -0800 (PST)
From: ccc31807 <cartercc@gmail.com>
Subject: Re: BEGIN and lexicals
Message-Id: <b9a0b88d-a07d-41f2-acc7-4b937cfd2a2a@r19g2000yqb.googlegroups.com>

On Jan 21, 9:12=A0am, cate <catebekens...@yahoo.com> wrote:
> I have a large list of my vars that I would like to get out of the
> way; place them at the end of the script - kinda a class thing.
>
> Is there a way to use BEGIN some how? =A0 Something like this. =A0I
> suspect you can't, but I'm asking the pros.

Put them in a separate file, either an ordinary file or a PM.

If in an ordinary file, say vars.txt, like this:
$var1=3DGWashington
$var2=3D12
$var3=3Dashington, D.C.

You can do this in your script:
my %vars;
open VARS, '<', 'vars.txt';
while (<VARS>) {
  chomp;
  my ($key, $val) =3D split /=3D/;
  $vars{$key} =3D $val;
}
close VARS;

If in a Perl module, say VARS.pm, use them like this in your script:

use VARS;
print $VARS::var1; # prints GWashington
$product =3D $VARS::var2 + 3; # $product is 15

In package VARS declare your variables with our.

CC.


------------------------------

Date: Thu, 21 Jan 2010 11:38:28 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: BEGIN and lexicals
Message-Id: <87aaw7qupn.fsf@quad.sysarch.com>

>>>>> "MG" == Marc Girod <marc.girod@gmail.com> writes:

  MG> On Jan 21, 2:12 pm, cate <catebekens...@yahoo.com> wrote:
  >> Is there a way to use BEGIN some how?   Something like this.  I
  >> suspect you can't

  MG> Their scope will be this of the BEGIN block.
  MG> You could do:

  MG> use vars qw($var1 $var2);

  MG> ...

  MG> BEGIN {
  MG>   $var1 = 'sdfsdf';
  MG>   $var2 = 'sdfsd';
  MG> }

  MG> Now, should I say that I am not convinced it buys you much...

it buys you the loss of lexicals. those are now package globals and can
be accessed from anywhere in the program.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 21 Jan 2010 11:45:06 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: BEGIN and lexicals
Message-Id: <87636vquel.fsf@quad.sysarch.com>

>>>>> "c" == cate  <catebekensail@yahoo.com> writes:

  c> I have a large list of my vars that I would like to get out of the
  c> way; place them at the end of the script - kinda a class thing.

  c> Is there a way to use BEGIN some how?   Something like this.  I
  c> suspect you can't, but I'm asking the pros.

  c> use strict;
  c> code using $var1 ...
  c> code using $var1 ...
  c> more code


  c> BEGIN {
  c>   my $var1 = 'sfsdf';
  c>   my $var2 = 'sdfsdf';
  c> }

the lexicals will be scoped only to the BEGIN block so they won't be
seen by the rest of the code. but needing to declare a mess of lexicals
tells me you have a weak design for this program. they are effectively
file globals and needing many globals is a poor design. try declaring
them in tighter scopes where they are just needed. use subs to organize
mainline code into smaller scopes where you can declare lexicals you
only need there. there should be almost no mainline code (code outside
subs) in any decent sized script. this will help with flow control,
understanding the code, maintaining it, etc. if you need a long flow,
still break it up into subs and call them from higher level subs. and do
that again if you have long higher level subs.

another solution is to use a single lexical hash with many/most of your
lexical data. it may need you to rewrite code that refers to them but
that be done quickly with a search/replace edit call. then you declare
the lexical hash at the top and initialize it in the BEGIN at the
bottom.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 21 Jan 2010 09:43:03 -0800
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: BEGIN and lexicals
Message-Id: <n44hl5dgq0vkgm59qg8br0gpln45o8smrs@4ax.com>

cate <catebekensail@yahoo.com> wrote:
>I have a large list of my vars that I would like to get out of the
>way; place them at the end of the script - kinda a class thing.

As a general rule you should try to avoid global variables, they are
rarely necessary. And a large number of global variables usually
indicates poor design of the algorithm or the data structure. 

Instead of trying to hide the variables I would rather investigate how
to improve my code or data structure and eliminate them.  

jue


------------------------------

Date: Thu, 21 Jan 2010 07:01:40 -0800 (PST)
From: Marc Girod <marc.girod@gmail.com>
Subject: Re: macros: return or exit
Message-Id: <15cb6a0a-7270-421a-b942-6014dedfce31@m25g2000yqc.googlegroups.com>

Thanks Ben,

On Jan 21, 1:14=A0pm, Ben Morrow <b...@morrow.me.uk> wrote:

> It's not clear what procedure you're trying to follow here. Are you
> going to create your own fork of the module, and possibly try to feed
> the changes upstream, or are you trying to find a way to avoid that?

It is not clear to me.
I'll make my choice based on the options.
If I maintain a branch, I won't publish it to CPAN: I'll keep it in my
Google Code site.
Until I have something good enough to convince the author?
Earlier if the solution is not invasive?
Never if it ends up shooting the performance down or similar.

> The syntax $cmd->(@ARGV) is usually clearer, and generalises better to
> situations like $cmd{$which}->(@ARGV).

Thanks. I agree.
I'll remember it if I push a change to the base module.

> One option is to accept that the module you are using will terminate the
> process, and fork before calling it in order to retain control. You will
> need to think carefully about whether the exit calls are going to do
> anything unpleasant to your filehandle buffers.

I don't think it is an option: I already have one fork, and I'd very
much like the forked process to remain a single shared background
server.
And it is a proprietary binary, to which I talk via stdin/stdout (I
mean that it won't 'select').

> Perl doesn't have macros, I'm afraid.

This far, I knew...

> If you can afford to maintain a fork of the module,
> a simple search-and-replace in your text editor
> would seem the easiest option.

Many nasty merges for every change I'll do...
Or generate the branched version every time? Not impossible...
But double testing... double environment, double installs...

> Otherwise, one nasty option (and you're pretty much into nasty options
> if you need to wrap a module with such an unpleasant interface) would be
> to use the (undocumented and unexported) Want::double_return function
> provided you the Want module. This sets up the next return to return
> 'twice', so you could do something like

Should I say that this is exactly the kind of answer I was hoping?
I have to investigate it a bit.
It even seems like an example for making my own, if I could not use
this
one precisely...

Thanks again.
Marc


------------------------------

Date: Thu, 21 Jan 2010 11:37:02 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: macros: return or exit
Message-Id: <87eiljqus1.fsf@quad.sysarch.com>

>>>>> "MG" == Marc Girod <marc.girod@gmail.com> writes:

  MG> And it is a proprietary binary, to which I talk via stdin/stdout (I
  MG> mean that it won't 'select').

huh? do you mean select as in the single arg select which handle
gets print by default? or 4 arg select which multiplexes handles for
i/o? if you mean the 4 arg, you can definitely use that on the subproc's
stdio as that is a common way to manage i/o from a forked process.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 21 Jan 2010 10:31:01 -0800 (PST)
From: Marc Girod <marc.girod@gmail.com>
Subject: Re: macros: return or exit
Message-Id: <c4b1b219-c891-4837-9a3d-fa8a8a85fde3@u41g2000yqe.googlegroups.com>

On Jan 21, 4:37=A0pm, "Uri Guttman" <u...@StemSystems.com> wrote:

> huh?

Sorry for the confusion.
The problem is probably that I didn't explain what binary I was
talking about.
I don't know how to make this story short, and it goes beyond our
scope.
If you are interested, you can read the docs on CPAN, for the modules
I mentioned, plus ClearCase::Argv which they all use.
But I doubt it is worth your interest unless you may try them, and
this depends on your having access to a ClearCase installation...

Marc


------------------------------

Date: Thu, 21 Jan 2010 13:41:01 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: macros: return or exit
Message-Id: <87vdevnvwi.fsf@quad.sysarch.com>

>>>>> "MG" == Marc Girod <marc.girod@gmail.com> writes:

  MG> On Jan 21, 4:37 pm, "Uri Guttman" <u...@StemSystems.com> wrote:
  >> huh?

  MG> Sorry for the confusion.
  MG> The problem is probably that I didn't explain what binary I was
  MG> talking about.

it is irrelevant about which binary. the issue is how do you manage its
i/o via its stdio. that is the same problem for any binary subprocess
you want to run with its stdio. now you never answered the question
about which select function you meant.

  MG> I don't know how to make this story short, and it goes beyond our
  MG> scope.  If you are interested, you can read the docs on CPAN, for
  MG> the modules I mentioned, plus ClearCase::Argv which they all use.
  MG> But I doubt it is worth your interest unless you may try them, and
  MG> this depends on your having access to a ClearCase installation...

no chance in hell of my having clearcase. :)

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 21 Jan 2010 11:08:35 -0800 (PST)
From: Marc Girod <marc.girod@gmail.com>
Subject: Re: macros: return or exit
Message-Id: <e8471e1f-d1d3-4d07-b398-cc51dfb440c0@c34g2000yqn.googlegroups.com>

On Jan 21, 6:41=A0pm, "Uri Guttman" <u...@StemSystems.com> wrote:

> it is irrelevant about which binary. the issue is how do you manage its
> i/o via its stdio. that is the same problem for any binary subprocess
> you want to run with its stdio. now you never answered the question
> about which select function you meant.

The C function that the cleartool binary does not use.

I was not saying my code doing select in any way. I meant its *not*
doing select.
It may be me who do not understand...
If I fork part of my script, there will be two instances sharing the
same background process.
I cannot see this work.

Does it make better sense  now?

Thanks,
Marc


------------------------------

Date: Thu, 21 Jan 2010 16:46:59 -0000
From: Justin C <justin.0911@purestblue.com>
Subject: Modules for PDFs especially tables.
Message-Id: <45d2.4b588503.708ce@zem>

I want to produce a PDF containing two tables side by side. Each 51
rows (including a header) by three columns. Columns 1 and 2 in each 
table are to contain centred text, and column three is to be
left-aligned.

I've been experimenting with PDF::API2, and PDF::Table, but PDF::Table
doesn't appear to do centred text, and PDF::API2 is hard work - the
documentation leaves a lot to be desired, for example, surfing the web
for hints on using PDF::API2 I find references to methods not mentioned
in the PDF::API2 documentation.

Does anyone have any suggestions on how I might proceed? TeX looks like
it might be the best way forward now, I have Lamport's LaTeX book here
so can pull together the relevant TeX/LaTeX commands. I suppose that, if
I can knock up what I want in TeX to start with I won't even need a TeX
module, I can just use some templating.

I'll probably still need Latex::Driver to get my PDF. 

Thank you for any suggestions.

	Justin.

-- 
Justin C, by the sea.


------------------------------

Date: Thu, 21 Jan 2010 18:54:15 +0000
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: Modules for PDFs especially tables.
Message-Id: <4b58a2d9$0$2531$da0feed9@news.zen.co.uk>


Justin C wrote:
> I want to produce a PDF containing two tables side by side. Each 51
> rows (including a header) by three columns. Columns 1 and 2 in each 
> table are to contain centred text, and column three is to be
> left-aligned.
> 
> I've been experimenting with PDF::API2, and PDF::Table, but PDF::Table
> doesn't appear to do centred text, and PDF::API2 is hard work - the
> documentation leaves a lot to be desired, for example, surfing the web
> for hints on using PDF::API2 I find references to methods not mentioned
> in the PDF::API2 documentation.
> 
> Does anyone have any suggestions on how I might proceed? TeX looks like
> it might be the best way forward now, I have Lamport's LaTeX book here
> so can pull together the relevant TeX/LaTeX commands. I suppose that, if
> I can knock up what I want in TeX to start with I won't even need a TeX
> module, I can just use some templating.
> 

Since I know PS better than Tex, I'd just print PS statements directly 
and use GS to convert that to PDF.

--------------8<----------------
#!/usr/bin/perl
use strict;
use warnings;

print <<EndPS;
%!PS
/Times-Roman 12 selectfont
100 700 moveto
(Hello) show
showpage

EndPS
--------------8<----------------
Untested - caveat emptor. I have PS boilerplate for centering text etc.

-- 
RGB


------------------------------

Date: Thu, 21 Jan 2010 12:57:22 -0600
From: John Bokma <john@castleamber.com>
Subject: Re: Modules for PDFs especially tables.
Message-Id: <87vdevffql.fsf@castleamber.com>

Justin C <justin.0911@purestblue.com> writes:

> I want to produce a PDF containing two tables side by side. Each 51
> rows (including a header) by three columns. Columns 1 and 2 in each 
> table are to contain centred text, and column three is to be
> left-aligned.
>
> I've been experimenting with PDF::API2, and PDF::Table, but PDF::Table
> doesn't appear to do centred text, and PDF::API2 is hard work - the
> documentation leaves a lot to be desired, for example, surfing the web
> for hints on using PDF::API2 I find references to methods not mentioned
> in the PDF::API2 documentation.
>
> Does anyone have any suggestions on how I might proceed? TeX looks like
> it might be the best way forward now, I have Lamport's LaTeX book here
> so can pull together the relevant TeX/LaTeX commands. I suppose that, if
> I can knock up what I want in TeX to start with I won't even need a TeX
> module, I can just use some templating.

For a project that required XML invoices to be turned into PDF or TIFF I
used two external (to the Perl program this was part of) programs:
AltovaXML (free, closed source) and Apache FOP.  The former because it
was the only free software (albeit closed) I could find that supports
XSLT 2, and Apache FOP to convert XSL-FO to PDF (or TIFF, or several
other formats)

Tables are supported that way (and quite easy IMO).

-- 
John Bokma                                                               j3b

Hacking & Hiking in Mexico -  http://johnbokma.com/
http://castleamber.com/ - Perl & Python Development


------------------------------

Date: Thu, 21 Jan 2010 06:23:03 -0800 (PST)
From: Marc Girod <marc.girod@gmail.com>
Subject: Re: Strip control characters in a file
Message-Id: <3d3979e9-5adb-4159-9e63-6833b7ce6aaf@p24g2000yqm.googlegroups.com>

On Jan 21, 9:28=A0am, "Uri Guttman" <u...@StemSystems.com> wrote:

> use tr///. untested (need to check the chars):

> perl -pi2 -e 'tr/0x00-0x090x11-0x1f//d'

Thanks Uri.
I couldn't make tr work with hexadecimal...
But it groked octal very nicely:

perl -pi2 -e 'tr/\000-\011\013-\037\177//d'

Marc


------------------------------

Date: Thu, 21 Jan 2010 08:22:46 -0800
From: sln@netherlands.com
Subject: Re: Strip control characters in a file
Message-Id: <3kvgl55tgmubhkmnbd7rn5ekj0tl7003m7@4ax.com>

On Thu, 21 Jan 2010 01:20:51 -0800 (PST), Marc Girod <marc.girod@gmail.com> wrote:

>Hello,
>
>I was asked a way to strip control characters from a text file.
>Soon, it became clear that newlines must be kept, as well as
>(probably) tabs.
>The context was however unix only.
>
>Inspired in part by recent posts in this group, I came up with the
>following one-liner:
>
>perl -pi2 -e 'BEGIN{$rep{chr($_)}=q() for 0..31,127;$rep{chr(10)}=chr
>(10)}s/([[:cntrl:]])/$rep{$1}/g' /tmp/fff
>
>...assuming the file was /tmp/fff, and keeping a backup of it.
>
>I would now humbly turn to you for critique and improvements.
>
>Thanks,
>marc

Or, you could use than new fangle \K thing:

  perl -pi2 -e 's/(?:\t|\n)\K|([[:cntrl:]])//g'  filename

As a program:

  use strict;
  use warnings;
  require 5.010_000;

  $_ = " bs: '\x{08}', tab: '\t', cr: '\x{0d}', zero: '\x{00}', newline: '\n', 127: '\x{1f}'";
  s/(?:\t|\n)\K|([[:cntrl:]])//g;
  print "$_\n\n";

Output:
 bs: '', tab: ' ', cr: '', zero: '', newline: '
', 127: ''

-sln


------------------------------

Date: Thu, 21 Jan 2010 08:25:54 -0800
From: sln@netherlands.com
Subject: Re: Strip control characters in a file
Message-Id: <8uvgl55nvnhbutn672fmphj1ehdm4lh26g@4ax.com>

On Thu, 21 Jan 2010 08:22:46 -0800, sln@netherlands.com wrote:

>On Thu, 21 Jan 2010 01:20:51 -0800 (PST), Marc Girod <marc.girod@gmail.com> wrote:
>
>>Hello,
>>
>>I was asked a way to strip control characters from a text file.
>>Soon, it became clear that newlines must be kept, as well as
>>(probably) tabs.
>>The context was however unix only.
>>
>>Inspired in part by recent posts in this group, I came up with the
>>following one-liner:
>>
>>perl -pi2 -e 'BEGIN{$rep{chr($_)}=q() for 0..31,127;$rep{chr(10)}=chr
>>(10)}s/([[:cntrl:]])/$rep{$1}/g' /tmp/fff
>>
>>...assuming the file was /tmp/fff, and keeping a backup of it.
>>
>>I would now humbly turn to you for critique and improvements.
>>
>>Thanks,
>>marc
>
>Or, you could use than new fangle \K thing:
>
>  perl -pi2 -e 's/(?:\t|\n)\K|([[:cntrl:]])//g'  filename
                               ^
Done even need capture parenth's ..
                 s/(?:\t|\n)\K|[[:cntrl:]]//g;

-sln


------------------------------

Date: Thu, 21 Jan 2010 16:26:49 +0000
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: Strip control characters in a file
Message-Id: <4b58804b$0$2486$db0fefd9@news.zen.co.uk>


Marc Girod wrote:
> Hello,
> 
> I was asked a way to strip control characters from a text file.
> Soon, it became clear that newlines must be kept, as well as
> (probably) tabs.
> The context was however unix only.
> 
> Inspired in part by recent posts in this group, I came up with the
> following one-liner:
> 
> perl -pi2 -e 'BEGIN{$rep{chr($_)}=q() for 0..31,127;$rep{chr(10)}=chr
> (10)}s/([[:cntrl:]])/$rep{$1}/g' /tmp/fff
> 
> ...assuming the file was /tmp/fff, and keeping a backup of it.
> 
> I would now humbly turn to you for critique and improvements.

I guess you don't need to worry about character sets other than ASCII or 
it's simpler supersets?

ISO-8859 has "control characters" assigned to 0x80-0x9F
Unicode has "control characters" assigned to U+0080 - U+009F and U+2029?
EBCDIC?
Others?

-- 
RGB


------------------------------

Date: Thu, 21 Jan 2010 09:02:29 -0800
From: sln@netherlands.com
Subject: Re: Strip control characters in a file
Message-Id: <ct1hl5dp8q42tps4ri7vff466kg92nha7o@4ax.com>

On Thu, 21 Jan 2010 16:26:49 +0000, RedGrittyBrick <RedGrittyBrick@spamweary.invalid> wrote:

>
>Marc Girod wrote:
>> Hello,
>> 
>> I was asked a way to strip control characters from a text file.
>> Soon, it became clear that newlines must be kept, as well as
>> (probably) tabs.
>> The context was however unix only.
>> 
>> Inspired in part by recent posts in this group, I came up with the
>> following one-liner:
>> 
>> perl -pi2 -e 'BEGIN{$rep{chr($_)}=q() for 0..31,127;$rep{chr(10)}=chr
>> (10)}s/([[:cntrl:]])/$rep{$1}/g' /tmp/fff
>> 
>> ...assuming the file was /tmp/fff, and keeping a backup of it.
>> 
>> I would now humbly turn to you for critique and improvements.
>
>I guess you don't need to worry about character sets other than ASCII or 
>it's simpler supersets?
>
>ISO-8859 has "control characters" assigned to 0x80-0x9F
>Unicode has "control characters" assigned to U+0080 - U+009F and U+2029?
>EBCDIC?
>Others?

Shouldn't [[:cntrl:]] read these as Unicode control chars?
Coerced to utf8:

$_ = " u1200: '\x{1200}', bs: '\x{08}', tab: '\t', cr: '\x{0d}', zero: '\x{00}', 81h: '\x{81}', newline: '
', 7fh: '\x{7f}', u009F: '\x{009F}', u2029: '\x{2029}' ";

s/(?:\t|\n)\K|[[:cntrl:]]//g;

binmode (STDOUT, "utf8");
print "$_\n\n";

Gives:
 u1200: 'ሀ', bs: '', tab: '   ', cr: '', zero: '', 81h: '', newline: '
', 7fh: '', u009F: '', u2029: 'GǬ'

All but U+2029 which seems kind of strange.

-sln


------------------------------

Date: Thu, 21 Jan 2010 10:03:31 -0800 (PST)
From: Marc Girod <marc.girod@gmail.com>
Subject: Re: Strip control characters in a file
Message-Id: <e944de92-6489-4ec1-92a3-233a36f3dba2@k17g2000yqh.googlegroups.com>

On Jan 21, 4:26=A0pm, RedGrittyBrick <RedGrittyBr...@spamweary.invalid>
wrote:

> I guess you don't need to worry about character sets other than ASCII or
> it's simpler supersets?

That was not in the request I got, no.

Marc


------------------------------

Date: Thu, 21 Jan 2010 10:04:12 -0800 (PST)
From: Marc Girod <marc.girod@gmail.com>
Subject: Re: Strip control characters in a file
Message-Id: <1f1a0a4b-db50-4e8f-9bb8-4185456ac1f3@c29g2000yqd.googlegroups.com>

On Jan 21, 4:22=A0pm, s...@netherlands.com wrote:

> Or, you could use than new fangle \K thing:

Thanks. Interesting: I had missed this completely.

Marc


------------------------------

Date: Thu, 21 Jan 2010 10:34:14 -0800
From: sln@netherlands.com
Subject: Re: Strip control characters in a file
Message-Id: <0s6hl51990l03osscgkttrc7on98tbp28g@4ax.com>

On Thu, 21 Jan 2010 10:03:31 -0800 (PST), Marc Girod <marc.girod@gmail.com> wrote:

>On Jan 21, 4:26 pm, RedGrittyBrick <RedGrittyBr...@spamweary.invalid>
>wrote:
>
>> I guess you don't need to worry about character sets other than ASCII or
>> it's simpler supersets?
>
>That was not in the request I got, no.
>
>Marc

Request or not, your first judgement to use '[[:cntrl:]]' was
correct because it generally recognises characterset control
characters on the host platform, and files of different encodings.
Since you don't care about encoding, just use the tr/// form and
deal with rewriting the whole thing when it fails on a Unicode file.

-sln


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2781
***************************************


home help back first fref pref prev next nref lref last post