[28074] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 9438 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Jul 8 18:05:40 2006

Date: Sat, 8 Jul 2006 15:05:04 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 8 Jul 2006     Volume: 10 Number: 9438

Today's topics:
        converting line input into columns vanagas99@yahoo.com
    Re: converting line input into columns <1usa@llenroc.ude.invalid>
    Re: converting line input into columns <mumia.w.18.spam+nospam.usenet@earthlink.net>
    Re: converting line input into columns <DJStunks@gmail.com>
    Re: converting line input into columns <1usa@llenroc.ude.invalid>
    Re: Get the reference to an array from a function... <tadmc@augustmail.com>
    Re: Get the reference to an array from a function... <sherm@Sherm-Pendleys-Computer.local>
    Re: Get the reference to an array from a function... <David.Squire@no.spam.from.here.au>
    Re: How to force formatted date (month) language ? <ynleder@nspark.org>
    Re: How to force formatted date (month) language ? <DJStunks@gmail.com>
    Re: How to force formatted date (month) language ? <bart@nijlen.com>
    Re: kill the process <wcooley@nakedape.cc>
    Re: kill the process <ced@blv-sam-01.ca.boeing.com>
        Need help to find byte offsets for regexps in a file <robert.dodier@gmail.com>
        Pls excuse if you consider this off-topic. Conceptual a M_Mann@artenom.com
        Profanity checking, phonetically. <shrike@cyberspace.org>
    Re: Profanity checking, phonetically. <john@castleamber.com>
        Using References to Formats? <vtatila@mail.student.oulu.fi>
    Re: Using References to Formats? <attn.steven.kuo@gmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 8 Jul 2006 07:29:50 -0700
From: vanagas99@yahoo.com
Subject: converting line input into columns
Message-Id: <1152368990.670158.258930@m79g2000cwm.googlegroups.com>

hi,

I have a file with formatted output like this:

Severity: Important
Status: Unknown
PDI ID: 1895
Finding Details
       This vulnerability... blah, blah, blah
Vulnerability Discussion
          blah, blah, blah text
Fix recommendations
      blah, blah, blah text

Please advice on how to parse such a file allowing me to put it in a
column type format. As you can see, can't use : as a separator since
not all categories have it. Plus, some of the details of these
categories are plopped in a separate line instead off next to it. Best
way would probably be to put all of it in one tab seberated line
(cleaning out severity, status, etc. later) I just dont know how to do
that. Please advice.,

Thanks,
AV



------------------------------

Date: Sat, 08 Jul 2006 18:52:10 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: converting line input into columns
Message-Id: <Xns97FA975B35B23asu1cornelledu@127.0.0.1>

vanagas99@yahoo.com wrote in news:1152368990.670158.258930
@m79g2000cwm.googlegroups.com:

> I have a file with formatted output like this:
> 
> Severity: Important
> Status: Unknown
> PDI ID: 1895
> Finding Details
>        This vulnerability... blah, blah, blah
> Vulnerability Discussion
>           blah, blah, blah text
> Fix recommendations
>       blah, blah, blah text
> 
> Please advice on how to parse such a file allowing me to put it in a
> column type format. 

Please consult the posting guidelines for this group. You can help 
others help you by posting what you have tried so far, and explaining 
the problems you have encountered.

On the other hand, it has been a month or so since I wrote any code, so 
I thought this might be a good warm-up exercise for me.

I am sure someone will correct my errors.

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

my @single_line_items = ( 'Severity', 'Status', 'PDI ID' );
my @multi_line_items  = ( 
    'Finding Details', 
    'Vulnerability Discussion',
    'Fix recommendations',
);

my @records;
my $current = 1;

while ( my $line = <DATA> ) {
    next unless $line =~ /^Severity/;
    my $text = $line;

    while ( <DATA> ) {
        last if /^\s+$/;
        $text .= $_;
    }
    $text .= "\n";

    eval {
        push @records, parse_record( \$text );
    };

    $@ and warn "Malformed record: $current: $@\n";
    ++ $current;
}

print Dumper \@records;

sub parse_record {
    my ($text_ref) = @_;

    my $record = { };

    for my $item ( @single_line_items ) {
        if ( $$text_ref =~ /$item:\s+([^\n]+)/mg ) {
            $record->{$item} = $1;
        }
        else {
            die "Missing '$item' in\n$$text_ref";
        }
    }

     for my $item ( @multi_line_items ) {
        $$text_ref =~ /^$item$/mg
            or die "Missing '$item' in\n$$text_ref";
        if ( $$text_ref =~ /\s+(.+?)\n(?:\n|\w)/sg ) {
            $record->{$item} = $1;
            pos $$text_ref -= 1;
        }
        else {
            die "Missing text for '$item' in\n$$text_ref";
        }
    }

    return $record;
}


__DATA__

Severity: Trivial
Status: Uppity
PDI ID: 1895
Finding Details
    Finding details for id 1895
Vulnerability Discussion
    Vulnerability discussion for id 1895
    more discussion
Fix recommendations
    Fix recommendations for id 1895
    more recommendations


Severity: Severe
Status: Fixed
PDI ID: 1897
Finding Details
    Finding details for id 1897
Vulnerability Discussion
    Vulnerability discussion for id 1897
    more discussion
Fix recommendations
    Fix recommendations for id 1897
    more recommendations

Severity: Offensive
Status: What's That?
PDI ID: 1898
Finding Details
    Finding details for id 1898
Vulnerability Discussion
    Vulnerability discussion for id 1898
    more discussion
Fix recommendations
    Fix recommendations for id 1898
    more recommendations


-- 
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html


------------------------------

Date: Sat, 08 Jul 2006 19:03:43 GMT
From: "Mumia W." <mumia.w.18.spam+nospam.usenet@earthlink.net>
Subject: Re: converting line input into columns
Message-Id: <jkTrg.5039$ye3.1658@newsread1.news.pas.earthlink.net>

vanagas99@yahoo.com wrote:
> hi,
> 
> I have a file with formatted output like this:
> 
> Severity: Important
> Status: Unknown
> PDI ID: 1895
> Finding Details
>        This vulnerability... blah, blah, blah
> Vulnerability Discussion
>           blah, blah, blah text
> Fix recommendations
>       blah, blah, blah text
> 
> Please advice on how to parse such a file allowing me to put it in a
> column type format. [...]

Let's make the problem simpler by breaking it into pieces. You need two 
reg-ex's, one for grabbing things like 'Severity: Important' and one for 
grabbing things like 'Finding Details....'

Can you think of a reg-ex that'll match 'Severity: Important' and grab 
'Important'?



------------------------------

Date: 8 Jul 2006 12:54:21 -0700
From: "DJ Stunks" <DJStunks@gmail.com>
Subject: Re: converting line input into columns
Message-Id: <1152388461.361579.37150@m79g2000cwm.googlegroups.com>

A. Sinan Unur wrote:
> vanagas99@yahoo.com wrote in news:1152368990.670158.258930
> @m79g2000cwm.googlegroups.com:
>
> > I have a file with formatted output like this:
> >
> > Severity: Important
> > Status: Unknown
> > PDI ID: 1895
> > Finding Details
> >        This vulnerability... blah, blah, blah
> > Vulnerability Discussion
> >           blah, blah, blah text
> > Fix recommendations
> >       blah, blah, blah text
> >
> > Please advice on how to parse such a file allowing me to put it in a
> > column type format.
>
> Please consult the posting guidelines for this group. You can help
> others help you by posting what you have tried so far, and explaining
> the problems you have encountered.
>
> On the other hand, it has been a month or so since I wrote any code, so
> I thought this might be a good warm-up exercise for me.

I was wondering if you were on vacation..... :p

> I am sure someone will correct my errors.
> <script snipped>

I don't see any errors, but it does seem needlessly complex?  Perhaps
you were trying to stretch your Perl muscles after your hiatus.

If the record is as static as presented, I would just parse the whole
thing in one fell swoop, repairing leading, trailing and multiline
spacing afterward:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use Data::Dumper;
  use English qw{ -no_match_vars };

  $INPUT_RECORD_SEPARATOR = '';

  RECORD:
  while (my $record = <DATA>) {
  	my (%record) = $record =~ m{\A \s*
  	                              (Severity)                  :(.+?)
  	                              (Status)                    :(.+?)
  	                              (PDI . ID)                  :(.+?)
  	                              (Finding . Details)          (.+?)
  	                              (Vulnerability . Discussion) (.+?)
  	                              (Fix . recommendations)      (.+?)
  	                            \z}xms;
  	if (not %record) {
  		warn "Malformed record";
  		next RECORD;
  	}
  	else {
  		# fix up spacing
  		for my $entry ( values %record ) {
  			$entry =~ s/^\s+//gm;
  			$entry =~ s/\s+$//gm;
  			$entry =~ s/\n/ /g;
  		}
  		print Dumper \%record;
  	}
  }

  __DATA__

  Severity: Trivial
  Status: Uppity
  PDI ID: 1895
  Finding Details
      Finding details for id 1895
  Vulnerability Discussion
      Vulnerability discussion for id 1895
      more discussion
  Fix recommendations
      Fix recommendations for id 1895
      more recommendations

  Severity: Severe
  Status: Fixed
  PDI ID: 1897
  Finding Details
      Finding details for id 1897
  Vulnerability Discussion
      Vulnerability discussion for id 1897
      more discussion
  Fix recommendations
      Fix recommendations for id 1897
      more recommendations

  Severity: Offensive
  Status: What's That?
  PDI ID: 1898
  Finding Details
      Finding details for id 1898
  Vulnerability Discussion
      Vulnerability discussion for id 1898
      more discussion
  Fix recommendations
      Fix recommendations for id 1898
      more recommendations 

Comments welcome,
-jp



------------------------------

Date: Sat, 08 Jul 2006 20:05:56 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: converting line input into columns
Message-Id: <Xns97FAA3DD5812Basu1cornelledu@127.0.0.1>

"DJ Stunks" <DJStunks@gmail.com> wrote in
news:1152388461.361579.37150@m79g2000cwm.googlegroups.com: 

> A. Sinan Unur wrote:
>> vanagas99@yahoo.com wrote in news:1152368990.670158.258930
>> @m79g2000cwm.googlegroups.com:
>>
>> > I have a file with formatted output like this:
>> >
>> > Severity: Important
>> > Status: Unknown
>> > PDI ID: 1895
>> > Finding Details
>> >        This vulnerability... blah, blah, blah
>> > Vulnerability Discussion
>> >           blah, blah, blah text
>> > Fix recommendations
>> >       blah, blah, blah text
>> >
>> > Please advice on how to parse such a file allowing me to put it in
>> > a column type format.
>>
>> Please consult the posting guidelines for this group. You can help
>> others help you by posting what you have tried so far, and explaining
>> the problems you have encountered.
>>
>> On the other hand, it has been a month or so since I wrote any code,
>> so I thought this might be a good warm-up exercise for me.
> 
> I was wondering if you were on vacation..... :p

Thanks for noticing. Some vacation ... some family business ;-)

>> I am sure someone will correct my errors.
>> <script snipped>
> 
> I don't see any errors, but it does seem needlessly complex?

Agreed. Your solution is quite elegant.

I do need the warm up.

Sinan
-- 
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW: 
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html


------------------------------

Date: Sat, 8 Jul 2006 08:17:50 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Get the reference to an array from a function...
Message-Id: <slrneavc3u.njt.tadmc@magna.augustmail.com>

David Squire <David.Squire@no.spam.from.here.au> wrote:

> This implies that the subroutine knows the 
> context in which it was called, 


So you can make your own subroutines that have different scalar context 
vs. list context behaviors, just like Perl's builtin functions do.

   perldoc -f wantarray


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Sat, 08 Jul 2006 11:01:41 -0400
From: Sherm Pendley <sherm@Sherm-Pendleys-Computer.local>
Subject: Re: Get the reference to an array from a function...
Message-Id: <m2veq8m97u.fsf@Sherm-Pendleys-Computer.local>

David Squire <David.Squire@no.spam.from.here.au> writes:

> True, and, to me, surprising. This implies that the subroutine knows
> the context in which it was called

It does - have a look at "perldoc -f wantarray".

sherm--

-- 
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org


------------------------------

Date: Sat, 08 Jul 2006 16:32:17 +0100
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: Get the reference to an array from a function...
Message-Id: <e8oj61$fdj$1@gemini.csx.cam.ac.uk>

Sherm Pendley wrote:
> David Squire <David.Squire@no.spam.from.here.au> writes:
> 
>> True, and, to me, surprising. This implies that the subroutine knows
>> the context in which it was called
> 
> It does - have a look at "perldoc -f wantarray".

Yes. If I had thought longer, I would have remembered wantarray. I guess 
it is just a bit of implicit polymorphism in how the argument to return 
is handled when putting stuff on the stack.

Regards,

DS


------------------------------

Date: Sat, 8 Jul 2006 19:29:39 +0200
From: Yohan N. Leder <ynleder@nspark.org>
Subject: Re: How to force formatted date (month) language ?
Message-Id: <MPG.1f1a09f1865c5f4e9897c8@news.free.fr>

In article <44af432e$0$25284$afc38c87@news.optusnet.com.au>, sisyphus1
@nomail.afraid.org says...
> For Win32 (and perhaps others ?), try:
> setlocale(LC_TIME, "English_USA.1252");
> [...]
> I'm on an "English" Win32 Machine - and, with the second strftime() call, I
> still got 'Jul' instead of 'juil.' However, when I changed your second
> setlocale() call to:
> setlocale(LC_TIME, "French_France.1252");
> 
> I then got the desired 'juil.' in the output.
> 
> Cheers,
> Rob

Effectively, I've tried 'setlocale(LC_TIME, "English_USA.1252");' and 
'setlocale(LC_TIME, "French_France.1252");' under Win 2K FR and it 
respectively gives 'Jul' and "juil.' as expected. Does it means the 
'en_US' and 'fr_FR' are not supported under Win32 ? Thanks ;)

Also, I've done the same test under a FreeBSD US and it still gives :
ENGLISH => 08 Jul 2006 @ 17:26:58 GMT
FRENCH => 08 Jul 2006 @ 17:26:58 GMT

What's the right LC_TIME specification for Unix flavors and, more 
generally, no-Win32 operating systems ?



------------------------------

Date: 8 Jul 2006 12:22:02 -0700
From: "DJ Stunks" <DJStunks@gmail.com>
Subject: Re: How to force formatted date (month) language ?
Message-Id: <1152386522.438934.268850@35g2000cwc.googlegroups.com>

Yohan N. Leder wrote:
> In article <44af432e$0$25284$afc38c87@news.optusnet.com.au>, sisyphus1
> @nomail.afraid.org says...
> > For Win32 (and perhaps others ?), try:
> > setlocale(LC_TIME, "English_USA.1252");
> > [...]
> > I'm on an "English" Win32 Machine - and, with the second strftime() call, I
> > still got 'Jul' instead of 'juil.' However, when I changed your second
> > setlocale() call to:
> > setlocale(LC_TIME, "French_France.1252");
> >
> > I then got the desired 'juil.' in the output.
> >
> > Cheers,
> > Rob
>
> Effectively, I've tried 'setlocale(LC_TIME, "English_USA.1252");' and
> 'setlocale(LC_TIME, "French_France.1252");' under Win 2K FR and it
> respectively gives 'Jul' and "juil.' as expected. Does it means the
> 'en_US' and 'fr_FR' are not supported under Win32 ? Thanks ;)

To build up a locale string in Win32 follow the instructions on this
page:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt_language_and_country_strings.asp

The possible choices for lang are listed:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt_language_strings.asp

The optional Country/Region and Code Pages supported by Windows are:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt_country_strings.asp
http://www.microsoft.com/globaldev/reference/wincp.mspx

> Also, I've done the same test under a FreeBSD US and it still gives :
> ENGLISH => 08 Jul 2006 @ 17:26:58 GMT
> FRENCH => 08 Jul 2006 @ 17:26:58 GMT
>
> What's the right LC_TIME specification for Unix flavors and, more
> generally, no-Win32 operating systems ?

as perllocale states, you should be able to produce a list of supported
locales on a UNIX-ish OS by using
  $ locale -a

in this case, I would say the easiest way to get the same script to
produce similar** results is to simply use 'english' and 'french' as
your locale strings.

HTH,
-jp

Note **: on redhat linux, 'french' produced "jui" rather than "juil.".



------------------------------

Date: 8 Jul 2006 12:40:03 -0700
From: "Bart Van der Donck" <bart@nijlen.com>
Subject: Re: How to force formatted date (month) language ?
Message-Id: <1152387603.366980.217680@p79g2000cwp.googlegroups.com>

Yohan N. Leder wrote:

> Effectively, I've tried 'setlocale(LC_TIME, "English_USA.1252");' and
> 'setlocale(LC_TIME, "French_France.1252");' under Win 2K FR and it
> respectively gives 'Jul' and "juil.' as expected. Does it means the
> 'en_US' and 'fr_FR' are not supported under Win32 ?

Both English and French locales should be available.

> Also, I've done the same test under a FreeBSD US and it still gives :
> ENGLISH =3D> 08 Jul 2006 @ 17:26:58 GMT
> FRENCH =3D> 08 Jul 2006 @ 17:26:58 GMT

1252 is a Microsoft proprietary code page and thus not very useful for
general use across various OS's.

The code of your first post should be okay, but there is something
going on you're not aware of.

The "official" French month abbreviations are the following:
janvier : janv.
f=E9vrier : f=E9vr.
mars : mars
avril : avr.
mai : mai
juin : juin
juillet : juil. ou juill.
ao=FBt : ao=FBt
septembre : sept.
octobre : oct.
novembre : nov.
d=E9cembre : d=E9c.
(http://fr.wikipedia.org/wiki/Mois)

It seems that MS' French_France.1252 follows these rules. But if we
want to force a three-char month notation (more English & informatics
style), then Houston has a problem:

French for June =3D Juin
French for July =3D Juillet
So, the three first chars are 'jui' for both months.

The logic would then be to take the first following character, like
Juin =3D jui
Juillet =3D jul

> What's the right LC_TIME specification for Unix flavors and, more
> generally, no-Win32 operating systems ?

General UNIX/Linux: setlocale('LC_TIME', 'fr_FR.ISO_8859-1');
Win32: setlocale('LC_TIME', 'fr');
Solaris: setlocale("LC_TIME", "fr");
FreeBSD: setlocale("LC_TIME", "fr_FR.ISO8859-1");
(out of http://www.oscommerce-fr.info/faq/qa_info.php?qID=3D52)

I also met:
 French_France.1252
 french.ISO_8859-1
 french
 fr_FR

A last super-safe option is to switch to the month number (like '7' in
stead of July). And then manually code out the part to tie it to the
actual month name/abbreviation you wish (according to the language the
user specified on the web page).=20
=20
--=20
 Bart



------------------------------

Date: Sat, 08 Jul 2006 12:13:28 -0700
From: Wil Cooley <wcooley@nakedape.cc>
Subject: Re: kill the process
Message-Id: <pan.2006.07.08.19.13.28.82503@nakedape.cc>

On Fri, 07 Jul 2006 12:51:15 -0700, blackdog wrote:

> I have a perl script, I like to kill it (commit suicide)  if the script
> is running on the system for more than one hour. What is the best way
> to do it?

If you're not otherwise using sleep() or anything else that sets an alarm,
you can use alarm()--see 'perldoc -f alarm'.  Here's a little example:

#!/usr/bin/perl
#
# alarm-test.pl - Simple alarm() demonstration
# 

use strict;
use warnings;
use Carp;

# How long is the alarm set for?
my $alarm_after_secs = 3;

# Set a handler for the ALRM signal; the POSIX default action is to
# terminate the program.  See perlipc and signal(7).
$SIG{'ALRM'} = sub { croak 'Received alarm signal'; };

# Set the alarm
alarm $alarm_after_secs;

# Do something to kill time here
while (1) {
    my $x = <STDIN>;
}

Wil



------------------------------

Date: Sat, 8 Jul 2006 20:26:45 GMT
From: Charles DeRykus <ced@blv-sam-01.ca.boeing.com>
Subject: Re: kill the process
Message-Id: <J23q4J.G0q@news.boeing.com>

blackdog wrote:
> I have a perl script, I like to kill it (commit suicide)  if the script
> is running on the system for more than one hour. What is the best way
> to do it?
> 

Here's a possible Unix solution if you mean what I think you mean:

     # near the top of your script
     $SIG{ALRM} = sub { die 'internal timeout'; };
     alarm(3600);

Of course, you should read all the caveats in the docs about
mixing sleep and alarm, stacking alarms, etc. Also, sometimes
signals may be lost, particularly across fork boundaries.

Hth,
-- 
Charles DeRykus


------------------------------

Date: 8 Jul 2006 13:18:03 -0700
From: "Robert Dodier" <robert.dodier@gmail.com>
Subject: Need help to find byte offsets for regexps in a file
Message-Id: <1152389883.658944.161960@75g2000cwc.googlegroups.com>

Hello,

I am hoping to find byte offsets of regular expressions in a file.

I'm working on the built-in doc system for Maxima, an open-
source computer algebra system. The doc text is a Texinfo
output file. I want to find the strings " -- Function: FOO (x, y, z)
 ..."
and print their byte offsets, and the number of bytes from one such
string to the end of the corresponding documentation item
(which might be the next " -- Function: " item or a different regex).

Here is some pseudocode to illustrate what I am attempting --

  let re1 = " --Function: <some name>"
  let re2 = FOO (not sure what to put here yet)
  slurp file into string S (this is OK, texinfo limits file to 300 k)
  byte_offset_1 = 0
  while seach for re1 beginning from byte_offset_1 succeeds
    extract <some name> from re1 match
    search for re2 beginnng from byte_offset_1
    let byte_offset_2 = byte offset of re2 match
    print <some name>, byte_offset_1, byte_offset_2
    let byte_offset_1 = byte_offset_2


I'm planning to slurp the resulting output into another program
that will then carry out matching on the list of <some name> strings
and use file seek to grab the corresponding texts. That program
will be written in another programming language so let's not worry
about that now.

If anyone has some advice about making a workable Perl
program from this pseudocode, I'll be very grateful.
Thanks in advance & all the best.

Robert Dodier



------------------------------

Date: 8 Jul 2006 10:46:08 -0700
From: M_Mann@artenom.com
Subject: Pls excuse if you consider this off-topic. Conceptual artists seek programmers here.
Message-Id: <1152380768.581234.255960@p79g2000cwp.googlegroups.com>

Hello,

Pls excuse if you consider this off-topic. Conceptual artists seek
programmers here.

We are authors of "Exhibition of Living Managers" (MANAGEX,
www.managex.info) which is global conceptual art project, performed in
world's leading contemporary art centres. Art objects at MANAGEX are
real employed managers, who volunteer to exhibit themselves in a
gallery setting. Our new project is  "Exhibition of Living Programmers"
(PROGRAMEX), which is similar to MANAGEX but focusing on professional
programmers.

Managex' official website is www.managex.info. We have also just opened
a Google Group here on Managex project:
http://groups.google.com/group/Exhibition-of-Living-Managers-MANAGEX,
where you are welcome to register and participate. Once we have
substantial number of interested programmers, we will open a dedicated
group on Programex.

Hope to hear from you and see you at Programex. Again, sorry if this
posting disturbed anybody.

Best, 

MANAGEX / PROGRAMEX team www.managex.info



------------------------------

Date: 8 Jul 2006 13:35:00 -0700
From: "shrike@cyberspace.org" <shrike@cyberspace.org>
Subject: Profanity checking, phonetically.
Message-Id: <1152390900.669510.327250@s13g2000cwa.googlegroups.com>

Howdy,

I have a randomly generated alphabetic string, and I need to profanity
check it,  phonetically. I didn't see anything like this on CPAN.

Anybody done anything like this? 

-Thanks in advance 
-Matt



------------------------------

Date: 8 Jul 2006 20:42:02 GMT
From: John Bokma <john@castleamber.com>
Subject: Re: Profanity checking, phonetically.
Message-Id: <Xns97FA9FB64366Dcastleamber@130.133.1.4>

"shrike@cyberspace.org" <shrike@cyberspace.org> wrote:

> Howdy,
> 
> I have a randomly generated alphabetic string, and I need to profanity
> check it,  phonetically. I didn't see anything like this on CPAN.
> 
> Anybody done anything like this? 

Soundex? And there is a better algorithm IIRC.

OTOH, why bother, people start using fsck, or f*kc etc.

-- 
John Bokma          Freelance software developer
                                &
                    Experienced Perl programmer: http://castleamber.com/


------------------------------

Date: Sat, 8 Jul 2006 17:26:05 +0300
From: "Veli-Pekka Tätilä" <vtatila@mail.student.oulu.fi>
Subject: Using References to Formats?
Message-Id: <e8ofa7$3i1$1@news.oulu.fi>

Hi,
Browsing perldiag, I noticed messages related to format references. So being 
curious and wishing to continue my exploration of Perl's dark and archaic 
corners, I decided to write a sample program to see how format references 
could be used in Perl. first is an account of what I've attempted, the 
relevant code in small chunks and the output received. The mail ends with 
the full program source and output from a sample run.

Curiously, references to formats are not documented in perlref, perlform 
etc... I'm running ActiveState Perl v5.8.7 (build 815, XP Pro SP2 English).

And now to the program:

To motivate taking references to formats I started out with a rather useless 
toy function that generates formats using eval. The format name and the 
number of chars it extracts from a global named $text can be parameterized.

sub genForm
{ # A simple format named $name outputting $n chars of $text.
   my($name, $length) = @_;
   eval
   (
      "format $name =\nFirst $length chars: @" .
      '<' x $length . "\n\$text\n."
   ); # eval
   die $@ unless $@ eq ''; # Eval failed.
} # sub

There's a related output function, which given a format name, writes it out 
to the default file handle:

sub writeForm
{ # Write out the specified format.
   local $~ = shift;
   write;
} # sub

At this point I started wondering whether I could use a real reference to a 
format in stead of an "indirect format" (in analogy to indirect file 
handles). First I had to use the *foo{THING} syntax to get at a format. The 
following statement, using main's symbol table and *foo{FORMAT} did the 
trick for me:

my $formRef = *{ $::{$name} }{FORMAT}; # *foo{FORMAT} syntax in main 
package.

To test what info could be gleaned from a format reference I made a function 
for that, too. Here it is:

sub dumpForm
{ # Dump info on a format reference.
   my $formRef = shift;
   print "The formref $name is:";
   print "Stringified: $formRef";
   print "of type: " . ref($formRef);
   print "Dumped: ";
   eval { print Dumper($formRef) };
} # sub

Oddly, neither the docs for the ref built-in nor Data::Dumper mentioned 
references to formats. Despite this the ref function and stringification 
worked all right but Dumper didn't. Here's some output:

The formref eight is:
Stringified: FORMAT(0x18c4ecc)
of type: FORMAT
Dumped:
cannot handle ref type 14 at C:/Perl/lib/Data/Dumper.pm line 167.
$VAR1 = ;

I wonder if the debugger does any better. I have not tested it yet.

To make format references useful at all, I suppose one would have to be able 
to dereference them somehow. IS that possible, and if so how? Formats have 
no sigil so I myself have absolutely no idea how they could be dereferenced. 
Would being able to work with format references bring any benefits compared 
to refering to formats by name? I suppose not though using format references 
does seem to sort of work.

The first thing that occurred to me was to try assigning a format reference 
to $~, as opposed to a format name. The same writeForm function could be 
used, just passing it a reference:

eval { writeForm($formRef) };
print "Using formref for $~: $@";

This strategy didn't work all that well. The statement printing the eval 
error outputs:

Using formref for STDOUT: Undefined format "FORMAT(0x18c4eb4)" called at 
C:\programming\plx\test.plx line 29.

Apparently no magical dereferencing is going on here. Starting to run out of 
ideas, I thought of testing what would happen if I tried to dereference the 
format as a scalar. I have no real rationale for that apart from scalar 
derefs working for elements in arrays and hashes. I did realize right from 
the start this wouldn't work for formats but typed in the following 
nevertheless:

eval { writeForm(${$formRef}) };
print "Using desperate scalar deref for $~: $@";

And the output is:

Using desperate scalar deref for STDOUT: Not a format reference at 
C:\programming\plx\test.plx line 29.

Quite right, not a format reference. But the thing that puzzles me here is 
that the error is phrased as though Perl expected a format reference. Yet 
when I give it one, as in the previous attempt, it doesn't seem to like it 
any better, either. It just takes the stringified form of the reference to 
be a format name which is no good.

Finally, here's the full code followed by some sample output:

Full code:

use strict; use warnings;
use Data::Dumper;
our $text = 'this is a test';
(my $name, local $\) = ('eight', "\n");
genForm($name , 8);
writeForm($name);
my $formRef = *{ $::{$name} }{FORMAT}; # *foo{FORMAT} syntax in main 
package.
dumpForm($formRef);
# Try using formatref in stead of format name for writing the data.
eval { writeForm($formRef) };
print "Using formref for $~: $@";
eval { writeForm(${$formRef}) };
print "Using desperate scalar deref for $~: $@";

sub genForm
{ # A simple format named $name outputting $n chars of $text.
   my($name, $length) = @_;
   eval
   (
      "format $name =\nFirst $length chars: @" .
      '<' x $length . "\n\$text\n."
   ); # eval
   die $@ unless $@ eq ''; # Eval failed.
} # sub

sub writeForm
{ # Write out the specified format.
   local $~ = shift;
   write;
} # sub

sub dumpForm
{ # Dump info on a format reference.
   my $formRef = shift;
   print "The formref $name is:";
   print "Stringified: $formRef";
   print "of type: " . ref($formRef);
   print "Dumped: ";
   eval { print Dumper($formRef) };
} # sub

Sample output:

First 8 chars: this is a
The formref eight is:
Stringified: FORMAT(0x18c4eb4)
of type: FORMAT
Dumped:
cannot handle ref type 14 at C:/Perl/lib/Data/Dumper.pm line 167.
$VAR1 = ;

Using formref for STDOUT: Undefined format "FORMAT(0x18c4eb4)" called at 
C:\programming\plx\test.plx line 29.

Use of uninitialized value in scalar assignment at 
C:\programming\plx\test.plx line 28.
Using desperate scalar deref for STDOUT: Not a format reference at 
C:\programming\plx\test.plx line 29.

-- 
With kind regards Veli-Pekka Tätilä (vtatila@mail.student.oulu.fi)
Accessibility, game music, synthesizers and programming:
http://www.student.oulu.fi/~vtatila/ 




------------------------------

Date: 8 Jul 2006 10:15:03 -0700
From: "attn.steven.kuo@gmail.com" <attn.steven.kuo@gmail.com>
Subject: Re: Using References to Formats?
Message-Id: <1152378903.285640.148110@75g2000cwc.googlegroups.com>

Veli-Pekka T=E4til=E4 wrote:

(snipped)

> At this point I started wondering whether I could use a real reference to=
 a
> format in stead of an "indirect format" (in analogy to indirect file
> handles). First I had to use the *foo{THING} syntax to get at a format. T=
he
> following statement, using main's symbol table and *foo{FORMAT} did the
> trick for me:
>
> my $formRef =3D *{ $::{$name} }{FORMAT}; # *foo{FORMAT} syntax in main
> package.
>
> To test what info could be gleaned from a format reference I made a funct=
ion
> for that, too. Here it is:
>
> sub dumpForm
> { # Dump info on a format reference.
>    my $formRef =3D shift;
>    print "The formref $name is:";
>    print "Stringified: $formRef";
>    print "of type: " . ref($formRef);
>    print "Dumped: ";
>    eval { print Dumper($formRef) };
> } # sub
>
> Oddly, neither the docs for the ref built-in nor Data::Dumper mentioned
> references to formats. Despite this the ref function and stringification
> worked all right but Dumper didn't. Here's some output:
>
> The formref eight is:
> Stringified: FORMAT(0x18c4ecc)
> of type: FORMAT
> Dumped:
> cannot handle ref type 14 at C:/Perl/lib/Data/Dumper.pm line 167.
> $VAR1 =3D ;


You can use Devel::Peek instead of Data::Dumper
if you want to look at the guts of a format reference.

For me Devel::Peek::Dump prints:

SV =3D RV(0x1821258) at 0x182ad40
  REFCNT =3D 1
  FLAGS =3D (PADBUSY,PADMY,ROK)
  RV =3D 0x182ae60
  SV =3D PVFM(0x4489e0) at 0x182ae60
    REFCNT =3D 3
    FLAGS =3D (CLONE)
    IV =3D 0
    NV =3D 0
    COMP_STASH =3D 0x0
    START =3D 0x448b90 =3D=3D=3D> 5120
    ROOT =3D 0x448c90
    XSUB =3D 0x0
    XSUBANY =3D 0
    GVGV::GV =3D 0x182ae9c        "main" :: "eight"
etc.,




> I wonder if the debugger does any better. I have not tested it yet.
>
> To make format references useful at all, I suppose one would have to be a=
ble
> to dereference them somehow. IS that possible, and if so how? Formats have
> no sigil so I myself have absolutely no idea how they could be dereferenc=
ed.
> Would being able to work with format references bring any benefits compar=
ed
> to refering to formats by name? I suppose not though using format referen=
ces
> does seem to sort of work.


Well, just use another typeglob for dereferencing.
Throw in the 'reftype' function from the
Scalar::Util module and then you can
pass either a format NAME or format
reference to writeForm:


> sub writeForm
> { # Write out the specified format.
>    local $~ =3D shift;
>    write;
> } # sub


becomes:

use Scalar::Utiil ('reftype');

sub writeForm {
   if (reftype $_[0] and reftype $_[0] eq 'FORMAT') {
       local *FOO;
       *FOO =3D $_[0];
       local $~ =3D 'FOO';
       write;
   } else {
       local $~ =3D shift;
       write;
    }
}

--=20
Hope this helps,
Steven



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 9438
***************************************


home help back first fref pref prev next nref lref last post