[23412] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 5630 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Oct 7 18:05:46 2003

Date: Tue, 7 Oct 2003 15:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 7 Oct 2003     Volume: 10 Number: 5630

Today's topics:
        //= operator alternative (Roy Johnson)
        [OT] please ignore (Swen Killer)
    Re: [OT] please ignore (Tad McClellan)
        ANNOUNCE: Class::Colon 0.02 (Phil Crow)
        C++ example in perldoc perlxs <tmohr@s.netic.de>
    Re: GDBM_File problems (Tad McClellan)
    Re: Opinions on "new SomeObject" vs. "SomeObject->new() <perl@my-header.org>
    Re: Opinions on "new SomeObject" vs. "SomeObject->new() <perl@my-header.org>
    Re: Opinions on "new SomeObject" vs. "SomeObject->new() <grazz@pobox.com>
    Re: Perl Command line for stat <thens@NOSPAMti.com>
    Re: Perl Command line for stat (Tad McClellan)
    Re: regex behavior <michael.p.broida@boeing_oops.com>
    Re: require and do - relative vs absolute? <invalid-email@rochester.rr.com>
        set some fiile data as a variable <feurry@hotmail.com>
    Re: set some fiile data as a variable <mbudash@sonic.net>
    Re: set some fiile data as a variable <feurry@hotmail.com>
    Re: Strange behaviour with '\r' character [[ sorry my o <j.m.f.dev.null@gmx.net>
    Re: sysopen problem (ko)
        TCP Listener on Windows XP <C.J.Robbins@ntlworld.com>
    Re: Teach me how to fish, regexp <henryn@zzzspacebbs.com>
    Re: Teach me how to fish, regexp <henryn@zzzspacebbs.com>
    Re:  <bwalton@rochester.rr.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 7 Oct 2003 12:27:12 -0700
From: rjohnson@shell.com (Roy Johnson)
Subject: //= operator alternative
Message-Id: <3ee08638.0310071127.62f1718a@posting.google.com>

I apologize for being so late to this party, but I do think I have an
idea worth considering.

//= will be forthcoming in perl6 to handle defaulting undefined values
(vs. merely false ones). In the threads discussing it, there was some
noise made about needing another operator for exists, and maybe a
special alternative to the (ever so rarely used) &&=, etc.

It occurred to me that maybe what ought to be done was to make tests
act as assigners (or yield lvalues, sort of) in certain situations --
in particular, in situations where ||= or &&= is specified, assign to
the argument being tested:

defined($var) ||= 'default';
exists($hash{$key}) &&= 'replacement';

To me, this reads pretty well. From a syntax standpoint, it's very
irregular, but that hasn't stopped Perl in the past.


------------------------------

Date: 7 Oct 2003 12:54:38 -0700
From: swen_killer@yahoo.co.uk (Swen Killer)
Subject: [OT] please ignore
Message-Id: <b8e66b67.0310071154.61f172a0@posting.google.com>

-


------------------------------

Date: Tue, 7 Oct 2003 16:09:05 -0500
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: [OT] please ignore
Message-Id: <slrnbo6arh.jrp.tadmc@magna.augustmail.com>

Swen Killer <swen_killer@yahoo.co.uk> wrote:

> Subject: [OT] please ignore


Please do not post articles that are to be ignored.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Tue, 7 Oct 2003 18:32:25 GMT
From: philcrow2000@yahoo.com (Phil Crow)
Subject: ANNOUNCE: Class::Colon 0.02
Message-Id: <HMEIAy.1MJ6@zorch.sf-bay.org>

The new perl module Class::Colon, available from CPAN, turns colon delimited
files into lists of objects.  It also turns those objects, or lists of objects,
back into delimited records.  Simply say:

    use Class::Colon Person => [ qw(given family other) ];

To turn a file into a list of Person objects:

    my $names = Person->READ_FILE("people.dat");
    # or:
    # open INPUT "people.dat";
    # my $names = Person->READ_HANDLE(*INPUT);

Then later you can loop through these names:

    foreach my $name (@$names) {
        my $full = join " ", ($name->given, $name->other, $name->family);
        print "$full\n";
    }

Alternatively, you can control the input yourself as in:

    use Class::Colon Person => [ qw(given family other) ];

    my @people;
    while(<>) {
        chomp;
        push @people, Person->OBJECTIFY($_);
    }

Each hash key mentioned in the use statement will become a new class with
an accessor for each field you name in its list.  If a field is itself
an object, you can tell Class::Colon how to construct the proper object.
See the perldoc for details.

You can pick the delimiter on a class by class basis (but not field by field).
It can be any literal string.

You can turn the object back into a delimited record with

    my $output = $object->STRINGIFY();

I'd be happy to hear from anyone who could make STRINGIFY work with overload so
the above is just:

    my $output = "$object";  # NOT YET AVAILABLE, MAY BE IMPOSSIBLE.

There are WRITE_FILE and WRITE_HANDLE methods to help you with output.




------------------------------

Date: Tue, 07 Oct 2003 22:09:52 +0200
From: Torsten Mohr <tmohr@s.netic.de>
Subject: C++ example in perldoc perlxs
Message-Id: <blv6ig$gvf$1@schleim.qwe.de>

Hi,

i'd like to compile and install the C++ example mentioned
in "perldoc perlxs".  I use perl 5.8 on my Linux system.

I've put the whole test project here:

http://www.s.netic.de/tmohr/cc.tar.gz

I can successfully compile and install the underlying
libAmod.so, that contains code for a C++ class:

class aMod {
  private:
    int a;

  public:
    aMod();
    ~aMod();

    int getA(void);
    void setA(int v);
};

The XS file i wrote generates C code and contains code and
the perl module installs without errors.  But when i call:

#! /usr/bin/perl -w

use aMod;

$a = aMod->new();

Use of inherited AUTOLOAD for non-method aMod::aMod() is deprecated at
 ./qwe.pl line 5.
Can't locate auto/aMod/aMod.al in @INC (@INC contains:
/usr/lib/perl5/5.8.0/i586-linux-thread-multi /usr/lib/perl5/5.8.0 .....

It seems to me that "aMod" is somehow still unknwon to perl, maybe
due to a wrong startup code in aMod.pm (i've attached it below)?


Has anybody got a hint for me?


Best regards,
Torsten.



--------- aMod.pm:
package aMod;

#use strict;
use warnings;

require Exporter;
require DynaLoader;

our @ISA = qw(Exporter DynaLoader);

our @EXPORT = (
        aMod
);

our $VERSION = '0.01';

bootstrap aMod $VERSION;


1;


__END__




------------------------------

Date: Tue, 7 Oct 2003 16:06:33 -0500
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: GDBM_File problems
Message-Id: <slrnbo6amp.jrp.tadmc@magna.augustmail.com>

Mike Hunter <mhunter@uclink.berkeley.edu> wrote:

> Can somebody point me to the perl tgz file for this?  I am getting:


Go to:  http://search.cpan.org/


> "Can't locate loadable object for module GDBM_File in @INC (@INC contains..."


Type 

   GDBM::File 

into the little box.

Click on "CPAN Search" button.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Tue, 07 Oct 2003 20:50:34 +0200
From: Matija Papec <perl@my-header.org>
Subject: Re: Opinions on "new SomeObject" vs. "SomeObject->new()"
Message-Id: <gj06ov8h7gkrgv458kt9hda03n53senaj2@4ax.com>

X-Ftn-To: Randal L. Schwartz 

merlyn@stonehenge.com (Randal L. Schwartz) wrote:
>>> Don't make $instance->new.  It obscures more than it communicates.
>
>Matija> I think you really should explain that in foreword of Damian book. :)
>
>I make the point in the Alpaca.  Is that enough?

I guess, btw where is your sense of humor, notice smiley above.

As for cargo cult programming, how about some general programming guidelines
which should have community consensus? Like perlstyle.pod which could make
some guidelines more official?



-- 
Matija


------------------------------

Date: Tue, 07 Oct 2003 21:00:23 +0200
From: Matija Papec <perl@my-header.org>
Subject: Re: Opinions on "new SomeObject" vs. "SomeObject->new()"
Message-Id: <g236ov047sqnn268sinbriscnpjpeb3jk8@4ax.com>

X-Ftn-To: Lack Mr G M 

gml4410@ggr.co.uk (Lack Mr G M) wrote:
> my $tko1 = MyToken->from_token('token' => $token);
> my $tko2 = MyToken->via_login('id' => yourid, 'password' => yourpassword);
>
>   Some poeple seem to be advocating this conveying more:
>
> my $tko1 = from_token MyToken('token' => $token);
> my $tko2 = via_login MyToken('id' => yourid, 'password' => yourpassword);
> 
>   I can't see it myself.

Isn't your last example potentially troublesome in case of class
inheritance?

-- 
Matija


------------------------------

Date: Tue, 07 Oct 2003 20:02:24 GMT
From: Steve Grazzini <grazz@pobox.com>
Subject: Re: Opinions on "new SomeObject" vs. "SomeObject->new()"
Message-Id: <kZEgb.28284$kD3.5646@nwrdny03.gnilink.net>

Matija Papec <perl@my-header.org> wrote:
>gml4410@ggr.co.uk (Lack Mr G M) wrote:
>> my $tko1 = from_token MyToken('token' => $token);
>> my $tko2 = via_login MyToken('id' => yourid, 'password' => yourpassword);
> 
> Isn't your last example potentially troublesome in case of class
> inheritance?

No.  

It's troublesome if there are functions in the current package 
called from_token() or via_login().  This has nothing to do with 
constructors or inheritance -- it's a parsing issue.  Indirect 
object syntax is ugly and ambiguous and the parser doesn't always
get it right.

-- 
Steve


------------------------------

Date: Tue, 7 Oct 2003 23:57:13 +0530
From: Thens <thens@NOSPAMti.com>
Subject: Re: Perl Command line for stat
Message-Id: <20031007235713.28809307.thens@NOSPAMti.com>

On 7 Oct 2003 10:52:23 -0700
nos41@hotmail.com (nos) wrote:

# I am trying to use a command line Perl _e to stat a file with in
# Solaris.  I was looking for someone that can modify the line so it
# will work from a command line. I have not been able to figure out the
# right format to use. Can someone please help me with this.
# 
# Perl _e _(stat($filename)) [10]' any suggestions

perl -e 'print [stat $ARGV[0]]->[10]' <filename>

  is what you want.

perldoc perlrun  for more information.


Regards,
Thens.


------------------------------

Date: Tue, 7 Oct 2003 16:13:06 -0500
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Perl Command line for stat
Message-Id: <slrnbo6b32.jrp.tadmc@magna.augustmail.com>

nos <nos41@hotmail.com> wrote:
> I am trying to use a command line Perl –e to stat a file with in
> Solaris.  I was looking for someone that can modify the line so it
> will work from a command line. I have not been able to figure out the
> right format to use. Can someone please help me with this.
> 
> Perl –e ‘(stat($filename)) [10]' any suggestions


   perl -le 'print( (stat shift)[10] )' file
or
   perl -le 'print +(stat shift)[10]' file


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Tue, 7 Oct 2003 20:12:20 GMT
From: "Michael P. Broida" <michael.p.broida@boeing_oops.com>
Subject: Re: regex behavior
Message-Id: <3F831E24.C1B52D47@boeing_oops.com>

Abigail wrote:
> 
> Michael P. Broida (michael.p.broida@boeing_oops.com) wrote on
> MMMDCLXXXVIII September MCMXCIII in <URL:news:3F81D592.208F5420@boeing_oops.com>:
> ''  Abigail wrote:
> '' >
> '' > Michael P. Broida (michael.p.broida@boeing_oops.com) wrote on
> '' > MMMDCLXXXIII September MCMXCIII in <URL:news:3F7B532C.7878A3BB@boeing_oops.com>:
> '' > ,,  Abigail wrote:
> '' > ,, >
> '' > ,, > Matija Papec (mpapec@yahoo.com) wrote on MMMDCLXXXIII September MCMXCIII
> '' > ,, > in <URL:news:4bdmnvcb9or2nbm2ne1euvhqp1e64s84g7@4ax.com>:
> '' > ,, > --
> '' > ,, > --  I went through perldoc but didn't found similar regex,
> '' > ,, > --  print join ',', 'a bb ccc dddd' =~ /(\w)+/g;
> '' > ,, > --
> '' > ,, > --  the question is, what it exactly matches and why?
> '' > ,, >
> '' > ,, > /(\w)+/ matches a set of consecutive word characters, capturing
> '' > ,, > the *last* one. //g in list context means, do this as often as
> '' > ,, > possible (without overlap), returning a list of each of the submatches.
> '' > ,, >
> '' > ,, > So, 'a bb ccc dddd' =~ /(\w)+/g; returns for each substring of
> '' > ,, > consecutive word characters the last one, resulting in 'a', 'b', 'c' and 'd'.
> '' > ,,
> '' > ,,      That tests out as you said, so it's MY thinking that's off.  :)
> '' > ,,      Hopefully, you can clue me in.  :)
> '' > ,,
> '' > ,,      I expected it to result in "a,bb,ccc,dddd". Now I realize that
> '' > ,,      it's the positioning of the + that causes it to get a single
> '' > ,,      character from each group.  If the + is inside the (), it
> '' > ,,      prints what I expected.
> '' > ,,
> '' > ,,      But...  What is causing the original /(\w)+/ to get the LAST
> '' > ,,      character from each group instead of the FIRST character from
> '' > ,,      each group?
> '' >
> '' > Would you expect:
> '' >
> '' >     $x = $_ for qw /a b c d/
> '' >     print $x;
> '' >
> '' > to print 'a' as well?
> ''
> ''      It doesn't print anything without a semi-colon on the first line.
> ''      <grin>
> ''
> ''      At first glance, I thought it would print each letter.  Then I
> ''      looked deeper and realized it's basically assigning and re-assigning
> ''      $x (via $_) during the "for" loop, but only printing it when it's all
> ''      done.  Thus it only prints "d".
> ''
> ''      But the prior discussion was about a regex, not a "for" loop.
> ''      If your point is that the regex processing works similarly to
> ''      the "for" loop in your example, then I see what you mean.
> ''
> ''      If that's NOT what your point was, then you've lost me.  <grin>
> 
> My point is, if you repeatedly assign something to a variable, do you
> expect the variable to retain the first value it was set to, or the
> last value? Because that's happening in both the match, and the for loop.

	Ah.  No, I wouldn't expect that.  But then, I didn't know
	that the *regex* was repeatedly assigning to the variable
	WITHIN the (\w)+ portion.  I -DID- expect it to assign a
	new result for each letter group (a, bb, ccc, and dddd)
	due to the //g.  I did NOT know it was reassigning for
	the \w within the () for each letter in a single group.

	But now I do know that, thanks to the discussion here.  :)

	Thanks everyone!

		Mike


------------------------------

Date: Tue, 07 Oct 2003 21:59:26 GMT
From: Bob Walton <invalid-email@rochester.rr.com>
Subject: Re: require and do - relative vs absolute?
Message-Id: <3F833622.6050702@rochester.rr.com>

Derf wrote:

> Bob Walton <invalid-email@rochester.rr.com> wrote in news:3F820680.403
> @rochester.rr.com:
> 
> 
>>CGI debugging is a FAQ, and, as the FAQ mentions, is also off-topic here 
>>(your question and its answer would be the same if your script was 
>>written in C, Befunge, etc).  Please see:
>>
>>    perldoc -q 500
>>
>>
> 
> 
> it is a Perl question to me, and I only mentioned the Web factor so folks 
> didn't start asking me if I was stating my path from my web root or actual 
> root. The CGI part is really irrelevant to the problem, it happens from 
> command line as well. Unfortunately, this is just confusing to describe.
> 
> Derf
> 

Well, if it happens from the command line as well, it would seem only 
one thing remains:  your "/path/from/root/" contains a typo.  Trying 
that, however, reveals that the error message you claim ("Can't do: No 
such file or directory at /path/from/root/cgi-bin/dir/requirefile.pl 
line ##.") isn't generated by either do() or require() in the event the 
specified file cannot be found.  do() is silent given the code you 
showed, although $! has "No such file or directory" in it; require() 
generates something on the order of:

Can't locate /path/from/root/cgi-bin/dir/blah.pl in @INC (@INC contains: 
C:/Perl/lib C:/Perl/site/lib .) at -e line 1.

It appears from the docs that require() won't work with an absolute path 
in its input string.

Is there an "or die" clause after the do() with the error text you 
mentioned?

Could you copy/paste the *exact text* of your *real error message* and 
show us that?  If you don't want to reveal your actual path for some 
reason, set up a fake path, put the code there, generate the error, and 
copy/paste the exact error, along with the results of a pwd from the 
directory where the file you want to do() is.

And finally, why would you want to make the path absolute anyway?  If it 
is relative, then it'll still work if you move the whole shebang to a 
different web server; if it's absolute, you'll probably have to modify 
the path.

-- 
Bob Walton
Email: http://bwalton.com/cgi-bin/emailbob.pl



------------------------------

Date: Tue, 07 Oct 2003 16:13:49 -0400
From: Peter <feurry@hotmail.com>
Subject: set some fiile data as a variable
Message-Id: <_7Fgb.89575$PD3.4844595@nnrp1.uunet.ca>

I'm trying to get some information from a file
and i can print the right data to shell using

print ($line =~ m/\(-vop crop\=(\d+)/g )

but i want to be able to set this as a variable instead.
I know if i set that line equal to a variable i will just
get a value of 1 for true.  How can i capture the information
print would show in a variable instead of true or false.



------------------------------

Date: Tue, 07 Oct 2003 20:24:38 GMT
From: Michael Budash <mbudash@sonic.net>
Subject: Re: set some fiile data as a variable
Message-Id: <mbudash-9C7C62.13243807102003@typhoon.sonic.net>

In article <_7Fgb.89575$PD3.4844595@nnrp1.uunet.ca>,
 Peter <feurry@hotmail.com> wrote:

> I'm trying to get some information from a file
> and i can print the right data to shell using
> 
> print ($line =~ m/\(-vop crop\=(\d+)/g)
> 
> but i want to be able to set this as a variable 

or variables, since you've specified the /g flag

> instead.
> I know if i set that line equal to a variable i will just
> get a value of 1 for true.  How can i capture the information
> print would show in a variable instead of true or false.

($var) = $line =~ m/\(-vop crop\=(\d+)/g;

or to get 'em all into an array:

@vars = $line =~ m/\(-vop crop\=(\d+)/g;

hth-
-- 
Michael Budash


------------------------------

Date: Tue, 07 Oct 2003 16:40:38 -0400
From: Peter <feurry@hotmail.com>
Subject: Re: set some fiile data as a variable
Message-Id: <8xFgb.89586$PD3.4844290@nnrp1.uunet.ca>

Thanks for your help that worked perfectly!!

($var) = $line =~ m/\(-vop crop\=(\d+)/g;



------------------------------

Date: Tue, 07 Oct 2003 21:44:28 +0200
From: Jens M. Felderhoff <j.m.f.dev.null@gmx.net>
Subject: Re: Strange behaviour with '\r' character [[ sorry my other post was wrong typed ]]
Message-Id: <blv52t$j29$1@newsreader2.netcologne.de>

Helgi Briem <HelgiBriem_1@hotmail.com> wrote:

> On 6 Oct 2003 10:40:49 -0700, i5513@hotmail.com (i5513) wrote:
>
>>Thanks you! I have read on perlre:
>>\s	Match a whitespace character
>
>>But I didn't know '\r' was whitespace (I thinked about \s was( |\t)).
>
> \s contains 4 different kinds of white space:

No, that's 5 different characters:

> \n line feed
> \r carriage return
> \t tab
>  space

\f form feed

Cheers

Jens
-- 
(Intentionally left blank.)


------------------------------

Date: 7 Oct 2003 14:57:54 -0700
From: kuujinbo@hotmail.com (ko)
Subject: Re: sysopen problem
Message-Id: <92d64088.0310071357.4ee9382c@posting.google.com>

Hon Guin Lee - Web Producer - SMI Marketing <Hon.Lee@Sun.COM> wrote in message news:<3F828F88.B3BAF345@Sun.COM>...
> My objective is to open a file if it exists, and if not create a new file from the string variable $FILE, but I have encountered a problem that in some occurences the file can be opened and modified without the compiler throwing error messages otherwise it throws a Permission Denied error (not owner) from just modifying a file within the local web server.
> 
>   # open the file and allow it to be modified.  
>   # modify file by adding the content or create a new file.
> 
>   open($FILE, "> $page") || die $!;
>   sysopen($FILE, $page, O_WRONLY|O_CREAT)        || die $!;
>   sysopen($FILE, $page, O_WRONLY|O_CREAT, 0777)  || die $!;  
>   
>   OR
>   
>   sysopen($FILE, $page, O_WRONLY | O_CREAT, 0777) || die $!;

This isn't the right way to open the file the way you want to. It will
destroy any data in an existing file, not add/append. From
'perlopentut':

To open a file for appending, creating one if necessary:

    open(FH, ">> $path");
    sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);

The rest of the problem is in the error message. You have a
permissions problem.  Identify the user account that the program is
running under, and you should have your answer.

HTH - keith


------------------------------

Date: Tue, 7 Oct 2003 21:03:29 +0100
From: "Colin Robbins" <C.J.Robbins@ntlworld.com>
Subject: TCP Listener on Windows XP
Message-Id: <A_Egb.101$B71.34@newsfep1-gui.server.ntli.net>

I have a network application that works fine on Linux, and I want to run it
on on a Windows XP machine.   For some reason I cannot get the TCP listener
to respond.

I have cut the code down to the bare minimum to try and debug...

        use IO::Socket;
        use Net::hostent;
        $PORT = 2345;
        $server = IO::Socket::INET->new( Proto     => 'tcp',
                                  LocalPort => $PORT,
                                  Listen    => SOMAXCONN,
                                  Reuse     => 1);
                  die "can't setup server" unless $server;
       print "[Server $0 accepting clients]\n";
       while ($client = $server->accept()) {
               $hostinfo = gethostbyaddr($client->peeraddr);
               printf "Connect from %s\n", $hostinfo->name ||
$client->peerhost;
               $line = <$client>;
               print "Got:  $line \n";
               close $client;
        }

When I run this, and connect a client, I get as far as the "Connect from..."
message, but the $line=<$client> never returns anything.

Any ideas why this will not work on XP?

I am using ActivState perl 5.8.0.


-- 
Colin Robbins
http://www.robbins4.freeserve.co.uk




------------------------------

Date: Tue, 07 Oct 2003 19:12:29 GMT
From: Henry <henryn@zzzspacebbs.com>
Subject: Re: Teach me how to fish, regexp
Message-Id: <BBA85E24.1558D%henryn@zzzspacebbs.com>

Martien Vebruggen:

Thank you for your response to my post:

in article slrnbo4p6t.pv1.mgjv@verbruggen.comdyn.com.au, Martien Verbruggen
at mgjv@tradingpost.com.au wrote on 10/7/03 12:01 AM:

> On Tue, 07 Oct 2003 04:34:05 GMT, Henry <henryn@zzzspacebbs.com> wrote: Folks:
> 
<snip>
>> 
> Is the first paragraph also preceded by three blank lines? And do you mean
> three blank lines, or three newlines? I will assume three newlines (i.e. two
> blank lines).

It appears that there are _usually_ three blank lines, i.e. four newlines
preceding each new section.  It appears that all other breaks --the ones I
don't want to find-- are shorter (fewer newlines) but I don't know how
reliable this is.
> 
> [snip of example records, see code, below]
> 
>> Seems the best way to deal with this is to slurp, and use "split" with the
>> appropriate regexp.  Wrinkle: I need to retain the section numbers in the
>> return strings.
>> 
> I would probably set the input record separator ($/, see perlvar) to "", which
> will treat two or more consecutive newlines as the record separator. Then each
> record starts with the number you're interested in.

Right, that's what I finally did, in effect. (I did something similar at the
"split".)  But this  isn't very robust, I think: it depends on some typist
somewhere _always_ following the rules.

I think you are saying that slurp mode may not be the best choice.

As far as your setting

   $/ = "";

This is not exactly intuitive from the point of view of a newcomer.
Sorry, could you help me understand  (or give me a blind rule of thumb) how
what looks like setting a variable to an empty string implies "two or more
successive newlines"?

> 
> #!/usr/local/bin/perl use warnings; use strict;
> 
> $/ = ""; while (<DATA>) { chomp; if (my ($num, $para) = /^(\d+(?:\.\d)?)\.
> (.*)/s) { print "[$num] $para\n"; } else { print "MALFORMED RECORD\n"; } }
> 
<snip> 
<snip>
> 
> === End example program===

Thanks for taking all the trouble to explain the components in detail:
> /
>   ^           # from the beginning of the record

Right.  

>   (           # start capture

Capture?  I guess you mean the mysterious "save the stuff you match"
mechanisms I've found in some perl references.  The explanations I've found
are very short and not very useful.   Also:  I find it hard to discriminate
between parens used for operation grouping and this use.

>     \d+       # one or more digits

Yes.

>     (?:       # start grouping, but no capturing

Sorry, could you speak more fully about this?  Again, I haven't found a good
reference for this stuff.

>       \.\d    # A literal . followed by a digit

Right.

>     )         # end grouping

OK, as above.

>     ?         # previous (group) one or zero times, i.e. it's optional

OK.

>   )           # end capturing

OK, as above.

>   .\ \        # literal . followed by two spaces

Sorry, I don't get that.  Could you explain more fully?  I think that I
understand that a period, unescaped, matches any character, so I would
expect that you'd have to escape before the period to match a literal
period/decimal point.

>   (.*)        # capture the rest of the record

I think I understand that

   .*

means "any character, repeated 0 or more times", but I don't get how the
parens lead to capture (and not operation grouping, as above) and eventual
appearance of the captured data somewhere.

> /sx
> 
> 
> The s modifier makes . match newlines, and the x modifier allows the comments
> I put in (which is also why I needed to escape the spaces in this version, and
> not in the one above.

OK.  (The modifiers mechanism takes some getting used to.)

> The first capturing set of parentheses returns the paragraph number, including
> the sub-number, if present, and the second capturing parentheses set returns
> the "Blah, blah.." bit up to the end of the record.

Right, as I said above, I can't figure out how this aspect works.   This may
seem obvious to you but looks like a hidden (or magical) side-effect to me.
> 
> Also see the perlvar and perlre documentation for more information.

My desk and my screen are littered with various references.   Thanks for
pointing out these man "subreferences" -- I had not noticed them
> 
> If two newlines is not a record splitter, and you _have_ to use a minimum of
> three, this won't work.

Sorry, could you speak more fully about this?  Is there a restriction I'm
not seeing?

> You can't even check after reading a record whether it
> ends in more than two newlines, since it always will end in exactly two, no
> matter how many are in the input (which is pretty annoying), so you'd have to
> probably set $/ to "\n\n\n", and remove any leading and trailing whitespace
> yourself and skip "empty" records:

Right.  This is exactly where I arrived before I decided I needed help and
posted my original question, except that I stayed with slurping the data
instead of sort-of line-at-a-time processing.
> 
> #!/usr/local/bin/perl use warnings; use strict;
> 
> $/ = "\n\n\n"; while (<DATA>) { s/^\s+//; s/\s+$//; next if $_ eq "";
> 
> if (my ($num, $para) = /^(\d+(?:\.\d)?)\.  (.*)/s) { print "[$num] $para\n"; }
> else { print "ILLEGAL RECORD\n"; } }
> 
<snip>

Thanks. It would seem that some of the mystifications I asked about above
appear here also. 

Thanks for your patience.

Thanks,

Henry

henryn@zzzspacebbs.com  remove 'zzz'
> 
> 
> Martien



------------------------------

Date: Tue, 07 Oct 2003 19:12:49 GMT
From: Henry <henryn@zzzspacebbs.com>
Subject: Re: Teach me how to fish, regexp
Message-Id: <BBA85E37.1558D%henryn@zzzspacebbs.com>

Roy Johnson:

Thanks for your response to my post:

in article 3ee08638.0310070536.697f7ef3@posting.google.com, Roy Johnson at
rjohnson@shell.com wrote on 10/7/03 6:36 AM:

> If I may take the last question first:
> 
> Henry <henryn@zzzspacebbs.com> wrote in message
> news:<BBA79042.154E5%henryn@zzzspacebbs.com>...
>> Or should I go back to my awk hack that works and which I actually
>> understand?
> 
> You could always run your awk hack through a2p to see how it deals
> with your situation. Could be ugly, could be enlightening.

Good idea.   I already did that.  It _was_ ugly.  (Kind of like seeing
myself on TV.  Yech! )
> 
> If you do this:
> my @paras = split(/\n{3}(\d+\.(?:\d+.)?)  /, $whole_file);
> 
> You will have your section headers split out as their own paragraphs,
> followed by the paragraphs themselves. Then you just have to put them
> back together as you go through the list. (Decide for yourself whether
> you want the spaces after the section number retained. My split is
> throwing them away.)
> 
> If you want three or more newlines, change the {3} to {3,}. Season to
> taste.

Thanks! I plugged your expression in to my test scaffolding.  The result is
quite workable; I just need to traverse the resulting array appropriately;
which should be no problem.  So you've given me a fish (a solution).

I need to learn how to catch my own fish.   So I'll backtrack and make sure
I understand your solution.  Please have patience with me; I'm (obviously)
new to all this.

#1 What does using "my" mean? (Give me a clue -- a keyword; I'll look it up.
Googling for "my" and "perl" has not been particularly enlightening.)

#2 Is there any difference --except brevity-- between writing

    \n{3}

and

     \n\n\n

-- both mean "exactly 3 newlines in succession" right?

I understand the fundamental match expression components, excepting your use
of parens and "?:".

#3 To get started, is there a difference between uttering

  my @paras = split(/\n{3}(\d+\.(?:\d+.)?)  /, $whole_file);

and

  my @paras = split /\n{3}(\d+\.(?:\d+.)?)   /, $whole_file;

Both seem to work the same way in my test scaffolding.

#4 .... And I _really_ don't see how this expression leads to getting
alternating saved section numbers and section contents in the output array.
This seems to be based on a bit of assumptions/side effects/magic, and I
haven't yet found the right perl reference to explain it.
> 
>> I can't even figure out why I seem to need "[0-9][0-9]+" for my 5 digit test
>> case when it seems "[0-9]+" ought to suffice.
> 
> I agree (although I recommend \d instead of [0-9]).

OK, these are equivalent, though, right?  Is this a matter of style,
dialect, common usage, modernity, or what?

My preference for [0-9] is only this:  that construct seems more versatile
and so perhaps I can do more with less demands on my internal (brain) memory
or fewer references to a perl regexp cheat-sheet.

> What was [0-9]+ doing wrong that [0-9][0-9]+ fixed?

Hmmmm, I forget. There were so many...

Thanks,

Henry

henryn@zzzspacebbs.com  remove 'zzz'



------------------------------

Date: Sat, 19 Jul 2003 01:59:56 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: 
Message-Id: <3F18A600.3040306@rochester.rr.com>

Ron wrote:

> Tried this code get a server 500 error.
> 
> Anyone know what's wrong with it?
> 
> if $DayName eq "Select a Day" or $RouteName eq "Select A Route") {

(---^


>     dienice("Please use the back button on your browser to fill out the Day
> & Route fields.");
> }
 ...
> Ron

 ...
-- 
Bob Walton



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 5630
***************************************


home help back first fref pref prev next nref lref last post