[22113] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4335 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jan 2 21:05:52 2003

Date: Thu, 2 Jan 2003 18:05:05 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 2 Jan 2003     Volume: 10 Number: 4335

Today's topics:
    Re: AWK vs PERL - splitting fields <krahnj@acm.org>
        FORMAT problem <never@home.com>
    Re: FORMAT problem <goldbb2@earthlink.net>
    Re: FORMAT problem <mgjv@tradingpost.com.au>
    Re: FORMAT problem <never@home.com>
    Re: FORMAT problem <mgjv@tradingpost.com.au>
    Re: LWP & Proxy/Firewalls <mgjv@tradingpost.com.au>
    Re: Perl for spliting vcf files (palm->iPod) <krahnj@acm.org>
        Prototype declaration with built-in function (Jeff Mott)
    Re: Prototype declaration with built-in function <goldbb2@earthlink.net>
    Re: Sorting hash tree from Xml::simple. (stew dean)
    Re: Sorting hash tree from Xml::simple. <goldbb2@earthlink.net>
    Re: vectors & large amounts of data - time & space prob <goldbb2@earthlink.net>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Fri, 03 Jan 2003 01:49:34 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: AWK vs PERL - splitting fields
Message-Id: <3E14EBAC.ADA2868C@acm.org>

Christopher Hamel wrote:
> 
> On that note, 'cut' is likely faster than AWK if the only goal is
> splitting fields, but neither AWK nor cut nor <insert favorite OS tool
> here> is really a programming language.  AWK is a nice tool, and I
> like it a lot, but it's no more a programming lanugage than 'cat.'

I'm sure Alfred Aho, Brian Kernighan, and Peter Weinberger would
disagree with you.  :-)

http://cm.bell-labs.com/cm/cs/awkbook/


John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Thu, 02 Jan 2003 18:29:13 -0500
From: Jason Lixfeld <never@home.com>
Subject: FORMAT problem
Message-Id: <v19iren10qsif6@corp.supernews.com>

I've been racking my brain for the last week or so trying to figure out what 
I'm doing wrong here.  I have a bunch of data that I need to sort, format 
and write out into different files.  I want to assign the format name to be 
the value of a variable so I can write to different files, but it doesn't 
seem to want to work for me.  All I get is:

Undefined format "w7" called at ./db2nag.pl line 148, <NAGIOS> chunk 240.

Line 148 in the full script points to the "write;" shown below.

If I comment out $~ = "$hg" and select($~), it will write everything to 
STDOUT without a problem so the script appears to be doing what it's 
supposed to do, except I can't figure out how to get it to write to 
different filehandles and change the filehandles and actually have it work.

sub hg_parse {
        foreach $key (keys %w_type) {
                $fh{$w_type{$key}{TYPE}} = new IO::File;
                
$fh{$w_type{$key}{TYPE}}->open(">>$w_type{$key}{TYPE}.hostgroup.cfg");
                $hg = $type = $w_type{$key}{TYPE};
                push @$type, "$key\.$type";
                $$hg = join(',', @$type);
        };
        foreach $key (keys %w_list) {
                $hg = $w_list{$key};
                $~ = "$hg";
                select($~);
format =
define hostgroup {
       hostgroup_name  nexxia_pppoe_dsl
       alias           Nexxia PPPoE DSL (w7)
       contact_groups  w7.email,w7.pager
       members         @*
                       $$hg
       }
 .
write;
        };

Any ideas?


------------------------------

Date: Thu, 02 Jan 2003 18:50:12 -0500
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: FORMAT problem
Message-Id: <3E14D034.EE4CF423@earthlink.net>

Jason Lixfeld wrote:
[snip]
> Any ideas?

Rewrite your code so that it works with 'use strict' enabled, and
perhaps we can help you.

-- 
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r  coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);


------------------------------

Date: Fri, 03 Jan 2003 00:17:11 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: FORMAT problem
Message-Id: <slrnb19loo.307.mgjv@verbruggen.comdyn.com.au>

On Thu, 02 Jan 2003 18:29:13 -0500,
	Jason Lixfeld <never@home.com> wrote:
> I've been racking my brain for the last week or so trying to figure out what 
> I'm doing wrong here.  I have a bunch of data that I need to sort, format 
> and write out into different files.  I want to assign the format name to be 
> the value of a variable so I can write to different files, but it doesn't 
> seem to want to work for me.  All I get is:
> 
> Undefined format "w7" called at ./db2nag.pl line 148, <NAGIOS> chunk 240.

That means that you're using write on a file handle that is called w7,
or you have set $~ explicitly to w7, and there is no format with that
name.

> Line 148 in the full script points to the "write;" shown below.
> 
> If I comment out $~ = "$hg" and select($~), it will write everything to 

Why those quotes around $hg?

And what is $hg? I bet it is "w7", which means that you have just told
write that the format name for the current output channel (normally
STDOUT) is w7. So the next write will look for a format called "w7".

Why are you using the select as well? You can't do that, unless you
have actually opened a file handle there. I think you are terribly
confused about how this all works. You have read the entries for
select and format in perlfunc, and the perlform documentation, right?

> STDOUT without a problem so the script appears to be doing what it's 

If you want to print to STDOUT, then don't use $~, or select(), and
create a format with the name STDOUT, or leave the name off.

> supposed to do, except I can't figure out how to get it to write to 
> different filehandles and change the filehandles and actually have it work.
> 
> sub hg_parse {
>         foreach $key (keys %w_type) {
>                 $fh{$w_type{$key}{TYPE}} = new IO::File;

You need to write your code with use strict and use warnings on. It'll
really help you figure out things.

I doubt very, very much that formats work with IO::File objects, like
this.

> $fh{$w_type{$key}{TYPE}}->open(">>$w_type{$key}{TYPE}.hostgroup.cfg");
>                 $hg = $type = $w_type{$key}{TYPE};

You should really tell us what $hg is supposed to be, because

>                 push @$type, "$key\.$type";

here you use $type as an ARRAY reference, and

>                 $$hg = join(',', @$type);

here you are treating $hg as a reference to a scalar, which seems a bit
suspect, since you set them to the same value, just two lines ago.

>         };
>         foreach $key (keys %w_list) {
>                 $hg = $w_list{$key};
>                 $~ = "$hg";
>                 select($~);

You select something that isn't a valid file handle, and you set $~ to
something that does not have a format associated with it.

> format =

This format will only work for STDOUT.

> define hostgroup {
>        hostgroup_name  nexxia_pppoe_dsl
>        alias           Nexxia PPPoE DSL (w7)
>        contact_groups  w7.email,w7.pager
>        members         @*
>                        $$hg

What is this $$hg supposed to be?

>        }
> .
> write;
>         };

> Any ideas?

1 - If you must use format, give the format a name that you will use
    for the file handle, and actually open the file you are trying to
    write to.

    Something like this would probably work, but I would _strongly_
    advise against it:

#! /usr/bin/perl -w
use strict;

our $hg;

my %w_list = (key1 => "val1", key2 => "val2");

for my $key (keys %w_list)
{
    open FOO, ">>$w_list{$key}" or die "open '$w_list{$key}': $!";
    $hg = $w_list{$key};
    write FOO;
    close FOO;
}

format FOO =
define hostgroup {
       hostgroup_name  nexxia_pppoe_dsl
       alias           Nexxia PPPoE DSL (w7)
       contact_groups  w7.email,w7.pager
       members         @*
                       $hg
       }
 .

__END__


2 - Don't use format. Write a subroutine that uses printf, and pass it
    a file handle to print to. Much easier, and much nicer. Perl
    formats are very powerful, but a bit odd, and hard to use. It also
    makes scoping variables hard. Making your code look nice and
    organised, and keeping the scope of your variables (and file
    handles!) under control is hard in the presence of formats.

    Something like this, although I really have no idea bout your data
    structures, and what is supposed to be in them:

#! /usr/bin/perl -w
use strict;

my %w_list = (key1 => "val1", key2 => "val2");

for my $key (keys %w_list)
{
    open my $foo, ">>$w_list{$key}" or die "open '$w_list{$key}': $!";
    write_hostgroup($foo, $w_list{$key});
}

sub write_hostgroup
{
    my ($fh, $hg) = @_;

    print $fh <<EOHG;
define hostgroup {
       hostgroup_name  nexxia_pppoe_dsl
       alias           Nexxia PPPoE DSL (w7)
       contact_groups  w7.email,w7.pager
       members         $hg
       }
EOHG
}
__END__


If you have an older Perl that doesn't understand the "open my $foo"
bit, then use your IO::File stuff. It should work pretty much the same
way.

Martien
-- 
                        | 
Martien Verbruggen      | Begin at the beginning and go on till you
Trading Post Australia  | come to the end; then stop.
                        | 


------------------------------

Date: Thu, 02 Jan 2003 20:17:10 -0500
From: Jason Lixfeld <never@home.com>
Subject: Re: FORMAT problem
Message-Id: <v19p67p7pf0t7e@corp.supernews.com>

Martien Verbruggen wrote:

> On Thu, 02 Jan 2003 18:29:13 -0500,
> Jason Lixfeld <never@home.com> wrote:
>> I've been racking my brain for the last week or so trying to figure out
>> what
>> I'm doing wrong here.  I have a bunch of data that I need to sort, format
>> and write out into different files.  I want to assign the format name to
>> be the value of a variable so I can write to different files, but it
>> doesn't
>> seem to want to work for me.  All I get is:
>> 
>> Undefined format "w7" called at ./db2nag.pl line 148, <NAGIOS> chunk 240.
> 
> That means that you're using write on a file handle that is called w7,
> or you have set $~ explicitly to w7, and there is no format with that
> name.

First off, thanks for taking the time to write your very informative reply.

Second off, I'm quite the newbie using what I think is rather complex code 
for my experience level :)

>> Line 148 in the full script points to the "write;" shown below.
>> 
>> If I comment out $~ = "$hg" and select($~), it will write everything to
> 
> Why those quotes around $hg?

No real reason (see above about being a newbie) :)

> And what is $hg? I bet it is "w7", which means that you have just told
> write that the format name for the current output channel (normally
> STDOUT) is w7. So the next write will look for a format called "w7".

It's rather difficult for me to try and articulate what I'm trying to 
accomplish, but I will try...

I'm pulling entries out of a database.  Each entry consists of a username 
and IP address pair, ie:  ("joeuser", "192.168.196.10").  The information 
gets parsed and evaluated and put into a hash.  The evaluation is based on 
the value of the 3rd octet of the IP address (196 in the case of the above 
example).  There is a hash further up in the script that contains each 
possible 3rd octet and a corresponding value:

%w_list =       (197 = "w7",
                196 = "w8",
                193 = "w6"
                );

The structure of the username/ip hash after it is parsed is as follows:

%w_type =       (<username>     => TYPE = <w_list_val>,
                                => IP   = <ip_address>
                );

So the entry for joeuser, for example would look like:

%w_type{joeuser} = (
                TYPE = w8,
                IP = 192.168.196.10
                );
                
The code snippet I'm trying to work out now looks through each key of 
$w_type, creates a filehandle with a name of $w_type{$key}{TYPE}.  In the 
end, 3 file handles should be created based on what's in the w_list hash.

w6, w7 and w8 (in answer to your question above, yes $hg = w7).

The goal is look through each $key{TYPE} of $w_type and whatever the TYPE 
is, add the $username.$type to an array called @$type (ie: @w8).  Once 
those arrays are all populated, then write out the contents of each array 
to the filehandle with the same name.  In the end, I'm looking to have 3 
files, w6.hostgroup.cfg, w7.hostgroup.cfg and w8.hostgroup.cfg.  Any 
contents of %w_type that have TYPE w6 get put into the w6.hostgroup.cfg 
file, TYPE w7 into the w7.hostgroup.cfg, etc.

I was going to use format because I am familiar with them and I have already 
used them at an earlier point in the same script.

> Why are you using the select as well? You can't do that, unless you
> have actually opened a file handle there. I think you are terribly
> confused about how this all works. You have read the entries for
> select and format in perlfunc, and the perlform documentation, right?

The filehandle was actually opened.  I was using select because I thought 
you needed to use select to tell write what output channel you wanted to 
use.

>> STDOUT without a problem so the script appears to be doing what it's
> 
> If you want to print to STDOUT, then don't use $~, or select(), and
> create a format with the name STDOUT, or leave the name off.

Yes, I know.  I was just using the STDOUT as an example.  That's not the 
production solution, it was just a test.

>> supposed to do, except I can't figure out how to get it to write to
>> different filehandles and change the filehandles and actually have it
>> work.
>> 
>> sub hg_parse {
>>         foreach $key (keys %w_type) {
>>                 $fh{$w_type{$key}{TYPE}} = new IO::File;
> 
> You need to write your code with use strict and use warnings on. It'll
> really help you figure out things.

I turned on -w and it spat out some concatenation errors but I didn't really 
want to get into trying to fix them because the script was working fine up 
until that point.  I'll read up on strict and try to get the script to work 
with that.

> I doubt very, very much that formats work with IO::File objects, like
> this.

That's what it's shaping up to look like :(

>> $fh{$w_type{$key}{TYPE}}->open(">>$w_type{$key}{TYPE}.hostgroup.cfg");
>>                 $hg = $type = $w_type{$key}{TYPE};
> 
> You should really tell us what $hg is supposed to be, because
> 
>>                 push @$type, "$key\.$type";
> 
> here you use $type as an ARRAY reference, and
> 
>>                 $$hg = join(',', @$type);
> 
> here you are treating $hg as a reference to a scalar, which seems a bit
> suspect, since you set them to the same value, just two lines ago.

Yes, the code is messy and not optimal by any stretch, I know.  That's a 
product of my inexperience.

>>         };
>>         foreach $key (keys %w_list) {
>>                 $hg = $w_list{$key};
>>                 $~ = "$hg";
>>                 select($~);
> 
> You select something that isn't a valid file handle, and you set $~ to
> something that does not have a format associated with it.

Actually, the filehandle was created here:

$fh{$w_type{$key}{TYPE}} = new IO::File;

$w_type{$key}{TYPE} has the same data that $w_list{$key} has.

(I know, messy and bad and stupid and <your choice four letter word here) :)

>> format =
> 
> This format will only work for STDOUT.

Yeah, see that's where I am confused.  I thought that $~ would not use 
STDOUT and define format = as ($~ = $hg).

The filehandles are dynamic so I can't give them static names so I thought 
$~ would do what I wanted.

>> define hostgroup {
>>        hostgroup_name  nexxia_pppoe_dsl
>>        alias           Nexxia PPPoE DSL (w7)
>>        contact_groups  w7.email,w7.pager
>>        members         @*
>>                        $$hg
> 
> What is this $$hg supposed to be?

that is ($$hg = join(',', @$type);

>>        }
>> .
>> write;
>>         };
> 
>> Any ideas?
> 
> 1 - If you must use format, give the format a name that you will use
>     for the file handle, and actually open the file you are trying to
>     write to.

My problem is that I (think I) need everything to be dynamic since I'm 
trying to write to multiple files depending on content.

I guess because I don't have much experience, I can't see any of the many 
other ways to do it so I keep thinking along these lines because they are 
the ones that make sense to me so far.

>     Something like this would probably work, but I would _strongly_
>     advise against it:
> 
> #! /usr/bin/perl -w
> use strict;
> 
> our $hg;
> 
> my %w_list = (key1 => "val1", key2 => "val2");
> 
> for my $key (keys %w_list)
> {
>     open FOO, ">>$w_list{$key}" or die "open '$w_list{$key}': $!";
>     $hg = $w_list{$key};
>     write FOO;
>     close FOO;
> }
> 
> format FOO =
> define hostgroup {
>        hostgroup_name  nexxia_pppoe_dsl
>        alias           Nexxia PPPoE DSL (w7)
>        contact_groups  w7.email,w7.pager
>        members         @*
>                        $hg
>        }
> .
> 
> __END__
> 
> 
> 2 - Don't use format. Write a subroutine that uses printf, and pass it
>     a file handle to print to. Much easier, and much nicer. Perl
>     formats are very powerful, but a bit odd, and hard to use. It also
>     makes scoping variables hard. Making your code look nice and
>     organised, and keeping the scope of your variables (and file
>     handles!) under control is hard in the presence of formats.

I will read up on on the <<EOHG snippet you posted below.  I don't 
understand what all that means :)

Like I said, the format method is what the documentation suggests (as far as 
what is in my scope of understanding).

>     Something like this, although I really have no idea bout your data
>     structures, and what is supposed to be in them:
> 
> #! /usr/bin/perl -w
> use strict;
> 
> my %w_list = (key1 => "val1", key2 => "val2");
> 
> for my $key (keys %w_list)
> {
>     open my $foo, ">>$w_list{$key}" or die "open '$w_list{$key}': $!";
>     write_hostgroup($foo, $w_list{$key});
> }
> 
> sub write_hostgroup
> {
>     my ($fh, $hg) = @_;
> 
>     print $fh <<EOHG;
> define hostgroup {
>        hostgroup_name  nexxia_pppoe_dsl
>        alias           Nexxia PPPoE DSL (w7)
>        contact_groups  w7.email,w7.pager
>        members         $hg
>        }
> EOHG
> }
> __END__
> 
> 
> If you have an older Perl that doesn't understand the "open my $foo"
> bit, then use your IO::File stuff. It should work pretty much the same
> way.

Nope, I'm using 5.8.0.

> Martien



------------------------------

Date: Fri, 03 Jan 2003 02:02:56 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: FORMAT problem
Message-Id: <slrnb19ruk.307.mgjv@verbruggen.comdyn.com.au>

On Thu, 02 Jan 2003 20:17:10 -0500,
	Jason Lixfeld <never@home.com> wrote:
> Martien Verbruggen wrote:
> 
>> On Thu, 02 Jan 2003 18:29:13 -0500,
>> Jason Lixfeld <never@home.com> wrote:

> I was going to use format because I am familiar with them and I have already 
> used them at an earlier point in the same script.
> 
>> Why are you using the select as well? You can't do that, unless you
>> have actually opened a file handle there. I think you are terribly
>> confused about how this all works. You have read the entries for
>> select and format in perlfunc, and the perlform documentation, right?
> 
> The filehandle was actually opened.  I was using select because I thought 
> you needed to use select to tell write what output channel you wanted to 
> use.

Ah, I see now what you were trying to do...

I'd avoid select and formats, both. it is much nicer to be explicit
about the file handle you use when you call print, or write. Since
your filehandles are actually IO::File objects, they are also easy to
pass around to subroutines (although, since you are using perl 5.8.0,
you should probably just use open with a lexical (my-ed) variable as
the first argument, see my example from the previous post).

Also, formats have this action-at-a-distance thing that I dislike.
Whenever write is called, you have to figure out which format it would
require (based on the file handle and the setting of $~ for that file
handle) and then you'd have to find that format, which could be
anywhere in the file, and you'd have to track down the variables used
by this format. See how that can lead to messy programs? A print is
much easier to understand, parse and read by humans, and much easier
to debug.

Formats are very useful things if you need to create reports on
terminals or fixed-width printers, and you create a template to fill
out with a bunch of global variables. The limitations of formats are
very perl-4-ish. You could wrap a format and some lexical variables in
a subroutine, together with a localised file handle, but that is a bit
sub-optimal, to me.  While it is possible to do all this, it quickly
becomes ugly, and you find yourself using globals all over the place,
for no apparent reason except that you're using format.

I have only used formats in perl once, in the last 5 years, and that
was to print out a multipage credit report on a fixed-type printer,
and the program was small (basically a wrapper around a database
stored procedure). All other output is done with print, or printf
(sometimes pack) if I need to enforce certain formatting.

>> 
>> You need to write your code with use strict and use warnings on. It'll
>> really help you figure out things.
> 
> I turned on -w and it spat out some concatenation errors but I didn't really 
> want to get into trying to fix them because the script was working fine up 
> until that point.  I'll read up on strict and try to get the script to work 
> with that.

I'd advise you, again, to fix those problems, and to always code with
-w and strict and to scope your variables as tightly as possible with
my.  It will encourage you to write much cleaner and easier to debug
code, and to fix problems before they become real problems.

>> I doubt very, very much that formats work with IO::File objects, like
>> this.
> 
> That's what it's shaping up to look like :(

As I said, formats are very perl-4-ish, but some work has been done to
allow it to work a bit better with non-glob file handles. I believe
that the Filehandle::format_name function could work with IO::File
objects or lexically scoped file handles, but I'm not sure. you'd
still need a name to associate with each handle...

>>>         };
>>>         foreach $key (keys %w_list) {
>>>                 $hg = $w_list{$key};
>>>                 $~ = "$hg";
>>>                 select($~);

while here you're using the name as the handle. Or actually, since $hg
is a reference to a scalar, you use the stringified version of that
thing as the name as well as the handle. Neither is what it should be.

> I will read up on on the <<EOHG snippet you posted below.  I don't 
> understand what all that means :)

See the perldata documentation. It's called a here-doc. It is
basically a (double-quoted, in this case) string starting at the line
just after <<EOHG, and continuing to the line that starts with EOHG.

> Like I said, the format method is what the documentation suggests (as far as 
> what is in my scope of understanding).

If you want to use write(), then yes, you need format. However, you
should probably not use write() :)

> Nope, I'm using 5.8.0.

Then I would get rid of the IO::File objects and use open in the way I
did in this example (they're IO::File objects behind the scenes
anyway.  Then I'd use a subroutine either like the one above, with a
print, or if you still feel more comfortable with format, you could
use this:

use IO::Handle;
sub write_hostgroup 
{
    my ($fh, $hg) = @_;
    $fh->format_name("host_group_format");
    write $fh;

format host_group_format =
define hostgroup {     
       hostgroup_name  nexxia_pppoe_dsl
       alias           Nexxia PPPoE DSL (w7)
       contact_groups  w7.email,w7.pager
       members         @*
                       $hg
       }
 .
}

At least now it is all grouped together, even though the format is
still a global one (with only access to the lexical local variable
$hg). using the format outside of this subroutine will result in
oddness.

But again, if I were you, I'd give up on formats when print can do the
job just as easily, and more readable. The version of this subroutine
with print is even shorter than this one, and that has to be a good
thing, not? :)

Martien
-- 
                        | 
Martien Verbruggen      | Useful Statistic: 75% of the people make up
Trading Post Australia  | 3/4 of the population.
                        | 


------------------------------

Date: Thu, 02 Jan 2003 23:44:57 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: LWP & Proxy/Firewalls
Message-Id: <slrnb19jsb.307.mgjv@verbruggen.comdyn.com.au>

On Thu, 02 Jan 2003 09:31:46 GMT,
	AnonyMoose <echao27@ameritech.net> wrote:
> Hi All:
> 
> Any suggestions on handling proxy/firewall servers that require a UID &
> password before permitting access to any external sites ?
> 
> I already tried using the UserAgent basic-authentication option, which
> failed.

But you're supposed to authenticate against the _proxy_, right? Not
the remote site?

The lwpcook documentation has this example:

$ua->proxy(['http', 'ftp'] => 'http://username:password@proxy.myorg.com');


Martien
-- 
                        | 
Martien Verbruggen      | This matter is best disposed of from a great
Trading Post Australia  | height, over water.
                        | 


------------------------------

Date: Fri, 03 Jan 2003 01:32:15 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: Perl for spliting vcf files (palm->iPod)
Message-Id: <3E14E79E.F218E9E9@acm.org>

Michael Robbins wrote:
> 
> Palm software outputs a vcf file that contains multiple records, with
> spaces in between but my iPod won't accept that.
> 
> I must remove the spaces and break up the file into pieces.
> 
> I am not very good at Perl and I was hoping you guys could give me
> some suggestions.
> 
> I plan to post the finished code on the iPod website so I was hoping
> to make it more complete than what I would make for myself.
> 
> I haven't tested this, but this is kind-of what I was thinking about:
> 
> $pathname="d:\\xfer\\";
> $sourcefilename="Palm20021206.vcf";
> $tempfilename="temp.vcf";
> $begintoken="BEGIN:VCARD";
> $endtoken="END:VCARD";
> $nametoken="FN:";
> 
> open(SOURCE, "< $pathname$sourcefilename")
>         or die "Couldn't open $sourcefilename for reading: $!";
> while (<SOURCE>) {
>         if (/$begintoken/ .. /$end token/) {
>            # line falls between begin and end, inclusive
>            if ($begintoken) {
>                    open(SINK, "> $pathname$tempfilename")
>                           or die "Couldn't open $tempfilename for reading: $!";
>            } #if
>            print SINK $_ or die "can't write $sinkfilename: $!";
>            $sinkfilename="$1.vcf\n" if (/$nametoken(.*?)\n/);
>            if (/$endtoken/) {
>               # TO DO: What if a file by that name already exists?
>                   # or if there is no FN?
>                   # John Doe1, John Doe2, ...
>                   close(SINK) or die "couldn't close $sinkfilename: $!";
>                   rename("$pathname$tempfilename","$pathname$sinkfilename");
>            } # if
>    } # if
> } # while (<>)
> close(SOURCE) or die "couldn't close $sourcefilename: $!";


If the records are separated by blank lines you can use paragraph mode to read each record.

#!/usr/bin/perl -w
use strict;
# vcard 2.1 - rfc2425,rfc2426

my $pathname       = 'd:/xfer';
my $sourcefilename = 'Palm20021206.vcf';

$/ = ''; # set paragraph mode
open SOURCE, "< $pathname/$sourcefilename"
    or die "Couldn't open $sourcefilename for reading: $!";
while ( <SOURCE> ) {
    chomp;
    my $sinkfilename;
    if ( /^(fn[;:].+)/im ) {
        ( undef, $sinkfilename ) = split /(?<!\\):/, $1, 2;
        }
    elsif ( /^(n[;:].+)/im ) {
        ( undef, $sinkfilename ) = split /(?<!\\):/, $1, 2;
        # n: field is "lastname;firstname"
        # change to "firstname lastname"
        $sinkfilename = join ' ', reverse split /(?<!\\);/, $sinkfilename;
        }
    my $count = '';
    if ( -e "$pathname/$sinkfilename" ) {
        1 while -e "$pathname/$sinkfilename" . ++$count;
        }
    open SINK, "> $pathname/$sinkfilename$count"
        or die "Couldn't open $sinkfilename$count for writing: $!";
    print SINK "$_\n" or die "can't write $sinkfilename$count: $!";
    close SINK or die "couldn't close $sinkfilename$count: $!";
    }
close SOURCE or die "couldn't close $sourcefilename: $!";

__END__



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: 2 Jan 2003 17:25:00 -0800
From: jeffmott@twcny.rr.com (Jeff Mott)
Subject: Prototype declaration with built-in function
Message-Id: <f9c0ce19.0301021724.12b1033f@posting.google.com>

I usually define my routines with their respective prototypes at the
beginning of a program so they can be called properly from anywhere
else within the program. But I get a prototype mismatch error when
doing this with in internal function name.

# example
use subs 'open';
sub open();


------------------------------

Date: Thu, 02 Jan 2003 20:43:51 -0500
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Prototype declaration with built-in function
Message-Id: <3E14EAD7.EEAC7212@earthlink.net>

Jeff Mott wrote:
> 
> I usually define my routines with their respective prototypes at the
> beginning of a program so they can be called properly from anywhere
> else within the program. But I get a prototype mismatch error when
> doing this with in internal function name.
> 
> # example
> use subs 'open';
> sub open();

The proper prototype for open is *;$@.  This means that you need to have
a subroutine stub like the following:

   sub open(*;$@);

Also, if you're providing a stub with a prototype, then you should leave
off the 'use subs ...' for that particular subname.

-- 
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r  coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);


------------------------------

Date: 2 Jan 2003 15:21:32 -0800
From: stewart@webslave.dircon.co.uk (stew dean)
Subject: Re: Sorting hash tree from Xml::simple.
Message-Id: <2b68957a.0301021521.7ed06e78@posting.google.com>

tadmc@augustmail.com (Tad McClellan) wrote in message news:<slrnb1912i.msd.tadmc@magna.augustmail.com>...
> stew dean <stewart@webslave.dircon.co.uk> wrote:
> 
> > Let me start by saying I'm newish to the perl stuff
> 
> 
> You should check out the Posting Guidelines that are posted
> here weekly. You've already hurt your chances of getting
> help with future Perl problems.

Bit of a catch 22 that - if you arnt a regular you wouldnt know. And
if you are a regular you would know more about perl that I do :)  It's
what I call a 'beware of a leopard' problem - see 'the hitchhikers
guide of the galaxy' to see what I mean :)  It's a bit like RTFM -
often with internet material it's 'what manual?' or the manual is so
badly written it's only of use to the original author.

> Also, the Data::Dumper module is invaluable for debugging
> complex data structures.

Seen that in a few places - I'll read up on it.

> 
> 
> > I've now got a problem that I have found answers for but don't
> > understand the answers.
> 
> 
> I expect that the answers you've found are for a different problem.
> 
> Most "sort a hash" questions are about sorting values that are
> within the same hash...
> 
> 
> > I am using XML::simple to read an xml file into a hash thingie. 
> 
> 
> ... but XML::Simple puts the "values of interest" in _separate_ hashes.

Um okay. Obvoius I'm new here.

> 
> > Now what I
> > need to do sort my hash by alphabetical order.
>  
> > #!/usr/bin/perl
> 
> 
> You should ask for all the help you can get:
> 
>    #!/usr/bin/perl
>    use strict;
>    use warnings;
 




> 
> > my $venueml = XMLin('../myxmlfile.xml');
> 
> 
> It would have been easier to answer your question about sorting
> data if you had included the data to be sorted.
> 
> 
> > $x=0;
> > foreach my $venue (@{$venueml->{venueList}->{venue}})
> >    {
> >    print "<font size=+2>
> > $venueml->{venueList}->{venue}[$x]->{venueName} <br></font>";
> >    $x++;
> >    }
> 
> 
> There are 2 "red flags" there.
> 
> You seldom need to maintain array indexes in Perl.
> 
> You never use $venue in the body of your loop.
> 
> You could replace all of that code with just:
> 
>    foreach my $venue (@{$venueml->{venueList}->{venue}}) {
>       print "<font size=+2>$venue->{venueName} <br></font>";
>    }
> 
> 
> > I'm kinda stumped here.
> 
> 
> I suggest using map to grab the values first, then sort them:
> 
>    my @names = map $_->{venueName}, @{$venueml->{venueList}->{venue}};
> 
>    foreach my $name ( sort ascend_alpha @names ) {
>       print "$name\n";
>    }
> 
> 
> Of course you don't really need the @names temporary variable,
> though it may be easier to read and understand if you left it in:
> 
>    foreach my $name (sort ascend_alpha
>                      map $_->{venueName}, @{$venueml->{venueList}->{venue}} ) {
>       print "$name\n";
>    }

Thank you for the help. 

Stew Dean


------------------------------

Date: Thu, 02 Jan 2003 18:43:03 -0500
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Sorting hash tree from Xml::simple.
Message-Id: <3E14CE87.F5912D27@earthlink.net>

stew dean wrote:
> Tad McClellan wrote:
> > stew dean wrote:
> > > Let me start by saying I'm newish to the perl stuff
> >
> > You should check out the Posting Guidelines that are posted
> > here weekly. You've already hurt your chances of getting
> > help with future Perl problems.
> 
> Bit of a catch 22 that - if you arnt a regular you wouldnt know. And
> if you are a regular you would know more about perl that I do :)  It's
> what I call a 'beware of a leopard' problem - see 'the hitchhikers
> guide of the galaxy' to see what I mean :)  It's a bit like RTFM -
> often with internet material it's 'what manual?' or the manual is so
> badly written it's only of use to the original author.

The posting guidelines are automatically posted here once a week.

Unlike walking through a jungle with a leopard hunting you, while you
search for the "beware of a leopard" sign, it's completely harmless for
you to read the newsgroup without posting anything (aka lurking) for a
week or two, with the intent of learning the proper etiquette for asking
questions.

[snip]
> > > I am using XML::simple to read an xml file into a hash thingie.
> >
> > ... but XML::Simple puts the "values of interest" in _separate_
> > hashes.
> 
> Um okay. Obvoius I'm new here.

You shouldn't have to ask *here* to learn that.

You should have read the documentation that came with XML::Simple.  You
can do this by simply typing:

   perldoc XML::Simple

In your console window.

-- 
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r  coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);


------------------------------

Date: Thu, 02 Jan 2003 18:34:32 -0500
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: vectors & large amounts of data - time & space problems
Message-Id: <3E14CC88.BC15E1A0@earthlink.net>

Robert McArthur wrote:
> 
> anno4000@lublin.zrz.tu-berlin.de (Anno Siegel) writes:
> >It is hard to come up with a space-saving data structure without
> >knowing what kind of access those vectors have to support.
> 
> Sorry, you're right. The algorithm is called HAL - hyperspace analogue
> to language (see http://locutus.ucr.edu/abstracts/ab-comput.html ).

There are a bunch of documents there.

> The vectors are created by working through text. The
> dimensions are words associated with the word that is the name of the
> vector; the dimension's value is the 'strength' of the association.

So it's something like $strength = $vector{$word1}{$word2} ?

> As new associations are found when you go through the test, new
> dimensions are added. When an existing association is found again,
> the weight (value) is updated.
> So for every association in the text we need to determine whether
> the vector exists; if it does, does the dimension exist; if it does,
> then add something to the value, otherwise create the dimension in
> the vector and give it the initial value. If the vector doesn't exist,
> then create it.

In other words, something like:

   if( exists $vector{$word1}{$word2} ) {
      update( $vector{$word1}{$word2} );
   } else {
      $vector{$word1}{$word2} = inital_value();
   }

> This happens a *lot* - the window size (see HAL) is
> six words so for *every word* in the text, we're doing these checks
> and updates 12 times. We're trying to work with about 250,000
> documents, each of which is about a page long :-) That's a lot of
> looking and updating! Hence the need for memory-based.

Or at least, the need for something with an excellent cache.

> The hashes in Perl have been great for the lookups - quite possible
> in, say, C, but Perl has them for free. It looks like we just need
> SV's to take up less room!

If you've got a really fast tied hash implementation (one which uses
really good caching and/or mmapping), then you could do something like
this:

   if( exists $vector{"$word1,$word2"} ) {
      update( $vector{"$word1,$word2"} );
   } else {
      $vector{"$word1,$word2"} = inital_value();
   }

(The reason for the change from $vector{$a}{$b} to $vector{"$a,$b"} is
that tied databases generally can only have flat strings, not hashrefs,
as their values)

-- 
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r  coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 4335
***************************************


home help back first fref pref prev next nref lref last post