[22111] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 4333 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jan 2 09:08:17 2003

Date: Thu, 2 Jan 2003 06:05:07 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 2 Jan 2003     Volume: 10 Number: 4333

Today's topics:
    Re: AWK vs PERL - splitting fields <Miguel.Duarte@tmn.pt>
    Re: Charting/Graphs (Joe Smith)
        Emacs modules for Perl programming (Jari Aalto+mail.perl)
    Re: Freebsd memory leak. <mongrol@REMOVE.btinternet.com>
        How to use radio buttons as an array in a POST (Ben Williams)
    Re: Literal and numeric declarations- are the same? <bart.lateur@pandora.be>
    Re: LWP & Proxy/Firewalls <echao27@ameritech.net>
    Re: Newbie Reg. Exp. questions.. (Joe Smith)
        Passing an array to a subroutine (HM)
        Perl for spliting vcf files (palm->iPod) (Michael Robbins)
    Re: Printed string truncated. news@roaima.freeserve.co.uk
        RecDescent and variables (Peter H.J. v.d. Kamp)
    Re: system command and $_ variable (juha)
    Re: The diamond operator (trwww)
        vectors & large amounts of data - time & space problems (Robert McArthur)
    Re: vectors & large amounts of data - time & space prob (Anno Siegel)
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 02 Jan 2003 10:25:56 +0000
From: Miguel Angelo Lapa Duarte <Miguel.Duarte@tmn.pt>
Subject: Re: AWK vs PERL - splitting fields
Message-Id: <3e141036$0$18110$a729d347@news.telepac.pt>

Benjamin Goldberg wrote:
> Miguel Angelo Lapa Duarte wrote:
> [snip]
> 
>>for($l = 0; $l < $ARGV[0]; ++$l) {
>>                  for($c = 0; $c < $ARGV[1] - 1; ++$c) {
>>                                  print int(rand(100)) . ',';
>>                  }
>>                  print int(rand(100));
>>                  print "\n";
>>}
> 
> 
> This will run faster if you rewrite it as:
> 
>    for my $n ( @ARGV ) {
>       print int(rand(100)), "\n" for 1 .. $n;
>    }
> 

Faster and cleaner :D



------------------------------

Date: Thu, 02 Jan 2003 06:46:42 GMT
From: inwap@inwap.com (Joe Smith)
Subject: Re: Charting/Graphs
Message-Id: <mfRQ9.1612$io.71308@iad-read.news.verio.net>

In article <3E0FFD55.6030003@thecouch.homeip.net>,
Mina Naguib  <spam@thecouch.homeip.net> wrote:
>
>Koos Pol wrote:
>| Robert Sipe wrote (Monday 30 December 2002 05:48):
>|
>|
>|>I need to chart performance data I extract from an SNMP daemon.  Thus, I
>|>will periodically poll the system for a few performance values, store
>|>the data to a file, then chart it out in on a Web page.  I have the SNMP
>|>polling and data collection all taken care of.  It is simple graph of
>|>perf data vs. time.  What is the recommended module/method to
>|>chart/graph this data using a cgi script?  Thanx in advance!
>|
>|
>|
>| http://search.cpan.org/search?query=chart
>
>Or there's a non-Perl solution if you don't feel like doing much coding,
>however it's initial learning curve might be more than you'd like. Take
>a look at RRDTool ( http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/ )

The C + Perl solution from that person makes pretty graphs of SNMP stats.
Check out http://www.stat.ee.ethz.ch/mrtg/ (www.mrtg.org goes there).
	-Joe

-- 
See http://www.inwap.com/ for PDP-10 and "ReBoot" pages.


------------------------------

Date: 02 Jan 2003 08:59:19 GMT
From: <jari.aalto@poboxes.com> (Jari Aalto+mail.perl)
Subject: Emacs modules for Perl programming
Message-Id: <perl-faq/emacs-lisp-modules_1041497631@rtfm.mit.edu>

Archive-name: perl-faq/emacs-lisp-modules
Posting-Frequency: 2 times a month
URL: http://tiny-tools.sourceforge.net/
Maintainer: Jari Aalto <jari.aalto@poboxes.com>

Announcement: "What Emacs lisp modules can help with programming Perl"

    Preface

        Emacs is your friend if you have to do anything comcerning software
        development: It offers plug-in modules, written in Emacs lisp
        (elisp) language, that makes all your programmings wishes come
        true. Please introduce yourself to Emacs and your programming era
        will get a new light.

    Where to find Emacs/XEmacs

        o   Unix:
            http://www.gnu.org/software/emacs/emacs.html
            http://www.xemacs.org/

        o   Unix Windows port (for Unix die-hards):
            install http://www.cygwin.com/  which includes native Emacs 21.x.
            XEmacs port is bundled in XEmacs setup.exe available from
            XEmacs site.

        o   Pure Native Windows port
            http://www.gnu.org/software/emacs/windows/ntemacs.html
            ftp://ftp.xemacs.org/pub/xemacs/windows/setup.exe

        o   More Emacs resources at
            http://tiny-tools.sourceforge.net/  => Emacs resource page

Emacs Perl Modules

    Cperl -- Perl programming mode

        ftp://ftp.math.ohio-state.edu/pub/users/ilya/perl
        http://www.perl.com/CPAN-local/misc/emacs/cperl-mode/
        <ilya@math.ohio-state.edu>    Ilya Zakharevich

        CPerl is major mode for editing perl files. Forget the default
        `perl-mode' that comes with Emacs, this is much better. Comes
        standard in newest Emacs.

    TinyPerl -- Perl related utilities

        http://tiny-tools.sourceforge.net/

        If you ever wonder how to deal with Perl POD pages or how to find
        documentation from all perl manpages, this package is for you.
        Couple of keystrokes and all the documentaion is in your hands.

        o   Instant function help: See documentation of `shift', `pop'...
        o   Show Perl manual pages in *pod* buffer
        o   Grep through all Perl manpages (.pod)
        o   Follow POD references e.g. [perlre] to next pod with RETURN
        o   Coloured pod pages with `font-lock'
        o   Separate `tiperl-pod-view-mode' for jumping topics and pages
            forward and backward in *pod* buffer.

        o   Update `$VERSION' variable with YYYY.MMDD on save.
        o   Load source code into Emacs, like Devel::DProf.pm
        o   Prepare script (version numbering) and Upload it to PAUSE
        o   Generate autoload STUBS (Devel::SelfStubber) for you
            Perl Module (.pm)

    TinyIgrep -- Perl Code browsing and easy grepping

        [TinyIgrep is included in Tiny Tools Kit]

        To grep from all installed Perl modules, define database to
        TinyIgrep. There is example file emacs-rc-tinyigrep.el that shows
        how to set up dattabases for Perl5, Perl4 whatever you have
        installed

        TinyIgrep calls Igrep.el to to do the search, You can adjust
        recursive grep options, set search case sensitivity, add user grep
        options etc.

        You can find latest `igrep.el' module at
        <http://groups.google.com/groups?group=gnu.emacs.sources> The
        maintainer is Jefin Rodgers <kevinr@ihs.com>.

    TinyCompile -- To Browse grep results in Emacs *compile* buffer

        TinyCompile is a minor mode for *compile* buffer from where
        you can collapse unwanted lines or shorten file URLs:

            /asd/asd/asd/asd/ads/as/da/sd/as/as/asd/file1:NNN: MATCHED TEXT
            /asd/asd/asd/asd/ads/as/da/sd/as/as/asd/file2:NNN: MATCHED TEXT

            -->

            cd /asd/asd/asd/asd/ads/as/da/sd/as/as/asd/
            file1:NNN: MATCHED TEXT
            file1:NNN: MATCHED TEXT

End



------------------------------

Date: Thu, 2 Jan 2003 11:09:52 -0000
From: "Brian Skreeg" <mongrol@REMOVE.btinternet.com>
Subject: Re: Freebsd memory leak.
Message-Id: <v187gcsr29s120@corp.supernews.com>


"Benjamin Goldberg" <goldbb2@earthlink.net> wrote in message
news:3E137247.68585653@earthlink.net...
> Brian Skreeg wrote:
>
> First suggestion:  Use strict and warnings.  Without them, you'll get a
> lot less help than you'll get with them.

I have use warnings on. It warns on nothing at all. Guess I`ll have to bite
one and turn on strict too. :)

> [snip]
> >       foreach $row ($ts->rows) {
> >         if (@$row[0] =~ /Norm/)
>
> This isn't related to your problem, but why are you asking perl for an
> array slice (@$row[0]) when what you appear to want is a single element
> from the arrayref (either $$row[0], or $row->[0])?

Yep, you`re not the first to point that out. From
http://search.cpan.org/author/MSISK/HTML-TableExtract-1.08/lib/HTML/TableExt
ract.pm

foreach $row ($ts->rows) {
      print join(',', @$row), "\n";
   }

rows()
Return all rows within a matched table. Each row returned is a reference to
an array containing the text of each cell.






------------------------------

Date: 2 Jan 2003 03:08:53 -0800
From: ben.williams@ocado.com (Ben Williams)
Subject: How to use radio buttons as an array in a POST
Message-Id: <ec75e27e.0301020308.5dd3b909@posting.google.com>

Hi

I'm using the LWP::Agent to construct a POST command to a php page,
which works fine for passing most variable types, but not for radio
buttons.  I thought I could pass it like :
POST '<http address>','[<name> => '<value>']'
in the vain hope that differentiating the <value> would mean the
correct one was passed.  It doesn't seem to be, and looking at other
postings would suggest that I should be doing it using an array.  It
may be that I'm not asking the right questions of the online docco,
but does anyone have any pointers on how I might do this ?

Many thanks
Ben


------------------------------

Date: Thu, 02 Jan 2003 08:34:03 GMT
From: Bart Lateur <bart.lateur@pandora.be>
Subject: Re: Literal and numeric declarations- are the same?
Message-Id: <cau71v4kt71m7tli2gnani3p8tb8o3p1gd@4ax.com>

Sara wrote:

>but I
>like being able to spot "constant" expressions immediately without
>having to scan the like for variables.

What variables?

	"foo\nbar"

won't do what's desired if you change the quotes to single quotes.

-- 
	Bart.


------------------------------

Date: Thu, 02 Jan 2003 09:31:46 GMT
From: "AnonyMoose" <echao27@ameritech.net>
Subject: Re: LWP & Proxy/Firewalls
Message-Id: <6GTQ9.4732$qU5.3695619@newssrv26.news.prodigy.com>

Hi All:

Any suggestions on handling proxy/firewall servers that require a UID &
password
before permitting access to any external sites ?

I already tried using the UserAgent basic-authentication option, which
failed.

I've heard that this is not supported too well by LWP.

Thanks in Advance,

Eisen




------------------------------

Date: Thu, 02 Jan 2003 07:11:52 GMT
From: inwap@inwap.com (Joe Smith)
Subject: Re: Newbie Reg. Exp. questions..
Message-Id: <YCRQ9.1614$io.71308@iad-read.news.verio.net>

In article <aukr4g$qvd$1@nobel.pacific.net.sg>,
-SmC- <invalid-email@melodyland.net> wrote:
>Can anyone teach me how to validate IP address?
>have 4 group of 3 number and each group not over 255 ?

There are valid IP addresses that do not match that definition.
  127.1        is valid and equivalent to 127.0.0.1
  127.16777215 is valid and equivalent to 127.255.255.255
  128.0.65535  is valid and equivalent to 128.0.255.255

	-Joe

-- 
See http://www.inwap.com/ for PDP-10 and "ReBoot" pages.


------------------------------

Date: 2 Jan 2003 05:57:29 -0800
From: happyman_132000@yahoo.com (HM)
Subject: Passing an array to a subroutine
Message-Id: <71c71f98.0301020557.5de53405@posting.google.com>

I am having trouble passing an array into validateString subroutine
and then
walking through the array and comparing its contents.  Could someone
please
take a look what I have and tell me what the problem is? I can't
believe how
much trouble I am having with something simple.  Thank you.


This is my .pl file

## passes in array contain 4 elements (each a string) and the size of
the array
$Response = Validation::validateString(@Strings, $Size);

####################################################

This is my .pm file
sub validateString
{
    # my $self = shift;
    (@a_String, $ArraySize) = shift;
    
    ## Possible strings
    my $ValidString1 = "This is a test.";
    my $ValidString2 = "This line is contained in the file.";
    my $ValidString3 = "So is this one.";
    my $ValidString4 = "And this one too.";
        
    ## For some reason I have to decrement 2 times, otherwise
    ## the for loop runs two additional times???
    $ARRAYSIZE--;
    $ARRAYSIZE--;
    
    ## Does not continue past the 0th element in the array (first
string)
    for($i=0; $i<=$ArraySize; $i++)
    {
        ## This one checks ok. Which it should not.
        ## Check for string1
        if($a_String[$i] =~ $ValidString1){print"Expected String    :
$a_String[$i]\n";}
    
        ## Check for string2
        elsif($a_String[$i] =~ $ValidString2){print"Expected String   
: $a_String[$i]\n";}
        
        ## Check for string3
        elsif($a_String[$i] =~ $ValidString3){print"Expected String   
: $a_String[$i]\n";}
    
        ## Check for string4
        elsif($a_String[$i] =~ $ValidString4){print"Expected String   
: $a_String[$i]\n";}
    
        ## Does not catch ValidString1.
        ## Reponse does not match any strings 
        elsif($a_String[$i] !~ $ValidString1 or $a_String[$i] !~
$ValidString2 or $a_String[$i] !~ $ValidString3 or $a_String[$i] !~
$ValidString4)
        {
            print"Unexpected String found in file    : $a_String[$i]";
        } 
    
    }
}
    
    


1;


################################################
This what is contained in the array:

This is a test. Not a valid string.
This line is contained in the file.
So is this one.
And this one too.


------------------------------

Date: 2 Jan 2003 05:02:44 -0800
From: michael.robbins@us.cibc.com (Michael Robbins)
Subject: Perl for spliting vcf files (palm->iPod)
Message-Id: <c6c65b14.0301020502.d739ba9@posting.google.com>

Palm software outputs a vcf file that contains multiple records, with
spaces in between but my iPod won't accept that.

I must remove the spaces and break up the file into pieces.

I am not very good at Perl and I was hoping you guys could give me
some suggestions.

I plan to post the finished code on the iPod website so I was hoping
to make it more complete than what I would make for myself.


I haven't tested this, but this is kind-of what I was thinking about:

$pathname="d:\\xfer\\";
$sourcefilename="Palm20021206.vcf";
$tempfilename="temp.vcf";
$begintoken="BEGIN:VCARD";
$endtoken="END:VCARD";
$nametoken="FN:";

open(SOURCE, "< $pathname$sourcefilename")
	or die "Couldn't open $sourcefilename for reading: $!";
while (<SOURCE>) {
	if (/$begintoken/ .. /$end token/) {
	   # line falls between begin and end, inclusive
	   if ($begintoken) {
	   	   open(SINK, "> $pathname$tempfilename")
		   	  or die "Couldn't open $tempfilename for reading: $!";
	   } #if
	   print SINK $_ or die "can't write $sinkfilename: $!";
	   $sinkfilename="$1.vcf\n" if (/$nametoken(.*?)\n/);
	   if (/$endtoken/) {
	      # TO DO: What if a file by that name already exists?
		  # or if there is no FN?
		  # John Doe1, John Doe2, ...
		  close(SINK) or die "couldn't close $sinkfilename: $!";
	   	  rename("$pathname$tempfilename","$pathname$sinkfilename");
	   } # if
   } # if
} # while (<>)
close(SOURCE) or die "couldn't close $sourcefilename: $!";


------------------------------

Date: Thu, 2 Jan 2003 11:48:25 +0000
From: news@roaima.freeserve.co.uk
Subject: Re: Printed string truncated.
Message-Id: <9u81va.tds.ln@moldev.cmagroup.co.uk>

Chris Snow <chris_snow@bigfoot.com> wrote:
> If I assign the output of a command to a scalar variable then print
> that scalar variable to the screen the output is truncated.

Perl won't have truncated that output.

> $string = `somecommand -last 100`;
> print $string;

> (data protection forbids that I post the output or the command!

Data Protection law in the UK (in which you appear to be located) also
requires that you test against data that cannot be identified with any
individuals (i.e. dummy data). So the output shouldn't matter. However,
instead of nitpicking further I'll also offer some suggestions:

How much of the data is truncated?

What happens if you simply run "somecommand -last 100" from the
screen? Does that also get truncated?

What happens if you dump the output to a file and then examine the file
using an editor (vi, emacs, wordpad,...)

Regards,
Chris
-- 
@s=split(//,"Je,\nhn ersloak rcet thuarP");$k=$l=@s;for(;$k;$k--){$i=($i+1)%$l
until$s[$i];$c=$s[$i];print$c;undef$s[$i];$i=($i+(ord$c))%$l}


------------------------------

Date: 2 Jan 2003 11:32:53 GMT
From: kamp@inl.nl_nospam (Peter H.J. v.d. Kamp)
Subject: RecDescent and variables
Message-Id: <av1815$qmr$1@highway.leidenuniv.nl>

When trying to run the following script I got
the following error:
Global symbol "%tables" requires explicit package name.

I can't figure out what I'm doing wrong; from Damian's documentation
I understand that it must be possible to use variables in actions.
Some things in the code are in Dutch, no problem I hope.

Any ideas?

use Parse::RecDescent;

my $grammar = q{
   queryInterpreter: Query(s) /\Z/
   
   Query:
      queryExpressions
      
   queryExpressions:
      opQueryExpression
      { print "Op expresie\n"; }
      | queryExpression
      { print "Single expressie\n"; }
      |
      <error: Fout in query>

   opQueryExpression:
      queryExpression operator queryExpression
         
   queryExpression:
      mediumQuery
      |
      topicQuery
      |
      auteurQuery
      |
      periodQuery
      
   mediumQuery:
      '/m/' querystring
      { print "Gevonden: $item[1] $item[2]\n";
         $tables{'bron'}=1;
      }
            
   topicQuery:
      '/t/' querystring
      { print "$item[1] $item[2]\n"; }
      
   auteurQuery:
      '/a/' querystring
      { print "$item[1] $item[2]\n"; }
   
   periodQuery:
      rangePeriod
      { print "Range\n"; }
      | singlePeriod
      
   singlePeriod:
      '/p/' digits
      { print "$item[1] $item[2]\n"; }

   rangePeriod:
      '/p/' digits '-' digits
      | '/p/' '-' digits
      | '/p/' digits '-'

   operator:
      '/o/' opstring
      { print "$item[1] $item[2]\n"; }
      
   querystring: /[a-zA-Z:\. ]+/
   digits: /[0-9]{4}/
   opstring: 'and' | 'or' | 'not'
};

my %tables = ('bron', 0, 'topic', 0, 'auteur', 0);

my $parser = new Parse::RecDescent($grammar);
undef $/;
my $text = <STDIN>;
$parser->queryInterpreter($text);



------------------------------

Date: 2 Jan 2003 01:19:48 -0800
From: salmjuh@hotmail.com (juha)
Subject: Re: system command and $_ variable
Message-Id: <c9858ca5.0301020119.4006ae79@posting.google.com>

"Ian.H [dS]" <ian@WINDOZEdigiserv.net> wrote in message news:<3ok61v00q3331ged4g8kr1he2gjg3nnbtd@4ax.com>...
> -----BEGIN xxx SIGNED MESSAGE-----
> Hash: SHA1
> 
> In a fit of excitement on 1 Jan 2003 12:42:59 -0800,
> salmjuh@hotmail.com (juha) managed to scribble:
> 
> > Hi all
> > 
> > My script try to wake up another program and give some variables at
> > the same time to that program.
> > 
> > Here is what doesn't work:
> > 
> > system ('ffmpeg -i /tmp/$_ -b 1300 -s 352*288  /tmp/ready/$_');
> > 
> > It has read the real file name before OK, but when I run it, it gives
> > me:
> > 
> > ffmpeg -i /tmp/$_ -b 1300 -s 352*288 /tmp/ready/$_  
> > 
> > I have tryit almost everything and running out of ideas.
> > So people, how do I get real filenames to right places ??
> > 
> > Thanks for any help
> > 
> > JS
> 
> 
> Use " " rather than ' ' for the system call.
> 
> 

I have also tryit that method. Doesn't work ...


Thanks anyway 
-JS


> HTH.
> 
> 
> 
> Regards,
> 
>   Ian
> 
> -----BEGIN xxx SIGNATURE-----
> Version: PGP Personal Privacy 6.5.3
> 
> iQA/AwUBPhNTGGfqtj251CDhEQITFACg/WxAnx0gpxjfaPMZ0PXunZEZcncAoLIY
> A1xiXPd6rKhwCoasdH4f+Ico
> =msDL
> -----END PGP SIGNATURE-----


------------------------------

Date: 2 Jan 2003 05:43:51 -0800
From: toddrw69@excite.com (trwww)
Subject: Re: The diamond operator
Message-Id: <d81ecffa.0301020543.3a0701c0@posting.google.com>

Benjamin Goldberg <goldbb2@earthlink.net> wrote in message news:<3E138672.95D8DE5C@earthlink.net>...
> James Tate wrote:
> > 
> > Hello,
> > 
> > I have looked through perldoc.com and have read what the have about
> > the diamond operator. I'm currently reading /Learning Perl/ by
> > O'Reilly Publshing. I don't really understand their definition of
> > using the diamond operator. Does anyone have any links or information
> > they can give me to explain it?
> 
> The diamond operator is the same as the readline() builtin function.
> 
> > An example they give me is:
> > > while (defined($line = <>)) {
> > >       chomp($line);
> > >       print "It was $line that I saw!\n";
> > > }
> > 
> > I don't really understand why the diamond operator is there in the
> > while loop...
> 
> The diamond operator with no argument (that is, <>) is a syntactic
> shortcut for <ARGV>.  And keeping in mind that <> and readline() are the
> same, this means that the above loop is the same as:
> 
>    while( defined( $line = readline(*ARGV) ) ) {
>       chomp($line);
>       print "It was $line that I saw!\n";
>    }
> 
<snip />

This is a good reply , but the two constructs are not exactly the same
=0)

from perldoc perlop:

The null filehandle <> is special: it can be used to emulate the
behavior of sed and awk. Input from <> comes either from standard
input, or from each file listed on the command line. Here's how it
works: the first time <> is evaluated, the @ARGV array is checked, and
if it is empty, $ARGV[0] is set to ``-'', which when opened gives you
standard input. The @ARGV array is then processed as a list of
filenames. The loop

    while (<>) {
        ...                     # code for each line
    }
is equivalent to the following Perl-like pseudo code:

    unshift(@ARGV, '-') unless @ARGV;
    while ($ARGV = shift) {
        open(ARGV, $ARGV);
        while (<ARGV>) {
            ...         # code for each line
        }
    }

Todd W.


------------------------------

Date: 2 Jan 2003 11:30:51 GMT
From: mcarthur@dstc.edu.au (Robert McArthur)
Subject: vectors & large amounts of data - time & space problems
Message-Id: <1041507051.134610@eeyore.dstc.edu.au>

Could anyone please help please:
We are doing research using a algorithm from cog sci. It basically
requires the creation of a set of vectors, where each vector is a
scalar string as a name, and has, on average, about 200 dimensions
(there may be many more than 200, and some with 1). Each dimension
is a string and real value.
We've implemented this fine in Perl using a hash for a vector, the
keys of the hash being the strings and the values being the real
numbers of the dimensions. The vector is an object with name and a
couple of other meta-information things as well as the dimension's
hash. A set of vectors is simply another object which is a hash
storing the name (keys) and the vector (values).
It all works fine...

except we're now trying it with large amounts of data and running
into problem with both space and time. We're using a vocab of about
100,000 strings, each of which is on average 6 characters long.
Doing the maths, assuming we store each dimension name as a string
rather than a pointer to a look-up table, it should be about 1GB
of memory. We're running out of memory at around 7GB on the hefty
machine we've chosen :-( 

Looking around, I came across a quote saying that Perl stores a SV,
or rather an integer IV, in at least 28 bytes. I was assuming 4 bytes
in my calculation above, not withstanding we're using reals and not
integers, and so I can see why we're running out of memory.

Can anyone give me any ideas for a better way to do this in Perl?

We're doing too many lookups, I believe, to have it on disk rather
than in memory so that road's out. We can code it in C, but would
rather stick to pure perl if possible. If I can fix the memory
problem, I can start work on the time problem :-) I suspect at
least part of the time problem (36 hours before it crashed out
of memory) is that it's swapping so much, so I'm happy to do the
time after the space has been sorted.

Thanks for any ideas you can give!
Robert
--
Robert McArthur		CRC for Enterprise Distributed System Technology
  BSc(Hons)		  Ph. +61 7 3365 4310        Brisbane, Australia
  MInfTech		  Fax +61 7 3365 4311	
  Grad.Cert.Ed.		  mcarthur@dstc.edu.au	


------------------------------

Date: 2 Jan 2003 12:29:50 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: vectors & large amounts of data - time & space problems
Message-Id: <av1bbu$n24$1@mamenchi.zrz.TU-Berlin.DE>

Robert McArthur <mcarthur@dstc.edu.au> wrote in comp.lang.perl.misc:
> Could anyone please help please:
> We are doing research using a algorithm from cog sci. It basically
> requires the creation of a set of vectors, where each vector is a
> scalar string as a name, and has, on average, about 200 dimensions
> (there may be many more than 200, and some with 1). Each dimension
> is a string and real value.
> We've implemented this fine in Perl using a hash for a vector, the
> keys of the hash being the strings and the values being the real
> numbers of the dimensions. The vector is an object with name and a
> couple of other meta-information things as well as the dimension's
> hash. A set of vectors is simply another object which is a hash
> storing the name (keys) and the vector (values).
> It all works fine...

[except it's eating too much memory]

It is hard to come up with a space-saving data structure without
knowing what kind of access those vectors have to support.

Here is one suggestion:  Use a compressed form of the hash for
storage and only expand to real hashes the one(s) you are actually
working with.  If there is a character (like "\0") that can never
appear in the hash keys or values, you can use it as a separator
as follows:

    sub compress_hash {
        my $href = shift;
        join "\0", %$href;
    }

    sub expand_hash {
        my $string = shift;
        return { split /\0/, $string };
    }

This stores the hash in a single string, which should be much more
economical than the explicit hash form.

Anno


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 4333
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[22111] in Perl-Users-Digest

Perl-Users Digest, Issue: 4333 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Thu Jan 2 09:08:17 2003

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jan 2 09:08:17 2003