[28185] in Perl-Users-Digest
Perl-Users Digest, Issue: 9549 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Aug 2 14:10:16 2006
Date: Wed, 2 Aug 2006 11:10:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 2 Aug 2006 Volume: 10 Number: 9549
Today's topics:
Re: Perl hash of hash efficiency. xhoster@gmail.com
Re: Perl hash of hash efficiency. <yekasi@gmail.com>
Re: Perl hash of hash efficiency. <mritty@gmail.com>
Re: Perl hash of hash efficiency. <yekasi@gmail.com>
Re: Q: ActivePerl - calling an ActiveX object bubbabubbs@yahoo.com
Re: Q: ActivePerl - calling an ActiveX object <benmorrow@tiscali.co.uk>
Re: Recursion xhoster@gmail.com
Re: Recursion <tzz@lifelogs.com>
Re: Recursion <usenet.x.octomancer@spamgourmet.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 02 Aug 2006 16:09:49 GMT
From: xhoster@gmail.com
Subject: Re: Perl hash of hash efficiency.
Message-Id: <20060802121750.708$HU@newsreader.com>
"tak" <yekasi@gmail.com> wrote:
> Hi,
>
> I have a script, that loads a txt file, with 240k lines in it to a hash
> currently. And when it loads the data to the hash - it becomes slower
> and slower when it reaches may be around 150k
How much memory do you have? How much are you using at this point?
> (probably due to
> collision, since perl's hash uses linear chaininig...).
That is a rather unlikely bit of speculation, especially on a modern Perl.
How many buckets does your hash have and use? (print scalar %hash).
> So, i am thinking of implementing it in hash of hash... using the first
> letter of these records, and divide them into 26 sub hashes. (Since
> these records' first letter are pretty random from A-Z).
>
> So, I tried it, the performance is about the same... why??
Solving the imagined problem rarely solves the real problem. (And if that
were the problem it was that easy to fix, don't you think Perl would
already have made that fix itself in the core hashing code?)
When the facts don't fit your theory, re-examing your theory. You probably
have a swapping problem, not a hash collision problem. And if you do have
a collision problem, the better way to fix it would be to start out with a
higher number of buckets, by assigning to the keys function.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: 2 Aug 2006 10:20:15 -0700
From: "tak" <yekasi@gmail.com>
Subject: Re: Perl hash of hash efficiency.
Message-Id: <1154539215.048056.104280@75g2000cwc.googlegroups.com>
xhoster@gmail.com wrote:
> "tak" <yekasi@gmail.com> wrote:
> > Hi,
> >
> > I have a script, that loads a txt file, with 240k lines in it to a hash
> > currently. And when it loads the data to the hash - it becomes slower
> > and slower when it reaches may be around 150k
>
> How much memory do you have? How much are you using at this point?
>From looking at the PF Usage - it is about 1.9 GB. on a 1gb machine -
the available physical memory are down to about 10 MB when loading...
But the CPU usage remains about 5% only...
>
> > (probably due to
> > collision, since perl's hash uses linear chaininig...).
>
> That is a rather unlikely bit of speculation, especially on a modern Perl.
> How many buckets does your hash have and use? (print scalar %hash).
>
I have 1 main_hash, which stores 27 hashes in it. And out of each 27
hashes, it averages about 9k unique strings. print scalar %hash
reports, 23/32. What does this number mean?
> > So, i am thinking of implementing it in hash of hash... using the first
> > letter of these records, and divide them into 26 sub hashes. (Since
> > these records' first letter are pretty random from A-Z).
> >
> > So, I tried it, the performance is about the same... why??
>
> Solving the imagined problem rarely solves the real problem. (And if that
> were the problem it was that easy to fix, don't you think Perl would
> already have made that fix itself in the core hashing code?)
>
So, what do you suggest?
> When the facts don't fit your theory, re-examing your theory. You probably
> have a swapping problem, not a hash collision problem. And if you do have
> a collision problem, the better way to fix it would be to start out with a
> higher number of buckets, by assigning to the keys function.
>
Can you elaborate on what you mean by a swapping problem? And I thought
about assigning higher number of bucket to the hash itself , but i
cannot find the related function to set that... I am a Java programmer,
and this is my first perl script.. I tried looking into the constructor
for the hash itself, but it doesnt seem like it accepts argument...?
Last question,
How Do you delete an element within a hoh? Say i have a hash of hash,
like the following.
my %hoh();
loop() { # say this is the loop of each line of my txtFile
my $value = "TheRecordFromMyTxtFile";
my $letter = substr $value, 0, 1; # say, i am using the first letter
as the key for subhash.
my $myKey = substr $value, 5, 9; # Say position 5 - 9 is the key for
the element.
$hoh{$letter}{$myKey} = $value
}
Now, I want to delete a particular value from one of the subhash...
I tried doing this,
delete $hoh{$letter}{$value};
But it doesnt seem like it is deleting... B/c if I try to get the
length of the $hoh{$letter}, it still reports the same number...
Thanks!
tak
> Xho
>
> --
> -------------------- http://NewsReader.Com/ --------------------
> Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: 2 Aug 2006 10:32:55 -0700
From: "Paul Lalli" <mritty@gmail.com>
Subject: Re: Perl hash of hash efficiency.
Message-Id: <1154539975.247139.30100@i42g2000cwa.googlegroups.com>
tak wrote:
> xhoster@gmail.com wrote:
> > "tak" <yekasi@gmail.com> wrote:
> I have 1 main_hash, which stores 27 hashes in it. And out of each 27
> hashes, it averages about 9k unique strings. print scalar %hash
> reports, 23/32. What does this number mean?
It means that Perl has allocated 32 buckets for this hash, and that 23
of them are currently in use. So, only 4 collissions in the "main"
hash.
> > And if you do have
> > a collision problem, the better way to fix it would be to start out with a
> > higher number of buckets, by assigning to the keys function.
> >
>
> And I thought
> about assigning higher number of bucket to the hash itself , but i
> cannot find the related function to set that...
Xho just gave it to you. The `keys` function. If you read the Perl
documentation on this function, by typing at your console window:
perldoc -f keys
you will find:
====================
As an lvalue "keys" allows you to increase the
number of hash buckets allocated for the given hash.
This can gain you a measure of efficiency if you
know the hash is going to get big. (This is similar
to pre-extending an array by assigning a larger
number to $#array.) If you say
keys %hash = 200;
then %hash will have at least 200 buckets allocated
for it--256 of them, in fact, since it rounds up to
the next power of two.
====================
> I am a Java programmer, and this is my first perl script..
Welcome to the world of Perl. You'll love it, I promise. :-)
> I tried looking into the constructor for the hash itself,
There is no such thing. Constructors are methods of classes. Hashes
are native data types, not objects. They are simply declared and used.
> but it doesnt seem like it accepts argument...?
I don't know what you were trying to give arguments to. If you mean
simply the `my` keyword, then you are correct - you cannot pre-allocate
buckets when you declare the hash. Instead, declare it, and then
assign buckets, using the keys function as described above.
> How Do you delete an element within a hoh? Say i have a hash of hash,
> like the following.
>
> my %hoh();
>
> loop() { # say this is the loop of each line of my txtFile
I don't know what this means. This is not Perl code. Please show real
code whenever possible.
> my $value = "TheRecordFromMyTxtFile";
> my $letter = substr $value, 0, 1; # say, i am using the first letter
> as the key for subhash.
> my $myKey = substr $value, 5, 9; # Say position 5 - 9 is the key for
> the element.
> $hoh{$letter}{$myKey} = $value
> }
>
>
> Now, I want to delete a particular value from one of the subhash...
>
> I tried doing this,
>
> delete $hoh{$letter}{$value};
That is precisely how you delete that particular value from that
particular "subhash".
> But it doesnt seem like it is deleting... B/c if I try to get the
> length of the $hoh{$letter}, it still reports the same number...
What, exactly, do you mean by "get the length of the $hoh{$letter}"?
Again, please show real code whenever possible. Did you, by any
chance, do something like:
print length $hoh{$letter};
?
That does not, at all, give you what you want. When you use a
reference in a scalar context, you get a string representing the type
of reference and it's memory address. To see what I mean, try printing
out:
print $hoh{$letter};
You should see something like
HASH(0x14d410)
It is this string that you were passing to the length function.
Obviously, the length of that string isn't going to change simply
because you removed one of the key/value pairs.
To determine the "size" (that is, number of keys) of a hash, you again
use the keys function, this time in sclar context:
print scalar(keys %{$hoh{$letter}});
The additional punctuation around $hoh{$letter} is what we call
"dereferencing" a reference. You can read all about it by typing at
your console window:
perldoc perlreftut
perldoc perllol
perldoc perldsc
Hope this helps,
Paul Lalli
------------------------------
Date: 2 Aug 2006 11:04:48 -0700
From: "tak" <yekasi@gmail.com>
Subject: Re: Perl hash of hash efficiency.
Message-Id: <1154541888.560731.131420@p79g2000cwp.googlegroups.com>
Paul Lalli wrote:
> tak wrote:
> > xhoster@gmail.com wrote:
> > > "tak" <yekasi@gmail.com> wrote:
> > I have 1 main_hash, which stores 27 hashes in it. And out of each 27
> > hashes, it averages about 9k unique strings. print scalar %hash
> > reports, 23/32. What does this number mean?
>
> It means that Perl has allocated 32 buckets for this hash, and that 23
> of them are currently in use. So, only 4 collissions in the "main"
> hash.
Why 4 collision? Do you mean 32 - 23 = 9?? or b/c you knew that i have
27 subhashes, so 27 - 23?? Without using the 27 hash of hashes, print
scalar %mainHash reports 16k / 23k. (Of course - it reported the
number, but I didnt take them down, just remember its 16k and 23k) That
is a hugh amount of collision.
>
> > > And if you do have
> > > a collision problem, the better way to fix it would be to start out with a
> > > higher number of buckets, by assigning to the keys function.
> > >
> >
> > And I thought
> > about assigning higher number of bucket to the hash itself , but i
> > cannot find the related function to set that...
>
> Xho just gave it to you. The `keys` function. If you read the Perl
> documentation on this function, by typing at your console window:
> perldoc -f keys
> you will find:
> ====================
> As an lvalue "keys" allows you to increase the
> number of hash buckets allocated for the given hash.
> This can gain you a measure of efficiency if you
> know the hash is going to get big. (This is similar
> to pre-extending an array by assigning a larger
> number to $#array.) If you say
>
> keys %hash = 200;
>
> then %hash will have at least 200 buckets allocated
> for it--256 of them, in fact, since it rounds up to
> the next power of two.
> ====================
> > I am a Java programmer, and this is my first perl script..
>
> Welcome to the world of Perl. You'll love it, I promise. :-)
>
> > I tried looking into the constructor for the hash itself,
>
> There is no such thing. Constructors are methods of classes. Hashes
> are native data types, not objects. They are simply declared and used.
>
> > but it doesnt seem like it accepts argument...?
>
> I don't know what you were trying to give arguments to. If you mean
> simply the `my` keyword, then you are correct - you cannot pre-allocate
> buckets when you declare the hash. Instead, declare it, and then
> assign buckets, using the keys function as described above.
I tried to do this, keys %main_hash = 300000; but it is still running
slow when it reaches 150000... perhaps it is not the collision problem,
as xho mentioned?
>
> > How Do you delete an element within a hoh? Say i have a hash of hash,
> > like the following.
> >
> > my %hoh();
> >
> > loop() { # say this is the loop of each line of my txtFile
>
> I don't know what this means. This is not Perl code. Please show real
> code whenever possible.
>
> > my $value = "TheRecordFromMyTxtFile";
> > my $letter = substr $value, 0, 1; # say, i am using the first letter
> > as the key for subhash.
> > my $myKey = substr $value, 5, 9; # Say position 5 - 9 is the key for
> > the element.
> > $hoh{$letter}{$myKey} = $value
> > }
> >
> >
> > Now, I want to delete a particular value from one of the subhash...
> >
> > I tried doing this,
> >
> > delete $hoh{$letter}{$value};
>
> That is precisely how you delete that particular value from that
> particular "subhash".
>
Say I have the key of subhash, as $letter, and the item in subhash as,
$value.
delete $hoh{$letter}{$value};
This should delete that from the hash, right?
> > But it doesnt seem like it is deleting... B/c if I try to get the
> > length of the $hoh{$letter}, it still reports the same number...
>
> What, exactly, do you mean by "get the length of the $hoh{$letter}"?
> Again, please show real code whenever possible. Did you, by any
> chance, do something like:
> print length $hoh{$letter};
> ?
>
> That does not, at all, give you what you want. When you use a
> reference in a scalar context, you get a string representing the type
> of reference and it's memory address. To see what I mean, try printing
> out:
> print $hoh{$letter};
> You should see something like
> HASH(0x14d410)
> It is this string that you were passing to the length function.
> Obviously, the length of that string isn't going to change simply
> because you removed one of the key/value pairs.
> To determine the "size" (that is, number of keys) of a hash, you again
> use the keys function, this time in sclar context:
> print scalar(keys %{$hoh{$letter}});
>
> The additional punctuation around $hoh{$letter} is what we call
> "dereferencing" a reference. You can read all about it by typing at
> your console window:
> perldoc perlreftut
> perldoc perllol
> perldoc perldsc
>
Say I want to look at the value of in this key --
$hoh{$letter}{$value}, how do you print it?
I tried, print $hoh{$letter}{$value}; - but it prints nothing....
Thanks,
Tak
> Hope this helps,
> Paul Lalli
------------------------------
Date: 2 Aug 2006 08:18:46 -0700
From: bubbabubbs@yahoo.com
Subject: Re: Q: ActivePerl - calling an ActiveX object
Message-Id: <1154531926.519764.60020@i3g2000cwc.googlegroups.com>
Sinan:
Thanks for your feedback. I made the code changes you suggested, but am
still not getting correct results from the ActiveX object. But I am not
getting any errors/warnings, either.
You are right, the method is not called 'foo', but something different.
I was hesitant, however, to provide the the method's real name in the
code snippet, as I'm not supposed to reveal proprietary information.
But I'm 100% sure aI am calling the right method.
What the method does is it implements a simple formula like arg4 =
(arg1-X)*arg2/arg3. At least that's what I've been able to figure out
by running some test cases and having the knowledge of the problem
domain. You are probably going to ask: "why don't you just implement
the formula in your code and forget about using the ActiveX object?" I
wish I could, but I can't... long story... the client and the 'suits'
insists on me using the ActiveX object.
Thanks
A. Sinan Unur wrote:
> bubbabubbs@yahoo.com wrote in news:1154481724.075186.314530
> @i42g2000cwa.googlegroups.com:
>
> > I have a 3rd-party ActiveX object (as a DLL) that I am trying to call
> > from Perl (ActivePerl 5.8) The function that I am calling takes three
> > 'in' parameters, and returns the result via the fourth 'out'
> > parameter. So I need to pass the last parameter by reference, which I
> > know Perl supports. (Unfortunately, the 3rd party cannot/will not
> > modify the interface.)
>
> Given that we have no idea what this component is, it is hard to come up
> with specific recommendations. I'll make some general comments below.
>
>
> > Here is my ActivePerl script (slightly simplified):
> >
> > ################################################################
>
> use strict;
> use warnings;
>
> > use Win32::OLE;
> > use Win32::OLE::Const 'Microsoft ActiveX Data Objects';
>
> Win32::OLE->Option(Warn => 3);
>
> > my $sLogFileName = "log.txt";
> > open( LOG, ">$sLogFileName" );
>
> open my $log_fh, '>', $sLogFileName
> or die "Cannot open '$sLogFileName': $!";
>
> > $arg4 = 0.0;
>
> my $arg4 = 0.0;
>
> > my $service = Win32::OLE->new( '3rdPartyComponent.Service' );
> > print LOG "before: " . $arg4 . "\n";
>
> print $log_fh "before: $arg4\n";
>
> > $service->foo( 31, 4000, 28000, \$arg4 );
>
> I seriously doubt the service provides a method named 'foo'. There is no
> way for us to determine if you are using the correct method.
>
> It is possible the higher warning setting for Win32::OLE will give some
> useful information.
>
> Sinan
> --
> A. Sinan Unur <1usa@llenroc.ude.invalid>
> (remove .invalid and reverse each component for email address)
>
> comp.lang.perl.misc guidelines on the WWW:
> http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
------------------------------
Date: Wed, 2 Aug 2006 16:58:49 +0100
From: Ben Morrow <benmorrow@tiscali.co.uk>
Subject: Re: Q: ActivePerl - calling an ActiveX object
Message-Id: <pmu6q3-uta.ln1@osiris.mauzo.dyndns.org>
Quoth bubbabubbs@yahoo.com:
> I have a 3rd-party ActiveX object (as a DLL) that I am trying to call
> from Perl (ActivePerl 5.8) The function that I am calling takes three
> 'in' parameters, and returns the result via the fourth 'out'
> parameter. So I need to pass the last parameter by reference, which I
> know Perl supports.
Perl supports it for calling Perl subs, but Win32::OLE does not. You
need to use explicit Variants: see the section at the end of
Win32::OLE::Variant "Variants by reference".
Ben
--
Giles: It's very common for Indian spirits to change to animal form.
Buffy: [...] and, 'Native American'. G: Sorry? B: We don't say 'Indian'.
G: Oh, right, yes; always behind on the terms... yes, still trying not to refer
to you lot as 'bloody colonials'. [Buffy, 'Pangs'] benmorrow@tiscali.co.uk
------------------------------
Date: 02 Aug 2006 16:34:46 GMT
From: xhoster@gmail.com
Subject: Re: Recursion
Message-Id: <20060802124247.030$Bg@newsreader.com>
"kokolo" <koko_loko_0@yahoo.co.uk> wrote:
> Thx a lot.
> At this very moment i know nothing about referencing in Perl so I'll
> have to learn it.
I recommend perldoc perlreftut as a starting point.
> I just don't know if I want to pass the entire set to subroutine as my
> concept was to keep "pivot" while "sorting" left and right side
> recursively.
Well, you have hit one of those famous tradeoffs. There is no doubt that
your paritioning method is simpler, less error prone, less subtle, etc.
than one of the traditional in-place pivot methods. But it is also slower
due to all the allocation and copying going on.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: Wed, 02 Aug 2006 12:47:05 -0400
From: Ted Zlatanov <tzz@lifelogs.com>
Subject: Re: Recursion
Message-Id: <g69zmenjd9y.fsf@CN1374059D0130.kendall.corp.akamai.com>
On 2 Aug 2006, bernard.el-haginDODGE_THIS@lido-tech.net wrote:
> Most questions asked here are rejected immediately because they
> don't conform to *every* *single* *fucking* *point* in the
> guidelines, even if they are perfectly clear. The only ones left are
> so simple, they can be answered with a perldoc reference.
I have to agree, I've been in c.l.p.misc on and off for a while, and I
find the reactions people have to most newbie posts too strong lately.
Sure, explain what they did wrong, but do it gently and in addition to
the answer. Everyone has an ego, and a new programmer is especially
vulnerable to criticism from people he may consider wiser.
Ted
------------------------------
Date: Wed, 02 Aug 2006 18:59:54 +0100
From: Richard Gration <usenet.x.octomancer@spamgourmet.com>
Subject: Re: Recursion
Message-Id: <pan.2006.08.02.17.59.53.719430@spamgourmet.com>
On Wed, 02 Aug 2006 09:47:06 +0200, Bernard El-Hagin wrote:
> "Paul Lalli" <mritty@gmail.com> wrote:
>> 3) Get over yourself.
>
>
> This is advice that a *lot* of the regulars should take. Mostly the
> newer ones.
<SNIP>
> The usefulness of this group has become questionable at best. The likes
> of you have made it so. Most questions asked here are rejected
> immediately because they don't conform to *every* *single* *fucking*
> *point* in the guidelines, even if they are perfectly clear. The only
> ones left are so simple, they can be answered with a perldoc reference.
> It's pathetic. In fact, I think I need to take a loooong break from this
> place. I'll miss the *true* wisdom of Tad, Anno, Abigail, Randal and
> some others, but this new bunch of regulars, who have no clue, but make
> up for it in arrogance are ruining even their input for me.
This NG has been off my subscribe list for 6 months for *exactly* the
reasons so eloquently expressed by Bernard in this thread and in
particular the 2 quotes above. I drop in every month or so just to find
out if it has become worth reading again. Haha, the joke is still on me
(and anyone expecting help here :-( ).
In the end it was the bitterness and venom and ... well, just good old
fashioned nastiness that stopped me *being* *able* to read here. It really
did make me sad to see this once witty, friendly, tolerant and above all
*informative* NG brought so low.
To Bernard: Very well done for calling a spade a spade, and for having the
guts to. I tried but wasn't up to the task.
To the spades: Get some psychoanalysis and figure out what's eating you
before you die of cancer. No-one can possibly have that much hatred
reserved only for inexperienced usenauts/programmers ...
Rich
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 9549
***************************************