[22114] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4336 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jan 3 03:05:50 2003

Date: Fri, 3 Jan 2003 00:05:09 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Fri, 3 Jan 2003     Volume: 10 Number: 4336

Today's topics:
    Re: AWK vs PERL - splitting fields <bongie@gmx.net>
    Re: AWK vs PERL - splitting fields <mgjv@tradingpost.com.au>
    Re: conditional search and replace <Jodyman@hotmail.com>
    Re: Learn about natural male enhancement          (qk07 <uri@stemsystems.com>
        Loop with Array or Loop and Read File? (Andrew Burton)
    Re: Loop with Array or Loop and Read File? <goldbb2@earthlink.net>
    Re: Need help with split <Jodyman@hotmail.com>
    Re: Need help with split <krahnj@acm.org>
    Re: Perl for spliting vcf files (palm->iPod) <goldbb2@earthlink.net>
    Re: Prototype declaration with built-in function <mgjv@tradingpost.com.au>
    Re: Prototype declaration with built-in function <mgjv@tradingpost.com.au>
    Re: Prototype declaration with built-in function (Jeff Mott)
    Re: system command and $_ variable (juha)
    Re: vectors & large amounts of data - time & space prob ctcgag@hotmail.com
    Re: vectors & large amounts of data - time & space prob ctcgag@hotmail.com
        What is wrong in my SYSTEM commad (juha)
        XS memory management <eric.anderson@cordata.net>
    Re: XS memory management <goldbb2@earthlink.net>
    Re: XS memory management <mgjv@tradingpost.com.au>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Fri, 03 Jan 2003 03:08:26 +0100
From: "Harald H.-J. Bongartz" <bongie@gmx.net>
Subject: Re: AWK vs PERL - splitting fields
Message-Id: <3058460.fpdVzALBLH@nyoga.dubu.de>

Christopher Hamel wrote:
> AWK is a nice tool, and I
> like it a lot, but it's no more a programming lanugage than 'cat.'

I disagree on that (although I rarely use awk anymore).
awk has conditionals, loops and other control structures, arrays, 
functions etc., and I knew people who wrote awk programs that filled
pages ... well, most of them use Perl now. :)

Ciao,
        Harald
-- 
Harald H.-J. Bongartz <bongie@gmx.net>
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
"The Law of Self Sacrifice"
When you starve with a tiger, the tiger starves last.



------------------------------

Date: Fri, 03 Jan 2003 03:00:08 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: AWK vs PERL - splitting fields
Message-Id: <slrnb19vac.307.mgjv@verbruggen.comdyn.com.au>

On 2 Jan 2003 08:32:22 -0800,
	Christopher Hamel <hamelcd@hotmail.com> wrote:
> Martien Verbruggen <mgjv@tradingpost.com.au> wrote in message news:<slrnb10cqr.4tt.mgjv@martien.heliotrope.home>...
>> On Mon, 30 Dec 2002 11:11:36 +0000,
>> 	Miguel Angelo Lapa Duarte <Miguel.Duarte@tmn.pt> wrote:
>> > 
>> > Once I argued whith an Un*x old timer at my company that perl was better
>> >    then awk. He told me that perl, although more flexible, could be
>> > orders of magnitude slower than AWK while spliting fields.
>> 
>> He's right. And awk has other advantages over perl. 
>> 
>> $ man perlvar
>> [snip]
>>                Remember: the value of "$/" is a string, not a
>>                regex.  awk has to be better for something. :-)
>> [snip]
>> 
> 
> On that note, 'cut' is likely faster than AWK if the only goal is

$ /usr/bin/time awk -F, '{ print $10 }' test > /dev/null
0.65user 0.05system 0:00.72elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (177major+28minor)pagefaults 0swaps

$ /usr/bin/time cut -d, -f 10 test > /dev/null      
1.28user 0.05system 0:01.40elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (127major+20minor)pagefaults 0swaps

Doesn't seem to be for me (linux) :)

On Solaris /usr/xpg4/bin/awk is slower than cut (/usr/bin/awk doesn't
work with that many fields).

> splitting fields, but neither AWK nor cut nor <insert favorite OS tool
> here> is really a programming language.  AWK is a nice tool, and I
> like it a lot, but it's no more a programming lanugage than 'cat.'

I disagree.

awk is much more a programming language than cat. it is probably not
a general purpose programming language, but it has most of the
constructs needed for it to be a programming language. It's got
conditionals, loops, variables, boolean operators, mathematical
operators and functions, and loads more. It may be a spacialised
programming language,but I'm pretty sure it is one :)

> If performance is REALLY that big of an issue, I personally have no
> problem with imbedding the OS tools into the Perl program:
> 
>   open IN1, "cut -d\\| -f23 $file1 |" or die;
>   open IN2, "grep -v ^M- $file2 |" or die;
> 
> I realize this is frowned upon, as it makes the program non-portable,
> and the performance increase is typically marginal in the grand scheme
> of the overall program, but it can help if you're fighting for seconds
> here and there.

Indeed. And there's nothing wrong with wrapping this sort of thing in
a subroutine or package that tries one thing first (the fast way) and
falls back on a slower way if that fails or is impossible for some
other reason.

Of course, with things like Inline::C you could even write a really
fast, specialised line parsing routine, but at some point you have to
decide not to write everything yourself :)

Martien
-- 
                        | 
Martien Verbruggen      | Make it idiot proof and someone will make a
Trading Post Australia  | better idiot.
                        | 


------------------------------

Date: Fri, 03 Jan 2003 04:10:20 GMT
From: "Jodyman" <Jodyman@hotmail.com>
Subject: Re: conditional search and replace
Message-Id: <M28R9.9416$134.1005858@newsread1.prod.itd.earthlink.net>

"Tad McClellan" <tadmc@augustmail.com> wrote in message
news:slrnb194jt.n88.tadmc@magna.augustmail.com...
> Damian Ibbotson <member@dbforums.com> wrote:
>
> > I have a requirement to modify a string to correct assumed typos.
> > Basically the logic I have to apply is to convert the letters "I" or "O"
> > to the numbers "1" or "0" respectively if they occur in positions 12 or
> > 13 of the string. I apply the same logic for the 14th to the last (the
> > 17th) character but in this instance must also map the letter "Z" to the
> > number "2".
>
>
> > Any ideas (with explanation)?
>
>
> Use substr() as an lvalue to restrict where a tr/// is to be applied:
>
>    perl -pe 'substr($_, 11) =~ tr/OI/01/; substr($_, 13) =~ tr/Z/2/'
>
>
> > An elegant solution in awk or similar
> > would also be appreciated.
>
>
> From the Perl newsgroup?
>
> Pffft!   :-)
>

Good One!  LOL.

Jody




------------------------------

Date: Fri, 03 Jan 2003 05:02:00 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Learn about natural male enhancement          (qk07dqj9y4)
Message-Id: <x7of6y3jy0.fsf@mail.sysarch.com>

>>>>> "h" == htkd  <htkd@kibtckgcrm.com> writes:

  h> * Find out about the "Penis Enhancement" plan rated #1 by Health Web
  h> 2000. *

while( $dick < $max_dick_size ) {

	$dick++ ;
	$wallet-- ;
	$dick-- ;
}

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
----- Stem and Perl Development, Systems Architecture, Design and Coding ----
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org
Damian Conway Perl Classes - January 2003 -- http://www.stemsystems.com/class


------------------------------

Date: 03 Jan 2003 02:10:13 GMT
From: tuglyraisin@aol.commune (Andrew Burton)
Subject: Loop with Array or Loop and Read File?
Message-Id: <20030102211013.16044.00000424@mb-cs.aol.com>

I'm working on an aggregator that will pull URL's from a file, perform commands
based on the URL, and then move on to another url until the list runs out. 
Looking at the program, I see two ways to go.  The first way is read the URL
list into an array, and later cycle through the array with a loop.  The other
way is to read from the file one line at a time, work from that line, then move
to the next.  Can someome recommend which of these two ways is better?  If my
query has been badly written, I can describe it better.  Thanks!

Andrew Burton -- tuglyraisin at aol dot com
Felecia Station on Harvestgain
"well, it's software, it can do anything :-)" - Ankh
"A racist alien robot, now that's cutting-edge children's programming, screw
the AIDS puppets." - Derik Smith



------------------------------

Date: Thu, 02 Jan 2003 21:35:01 -0500
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Loop with Array or Loop and Read File?
Message-Id: <3E14F6D5.5B41DD44@earthlink.net>

Andrew Burton wrote:
> 
> I'm working on an aggregator that will pull URL's from a file, perform
> commands based on the URL, and then move on to another url until the
> list runs out. Looking at the program, I see two ways to go.  The
> first way is read the URL list into an array, and later cycle through
> the array with a loop.  The other way is to read from the file one
> line at a time, work from that line, then move to the next.  Can
> someome recommend which of these two ways is better?  If my query has
> been badly written, I can describe it better.  Thanks!

It's a tradeoff between various kinds of resources.

If you read into an array, you're using more memory.

If you read one line at a time, then you're keeping a filehandle open
until you've processed every url, and some OS's give programs a limited
number of filehandles.

If the file is small, or if you need to loop over it a number of times,
then read it into an array.  Otherwise, the one line at a time technique
is preferred.

PS: You may get higher throughput on your url-downloading program by
using LWP::Parallel, or by forking.

-- 
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r  coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);


------------------------------

Date: Fri, 03 Jan 2003 03:45:59 GMT
From: "Jodyman" <Jodyman@hotmail.com>
Subject: Re: Need help with split
Message-Id: <XH7R9.9359$134.1043839@newsread1.prod.itd.earthlink.net>

"Jim Janovich" <bigpun@mindspring.com> wrote in message
news:aupjv7$pe7$1@slb9.atl.mindspring.net...
> I have a string that looks like:
>
> c:\folder1\folder2\folder3\blah.jpg
>
> But the folders can be any number of folders.  All I need is the last
piece
> (blah.jpg).  Can someone help?  I was going to split on \ but since there
> can be any nuber of them I cannot figure it out.  Any help would be
> appreciated.

TMTOW with Perl, You don't even need split, try this too:

#!c:\perl\bin\perl -w
use strict;

my $filename1 = 'c:\windows\system32\test.jpg';
my $filename2 = '/home/jodyman/pics/test.jpg';

my ($results1) = $filename1 =~ /\\(\w+\.\w+)$/;
my ($results2) = $filename2 =~ /\/(\w+\.\w+)$/;
my ($results3) = $filename2 =~ /[\\|\/](\w+\.\w+)$/;
my ($results4) = $filename1 =~ /[\\|\/](\w+\.\w+)$/;

print "Results for DOS Regex $filename1 = $results1\n";
print "Results for UNIX Regex $filename2 = $results2\n";
print "Results for MULTI Regex Unix Name: $filename2 = $results3\n";
print "Results for MULTI Regex DOS Name: $filename1 = $results4\n";

HTH,

Jody




------------------------------

Date: Fri, 03 Jan 2003 04:47:11 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: Need help with split
Message-Id: <3E15154F.E518EBC7@acm.org>

Jodyman wrote:
> 
> TMTOW with Perl, You don't even need split, try this too:
> 
> #!c:\perl\bin\perl -w
> use strict;
> 
> my $filename1 = 'c:\windows\system32\test.jpg';
> my $filename2 = '/home/jodyman/pics/test.jpg';
> 
> my ($results1) = $filename1 =~ /\\(\w+\.\w+)$/;
> my ($results2) = $filename2 =~ /\/(\w+\.\w+)$/;
> my ($results3) = $filename2 =~ /[\\|\/](\w+\.\w+)$/;
> my ($results4) = $filename1 =~ /[\\|\/](\w+\.\w+)$/;
                                     ^
Which file system uses | to separate directory names?



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Thu, 02 Jan 2003 21:18:08 -0500
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Perl for spliting vcf files (palm->iPod)
Message-Id: <3E14F2E0.BF95B007@earthlink.net>

Michael Robbins wrote:
> 
> Palm software outputs a vcf file that contains multiple records, with
> spaces in between but my iPod won't accept that.
> 
> I must remove the spaces and break up the file into pieces.
> 
> I am not very good at Perl and I was hoping you guys could give me
> some suggestions.

My first suggestion is for you to read RFC 2426, to know the precise
format for vcards.

Then, (after you see how much there is to do to *properly* process
vcards), look on CPAN to see if anyone else has done it.  The module
XML::SAXDriver::vCard looks fairly promising.

If you don't want to use that, then consider using Parse::RecDescent and
writing a grammer for vcards.

-- 
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r  coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);


------------------------------

Date: Fri, 03 Jan 2003 02:23:41 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: Prototype declaration with built-in function
Message-Id: <slrnb19t60.307.mgjv@verbruggen.comdyn.com.au>

On 2 Jan 2003 17:25:00 -0800,
	Jeff Mott <jeffmott@twcny.rr.com> wrote:
> I usually define my routines with their respective prototypes at the
> beginning of a program so they can be called properly from anywhere
> else within the program. But I get a prototype mismatch error when
> doing this with in internal function name.
> 
> # example
> use subs 'open';
> sub open();

This works under 5.8.0 (and later), but is broken under 5.6.1 (and
before). I know quite a bit of work was done to fix some other
problems with prototypes for 5.8.0, but this one doesn't ring a bell
with me.

Martien
-- 
                        | 
Martien Verbruggen      | Make it idiot proof and someone will make a
Trading Post Australia  | better idiot.
                        | 


------------------------------

Date: Fri, 03 Jan 2003 02:34:15 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: Prototype declaration with built-in function
Message-Id: <slrnb19tps.307.mgjv@verbruggen.comdyn.com.au>

On Thu, 02 Jan 2003 20:43:51 -0500,
	Benjamin Goldberg <goldbb2@earthlink.net> wrote:
> Jeff Mott wrote:
>> 
>> I usually define my routines with their respective prototypes at the
>> beginning of a program so they can be called properly from anywhere
>> else within the program. But I get a prototype mismatch error when
>> doing this with in internal function name.
>> 
>> # example
>> use subs 'open';
>> sub open();
> 
> The proper prototype for open is *;$@.  This means that you need to have
> a subroutine stub like the following:
> 
>    sub open(*;$@);

I think the OP is overriding the builtin open with their own, and
their own happens to not take any arguments. I'm sure that in the real
program the definition of open() would actually appear somewhere else
in the program.

> Also, if you're providing a stub with a prototype, then you should leave
> off the 'use subs ...' for that particular subname.

unless you want to override a builtin they way perlsub suggests you
do it :)

Martien

PS. Note that this works fine in 5.8.0. I suspect the OP has 5.6.1 or
earlier.
-- 
                        | 
Martien Verbruggen      | 
Trading Post Australia  | The gene pool could use a little chlorine.
                        | 


------------------------------

Date: 2 Jan 2003 22:58:54 -0800
From: jeffmott@twcny.rr.com (Jeff Mott)
Subject: Re: Prototype declaration with built-in function
Message-Id: <f9c0ce19.0301022258.8c02771@posting.google.com>

Yes, I was using 5.6.1, updated to 5.8.0 and it does work correctly
now. In that case I'd also like to know how many other systems have
been updated (in particular Web servers).


------------------------------

Date: 2 Jan 2003 22:47:15 -0800
From: salmjuh@hotmail.com (juha)
Subject: Re: system command and $_ variable
Message-Id: <c9858ca5.0301022247.185fe3e8@posting.google.com>

Benjamin Goldberg <goldbb2@earthlink.net> wrote in message news:<3E136CA5.70B49B42@earthlink.net>...
> juha wrote:
> > 
> > Hi all
> > 
> > My script try to wake up another program and give some variables at
> > the same time to that program.
> > 
> > Here is what doesn't work:
> > 
> > system ('ffmpeg -i /tmp/$_ -b 1300 -s 352*288  /tmp/ready/$_');
> 
> Try:
> 
> system(qw(ffmpeg -i), "/tmp/$_", qw(-b 1300 -s 352*288),
>        "/tmp/ready/$_")
> 

This is close !! However, there is new line which make this no good.
When I echo this command it gives me two separate lines:

[root@testsrv videohuolto]# ./convertfile

ffmpeg -i /home/TVtampere/LidlTre2.mpg
 -b 1300 -s 352*288 /home/intravideot/LidlTre2.mpg

What I need to do to make keep those lines as one commnad line ?


Thanks so far
-JS


> Note that I use "" around the parts containing $_, so that string
> interpolation occurs.


------------------------------

Date: 03 Jan 2003 03:25:18 GMT
From: ctcgag@hotmail.com
Subject: Re: vectors & large amounts of data - time & space problems
Message-Id: <20030102222518.098$wG@newsreader.com>

mcarthur@dstc.edu.au (Robert McArthur) wrote:
> Could anyone please help please:
> We are doing research using a algorithm from cog sci. It basically
> requires the creation of a set of vectors, where each vector is a
> scalar string as a name, and has, on average, about 200 dimensions
> (there may be many more than 200, and some with 1). Each dimension
> is a string and real value.
> We've implemented this fine in Perl using a hash for a vector, the
> keys of the hash being the strings and the values being the real
> numbers of the dimensions. The vector is an object with name and a
> couple of other meta-information things as well as the dimension's
> hash. A set of vectors is simply another object which is a hash
> storing the name (keys) and the vector (values).
> It all works fine...

so you have essentially:
  $pdata{$vector_name}{$dimension_name}=$weight;
  $mdata{$vector_name}{$meta_data_field1}=$meta_data1;
  ...
right?

>
> except we're now trying it with large amounts of data and running
> into problem with both space and time. We're using a vocab of about
> 100,000 strings, each of which is on average 6 characters long.
> Doing the maths, assuming we store each dimension name as a string
> rather than a pointer to a look-up table, it should be about 1GB
> of memory. We're running out of memory at around 7GB on the hefty
> machine we've chosen :-(

How do you arrive at the 1GB estimate?

My own estimate is based on the assumption top level stuff and
the meta-data is insignificant compared to the deeper nesting,
so just look at:

100,000 hashes each with 200 key/value pairs
or 2e7 key/value pairs.  A 6 character hash key is ~65 bytes
each, (actually less, because once Perl's seen a certain key, each
further use of it only takes ~49 bytes.), a floating point value that
is never used in a string context is ~35 bytes, so that comes to about
2 gig.  However, a fp value that has been used in a string context
is ~95 bytes, which would add up to a total of  3.2 Gig, still comfortably
less than the 7 Gig-death you report.

So, I would conclude that your perl implementation is more wasteful than
mine, that the meta-data you are storing is voluminous, that your memory
is getting rather fragmented, that your estimates of the numbers
involved are too low, or that you are using a much higher-overhead
data structure than I anticipate.

>
> Looking around, I came across a quote saying that Perl stores a SV,
> or rather an integer IV, in at least 28 bytes. I was assuming 4 bytes
> in my calculation above, not withstanding we're using reals and not
> integers, and so I can see why we're running out of memory.
>
> Can anyone give me any ideas for a better way to do this in Perl?

I don't see why it isn't working right now, so any specific advice may be
attacking something that isn't the problem.

Monitor the data structure to make sure it isn't getting more things than
you think it is.

Comment out the portions of your code that create the dimensions, run the
program on a sub-set of the data, and see how much space just the vectors
and the meta-data take up. (on Linux, I just print "Mem:\t",(`ps -p $$ -o
rss `)[1]; just before exit, or at other strategic points, to get approx
usage in kb. replacing rss with size might be more accurate if you suspect
swapping).

Get a sub-set of data that doesn't kill your program, run the analysis and
see how much memory it takes, then write out the results in some format.

Write another program that simply reads those results into the same
data structure, and see how much memory this takes.  If it takes
substantially less space when read than when computed, that indicates
fragmentation or a memory leak in the computation process.

Does vector cat, dimension dog have the same weight as vector dog,
dimension cat?  It sounds like they do, if so you could you cut the data
size in half just by skipping any pairing if $vector_name gt
$dimension_name.


If you are printing a float that has never been used in a stringy way
before, use
print sprintf "%f", $hash{$key1}{$key2};
rather than
print $hash{$key1}{$key2};
as the second method will attach the string representation to the variable,
taking up space, while the first way throws away the string representation
when it is done with it.



> We're doing too many lookups, I believe, to have it on disk rather
> than in memory so that road's out.

Don't knock it until you try it.  As I see it, you can spend an enormous
amount of time tuning the snot out of your program so it fits into physical
memory, only to have it die when N increases from 100,000 to 110,000 and
you're left at square one; or you can switch to a scalable approach, see
how it performs, and then tune the snot out of that if necessary.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service              New Rate! $9.95/Month 50GB


------------------------------

Date: 03 Jan 2003 05:09:02 GMT
From: ctcgag@hotmail.com
Subject: Re: vectors & large amounts of data - time & space problems
Message-Id: <20030103000902.531$NH@newsreader.com>

ctcgag@hotmail.com wrote:
>
> 100,000 hashes each with 200 key/value pairs
> or 2e7 key/value pairs.  A 6 character hash key is ~65 bytes
> each, (actually less, because once Perl's seen a certain key, each
> further use of it only takes ~49 bytes.), a floating point value that
> is never used in a string context is ~35 bytes, so that comes to about
> 2 gig.  However, a fp value that has been used in a string context
> is ~95 bytes, which would add up to a total of  3.2 Gig, still
> comfortably less than the 7 Gig-death you report.

Correction, a 6 character hash key that has already been used (in
a different hash) uses ~12 bytes, not ~49 bytes, on my machine.

So, assuming only 1e5 distinct words are used as keys in these
2e7 hash slots, that drops the memory usage down to 1G.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service              New Rate! $9.95/Month 50GB


------------------------------

Date: 2 Jan 2003 23:35:43 -0800
From: salmjuh@hotmail.com (juha)
Subject: What is wrong in my SYSTEM commad
Message-Id: <c9858ca5.0301022335.7a577d31@posting.google.com>

Hi All


I write earlier about same problem and thanks for all answering to my
question.
However, I still have a litle problem. The script doesn't work quit
well yet. It print the system command to two separate line insted one
line. Here is the command:

 .
 .

system(qw(echo ffmpeg -i), "/tmp/$_", qw(-b 1300 -s 352*288),
"/tmp/ready/$_");

 .
 .

and the result it gives me:

ffmpeg -i /tmp/testfile.mpg
 -b 1300 -s 352*288 /tmp/ready/testfile.mpg

So after " /tmp/testfile.mpg " appears a new line. How I can keep all
in the same line ???????


I appreciate any help and hints.

-js


------------------------------

Date: Thu, 02 Jan 2003 21:22:48 -0500
From: "Eric Anderson" <eric.anderson@cordata.net>
Subject: XS memory management
Message-Id: <pan.2003.01.03.02.22.42.382917@cordata.net>

OK, I'm less than a beginner at Perl XS. To learn about it more I am
looking through some source code of a project that interests me that uses
Perl XS.

This project has a place where it receives a pointer to a character
(string) from a library. The library states that it is the callers job to
free this memory. The source code I am looking at never frees this memory,
so my 'C' instincts send up memory leak alerts.

But this is Perl XS, not strictly C. So perhaps the code is not wrong and
Perl's garbage collector is taking care of it. I read somewhere that Perl
has it's own version of malloc since it does it's own memory management.
And since this code is dynamically linked to the Perl binary, and the
library being used is also dynamically linked. Perhaps the malloc this
library is using is the one Perl has created. Perhaps the memory is
being garbage collected, therefore the calling code does not need to free
the memory becuase Perl is doing this automatically.

But that is just my hunch. Perhaps not and the code I am looking at just
has memory leaks. I wanted to run this by the Perl XS experts before I
emailed the author of the code about memory leak problems.

Thanks for any info you can provide.

--
Eric Anderson


------------------------------

Date: Thu, 02 Jan 2003 21:28:56 -0500
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: XS memory management
Message-Id: <3E14F568.A3B6754A@earthlink.net>

Eric Anderson wrote:
> 
> OK, I'm less than a beginner at Perl XS. To learn about it more I am
> looking through some source code of a project that interests me that
> uses Perl XS.
> 
> This project has a place where it receives a pointer to a character
> (string) from a library. The library states that it is the callers job
> to free this memory. The source code I am looking at never frees this
> memory, so my 'C' instincts send up memory leak alerts.

What is done with this string?  If it is put into a PV*, then it's
possible that perl will garbage collect it, depending on what you do
with that PV*.

> But this is Perl XS, not strictly C. So perhaps the code is not wrong
> and Perl's garbage collector is taking care of it. I read somewhere
> that Perl has it's own version of malloc since it does it's own memory
> management.

Perl does have it's own version of malloc, but that is independent of
perl's garbage collection.  Perl's malloc is used on those systems where
the system malloc is either buggy or innefficient, but it is not
*always* used.  On systems with a fast, nonbuggy malloc, then the system
malloc is used.

Perl's garbage collection, however, is always used... but it only
applies to SV* and it's various subtypes.

> And since this code is dynamically linked to the Perl binary, and the
> library being used is also dynamically linked. Perhaps the malloc this
> library is using is the one Perl has created.

You can build a dynamically linked library, and have it statically link
to other libraries -- I hope that wasn't done here, or else your library
might be using libc's or glibc's malloc at a time that perl is using
it's own malloc.  But this is unlikely.  (And not related to whether
data that's malloced gets garbage collected).

> Perhaps the memory is being garbage collected, therefore the calling
> code does not need to free the memory becuase Perl is doing this
> automatically.

Perl has magic, but not that much magic.  If you're returning the string
by storing it in a PV*, then when that PV goes out of scope, it will get
freed.  But otherwise, your string is probably getting leaked.

> But that is just my hunch. Perhaps not and the code I am looking at
> just has memory leaks. I wanted to run this by the Perl XS experts
> before I emailed the author of the code about memory leak problems.

If the XS function is small, (presumably merely a wrapper around calling
the library function), then perhaps you could post it here, so we can
give you more help?

-- 
$..='(?:(?{local$^C=$^C|'.(1<<$_).'})|)'for+a..4;
$..='(?{print+substr"\n !,$^C,1 if $^C<26})(?!)';
$.=~s'!'haktrsreltanPJ,r  coeueh"';BEGIN{${"\cH"}
|=(1<<21)}""=~$.;qw(Just another Perl hacker,\n);


------------------------------

Date: Fri, 03 Jan 2003 03:17:41 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: XS memory management
Message-Id: <slrnb1a0b9.307.mgjv@verbruggen.comdyn.com.au>

On Thu, 02 Jan 2003 21:22:48 -0500,
	Eric Anderson <eric.anderson@cordata.net> wrote:
> OK, I'm less than a beginner at Perl XS. To learn about it more I am
> looking through some source code of a project that interests me that uses
> Perl XS.

You might be interested in /Extending and Embedding Perl/ by Tim
Jenness and Simon Cozens.

> This project has a place where it receives a pointer to a character
> (string) from a library. The library states that it is the callers job to
> free this memory. The source code I am looking at never frees this memory,
> so my 'C' instincts send up memory leak alerts.
> 
> But this is Perl XS, not strictly C. So perhaps the code is not wrong and
> Perl's garbage collector is taking care of it. I read somewhere that Perl

It depends on what is done with that pointer. There are several ways
in the perl API to create a SV from a char pointer in such a way that
Perl will free the memory once the SV is no longer needed, with its
normal garbage collection process. You can even do this with more
complex structures, provided you provide the free() function that goes
with it.

In XS some C types are automatically treated this way when returned
from an XS function, depending on the typemap entries.

There could also be a DESTROY function defined somewhere, to serve as
a XS destructor for the memory returned (see perlxs for an example of
getnetconfigent XS wrappers).

You don't mention the name of the project or the function concerned,
so it's a bit hard for anyone to guess what it does, and how it is
wrapped.

Martien
-- 
                        | 
Martien Verbruggen      | I took an IQ test and the results were
Trading Post Australia  | negative.
                        | 


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 4336
***************************************


home help back first fref pref prev next nref lref last post