[28542] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 9906 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Oct 30 14:05:51 2006

Date: Mon, 30 Oct 2006 11:05:05 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 30 Oct 2006     Volume: 10 Number: 9906

Today's topics:
    Re: Handling a 450,000x450,000 array with Perl <quetzalcotl@consultant.com>
    Re: Handling a 450,000x450,000 array with Perl francescomoi@usa.com
    Re: Handling a 450,000x450,000 array with Perl <bugbear@trim_papermule.co.uk_trim>
    Re: How do I do full access logging including HTTP head <john@castleamber.com>
        How to get a full trace of a prog's execution? <socyl@987jk.com.invalid>
        Interesting behaviour with lexical variable <bol@adv.magwien.gv.at>
    Re: Interesting behaviour with lexical variable <jl_post@hotmail.com>
    Re: Naive threading performance questions anno4000@radom.zrz.tu-berlin.de
    Re: Naive threading performance questions <worky.workerson@gmail.com>
    Re: Naive threading performance questions xhoster@gmail.com
        Perl equivalent to unix script <mikedawg@gmail.com>
    Re: Perl equivalent to unix script <mikedawg@gmail.com>
    Re: Perl equivalent to unix script <jl_post@hotmail.com>
    Re: Perl equivalent to unix script <glex_no-spam@qwest-spam-no.invalid>
    Re: Perl equivalent to unix script usenet@DavidFilmer.com
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 30 Oct 2006 06:24:38 -0800
From: "Ingo Menger" <quetzalcotl@consultant.com>
Subject: Re: Handling a 450,000x450,000 array with Perl
Message-Id: <1162218278.252766.285630@f16g2000cwb.googlegroups.com>


francescomoi@usa.com wrote:
> Hi.
>
> I'd like to create a program to handle data (natural numbers)
> within a 450,000 x 450,000 array. I'm considering Perl, but
> I don't know is it's powerful enough to manage it.

Perl is, but you'll find few computers that could store 202 billion
"natural numbers".
Remember, a PC has a RAM of around 2 billion bytes, give or take a few.
You could hold a maximum of 16 billion natural numbers as long as they
are 0 or 1. But, 16 billion is still not 202 billion.

In any case, this will be a system where native integers have 64 bits.



------------------------------

Date: 30 Oct 2006 07:32:44 -0800
From: francescomoi@usa.com
Subject: Re: Handling a 450,000x450,000 array with Perl
Message-Id: <1162222364.737405.90510@e64g2000cwd.googlegroups.com>

Hi.

Thank you very much for your quick answers. I try to handle data from a
450x450 meter surface.

I've got 100,000 elements distributed on this surface, and each
square millimeter can hold one or more elements. So the array to create
is a sparse one: most of square millimeters holds zero elements, and
the rest, 20 or less.

I'm trying to find the square meter with more elements inside.

Thank you very much.

On Oct 30, 3:24 pm, "Ingo Menger" <quetzalc...@consultant.com> wrote:
> francesco...@usa.com wrote:
> > Hi.
>
> > I'd like to create a program to handle data (natural numbers)
> > within a 450,000 x 450,000 array. I'm considering Perl, but
> > I don't know is it's powerful enough to manage it.Perl is, but you'll find few computers that could store 202 billion
> "natural numbers".
> Remember, a PC has a RAM of around 2 billion bytes, give or take a few.
> You could hold a maximum of 16 billion natural numbers as long as they
> are 0 or 1. But, 16 billion is still not 202 billion.
>
> In any case, this will be a system where native integers have 64 bits.



------------------------------

Date: Mon, 30 Oct 2006 16:36:38 +0000
From: bugbear <bugbear@trim_papermule.co.uk_trim>
Subject: Re: Handling a 450,000x450,000 array with Perl
Message-Id: <45462a16$0$8729$ed2619ec@ptn-nntp-reader02.plus.net>

francescomoi@usa.com wrote:
> Hi.
> 
> Thank you very much for your quick answers. I try to handle data from a
> 450x450 meter surface.
> 
> I've got 100,000 elements distributed on this surface, and each
> square millimeter can hold one or more elements. So the array to create
> is a sparse one: most of square millimeters holds zero elements, and
> the rest, 20 or less.
> 
> I'm trying to find the square meter with more elements inside.

Ah. OK.

Your question should really be "what data structure should
I use to..."

It's not really a perl question at all, it's
really an algorithm and data structure question.

   BugBear


------------------------------

Date: 30 Oct 2006 18:42:07 GMT
From: John Bokma <john@castleamber.com>
Subject: Re: How do I do full access logging including HTTP headers?
Message-Id: <Xns986C81369F03Dcastleamber@130.133.1.4>

"Nu" <no@spam.com> wrote:

Don't top post, if you don't know what top posting means, look it up in 
the wikipedia.

> I'm on a shared hosting and don't have root access and my host just
> said their logs are the only ones they give.

What header(s) do you want to log, and for which request(s)?

-- 
John                Experienced Perl programmer: http://castleamber.com/

          Perl help, tutorials, and examples: http://johnbokma.com/perl/


------------------------------

Date: Mon, 30 Oct 2006 18:07:47 +0000 (UTC)
From: kj <socyl@987jk.com.invalid>
Subject: How to get a full trace of a prog's execution?
Message-Id: <ei5f1j$edc$1@reader2.panix.com>




I'm trying to exterminate a bug that shows up very sporadically
(but with serious results).  The program in question is a CGI script
and I'm having a very hard time rooting out the bug.

It would be extremely useful to be able to toggle on something akin
to "Trace Mode" in the Perl debugger, sending the trace to some
user-specified file.  (Since the bug shows up sporadically and
unpredictable, it is not practical to try to find it by using the
Perl debugger.)  Is there some other way to get such Trace output?

TIA!

kj
-- 
NOTE: In my address everything before the first period is backwards;
and the last period, and everything after it, should be discarded.


------------------------------

Date: Mon, 30 Oct 2006 17:37:43 +0100
From: "Ferry Bolhar" <bol@adv.magwien.gv.at>
Subject: Interesting behaviour with lexical variable
Message-Id: <1162226268.131319@proxy.dienste.wien.at>

Hi to all,

while playing with code like this:

use warnings;
while (my $input = <STDIN>){
  my $num = abs $input;
  print add();

  sub add {
    $num + $num;
  }
}

I found out that once the loop get executed for the second
time, the variable $sum gets "splitted", taking the new value
from $input in the loop, but leaving the old, previous value
in add().

I was wondered a litte bit because I couldn't found described
this behaviour anywhere in the docs.

If, however, the entire code is placed again in another function,
and this function is then called:

use warnings;
sub main {
  ...                # See above, starting with "while..."
}
main();

I get this message:

Variable "$num" will not stay shared at x.pl line 11.

explaining what actually will happen here (although I can't
understand _why_ it does happen - it's the same lexical
variable within the same scope - so why will/can it not
longer stay shared?).

So just for curiosity this question: the same behaviour
occurs twice, but gets reported only once. Is this expected
or perhaps a (small) bug?

BTW: This is perl 5.8.8 (as well as 5.6.1).

Greetings, Ferry

-- 
Ing Ferry Bolhar
Magistrat der Stadt Wien - MA 14
A-1010 Wien
E-Mail: bol@adv.magwien.gv.at






------------------------------

Date: 30 Oct 2006 09:47:41 -0800
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: Interesting behaviour with lexical variable
Message-Id: <1162230461.495338.286540@e3g2000cwe.googlegroups.com>

Ferry Bolhar wrote:
>
> Hi to all,
>
> while playing with code like this:
>
> use warnings;
> while (my $input = <STDIN>){
>   my $num = abs $input;
>   print add();
>
>   sub add {
>     $num + $num;
>   }
> }

   Unless you really know what you're doing, you should never define a
function inside a loop or another function.  In fact, many Perl
programmers (me included) recommend that all your functions be defined
at the top of your script, before you start your main code.  Do that,
and your problem will go away.

> I found out that once the loop get executed for the second
> time, the variable $sum gets "splitted", taking the new value
> from $input in the loop, but leaving the old, previous value
> in add().

   I believe that's because the add() function is never defined until
the first time through the loop.  Once it gets defined, it uses the
current value of $num in its return value, which appears to go out of
scope at the end of the loop.  However, a reference is still retained
(by the add() function), so $num (declared the first time time through
the loop) never completely goes out of scope, as it is retained by the
add() function.

   So the second (and subsequent) times through the loop, a new $num is
created, which is a DIFFERENT instance of the one that the add()
function uses (remember:  the add() function uses the very first
instance of $num).  It never stops using that first instance of $num,
which is why the add() function will always return the same thing.

   You may mistakenly think that the add() function will always use the
latest value of $num, but it won't:  the add() function only needs to
be defined once, and after that it never gets defined again.  So it
never stops using the first instance of $num.

   If this sounds confusing, it's because it kind of is.  But change
your coding style so that all your functions are defined before your
main code (and not in any loops or functions either), and you should
never encounter this problem again.

   I hope this helps, Ferry.

   -- Jean-Luc



------------------------------

Date: 30 Oct 2006 14:53:51 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Naive threading performance questions
Message-Id: <4qmhvvFnoc1cU1@news.dfncis.de>

 <xhoster@gmail.com> wrote in comp.lang.perl.misc:
> "Worky Workerson" <worky.workerson@gmail.com> wrote:
> >
> > And here is my code.  I've factored out the line processing so that it
> > would show up in the dprofpp.  Again, sorry if there are any hand-copy
> > errors ....

[...]

> I made a few changes that sped it up, but not by much:

[...]

> ## This takes a surprising amount of time, but I don't know what you can
> ##   do about it.
>   # fix up column with random bad bytes
>   $ip_details{keya} = s/[^\x20-\x7e]//g;

I suppose "=" should be "=~" here.

    $ip_details{keya} =~ tr/\x20-\x7e//cd;

may be faster.

Anno


------------------------------

Date: 30 Oct 2006 08:48:44 -0800
From: "Worky Workerson" <worky.workerson@gmail.com>
Subject: Re: Naive threading performance questions
Message-Id: <1162226924.767862.44990@i42g2000cwa.googlegroups.com>


> > And here is my code.  I've factored out the line processing so that it
> > would show up in the dprofpp.  Again, sorry if there are any hand-copy
> > errors ....
> I had assumed you were using Text::CSV_XS to parse lines rather than
> printing them.  You might want to try printing them in Perl, you never
> know, it might be faster.

That's another data set, where the input and output data is in CSV.  I
actually have several data sets that I convert into a single,
normalized, CSV which is then loaded into the DB with Text::CSV_XS.
Essentially, for most of the othe data sets, I have set up translation
maps that convert the various input fields into the correct output
fields.  Upon load, each of the fields either gets inserted into the DB
directly or translated into a integer key by the perl before being
inserted.

For the other transforms, I have modularized and factored out much of
the translation code, but I think that that is part of the slow down
 ... too many function calls.  Its a bit too unwieldy to post here
(especially when I hand-copy it), but this thread has given me a bunch
of ideas, especially on the handling/conversions of arrays/hashes.

> > #!/usr/bin/perl
>
> > use strict; use warnings;
> > use IO::File; use Text::CSV_XS;
>
> > my @valid_columns = qw/ keya keyb keyc keyd keye keyf keyg keybig
> > keyanother /;
> > my %valid_columns = map {$_ => 1} @valid_columns;
>
> > my $output_csv = Text::CSV_XS->new({eol=>"\n", 'binary' => 1});
> > $output_csv->print(*STDOUT, process_line($_)) while (<>);
>
> > sub process_line {
> >   my ($line) = @_;
> >   my ($ip_range, $rest) = split /\s+/, $line, 2;
> >   chomp($rest);
>
> >   my %ip_details = (ip_range => $ip_range);
>
> >   # split on ':', then split each element on '=' and stick in hash
> >   map { my ($k, $v) = split /=/; $ip_details{$k} = $v } split(/:/,
> > $rest);
>
> >   # fix up column with random bad bytes
> >   $ip_details{keya} = s/[^\x20-\x7e]//g;
>
> >   my @cols = map { $ip_details{$_} } @valid_columns;
> >   return \@cols;
> > }I made a few changes that sped it up, but not by much:
>
>   my ($ip_range, $rest) = split /\s+/, $line, 2;
>   chomp($rest);
>
> ## This will give the same answer as the your map way for "well-formed"
> ## data.  For malformed data, it will give a different, but probably
> ##  equally meaningless, answer
>   my %ip_details = (split /[=:]/, $rest);

Thanks ... very simple and elegant.  I sometimes tend to complicate
thing needlessly.

> # I don't see what the point of this is, as you never use the ip_range key.
>   $ip_details{ip_range} = $ip_range;

Sorry, guess I missed putting that at the beginning of the
@valid_columns

> ## This takes a surprising amount of time, but I don't know what you can
> ##   do about it.
>   # fix up column with random bad bytes
>   $ip_details{keya} = s/[^\x20-\x7e]//g;

Is this from experience or profiling?  Is there an easy way to profile
single lines like this, without factoring them out into a subroutine?

> ## Use a hash slice rather than map:
>
>   return [@ip_details{@valid_columns}];

OK, this is a big question of mine, and what (I think) is the major
slowdown in some of my other code.  My data flow usually looks like
this:

CSV -> arrayref -> hashref -> new hashref with transformed values and
names -> arrayref -> CSV

I did it this way so that I could factor out a lot of the common code
of associated with transforming and outputting a new input format.  In
order to do this, I usually pass around a line of input in various
forms.  For example, one sub will read the CSV into an arrayref, pass
it to another that will convert it to a hashref, which will then pass
it to another to transform the hashref's values, etc.  I realize that
this is not necessarily the fastest way of doing these things, but it
helps a lot when I have 10 different input types being translated into
the same output type.

Because I have a few subs being called *many* times, I have a couple
local optimization questions:

-What is the most efficient way of calling a module subroutine and/or
object's method?  I'm assuming that, like C, its cheaper to pass a
reference to a hash/array than to pass the actual hash/array, right?
Does this also hold true for the return value from a sub?

-What is the most efficient way of converting back and forth between a
hash and an array, when the key->index mapping is known?  Does the
answer change at all when dealing with references?

-As above, when returning a value, does it make a difference if you
create a new local variable to return, or just return the computation
directly?  I.e. (a very simple example):

my $a = {a=>'1', b=>'2'}; return $a;
# vs
return {a=>'1', b=>'2'};

I'd like to think that perl's compiler might be able to figure out that
these are equivalent, but perhaps I am wrong.

-Sometimes I assign one of the hash/array elements to a local value so
that I can transform it, and eventually assign it back to the hash.  Is
this a "win" vs just transforming the hash value directly?  I.e.:

sub transform {
  my $hashref = (@_);
  my $name = $hashref->{name_key};

  $name = s/A/a/g;
  # ... more $name transforms ...

  $hashref->{name_key} = $name;
  return $hashref;
}

# ... vs ...

sub transform {
  my $hashref = (@_);

  $hashref->{name_key} = s/A/a/g;
  # ... more $hashref->{name_key} transforms ...

  return $hashref;
}

I'm sure that the answer is "it depends", but my next question would be
"On what?".  My naive thoughts would be that it would depend on the
number of hash lookups that are needed, which relies on the (C)
assumption that perl would not be able to cache the hashref lookup into
a "register".  Are there any other costs to the hash lookup/update
implementation?

-When transforming values, is it more efficient to use the same
variable to hold the new value or to create a new variable?  I'm
thinking that this is one of those space vs. time questions, but since
I have a lot of memory, I'd like to optimize for time.  I.e.:

sub transform {
  my ($manufacturer) = (@_);
  %translation_of = ( 'Mercedes' => 'Luxury', 'BMW' => 'Luxury',
'Honda' => 'Normal');

  $manufacturer = $translation_of{$manufacturer};
  # do more stuff
}

# ... vs ...

sub transform {
  my ($manufacturer) = (@_);
  %translation_of = ( 'Mercedes' => 'Luxury', 'BMW' => 'Luxury',
'Honda' => 'Normal');

  my $transformed_manufacturer = $translation_of{$manufacturer};
  # do more stuff
}


> I get about 44,000 lines per second.  Anyway, I don't see any obvious
> inefficiencies.  Maybe parallelization is the better route afterall.

Thanks, I'll see what I can do with that, and the previous examples.



------------------------------

Date: 30 Oct 2006 18:09:31 GMT
From: xhoster@gmail.com
Subject: Re: Naive threading performance questions
Message-Id: <20061030130956.921$GG@newsreader.com>

"Worky Workerson" <worky.workerson@gmail.com> wrote:
>
> > ## This takes a surprising amount of time, but I don't know what you
> > can ##   do about it.
> >   # fix up column with random bad bytes
> >   $ip_details{keya} = s/[^\x20-\x7e]//g;
>
> Is this from experience or profiling?  Is there an easy way to profile
> single lines like this, without factoring them out into a subroutine?

I create a simple test case and time its run (on linux, "time sample.pl").

Then I commented out that line, and re-ran it.  Since the sample input file
I used had no weird characters, the presence or absense of this line should
have no effect on downstream processing.  Doing it this way is more work
than DProf, but give you better answers.


>
> > ## Use a hash slice rather than map:
> >
> >   return [@ip_details{@valid_columns}];
>
> OK, this is a big question of mine, and what (I think) is the major
> slowdown in some of my other code.  My data flow usually looks like
> this:
>
> CSV -> arrayref -> hashref -> new hashref with transformed values and
> names -> arrayref -> CSV
>
> I did it this way so that I could factor out a lot of the common code
> of associated with transforming and outputting a new input format.  In
> order to do this, I usually pass around a line of input in various
> forms.  For example, one sub will read the CSV into an arrayref, pass
> it to another that will convert it to a hashref, which will then pass
> it to another to transform the hashref's values, etc.  I realize that
> this is not necessarily the fastest way of doing these things, but it
> helps a lot when I have 10 different input types being translated into
> the same output type.

Yes, extreme optimization and flexibility/maintainibility are often mortal
enemies.  I thought there might be one obvious dramatic speed up, but since
there isn't, I'd probably continue what you were doing, focusing on
parallelization rather than extreme optimization.

>
> Because I have a few subs being called *many* times, I have a couple
> local optimization questions:
>
> -What is the most efficient way of calling a module subroutine and/or
> object's method?  I'm assuming that, like C, its cheaper to pass a
> reference to a hash/array than to pass the actual hash/array, right?
> Does this also hold true for the return value from a sub?

I believe so, yes.

time perl -le 'sub foo {my @x=1..500; return \@x}; my $y; \
          $y+= scalar (()=@{foo()}) foreach 1..1e6; print $y'
500000000
67.792u 0.002s 1:07.80 99.9%    0+0k 0+0io 0pf+0w

time perl -le 'sub foo {my @x=1..500; return @x}; my $y; \
          $y+= scalar (()=foo()) foreach 1..1e6; print $y'
500000000
97.425u 0.006s 1:37.45 99.9%    0+0k 0+0io 0pf+0w

(Notice I need the scalar (()=...) construct.  This is to force the sub
and/or dereference to be called in a list context.  That is one danger with
doing these types of tests--it is easy to screw them up and test something
different from what you wanted to.)

> -What is the most efficient way of converting back and forth between a
> hash and an array, when the key->index mapping is known?  Does the
> answer change at all when dealing with references?

Probably a hash slice.
my %hash;
@hash{@key_list}=@value_list;

or going the other way:

my @value_list = @hash{@key_list};

>
> -As above, when returning a value, does it make a difference if you
> create a new local variable to return, or just return the computation
> directly?  I.e. (a very simple example):
>
> my $a = {a=>'1', b=>'2'}; return $a;
> # vs
> return {a=>'1', b=>'2'};

It will make a difference, but it will be slight.


time perl -le 'sub foo {return [1..500]}; my $y; \
        $y+= scalar (()=@{foo()}) foreach 1..1e6; print $y'
500000000
60.099u 0.002s 1:01.86 97.1%    0+0k 0+0io 0pf+0w

(Compare this to the 67.792 seconds above)

>
> I'd like to think that perl's compiler might be able to figure out that
> these are equivalent, but perhaps I am wrong.

At present, the Perl compiler does almost no optimization.  The upside
of that is that it makes conducting hands-on test easy, as you know
your experimental constructs aren't going to be optimized away.

Your other questions can be answered by similar tests.

(BTW, in your other questions, you've also used = rather than =~ in several
places, something I didn't notice in your original until Anno pointed it
out.)

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: 30 Oct 2006 09:52:07 -0800
From: "Mike" <mikedawg@gmail.com>
Subject: Perl equivalent to unix script
Message-Id: <1162230727.836071.158630@k70g2000cwa.googlegroups.com>

Ok. . . Well, I'm sure here comes another dumb question.  I'm fairly
handy with unix and unix scripting, however, I'm terrible at perl.

What I'm looking to do with a perl script is the equivalent of the
following unix (bash) script:

cat tempfile1 | sort > newfile2; rm tempfile1

I'm not completely comfortable with file handling in perl, and I think
it should be easy to do, but I've been unsuccessful in trying to do it.

I'm essentially trying to sort the lines in a file (alphabetically),
and then output them to another file.

Thanks

Mike



------------------------------

Date: 30 Oct 2006 09:59:10 -0800
From: "Mike" <mikedawg@gmail.com>
Subject: Re: Perl equivalent to unix script
Message-Id: <1162231150.713737.142340@b28g2000cwb.googlegroups.com>


Mike wrote:
> Ok. . . Well, I'm sure here comes another dumb question.  I'm fairly
> handy with unix and unix scripting, however, I'm terrible at perl.
>
> What I'm looking to do with a perl script is the equivalent of the
> following unix (bash) script:
>
> cat tempfile1 | sort > newfile2; rm tempfile1
>
> I'm not completely comfortable with file handling in perl, and I think
> it should be easy to do, but I've been unsuccessful in trying to do it.
>
> I'm essentially trying to sort the lines in a file (alphabetically),
> and then output them to another file.
>
> Thanks
>
> Mike

I guess I should have said that I don't want to handle this with a
system call either, I'd prefer to figure out, and learn how to do it
with purely perl.

Thanks

Mike



------------------------------

Date: 30 Oct 2006 10:09:15 -0800
From: "jl_post@hotmail.com" <jl_post@hotmail.com>
Subject: Re: Perl equivalent to unix script
Message-Id: <1162231755.698250.238780@m73g2000cwd.googlegroups.com>

Mike wrote:
> What I'm looking to do with a perl script is the equivalent of the
> following unix (bash) script:
>
> cat tempfile1 | sort > newfile2; rm tempfile1

   Just use the command:

      perl -e 'print sort <>' tempfile1 > newfile2;  rm tempfile1

> I'm fairly handy with unix and unix
> scripting, however, I'm terrible at perl.

   Ah... yeah... I was the same way a few years ago:  I was (fairly)
good at Unix scripting, but didn't know any Perl.  All that changed
when I read O'Reilly's "Learning Perl" book (by Randal L. Schwartz and
Tom Phoenix).  The chapters are easy to understand and the exercises
are laid out very well.  The book will teach you things all decent Perl
programmers should know, as well as things that will save you literally
hours of debugging time later on.  (It will also explain '<>' (the
diamond operator) in sufficient detail so that you can use it to
program quickly and more efficiently.)

   I hope this helps, Mike.

   -- Jean-Luc



------------------------------

Date: Mon, 30 Oct 2006 12:11:50 -0600
From: "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid>
Subject: Re: Perl equivalent to unix script
Message-Id: <45464046$0$10310$815e3792@news.qwest.net>

Mike wrote:
> Mike wrote:
>> Ok. . . Well, I'm sure here comes another dumb question.  I'm fairly
>> handy with unix and unix scripting, however, I'm terrible at perl.
>>
>> What I'm looking to do with a perl script is the equivalent of the
>> following unix (bash) script:
>>
>> cat tempfile1 | sort > newfile2; rm tempfile1
>>
>> I'm not completely comfortable with file handling in perl, and I think
>> it should be easy to do, but I've been unsuccessful in trying to do it.

Post what you've tried.

>>
>> I'm essentially trying to sort the lines in a file (alphabetically),
>> and then output them to another file.

> 
> I guess I should have said that I don't want to handle this with a
> system call either, I'd prefer to figure out, and learn how to do it
> with purely perl.

If you want to figure it out, give the documentation a try.

perldoc -f open
perldoc -f sort
perldoc -f unlink
perldoc opentut

Helpful Web sites:

http://perldoc.perl.org/
http://bookmarks.cpan.org/search.cgi?cat=Training%2FTutorials


------------------------------

Date: 30 Oct 2006 11:00:13 -0800
From: usenet@DavidFilmer.com
Subject: Re: Perl equivalent to unix script
Message-Id: <1162234813.014276.223590@e64g2000cwd.googlegroups.com>

Mike wrote:

> cat tempfile1 | sort > newfile2;

That's rather convoluted even for UNIX scripting.  Why not:

   sort tempfile1 > newfile2;

You cannot do this as a pure-Perl approach (ie, no shell redirects,
etc) any more simply.  You could do something like this:

#!/usr/bin/perl
   use strict; use warnings;

   open (my $in,  '<', '/tmp/tempfile1') or die "oops - $!\n";
   open (my $out, '>', '/tmp/newfile2')  or die "oops - $!\n";

   print $out sort(<$in>);

   close $in;
   close $out;

   unlink $in;

__END__

--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 9906
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[28542] in Perl-Users-Digest

Perl-Users Digest, Issue: 9906 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Oct 30 14:05:51 2006

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Oct 30 14:05:51 2006