[10856] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4457 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Dec 18 20:07:31 1998

Date: Fri, 18 Dec 98 17:00:18 -0800
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Fri, 18 Dec 1998     Volume: 8 Number: 4457

Today's topics:
    Re: $&, $', and $` and parens.... <uri@sysarch.com>
    Re: $&, $', and $` and parens.... <ebohlman@netcom.com>
    Re: An example of good file locking please <richgrise@entheosengineering.com>
    Re: An example of good file locking please (Sam Holden)
        Calling Perl from Perl? <arvindk@pa.dec.com>
        Error when trying to use flock() example from perlfaq5 <richgrise@entheosengineering.com>
    Re: First German Perl Workshop 1.0 birgitt@my-dejanews.com
    Re: hashes <aqumsieh@matrox.com>
    Re: I know this sounds stupid... <gellyfish@btinternet.com>
    Re: multi-dimensional arrays <aqumsieh@matrox.com>
    Re: Nested sorting (Bill Moseley)
    Re: Nested sorting <ebohlman@netcom.com>
    Re: Nested sorting (Larry Rosler)
    Re: Origin of 'local'? (M.J.T. Guy)
        RSA and PERL (Kevin Scott)
    Re: Searching through a 10MB file <ebohlman@netcom.com>
    Re: Searching through a 10MB file <uri@sysarch.com>
    Re: STANDARD PERL for WIN 95/NT EXECUTABLE <newspost@morlock.net>
    Re: What's the Right Way to detect an Array Ref? (Bart Lateur)
    Re: What's the Right Way to detect an Array Ref? <ebohlman@netcom.com>
    Re: Why references get created in an r-value context? <aqumsieh@matrox.com>
        Writing many files efficiently (K. Krueger)
        Special: Digest Administrivia (Last modified: 12 Dec 98 (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 18 Dec 1998 19:08:44 -0500
From: Uri Guttman <uri@sysarch.com>
Subject: Re: $&, $', and $` and parens....
Message-Id: <x7ogp1m1rn.fsf@sysarch.com>

>>>>> "LR" == Larry Rosler <lr@hpl.hp.com> writes:

  LR> Before I change any of my code, I'd like to hear more, from those who 
  LR> have repeatedly imprecated against the special variables.  It is 
  LR> possible that improvements in the regex engine have obsoleted those 
  LR> warnings.
 
larry, here are some thoughts that may clear your mind.

the special vars are around all the time so storing results to them may
be faster than creating $1, $2 on the fly. 

the caveat about $& and friends was that if you didn't use them ANYWHERE
in the program things would run faster. using them once triggered a
compiler flag to make them effectively used every time you do a
regex. that is the loss, not the actual use of them but the hidden
global use of them.

try your benchmarks again with one program not having any mention of $&
and friends. then try the same code again with a single (non
benchmarked) use of $&. then run again with all possible
benchmarks. that might illuminate some things.

also ilya may have sped thengs up so the global penalty of $& is
lessened so the local speedup (by not allocating $1, etc), is a win now.

uri

-- 
Uri Guttman  -----------------  SYStems ARCHitecture and Software Engineering
Perl Hacker for Hire  ----------------------  Perl, Internet, UNIX Consulting
uri@sysarch.com  ------------------------------------  http://www.sysarch.com
The Best Search Engine on the Net -------------  http://www.northernlight.com


------------------------------

Date: Sat, 19 Dec 1998 00:09:24 GMT
From: Eric Bohlman <ebohlman@netcom.com>
Subject: Re: $&, $', and $` and parens....
Message-Id: <ebohlmanF46r3p.J56@netcom.com>

Larry Rosler <lr@hpl.hp.com> wrote:
: So I just did benchmark it.  (Surprise!  ;-)  And you are right.  There 
: must be something wrong -- not with the perl interpreter, but with the 
: conventional wisdom.  Here are my code and results:

: Benchmark: timing 65536 iterations of Anchor, Special, Unanch...
:     Anchor:  6 wallclock secs ( 5.36 usr +  0.00 sys =  5.36 CPU)
:    Special:  3 wallclock secs ( 2.33 usr +  0.00 sys =  2.33 CPU)
:     Unanch:  6 wallclock secs ( 5.83 usr +  0.00 sys =  5.83 CPU)

Your example doesn't include any other regexp matches, so I decided to 
see how they'd be affected:

#!/usr/local/bin/perl -w
use Benchmark;

$a = 'x' x 50 . '1' . 'y' x 50;

timethese(1 << (shift || 0), {
 Anchor  => sub { my ($pre, $match, $post) = $a =~/^(.*?)(\d)(.*)/;otm();},
 Unanch  => sub { my ($pre, $match, $post) = $a =~ /(.*?)(\d)(.*)/;otm();},
Special => sub { $a =~ /\d/; my ($pre, $match, $post) = ($`, $&, $');otm();},
});

sub otm {
  for (my $i=0;$i<10;++$i) {
    my $hasy=($a=~/y/);
  }
}
__END__

Here's what I got:


This is perl, version 5.004

Copyright 1987-1997, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5.0 source kit.

Benchmark: timing 65536 iterations of Anchor, Special, Unanch...
    Anchor: 125 secs (54.25 usr  0.07 sys = 54.32 cpu)
   Special: 54 secs (43.25 usr  0.03 sys = 43.28 cpu)
    Unanch: 57 secs (53.68 usr  0.07 sys = 53.75 cpu)

The discrepancy in total time between Anchor and Unanch appears to be 
an artifact of system loading.

It appears that at least as of 5.004, using the special variables doesn't 
appreciably slow down other expressions, unless something's going on that 
I'm not aware of (like the match in my loop getting optimized away).



------------------------------

Date: Fri, 18 Dec 1998 16:51:18 +0000
From: Rich Grise <richgrise@entheosengineering.com>
Subject: Re: An example of good file locking please
Message-Id: <367A8806.3D99C791@entheosengineering.com>

Tom Christiansen wrote:
> [...]
> I just posted a FMTEYEWTK on open.  Please grab that.
> 
OK, I'll bite. What's FMTEYEWTK the acronym for, if you don't mind my
asking?

TIA :)
-- 
Rich Grise
richgrise@entheosengineering.com
(No need to futz with my e-mail: I have a "delete" button!)


------------------------------

Date: 18 Dec 1998 23:22:23 GMT
From: sholden@pgrad.cs.usyd.edu.au (Sam Holden)
Subject: Re: An example of good file locking please
Message-Id: <slrn77lote.beu.sholden@pgrad.cs.usyd.edu.au>

On Fri, 18 Dec 1998 16:51:18 +0000, Rich Grise
	<richgrise@entheosengineering.com> wrote:
>Tom Christiansen wrote:
>> [...]
>> I just posted a FMTEYEWTK on open.  Please grab that.
>> 
>OK, I'll bite. What's FMTEYEWTK the acronym for, if you don't mind my
>asking?

Far More than Everything You've Ever Wanted to Know.

-- 
Sam

You can blame it all on the internet. I do...
	--Larry Wall


------------------------------

Date: Fri, 18 Dec 1998 16:13:32 -0800
From: Arvind Krishnaswamy <arvindk@pa.dec.com>
Subject: Calling Perl from Perl?
Message-Id: <367AEFAC.BF61CE48@pa.dec.com>

Is the only way of calling one Perl script from another by using exec or
system calls?
If there are other ways, can the two Perl scripts communicate with each
other (basically to send status messages such as 'Completed' etc)?

Thanks





------------------------------

Date: Fri, 18 Dec 1998 17:11:35 +0000
From: Rich Grise <richgrise@entheosengineering.com>
Subject: Error when trying to use flock() example from perlfaq5
Message-Id: <367A8CC7.11F8BD1F@entheosengineering.com>

Hi. I'm hacking Matt's wwwboard script, and thought file locking might
be a good
idea (somebody once told me that Matt's scripts are not good examples of
Perl
coding - as I learn more, bit by bit {no pun intended, but it is kinda
cute!}
I'm seeing that that's true)

So I lifted this snippet right out of perlfaq5:

use Fcntl qw(:flock);

    sysopen(FH, "numfile", O_RDWR|O_CREAT, 0644) or die "can't open
numfile: $!";
    flock(FH, 2)                                 or die "can't flock
numfile: $!";
    $num = <FH> || 0;
    seek(FH, 0, 0)                               or die "can't rewind
numfile: $!";
    truncate(FH, 0)                              or die "can't truncate
numfile: $!";
    (print FH $num+1, "\n")                      or die "can't write
numfile: $!";
    # DO NOT UNLOCK THIS UNTIL YOU CLOSE
    close FH                                     or die "can't close
numfile: $!";


supposedly to lock, for updating, the "message number" file (only the
file name
is different, for my particular application) and I get the following
error
message in my error_log:

can't truncate numfile: Permission denied at
/mnt/web/guide/entheosengineering/SB/cgibin/secureboard.cgi line 316,
<FH> chunk 1.
[Fri Dec 18 16:42:28 1998] access to
/mnt/web/guide/entheosengineering/SB/cgibin/secureboard.cgi failed for
localhost, reason: Premature end of script headers

Of course the premature end is because of the error on line 316.

But what have I missed, such that the truncate fails? 
Permission of data.txt (my numfile) is 666, and it was working before,
when I left
Matt's method the way it was, which I realized could cause a collision
because of
the amount of time between getting the number and writing the
incremented number.

But howcome the code snippet right out of the faq fails?

perl, version 5.004_03
Slackware Linux 3.3.0, kernel 2.0.30,
Server version Apache/1.2.0.

TIA :)
-- 
Rich Grise
richgrise@entheosengineering.com
(No need to futz with my e-mail: I have a "delete" button!)


------------------------------

Date: Sat, 19 Dec 1998 00:33:22 GMT
From: birgitt@my-dejanews.com
Subject: Re: First German Perl Workshop 1.0
Message-Id: <75es8i$t99$1@nnrp1.dejanews.com>

In article <75dlmu$b0o@alice.gmd.de>,
  jc@gmd.de (Juergen Christoffel) wrote:
> [mailed and posted]
>
> In <75abr0$hi4$3@penthesilea.Materna.DE> Juergen.Puenter@materna.de (J|rgen
P|nter) writes:
>
> >In article <7596r6$3vu$1@nnrp1.dejanews.com>, birgitt@my-dejanews.com
> >says...
> >>
> >>This announcement was NOT made in de.comp.lang.perl, nor in clpm, nor in
> >>clp.announce. Just in clp.moderated.
> >>
> >>Is it worth wondering why ?


> It was first submitted to clp.announce by
> the end of November but Randal (or some semi-automated incarnation of him)
> rejected it because it didn't contain the "right keywords" without further
> regard to the contents.

I mentioned months ago, that IMHO such things should be announced in
clp.announce and I haven't changed my mind about it.

> And, mind you, cross-posting between
> clp.moderated and clp.misc wouldn't be that good an idea.
>

IMHO for that kind of a post it would have been appropriate.

> And, yes, it wasn't posted to de.comp.lang.perl because 90% of the
> readership wouldn't be able to attend the workshop anyway (see below for
> details). But we expect the remaining 10% of the readership to also follow
> at least clp.moderated and so feel free to join the mailing list if you
> read clp.moderated regularly.

I lurk regularly in all Perl NGs, have nothing perlish to post and
nothing perlish (yet) to ask (just a lot to read). I don't qualify
at all to join clp.moderated. I could not post my first reaction to the
announcement to clp.moderated. I still want to know what is going on
among Perl programmers in Germany. Is that too much to ask ?

  only. If you've read the camel book and regularly
> consult the perl faqs (which, btw, 90% of the people posting in
> de.comp.lang.perl won't do :-())

from which one can _not_ conclude that 90% of all _readers_ of
de.comp.lang.perl or clpm don't read the perl faqs or read Perl books...

 you might qualify already.

I have nothing to loose on an expert workshop for Perl. I am first semester
programming student at a German University. I have no problem
with an expert meeting of Perl programmers in Germany at all. All the
reasons you give I can support wholeheartedly.

My argument goes against HOW it was announced, not WHAT was announcend.
And therefore I post this again here, in clpm.

	--jc
>
> P.S. When I came up with the idea of organizing "such a thing" last year
> (after having been to the first perl conference in san jose) I thought that
> a German or maybe even a European conference might be a good idea. But
> after talking with various people who supported the idea and even running a
> BOF at this year's perl conference, we decided to start with a small
> workshop and scale it up afterwards if the workshop succeeds.

No problem, again, with all of that, though I wished you would put more
emphasis on teaching and would push academic circles at German universities
to include Perl classes.

>    GMD - German National Research Center for Information Technology
>
>    "Lesen erspart Ueberraschungen!" Hermann Heimpel, Historiker
>
     I like your signature, that's why I even read clp.moderated :-)
     (sginature translated round about: "Reading saves (you some) surprises".)

birgitt

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    


------------------------------

Date: Fri, 18 Dec 1998 14:02:58 -0500
From: Ala Qumsieh <aqumsieh@matrox.com>
Subject: Re: hashes
Message-Id: <x3yogp19sta.fsf@tigre.matrox.com>


Tk Soh <r28629@email.sps.mot.com> writes:

> 
> > $num = scalar keys %hash;
>          ^^^^^
> $num is already a scalar.

I know .. but I like to be more explicit sometimes.
Example:

I've seen a lot of postings from Perl experts that say something like:

$length = @ary;

or (for $i = 0; $i < @ary; $i++) { ... }

I don't like that. I prefer to explicitly say:

$length = scalar @ary;

It's easier to read (and safer?).
It's just a matter of taste and style.

Ala



------------------------------

Date: 18 Dec 1998 23:31:30 -0000
From: Jonathan Stowe <gellyfish@btinternet.com>
Subject: Re: I know this sounds stupid...
Message-Id: <75eoki$jk$1@gellyfish.btinternet.com>

On Fri, 18 Dec 1998 22:50:31 GMT Devin Redlich <devin@pctc.com> wrote:
> 
<snip tale of woe>
>
> rm -rf libwww-perl-5.36
> 
> And what happens?  Different system, same result.  Another kernel panic and 
> another crash.  I couldn't believe it.  Now, I'm no perl expert - I just 
> use it for a few quick utilities now and then - but I *am* allowed to 
> delete the distribution directory after I do a "make install", am I not?
> 
> If I'm doing something wrong, please clue me in.  Thx.

Ouch ! I'm not being weird here but I dont think anyone here can help you
it sounds like you've either got a seriously screwed up filesystem or a
seriously screwed up operating system - I'd fsck the hell out of the
filesystem in question and if that doesnt remedy the problem I'd go to the
OS vendor and give them a raving ...

/J\
-- 
Jonathan Stowe <jns@btinternet.com>
Some of your questions answered:
<URL:http://www.btinternet.com/~gellyfish/resources/wwwfaq.htm>
Hastings: <URL:http://www.newhoo.com/Regional/UK/England/East_Sussex/Hastings>


------------------------------

Date: Fri, 18 Dec 1998 13:57:20 -0500
From: Ala Qumsieh <aqumsieh@matrox.com>
Subject: Re: multi-dimensional arrays
Message-Id: <x3ypv9h9t2n.fsf@tigre.matrox.com>


"Teun" <teun@bye.nl> writes:

> 
> Is it possible to use two- or three dimensional arrays in perl?

Yes .. have a look at 

perldoc perldata
perldoc perldsc

> what would a foreach-loop look like then?

What foreach loop??



------------------------------

Date: 18 Dec 1998 23:17:11 GMT
From: moseley@best.com (Bill Moseley)
Subject: Re: Nested sorting
Message-Id: <367ae277$0$205@nntp1.ba.best.com>

In article <MPG.10e4750c33ead7bc9898db@nntp.hpl.hp.com>, lr@hpl.hp.com says...
>You didn't do enough pre-processing.  The goal should be to reduce the 
>sort subroutine to triviality, or -- even better -- to non-existence.  

>All the processing you do in your current routine on a comparison-by-
>comparison basis should be done once, before actually sorting.

I'm not clear on what you mean, here.

>> I don't know the format of the sort keys (numeric vs. alpha), and if the first
>> character is a "!" then I do a reverse sort.
>
>These characteristics can and should be resolved during the pre-
>processing that computes a single key per datum.

Well, the problem I'm having is that the program doesn't know they type
of data or number of keys until run time.  That's why I'm using the for loop
to compare the nested keys one-by-one based on the size of the @sort_keys
array.

And I would like to build one string of all the keys to sort, but because 
the reverse sort can be on any or none keys positions, seems that a single
key won't work.


Removing the lc() makes sense, of course.



>
>I have made a couple of comments below

Me too.


 about your code, but I don't 
>endorse this approach.

That's why I'm here!



>
>> sub list_sort {
>> 
>>    my $compare;
>>    
>>    # Loop through all the sort keys
>>    
>>    for (0..$#{$a->[1]} )  {
>> 
>>        # create local copies of the data 
>> 
>>        my $aa = $a->[1][$_];
>>        my $bb = $b->[1][$_];
>>        
>>        # look for reverse sort flag
>> 
>>        if ( $aa =~ /^!/ ) {      # "!" is a reverse flag
>>           my $cc = $aa;          # Would index() be better than m//?
>
>Yes, but considering how pokey the rest of the code is, you would 
>*never* note the difference.
>
>>           $aa = $bb;
>>           $bb = $cc;
>
>Perl supports a one-line swap:  ($aa, $bb) = ($bb, $aa);  But where do 
>you get rid of the '!'?  What if it is one one of the values but not the 
>other?

For a given key position it is on all the keys.  That's placed there in the
pre-processing on each key of that sort position.  So it doesn't matter
that the character is there.


>
>>        }
>> 
>>        $compare =  (
>>                        $aa <=> $bb 
>
>With '-w' on, this would be a very noisy way to disambiguate numbers 
>from letters.  Convert the numbers to strings in the pre-processing, as 
>discussed and demonstrated in the recent discussions in this group.

I could do this, but I don't know the range of numbers (to set the width of
sprintf field to make numbers sort correctly).


>
>>                             or
>>                     lc($aa) cmp lc($bb)
>
>Similarly, do the case-smashing in the pre-processing.
>
>>                    );
>> 
>> 
>>        # Return if they don't match
>> 
>>        return $compare if $compare;
>>    }
>> 
>>    0;  # return something if falls through
>> }
>
>There is also a FMTYEWTK about sorting, and a FAQ that you should 
>absorb:  perlfaq4:  "How do I sort an array by (anything)?"

Yes, I've read both of those, and the articles that led up to the FMTYEWTK.

I'll look for more on Dejanews.



--------------
Bill Moseley
moseley@best.com



------------------------------

Date: Sat, 19 Dec 1998 00:42:48 GMT
From: Eric Bohlman <ebohlman@netcom.com>
Subject: Re: Nested sorting
Message-Id: <ebohlmanF46snC.KDJ@netcom.com>

Bill Moseley <moseley@best.com> wrote:
: In article <MPG.10e4750c33ead7bc9898db@nntp.hpl.hp.com>, lr@hpl.hp.com says...
: >You didn't do enough pre-processing.  The goal should be to reduce the 
: >sort subroutine to triviality, or -- even better -- to non-existence.  

: >All the processing you do in your current routine on a comparison-by-
: >comparison basis should be done once, before actually sorting.

: I'm not clear on what you mean, here.

When you sort a list, each item gets compared multiple times.  Since perl 
uses the quicksort algorithm, each item participates in an average of 
log2(N) comparisons.  So if you have 1000 items, sort() has to do 10,000 
comparisons.

Therefore, you want to minimize the amount of computation you do when 
sort() has to compare two items.  The usual way to doing this is to 
pre-process the items, before sorting them, into a form that can be 
compared as quickly as possible, sort those transformed items, and then 
extract the original data from the sorted list.  That way, if you have 
1000 items you only need to do 1000 transformation, not 10,000.



------------------------------

Date: Fri, 18 Dec 1998 16:48:01 -0800
From: lr@hpl.hp.com (Larry Rosler)
Subject: Re: Nested sorting
Message-Id: <MPG.10e4979bfeac0119898dc@nntp.hpl.hp.com>

[Posted to comp.lang.perl.misc and a copy mailed.]

In article <367ae277$0$205@nntp1.ba.best.com> on 18 Dec 1998 23:17:11 
GMT, Bill Moseley <moseley@best.com> says...
+ In article <MPG.10e4750c33ead7bc9898db@nntp.hpl.hp.com>, lr@hpl.hp.com 
says...
+ >You didn't do enough pre-processing.  The goal should be to reduce 
the 
+ >sort subroutine to triviality, or -- even better -- to non-existence.  

+ 
+ >All the processing you do in your current routine on a comparison-by-
+ >comparison basis should be done once, before actually sorting.
+ 
+ I'm not clear on what you mean, here.

What I mean is that when the data are ready for sorting, you do one pass 
over them to prepare keys that can be used to do the actual sorting.

+ >> I don't know the format of the sort keys (numeric vs. alpha), and 
if the first
+ >> character is a "!" then I do a reverse sort.
+ >
+ >These characteristics can and should be resolved during the pre-
+ >processing that computes a single key per datum.
+ 
+ Well, the problem I'm having is that the program doesn't know they 
type
+ of data or number of keys until run time.  That's why I'm using the 
for loop
+ to compare the nested keys one-by-one based on the size of the 
@sort_keys
+ array.
+ 
+ And I would like to build one string of all the keys to sort, but 
because 
+ the reverse sort can be on any or none keys positions, seems that a 
single
+ key won't work.

Then you are in real trouble, because all the sorting techniques 
described require the existence of a single-valued key per datum in the 
array to be sorted.

 ...

+ >>        $compare =  (
+ >>                        $aa <=> $bb 
+ >
+ >With '-w' on, this would be a very noisy way to disambiguate numbers 
+ >from letters.  Convert the numbers to strings in the pre-processing, 
as 
+ >discussed and demonstrated in the recent discussions in this group.
+ 
+ I could do this, but I don't know the range of numbers (to set the 
width of
+ sprintf field to make numbers sort correctly).

If they are really numbers (and not arbitrarily long strings of digits), 
then on a 32-bit implementation, 10 digits will suffice.  If any of the 
numbers can be negative, bias them all by adding 1 << 31.

-- 
(Just Another Larry) Rosler
Hewlett-Packard Company
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: 18 Dec 1998 21:02:09 GMT
From: mjtg@cus.cam.ac.uk (M.J.T. Guy)
Subject: Re: Origin of 'local'?
Message-Id: <75efsh$ji8$1@pegasus.csx.cam.ac.uk>

[ Off topic, like the rest of this thread, so I'll just mention Perl here ]

Peter A Fein  <p-fein@uchicago.edu> wrote:
>
>I believe Algol did lexical as well; some of the earlier LISP variants
>used dynamic (though unintentionally, IIRC).

That's exactly right.    The original Algol60 Report and the first
version of Lisp, published at about the same time, both had the bug
of defining binding as dynamic rather than lexical.   This happened
because the evaluation of a subroutine call was defined (roughly) as
"Substitute the text of the subroutine body in place of the call".
This gives you dynamic binding rather than the intended lexical.

The Algol Committee noticed the bug, and corrected it in the Revised
Report.   So most languages in the Algol tradition (including Pascal)
have lexical scoping.    But the bug in Lisp was left unmended for more
than 15 years.  Since Lisp was the preferred language for Computer
Scientists at that time,  this has created a whole generation of
CS people with very confused ideas about binding.

Another moral of the story is that committees don't always do a worse
job than gifted individuals.


Mike Guy


------------------------------

Date: 18 Dec 1998 23:44:13 GMT
From: johnDoe@nowhere.com (Kevin Scott)
Subject: RSA and PERL
Message-Id: <75epcd$ek1$1@newshost.cyberramp.net>

Has anyone been able to get PERL to use RSA's BSAFE product?

I need to find a way to (in the USA) to read an https:// connection.

I really do not want have use C instead of PERL!


Kevin



------------------------------

Date: Fri, 18 Dec 1998 23:37:39 GMT
From: Eric Bohlman <ebohlman@netcom.com>
Subject: Re: Searching through a 10MB file
Message-Id: <ebohlmanF46pMr.HwC@netcom.com>

Christian M. Aranda <christian.aranda@iiginc.com> wrote:
: Thanks for the explaination!  The new version of my code is as follows
: (it smokes!) -

: sub get_ddts_record
: {
: 	open(DDTS_DATA, "$ddts_file") ||
: 		&err_msg("fatal", "Unable to open data file", "open",
: undef ) ;

:    $start = "Start: $ddts_value{identifier}";
:    $end_record = "End: $ddts_value{identifier}";
:    undef($attached_record);

:    while (<DDTS_DATA>) {
:       if (index($_, $start) == 0) {

If your start and end patterns are always going to appear at the beginning
of the line, it might be faster to to precompute the lengths of $start 
and $end_record and then replace the index comparisons with something like:

if (substr($_,0,$start_length) eq $start) {

which would eliminate any time spent by index() in looking for matches 
anywhere but the start.  You would of course need to actually measure the 
times to see if there's any real improvement; IIRC index() uses the 
Boyer-Moore algorithm, which among other things actually starts comparing 
from the *end* of the string being searched for, so if most of your lines 
are not much longer than your start and end lines, index() will be pretty 
quick in rejecting them and the extra overhead of a substr() might eat up 
any savings.



------------------------------

Date: 18 Dec 1998 19:00:05 -0500
From: Uri Guttman <uri@sysarch.com>
Subject: Re: Searching through a 10MB file
Message-Id: <x7r9txm262.fsf@sysarch.com>

>>>>> "CMA" == Christian M Aranda <christian.aranda@iiginc.com> writes:

  CMA> On 17 Dec 1998 16:08:52 -0500, Uri Guttman <uri@ibnets.com> wrote:
  >> anyway, as i said, using a single loop with flags is old
  >> fashioned. learn to use multiple loops. the logic is simpler and it is
  >> faster to boot. my earlier post has most of the code you need for that
  >> design.

  CMA> Thanks for the explaination!  The new version of my code is as follows
  CMA> (it smokes!) -

as i would have expected!

  CMA> sub get_ddts_record
  CMA> {
  CMA> 	open(DDTS_DATA, "$ddts_file") ||
  CMA> 		&err_msg("fatal", "Unable to open data file", "open",
  CMA> undef ) ;

  CMA>    $start = "Start: $ddts_value{identifier}";
  CMA>    $end_record = "End: $ddts_value{identifier}";

  CMA>    undef($attached_record);

why do you do this? in general don't undef variables. assign to
them. undef as a function deletes them from the namespace and is more
than you need to do. since all you do is append to it, assigning '' is
proper and simpler.

  CMA>    while (<DDTS_DATA>) {
  CMA>       if (index($_, $start) == 0) {

  CMA>          while (<DDTS_DATA>) {
  CMA>             last if (index($_, "History") == 0);
  CMA>          }

  CMA>          while (<DDTS_DATA>) {
  CMA>             chop($_);
  CMA>             $attached_record .= "$_\n";
  CMA>             last if (index($_, $end_record) == 0);
  CMA>          }
  CMA>       }
  CMA>    }

as someone else commented you could use substr to compare only the
beginning part of each string. regexes are fine here too but need to be
recompiled each time.

  CMA> @comments = split("Related-file:",$attached_record);
  CMA> close(DDTS_DATA);
       
  CMA> }

i never found out from you if you are searching for multiple records per
invocation of the program. if so you should make a hash of the records
keys you want and grab the start lines and see if the record name is in
the hash. then you can do one pass of the large file and get mutiple
records very efficiently.

uri

-- 
Uri Guttman  -----------------  SYStems ARCHitecture and Software Engineering
Perl Hacker for Hire  ----------------------  Perl, Internet, UNIX Consulting
uri@sysarch.com  ------------------------------------  http://www.sysarch.com
The Best Search Engine on the Net -------------  http://www.northernlight.com


------------------------------

Date: Fri, 18 Dec 1998 18:24:53 -0500
From: "Steven Morlock" <newspost@morlock.net>
Subject: Re: STANDARD PERL for WIN 95/NT EXECUTABLE
Message-Id: <GwBe2.1669$qF5.4807311@lwnws01.ne.mediaone.net>


Since the ActiveState folks are not releasing the
source code for their port can it really be considered
the _standard_ release?

When and if will ActiveState's  port of perl be fully rolled
into the common source archive?

Steve

--
Foliage Software Systems
aka The Nerd Farm
http://www.foliage.com

Matthew Bafford wrote in message ...
>[This followup was posted to comp.lang.perl.misc and a copy was sent to
>the cited author.]
>
>In article <76d9Tp03UkB@link-n-j.poehlmann.link-n.cl.sub.de>,
>j.poehlmann@link-n.cl.sub.de says...
>=> (Weekly posting to comp.lang.perl.misc and de.lang.perl)
>             ^ false
>=> _Where to find the *STANDARD* windows 95/NT port of perl (binary) ?_
>
>Other people have already tried, but maybe my response will be the last
>stick on the camel's back.  Doubtful, though.
>
>The most *RECENT* port is the *STANDARD* port.
>
>And, since AS's release is 5.005_02, and GS's is only 5.004_02 (IIRC,
>over a years span between the two), I'd say AS's is much more recent than
>GS's.
>
>Since it's more recent, that makes it the *STANDARD* port.
>
>=> (The "standard" has a number of advantages over Microsoft's "Activeware"
>=> port. E.g. you can install perl modules  not contained in the binary
>=> distribution. As long as these  Modules do not use C-code you do not need a C-
>=> compiler )
>
>This comment was true during the days when AS (or whatever they were
>called) didn't support MakeMaker.  Now that they do, you are able to use
>any non C module with AS's release just as easily as you can with GS's.
>
>=> Worldwide (Redirecting you to a "near" site )
>=>     ftp://ftp.perl.com/pub/CPAN/ports/win32/Standard/x86/
>
>The ftp site does not redirect you.
>
>=>     Look for a File like
>=>           perl5.00402-bindist04-bc.zip
>=>     there is a new Version (perl 5.005 ) released
>
>No, there is not.
>
>[snip]
>
>=> #------------------------------------
>=> #  find here Gurusamy Sarathy's readme file
>=> #------------------------------------
>
>Find here GS's comment about the perl.5.00402-bindist04-bc.*:
>
>perl5.00402-bindist04-bc.zip
>perl5.00402-bindist04-bc.tar.gz
>    A popular binary distribution of Perl for the Win32 platform (intel).
>    Comes with many widely-used modules and all the tools for adding
>    new ones from CPAN.  This distribution is rather aged; see
>    www.activestate.com for binaries built around Perl 5.005.  I plan to
>    update this distribution with Perl 5.004_05 when that maintenance
>    update of Perl becomes available.  There are no current plans to make
>                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>    5.005 binaries, but I may make one if there is sufficient interest
>    ^^^^^^^^^^^^^^
>    and/or the ActiveState offering doesn't cut it, for some reason that
>    they're unwilling to fix.
>
>
>And, since he now WORKS for ActiveState, AS's port should be considered
>the *STANDARD*.
>
>[snip]
>
>=> Enjoy!
>=>
>=>
>=> Gurusamy Sarathy (Just Another Perl Porter)
>=> gsar@umich.edu
>=> 08-AUG-1997
>
>You know, it's fairly rude to sign a message with someone else's name.
>
>Hope This Helps!
>
>--Matthew




------------------------------

Date: Fri, 18 Dec 1998 22:56:40 GMT
From: bart.lateur@skynet.be (Bart Lateur)
Subject: Re: What's the Right Way to detect an Array Ref?
Message-Id: <367bdcfe.1256738@news.skynet.be>

dg50@chrysler.com wrote:

>What I want is for - in this one instance - to ALWAYS return an array ref,
>even if there is only one element in the name=value hash.
>
>To do this, my tortured and overworked little brain came up with:
># Ohhhh, this is ugly!
>$promotees_raw = [ ("$promotees_raw") ] if ($promotees_raw !~ /^ARRAY/);

Try the function "ref".

   $promotees_raw = [ $promotees_raw ] 
	unless  ref($promotees_raw) eq 'ARRAY';

but I think if ref returns anything not false, it must be an array ref:

   $promotees_raw = [ $promotees_raw ] unless ref($promotees_raw);

	Bart.


------------------------------

Date: Sat, 19 Dec 1998 00:28:12 GMT
From: Eric Bohlman <ebohlman@netcom.com>
Subject: Re: What's the Right Way to detect an Array Ref?
Message-Id: <ebohlmanF46rz1.Ju2@netcom.com>

dg50@chrysler.com wrote:
: # Ohhhh, this is ugly!
: $promotees_raw = [ ("$promotees_raw") ] if ($promotees_raw !~ /^ARRAY/);

: foreach $string (@{$promotees_raw}) {
: ...etc.

: What's the Right Way to do this?

: I didn't find anything immedately obvious in perldoc or perlfaq, but I
: suppose I may have missed something....

perldoc -f ref



------------------------------

Date: Fri, 18 Dec 1998 14:17:56 -0500
From: Ala Qumsieh <aqumsieh@matrox.com>
To: tomek@rentec.com
Subject: Re: Why references get created in an r-value context?
Message-Id: <x3yn24l9s4c.fsf@tigre.matrox.com>


Tomasz Kozlowski <tomek@rentec.com> writes:

> 
> I ran into problems with Perl while using the following kind of code
> 
>    $val = $data->{$date}->{$attr} || '';
> 
>    The variable $val gets either the value from the hash
> or gets initialized to the empty string. Everything works as expected.
> If the hash doesn't contain anything for the specified values, however,
> some references magically spring into existence and the hash gets updated.
> This isn't what I expected. For example, the following statements

This is called 'vivification' and is properly documented.

> 
>    $data = {};
>    $date = '19981215';
>    $attr = 'name';
>    $val = $data->{$date}->{$attr} || '';
> 
> will cause the hash reference to become
> 
>    $data->{'19981215'} = {}

Yep .. If you don't want that behaviour, then you have to test for the
hash element's existance before trying to access it.

$val = $data->{$date}{$attr} if exists $data->{$date} and
	exists $data->{$date}{$attr};

or something like that.

>   To me this is a bug in Perl either in the implementation or specification.

I agree .. but it's too late to fix it. Way too many programs depend
on this behaviour, and fixing it would break a lot of code. It's not
worth it as it can be easily avoided by testing for existence before.

(This was discussed in a thread in the ng a few months ago. Search
DejaNews if you want to know more).

> I consulted 'Programming Perl', 2nd Ed by Wall, Christiansen and Schwartz
> and on page 250 the authors say:
> 
> "This is one of the cases mentioned earlier in which references spring
> into existence when used in an lvalue context ...
> Nothing would spring into existence if you were just trying to print out
> the value. You'd just get the undefined value out of it."
> 
> I talked to quite a few people around and nobody said that's a desirable
> behavior but people tended to defend Perl along the lines "It's easy
> to imagine why it does that". Well, yes, it's easy to imagine what's
> happening but that doesn't mean Perl should do that. It seems to me that
> using a hash reference in a an r-value context should not update the hash.

Agreed. But, it's too late (IMO).

> The keys construct behaves similarly. I have many loops for iterating
> over the contents of a hash that look like
> 
>    foreach $date (keys %$data)
>    {
>       foreach $attr (keys %{$data->{$date}})
>       {
>          $val = $data->{$date}->{$attr};
>          ...
>       }
>    }
> 
>    This snippet of code doesn't give me any surprises. But if I change it to

Of course not since you are using keys() to get a list of keys. So you
are guaranteed that they exist in the hash.

> 
>    foreach $date (@dates)
>    {
>       foreach $attr (keys %{$data->{$date}})
>       {
>          $val = $data->{$date}->{$attr};
>          ...
>       }
>    }
> 
>    and the hash initially doesn't contain anything for some date, well,
> too bad, Perl will create a spot for it and assign to it a reference to an
> empty hash. To give a shorter example:

Yep .. that's the same behaviour. This is also the behaviour that
allows you to freely say:

$hash{$key1}{$key2}{$key3} = $val;

Would you complain about that one?

> 
>    $data = {};
>    @attrs = keys %{$data->{'19981215'}};
> 
>    $data is now
>       $data->{'19981215'} = {}
> 
>   Again, things get changed when a reference is used in an r-value context.
> To make thing worse the defined function has the same 'feature':

defined() checks if its argument is defined. If the argument is a hash
element, as you show, it will try to access that element. This is the
same as

$var = $data->{'19981215'}->{'attr'};

Perl tries to access that element, if the hash key doesn't exist, it
creates it.

I think you are confused here, you should use exists() instead of
defined().

>    $data = {};
>    if( defined( $data->{'19981215'}->{'attr'} ) )
>       { print( "defined\n" ); }
>    else
>       { print( "undefined\n" ); }
> 
>    'undefined' gets printed and $data is now
>       $data->{'19981215'} = {}
> 
> Does anyone know of any way of circumventing this problem?

Test with exists() before accessing it.

Hope this helps,
Ala



------------------------------

Date: 18 Dec 1998 15:37:39 -0800
From: kirbyk@best.com (K. Krueger)
Subject: Writing many files efficiently
Message-Id: <75ep03$ont$1@shell2.ba.best.com>

I've got a script that gathers data, parses through it, and writes the data
out to many (~2000) small files.  Unsurprisingly, this is turning out to
be a beast - the CPU hovers around 80-90% iowait during the writing portion
of the code.  This is just bound to be trouble.

Right now, I'm essentially gathering the filenames and the contents into a
big hash, doing a foreach on the keys of that hash (the filenames), opening
up the file, printing the contents to it, and closing it.  All in a big
long loop.  (Code available if necessary, but it's pretty basic.)

Is there a better way to do this?  Rearchitecting the data structures is
a big deal, and has some other consequences, but I'm willing to do it if
it'll dramatically help the problem.  I'd rather find some sort of module
that writes lots of files in a filesystem-efficient way.

I'm using Perl 5.004_03 on Solaris 2.5.1.  

Thanks.
-- 
Kirby Krueger      O-     kirbyk@best.com 
<*> "Most .sigs this small can't open their own jump gate."


------------------------------

Date: 12 Dec 98 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Special: Digest Administrivia (Last modified: 12 Dec 98)
Message-Id: <null>


Administrivia:

Well, after 6 months, here's the answer to the quiz: what do we do about
comp.lang.perl.moderated. Answer: nothing. 

]From: Russ Allbery <rra@stanford.edu>
]Date: 21 Sep 1998 19:53:43 -0700
]Subject: comp.lang.perl.moderated available via e-mail
]
]It is possible to subscribe to comp.lang.perl.moderated as a mailing list.
]To do so, send mail to majordomo@eyrie.org with "subscribe clpm" in the
]body.  Majordomo will then send you instructions on how to confirm your
]subscription.  This is provided as a general service for those people who
]cannot receive the newsgroup for whatever reason or who just prefer to
]receive messages via e-mail.

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.

The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V8 Issue 4457
**************************************

home help back first fref pref prev next nref lref last post