[24750] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 6905 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Aug 24 18:06:06 2004

Date: Tue, 24 Aug 2004 15:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 24 Aug 2004     Volume: 10 Number: 6905

Today's topics:
        Can this code be better? <>
    Re: Can this code be better? <nobull@mail.com>
    Re: Can this code be better? <>
    Re: Can this code be better? <dwall@fastmail.fm>
    Re: Can this code be better? <>
    Re: Can this code be better? <someone@example.com>
    Re: convertinga directory path into a hash <nobull@mail.com>
        Dereferencing Objects $_@_.%_
    Re: Dereferencing Objects <notvalid@email.com>
    Re: Dereferencing Objects $_@_.%_
    Re: Dereferencing Objects <notvalid@email.com>
        How to unzip backup files of RaQ4 server? (Andy Signer)
        Parsing FileName for upload (Tony McGuire)
        Performance Improvement of complex data structure (hash <sgilpin@gmail.com>
        perl 5.8.5 (The Doctor)
    Re: Perl Search & Replace Script For Website (krakle)
    Re: Perl Search & Replace Script For Website <noreply@gunnar.cc>
        RegExp Pattern Question <IDontLike@Spam.com>
    Re: RegExp Pattern Question $_@_.%_
    Re: RegExp Pattern Question <Joe.Smith@inwap.com>
    Re: RegExp Pattern Question <IDontLike@Spam.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 24 Aug 2004 11:50:45 -0400
From: Lou Moran <>
Subject: Can this code be better?
Message-Id: <65omi0lv0nkmov6vvefa17ft8unrh3n0be@4ax.com>

This code works as expected but there are things I think could be
better.  For instance I think using GetOSName might be the righter
thing to do.

Just attempting to improve this working code if I can.


use warnings ;
use strict ;
use diagnostics ;
use Win32 ;
my ($string, $major, $minor, $build, $id) = Win32::GetOSVersion() ; 
my $OS = ($minor) ; #Equals 1 for XP 0 for W2K

my $vpn = ("#		VPN Hosts File
127.0.0.1	localhost
xxx.xxx.xxx.xxx	gwdc.geller-wind.local		#primary dc
") ;
my $not = ("#		Standard Hosts File
127.0.0.1	localhost
") ;

#User interface

print "Geller Group LTD\n\n" ;
print "This program launches our VPN Client and mounts network
drives.\n" ; 
print "Are you using the VPN or NOT? Enter VPN or NOT:  " ; 
     my $loc = <STDIN>;
     chomp $loc;
     $loc =~ tr/A-Z/a-z/;	#needlessly sets lowercase
     # print $loc;

     if ($loc eq "vpn" or $loc eq "not") {
          } else {
          print "\n******\nNo changes made to hosts file.\n" ;
          print "VPN not activated!\n******\n" ;
          print "Press any key to exit.\n" ;
          print `pause` ;
          exit () ;		#exit () works better than die here
     }
     
     $loc =~ s/vpn/v/;		#needlessly sets STDIN to one letter
     $loc =~ s/not/n/;		#needlessly sets STDIN to one letter
     # print $loc;

if ($OS eq "0") {
	open (HOSTS, '+> C:/WINNT/system32/drivers/etc/hosts') ; #W2K
} elsif ($OS eq "1") {
	open (HOSTS, '+> C:/WINDOWS/system32/drivers/etc/hosts') #XP 
} else {
	die "There is no hosts file." ; 	#Overwrites the hosts
file
}

if ($loc eq "n") { 
     print "Writing the non-VPN hosts file.\n" ;
     print HOSTS $not ; 			#write the file
     print "Wrote non-VPN hosts file.\n\n" ;
     print "Press any key to exit.\n" ;
     print `pause` ;
     exit () ;
     
     
     } else {
     
     print "Writing the VPN hosts file.\n" ; 
     print HOSTS $vpn ; 			#write the file
     
     #opens the VPN Client software 
     system start=>"C:/Progra~1/SONICWALL/SONICW~1/SWGVpnClient.exe" ;
     
     #pings Domain Controller 10 times 
     print `ping -n 10 gwdc` ;
   
     # Logon script
     print `C:/VPN/BAT/VPN.BAT` ;
     
     print "Wrote VPN hosts file and executed LOGON script.\n\n" ;
     print "Press any key to exit.\n" ;
     print `pause` ;
     exit () ;

}
close HOSTS
__END__


------------------------------

Date: Tue, 24 Aug 2004 18:51:22 +0100
From: Brian McCauley <nobull@mail.com>
Subject: Re: Can this code be better?
Message-Id: <cgfv6r$72k$1@slavica.ukpost.com>

Lou Moran wrote:
> This code works as expected but there are things I think could be
> better.  For instance I think using GetOSName might be the righter
> thing to do.

Which part of the Windows API is the best way to destinguish which 
dialect of Windows you are running is a question about Windows not Perl. 
I cannot comment.

> 
> Just attempting to improve this working code if I can.
> 
> 
> use warnings ;
> use strict ;
> use diagnostics ;

> use Win32 ;
> my ($string, $major, $minor, $build, $id) = Win32::GetOSVersion() ; 
> my $OS = ($minor) ; #Equals 1 for XP 0 for W2K

There is no need to use 5 variables on the LHS.  You can use undef as a 
dummy or use a list slice.

my (undef, undef, $OS) = Win32::GetOSVersion() ;

or

my $OS = (Win32::GetOSVersion())[2]


> my $vpn = ("#		VPN Hosts File
> 127.0.0.1	localhost
> xxx.xxx.xxx.xxx	gwdc.geller-wind.local		#primary dc
> ") ;
> my $not = ("#		Standard Hosts File
> 127.0.0.1	localhost
> ") ;
> 
> #User interface
> 
> print "Geller Group LTD\n\n" ;
> print "This program launches our VPN Client and mounts network
> drives.\n" ; 
> print "Are you using the VPN or NOT? Enter VPN or NOT:  " ; 
>      my $loc = <STDIN>;

Use indentatation to show the structure of your code.  Don't just indent 
  for no reason.

>      chomp $loc;
 >      $loc =~ tr/A-/a-z/;	#needlessly sets lowercase

There's nohing wrong with that as three lines but usual idiom is:

   chomp my $loc = lc <STDIN>;

Er - why do you have code that is commented as needless?

>      # print $loc;
> 
>      if ($loc eq "vpn" or $loc eq "not") {
>           } else {
>           print "\n******\nNo changes made to hosts file.\n" ;
>           print "VPN not activated!\n******\n" ;
>           print "Press any key to exit.\n" ;
>           print `pause` ;

While there can someimes be a reason to do that, it's far more likely 
that you should simply say:

   system('pause');

>           exit () ;		#exit () works better than die here
>      }
>      
>      $loc =~ s/vpn/v/;		#needlessly sets STDIN to one letter
>      $loc =~ s/not/n/;		#needlessly sets STDIN to one letter
>      # print $loc;

Er - why do you have code that is commented as needless?

> if ($OS eq "0") {
> 	open (HOSTS, '+> C:/WINNT/system32/drivers/etc/hosts') ; #W2K
> } elsif ($OS eq "1") {
> 	open (HOSTS, '+> C:/WINDOWS/system32/drivers/etc/hosts') #XP 

You should always, yes always, check the retun value from open().

Why are you opening these files read/write?

I also happen to know that you've potentially got wrong directory names 
there (since the directory name is a option during manual install).  The 
correct directory can be found in an environment variable.  This of 
course has nothing to do with Perl.

If $OS is a number it's more conventional to use a numeric comparison 
operator.

> } else {
> 	die "There is no hosts file." ; 	#Overwrites the hosts

Do not include comments that are totally unrelated to the code.


>      print `pause` ;

See above.

>      system start=>"C:/Progra~1/SONICWALL/SONICW~1/SWGVpnClient.exe" ;

Don't ya just hate Windows?

>      
>      #pings Domain Controller 10 times 
>      print `ping -n 10 gwdc` ;

See above.


>      # Logon script
>      print `C:/VPN/BAT/VPN.BAT` ;

See above.

>      print "Wrote VPN hosts file and executed LOGON script.\n\n" ;
>      print "Press any key to exit.\n" ;
>      print `pause` ;

See above.

>      exit () ;
> 
> }
> close HOSTS

This line is probably not reached - perhaps you shoul remove some of the 
spurious exit()s?



------------------------------

Date: Tue, 24 Aug 2004 14:48:15 -0400
From: Lou Moran <>
Subject: Re: Can this code be better?
Message-Id: <gi2ni099e15i7n0vgo94cqtgjo9l8ue269@4ax.com>

On Tue, 24 Aug 2004 18:51:22 +0100, Brian McCauley <nobull@mail.com>
wrote:

SNIP
>
>> use Win32 ;
>> my ($string, $major, $minor, $build, $id) = Win32::GetOSVersion() ; 
>> my $OS = ($minor) ; #Equals 1 for XP 0 for W2K
>
>There is no need to use 5 variables on the LHS.  You can use undef as a 
>dummy or use a list slice.
>
>my (undef, undef, $OS) = Win32::GetOSVersion() ;
>
>or
>
>my $OS = (Win32::GetOSVersion())[2]
>
>

I will try that... I was finding that if I only used 1 variable I was
not getting $minor to correctly report.  Apparently GetOSVersion gives
you 5 answers whether you need them or not.


SNIP
>Use indentatation to show the structure of your code.  Don't just indent 
>  for no reason.
>

I think I may have started a loop there and then removed it and my
editor left the indentation and so did I...


>>      chomp $loc;
> >      $loc =~ tr/A-/a-z/;	#needlessly sets lowercase
>
>There's nohing wrong with that as three lines but usual idiom is:
>
>   chomp my $loc = lc <STDIN>;
>
>Er - why do you have code that is commented as needless?

I will replace with your one line chomp.  I was undoubtedly "trying"
something and then commented it so I would remember what it did and
why it was there.  That's why I think I am seeing the #print code
every now and then... I must have been checking to see if it worked.
Now I rewrite the code and test it by itself. 

SNIP
>>           print `pause` ;
>
>While there can someimes be a reason to do that, it's far more likely 
>that you should simply say:
>
>   system('pause');
>

Yes you are correct and this has been fixed.

>>      $loc =~ s/vpn/v/;		#needlessly sets STDIN to one letter
>>      $loc =~ s/not/n/;		#needlessly sets STDIN to one letter
>>      # print $loc;
>
>Er - why do you have code that is commented as needless?
>
If I don't comment it I will forget what it does in 6 mos when I open
this back up.

>> if ($OS eq "0") {
>> 	open (HOSTS, '+> C:/WINNT/system32/drivers/etc/hosts') ; #W2K
>> } elsif ($OS eq "1") {
>> 	open (HOSTS, '+> C:/WINDOWS/system32/drivers/etc/hosts') #XP 
>
>You should always, yes always, check the retun value from open().
>
How do you mean?

>Why are you opening these files read/write?
>
To intentionally clobber them.  I want either one host file or the
other.

>I also happen to know that you've potentially got wrong directory names 
>there (since the directory name is a option during manual install).  The 
>correct directory can be found in an environment variable.  This of 
>course has nothing to do with Perl.

A possibilty and something I should look to automate.  Maybe a "Find
the Hosts file" type function.

>
>If $OS is a number it's more conventional to use a numeric comparison 
>operator.
>
>> } else {
>> 	die "There is no hosts file." ; 	#Overwrites the hosts
>
>Do not include comments that are totally unrelated to the code.

I definitely over comment.  I feel like I have to in order to be able
to reuse the code I write.  Some of it is cargo culted and some of it
is divined through perldocs/ORA/c.l.p.m. and it is likely I will not
retain its meaning when I revisit it months later.


>> close HOSTS
>
>This line is probably not reached - perhaps you shoul remove some of the 
>spurious exit()s?

Done.  


Thank you for your comments.


------------------------------

Date: Tue, 24 Aug 2004 19:46:58 -0000
From: "David K. Wall" <dwall@fastmail.fm>
Subject: Re: Can this code be better?
Message-Id: <Xns954FA08CE2B3Fdkwwashere@216.168.3.30>

Brian McCauley <nobull@mail.com> wrote in message
<news:cgfv6r$72k$1@slavica.ukpost.com>: 

> Lou Moran wrote:

[snip]

> Use indentatation to show the structure of your code.  Don't just
> indent for no reason.

[snip]
>>      if ($loc eq "vpn" or $loc eq "not") {
>>           } else {
>>           print "\n******\nNo changes made to hosts file.\n" ;
>>           print "VPN not activated!\n******\n" ;
>>           print "Press any key to exit.\n" ;
>>           print `pause` ;

-- and especially don't make the indentation misleading. Someone used 
to reading properly-indented code might miss that cuddled 'else' in 
there. Perhaps the condition should be negated:

    if (not ($loc eq "vpn" or $loc eq "not")) {
        # print stuff and exit
    }

or rewrite it using de Morgan's laws, or use an 'unless' block, or 
something other than misleading indentation.


------------------------------

Date: Tue, 24 Aug 2004 16:16:50 -0400
From: Lou Moran <>
Subject: Re: Can this code be better?
Message-Id: <ng8ni05cs0atkgiqvflhvgh43anqksnmvs@4ax.com>

On Tue, 24 Aug 2004 19:46:58 -0000, "David K. Wall"
<dwall@fastmail.fm> wrote:
SNIP

>Perhaps the condition should be negated:
>
>    if (not ($loc eq "vpn" or $loc eq "not")) {
>        # print stuff and exit
>    }
>
That is nicer thank you.


------------------------------

Date: Tue, 24 Aug 2004 21:33:26 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: Can this code be better?
Message-Id: <GuOWc.3924$A8.670@edtnps89>

Brian McCauley wrote:
> Lou Moran wrote:
>>
>>      chomp $loc;
> 
>  >      $loc =~ tr/A-/a-z/;    #needlessly sets lowercase
> 
> There's nohing wrong with that as three lines but usual idiom is:
> 
>   chomp my $loc = lc <STDIN>;

Precedence Brian!  You need parentheses for chomp to do the right thing.

    chomp( my $loc = lc <STDIN> );



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Tue, 24 Aug 2004 19:13:13 +0100
From: Brian McCauley <nobull@mail.com>
Subject: Re: convertinga directory path into a hash
Message-Id: <cgg0fu$788$1@slavica.ukpost.com>

> I have a unix directory path, say /home/user/mail

Since you don't say where I'll assume it's in $_

I'll also assume you can assume you've got a valid absolute path.

> Not knowing how long it will be, that is, how many elements, how can I
> convert it into a sequence of hashes:
> 	$ref->{'home'}->{'user'}->{'mail'}

Assming that $ref is initially undef and you want that node to be undef...

{
   my $r = \$ref;
   $r = \$$r->{$_} for /[^\/]+/g;
}



------------------------------

Date: Tue, 24 Aug 2004 17:58:39 GMT
From: $_@_.%_
Subject: Dereferencing Objects
Message-Id: <jlLWc.7821$Nn2.2997@trndny05>


Hello,

Would someone please help me with this 'dereferencing' problem?
I've searched this group for solutions posted previously and
did find three threads that kept me busy trying differnt things;
but this unfortunatly didnt solve my problem.  This seems like
it will require dereferencing an object that references another
object in another package.

Please reply if you have any helpful information that may solve
this problem.

Here is the example code (the screen output is in comments):

#!
use strict;
use warnings;
use IO::Socket;
use IO::LineBufferedSet;
use Fcntl qw(:DEFAULT :flock);

#Declarations#
my ($listen_socket, $session_set,);

#Main#
$listen_socket = IO::Socket::INET->new(LocalPort => 11111,
                                       Timeout   => 32,
                                       Listen    => 64,
                                       Reuse     => 1,
                                       Proto     => 'tcp',);
if ($listen_socket) {
    $session_set = IO::LineBufferedSet->new($listen_socket);
}else{
    die "\aCan't create a listening socket\n$@";
}

#Mainloop#
while (1) {
    my @ready = $session_set->wait;
    
    my @sessions = $session_set->sessions();
    print "$sessions[0]\n";

    #This is the output from the above command:
    #IO::LineBufferedSessionData=HASH(0x1c6b06c)
}




------------------------------

Date: Tue, 24 Aug 2004 18:58:45 GMT
From: Ala Qumsieh <notvalid@email.com>
Subject: Re: Dereferencing Objects
Message-Id: <FdMWc.11132$6b7.1656@newssvr27.news.prodigy.com>

$_@_.%_ wrote:

> #!
> use strict;
> use warnings;
> use IO::Socket;
> use IO::LineBufferedSet;
> use Fcntl qw(:DEFAULT :flock);
> 
> #Declarations#
> my ($listen_socket, $session_set,);
> 
> #Main#
> $listen_socket = IO::Socket::INET->new(LocalPort => 11111,
>                                        Timeout   => 32,
>                                        Listen    => 64,
>                                        Reuse     => 1,
>                                        Proto     => 'tcp',);
> if ($listen_socket) {
>     $session_set = IO::LineBufferedSet->new($listen_socket);
> }else{
>     die "\aCan't create a listening socket\n$@";
> }
> 
> #Mainloop#
> while (1) {
>     my @ready = $session_set->wait;
>     
>     my @sessions = $session_set->sessions();
>     print "$sessions[0]\n";
> 
>     #This is the output from the above command:
>     #IO::LineBufferedSessionData=HASH(0x1c6b06c)

Correct. That's just what the docs for IO::LineBufferedSet says:

<quote>
=item @sessions = $set->sessions

The sessions() method returns a list of IO::LineBufferedSessionData
objects, each one corresponding to a handle either added manually with
add(), or added automatically by wait().
</quote>

So your @sessions array will hold a list of IO::LineBufferedSessionData 
objects (which are just blessed hashes). Check the docs of 
IO::LineBufferedSessionData for how to extract information from it.

--Ala



------------------------------

Date: Tue, 24 Aug 2004 21:42:20 GMT
From: $_@_.%_
Subject: Re: Dereferencing Objects
Message-Id: <0DOWc.7940$Ff2.3050@trndny06>


Ala Qumsieh <notvalid@email.com> wrote in message-id:
<FdMWc.11132$6b7.1656@newssvr27.news.prodigy.com>
>
>$_@_.%_ wrote:
>
>> #!
>> use strict;
>> use warnings;
>> use IO::Socket;
>> use IO::LineBufferedSet;
>> use Fcntl qw(:DEFAULT :flock);
>>
>> #Declarations#
>> my ($listen_socket, $session_set,);
>>
>> #Main#
>> $listen_socket = IO::Socket::INET->new(LocalPort => 11111,
>>                                        Timeout   => 32,
>>                                        Listen    => 64,
>>                                        Reuse     => 1,
>>                                        Proto     => 'tcp',);
>> if ($listen_socket) {
>>     $session_set = IO::LineBufferedSet->new($listen_socket);
>> }else{
>>     die "\aCan't create a listening socket\n$@";
>> }
>>
>> #Mainloop#
>> while (1) {
>>     my @ready = $session_set->wait;
>>
>>     my @sessions = $session_set->sessions();
>>     print "$sessions[0]\n";
>>
>>     #This is the output from the above command:
>>     #IO::LineBufferedSessionData=HASH(0x1c6b06c)
>
>Correct. That's just what the docs for IO::LineBufferedSet says:
>
><quote>
>=item @sessions = $set->sessions
>
>The sessions() method returns a list of IO::LineBufferedSessionData
>objects, each one corresponding to a handle either added manually with
>add(), or added automatically by wait().
></quote>
>
>So your @sessions array will hold a list of IO::LineBufferedSessionData
>objects (which are just blessed hashes). Check the docs of
>IO::LineBufferedSessionData for how to extract information from it.
>
>--Ala

Thanks for the reply, appreciate it.

Read the docs and looked at the module, there apparently
is no method provided to access this information.
IO::LineBufferedSessionData appears to be some sort of wrapper for
another module named IO::SessionData.

Im a bit fuzzy on this..  Ive not started OO with exception to
seperating data stuff from function stuff.

Is there another way to dereference this data?




------------------------------

Date: Tue, 24 Aug 2004 22:00:43 GMT
From: Ala Qumsieh <notvalid@email.com>
Subject: Re: Dereferencing Objects
Message-Id: <fUOWc.7815$QJ3.3036@newssvr21.news.prodigy.com>

$_@_.%_ wrote:

> Read the docs and looked at the module, there apparently
> is no method provided to access this information.

What information are you talking about exactly? :)
What is it that you want to get to?

> IO::LineBufferedSessionData appears to be some sort of wrapper for
> another module named IO::SessionData.

It is a sub-class of IO::SessionData which means it inherits any methods 
and package vars defined in IO::SessionData.

> Im a bit fuzzy on this..  Ive not started OO with exception to
> seperating data stuff from function stuff.

	perldoc perltoot

is a good start.

> Is there another way to dereference this data?

It's simply a hash. You can use keys() to access its keys, and so on. 
But I don't advise this. It's best to stick with its API. I haven't used 
it before, but it seems to be encapsulating some sort of handle. The 
following methods are defined:

	read()
	getline()
	write()
	close()

among others. I presume you can do everything you want to do with a 
handle using those methods alone. Or are you looking for something else?

--Ala


------------------------------

Date: 24 Aug 2004 13:45:24 -0700
From: blackhole@diediedie.org (Andy Signer)
Subject: How to unzip backup files of RaQ4 server?
Message-Id: <d416735.0408241245.3adfc27d@posting.google.com>

Hi,
tonight I was searching for a script to extract *.raq files. I wasn't
able to get one on the inet. The only thing I found was the script
written by Jeff Bilicki for RaQ2. I customised it for RaQ 4. May it be
helpful to others.

Here it comes (quick 'n dirty)!

Have fun ;-)
Andy


#!/usr/bin/perl
# Andy Signer <blackhole at diediedie.org>
# removes the header out of a RaQ 4 backup file (*.raq) and write a
# normal tar.gz archive
#
# Original version by:
# Jeff Bilicki <jeffb at cobaltnet.com>
# removes the header out of a RaQ 2 backup file
###############################################################################
use strict;

my $infile;
my $outfile = "out.tar.gz";
my $header_start ="\%\%BACKUP_HEADER";
my $header_end = "\%\%END_XML";

if (@ARGV) {
        $infile = $ARGV[0];
} else {
        print "usage: stripheader.pl <file name>\n";
        exit 1;
}

open (INFILE, $infile) or die "Can't open: $!\n";
open (OUTFILE, ">$outfile") or die "Can't open $!\n";

while (<INFILE>) {
        if ( /^$header_start/ ... /^$header_end/ ) {
                next;
        }
        print OUTFILE $_;
}

close(INFILE);
close(OUTFILE);
exit 0;


------------------------------

Date: 24 Aug 2004 14:49:59 -0700
From: tony@paradoxcommunity.com (Tony McGuire)
Subject: Parsing FileName for upload
Message-Id: <f896a829.0408241349.46d52e77@posting.google.com>

If a user selects a file on a Windows box, with IE at least, the FULL
PATH of the file on the user's system is transmitted to the server.

If the user does the same thing using Opera on a Linux box, and
apparently Firebird, then only the specific filename gets transmitted
to the server.

I've been going batty trying to figure out a routine that will detect
when there is a full path sent and parse the file name from that path,
and when there is only a file name sent.

I would dearly appreciate anyone who can help with this.  I found many
references to parsing filenames, but nothing I could translate or make
work for this specific situation.  Although I only went thru a couple
dozen posts, it's true.


------------------------------

Date: 24 Aug 2004 13:26:29 -0700
From: "Scott  Gilpin" <sgilpin@gmail.com>
Subject: Performance Improvement of complex data structure (hash of hashes of hashes)
Message-Id: <cgg89l$6fv@odah37.prod.google.com>

Hi everyone -

I'm trying to improve the performance (runtime) of a program that
processes large files.  The output of the processing is some fixed
number of matrices (that can vary between invocations of the program),
each of which has a different number of rows, and the same number of
columns.  However, the number of rows and columns may not be known
until the last row of the original file is read.  The original file
contains approximately 100 millon rows.  Each individual matrix has
between 5 and 200 rows, and between 50 and 10000 columns.  The data
structure I'm using is a hash of hashes of hashes that stores this
info.   N is the total number of columns, M1 is the total number of
rows in matrix #1, M2 is the total number of rows in matrix 2, etc,
etc.  The total number of matrices is between 3 and 15.


matrix #1 => row name 1 => col name 1 => value of 1,1
col name 2 => value of 1,2
 .....
col name N => value of 1,N
row name 2 => col name 1 => value of 2,1
col name 2 => value of 2,2
 .....
col name N => value of 2,N
 ....
row name M1=> col name 1 => value of M1,1
col name 2 => value of M1,2
 .....
col name N => value of M1,N

matrix #2 => row name 1 => col name 1 => value of 2,1
col name 2 => value of 2,1
 .....
col name N => value of 2,1
 ....
row name M2=> col name 1 => value of M2,1
col name 2 => value of M2,N
 .....
col name N => value of M2,N

etc, etc...

Here is the code that I'm using to build up this data structure.  I'm
running perl version 5.8.3 on solaris 8 (sparc processor).  The system
is not memory bound or cpu bound - this program is really the only
thing that runs.  There are several gigabytes of memory, and this
program doesn't grow bigger than around 100 MB.  Right now the run time
for the following while loop with 100 million rows of data is about 6
hours.  Any small improvements would be great.

## loop to process each row of the original data
while(<INDATA>)
{
chomp($_);


## Each row is delimited with |
my @original_row = split(/\|/o,$_);

## The cell value and the column name are always in the same
position
my $cell_value = $original_row[24];
my $col_name = $original_row[1];

## Add this column name to the list of ones we've seen
$columns_seen{$col_name}=1;

##  For each matrix, loop through and increment the
row/column value
foreach my  $matrix   (@matrixList)
{

## positionHash tells the position of the value for
## this matrix in the original data row
my $row_name = $original_row[$positionHash{$matrix}];
$matrix_values{$matrix}{$row_name}{$col_name} +=
$cell_value;
}

}   ## end while

I tried using DProf & dprofpp,  but that didn't reveal anything
interesting.  I also tried setting the initial size of each hash using
'keys', but this didn't show any improvement.  I could only initialize
the hash of hashes - and not the third level of hashes (since I don't
know the values in the second hash until they are read in from the
file).  I know that memory allocation in C is expensive, as is
re-hashing - I suspect that's what's taking up a lot of the time.

My specific questions are:

Is there a profiler for perl that will produce output with information
about the underlying C function calls?  (eg - malloc)   Or at least
more information than DProf?
Is there a more suitable data structure that I should use?
Is there a way to allocate all the memory I would need at the beginning
of the program, to eliminate subsequent memory allocation and
rehashing?  (My system has plenty of memory)
Anything else I'm missing?

Thanks in advance.
Scott



------------------------------

Date: Tue, 24 Aug 2004 13:46:38 +0000 (UTC)
From: doctor@doctor.nl2k.ab.ca (The Doctor)
Subject: perl 5.8.5
Message-Id: <cgfgru$m7f$6@gallifrey.nk.ca>

1)  On BSD/OS 4.3.1 I can compile a non-threaded perl with success but 
  a non-thread perl turns up

doctor.nl2k.ab.ca//usr/source/perl-5.8.5$ make -t
touch lib/auto/B/B.so
*** couldn't touch lib/auto/B/B.so: No such file or directorytouch lib/auto/Encode/Encode.so
*** couldn't touch lib/auto/Encode/Encode.so: No such file or directorytouch lib/auto/List/Util/Util.so
*** couldn't touch lib/auto/List/Util/Util.so: No such file or directorytouch lib/auto/SDBM_File/SDBM_File.so
*** couldn't touch lib/auto/SDBM_File/SDBM_File.so: No such file or directorytouch extras.make
touch all

Why am I have this problem on BSD/OS 4.3.1 ??

2)  Both neomail and openwebmail do not work well under perl 5.8.5 .

openwebmail:

doctor.nl2k.ab.ca//var/www/cgi-bin/openwebmail$ ^sb^b
/usr/bin/suidperl -T openwebail-prefs.pl --init
sperl needs fd script
You should not call sperl directly; do you need to change a #! line
from sperl to perl?

neomail from the error_log

[Tue Aug 24 07:45:40 2004] [error] [client 216.95.238.94] Can't access() script

Did someone not follow the new security trends in perl?
-- 
Member - Liberal International	
This is doctor@nl2k.ab.ca	Ici doctor@nl2k.ab.ca
God Queen and country! Beware Anti-Christ rising!
Microsoft is not the solution; it is the question; what is the answer?? NO!!


------------------------------

Date: 24 Aug 2004 11:11:19 -0700
From: krakle@visto.com (krakle)
Subject: Re: Perl Search & Replace Script For Website
Message-Id: <237aaff8.0408241011.5407b8d9@posting.google.com>

Michael4172@hotmail.com (Michael) wrote in message news:<9dcd9df6.0408231730.f586eb9@posting.google.com>...
> Anyone know of a script, where whenever a page is published, its
> scanned and if a keyword I define in another area is spotted it
> automatically goes into a link?  Then any other currences of it
> afterswards in the same article its regular text?

I don't think you will find a script already written to do such an unique task.

>  
> If that didn't make sense, let me show an example.
>  
> "Johnny (<-- that would be a link since I predefined it, in a script
> or what not, upon clicking it would go to his profile)  did something
> good.  Johnny Estrada (<-- this would be plain text since its the 2nd
> occurence in the same article) did something else."
>  
> Hope that gives you an idea of what I'm trying to do. Any help would
> appreciate :)

Oh yea... It's very clear now...


------------------------------

Date: Tue, 24 Aug 2004 21:07:22 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Perl Search & Replace Script For Website
Message-Id: <2p1hv2Ffns0fU1@uni-berlin.de>

Michael wrote:
> Anyone know of a script, where whenever a page is published, its 
> scanned and if a keyword I define in another area is spotted it 
> automatically goes into a link?  Then any other currences of it 
> afterswards in the same article its regular text?
> 
> If that didn't make sense, let me show an example.
> 
> "Johnny (<-- that would be a link since I predefined it, in a
> script or what not, upon clicking it would go to his profile)  did
> something good.  Johnny Estrada (<-- this would be plain text since
> its the 2nd occurence in the same article) did something else."
> 
> Hope that gives you an idea of what I'm trying to do.

I have seen that functionality in the content management system
PostNuke (the "Autolinks" module). The substitution takes place
dynamically as a part of the page generating.

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: Tue, 24 Aug 2004 13:54:02 -0400
From: "GrindKore" <IDontLike@Spam.com>
Subject: RegExp Pattern Question
Message-Id: <YdLWc.1$U75.0@animal.ultrafeed.com>

Hello, I can't get my mind around RegExp pattern that can match file name in
binary post subject field.

For example some yEnc encoded binary post include filename in the ""
3dttv8 par2 - "3dttv8.vol025+22.PAR2" yEnc (20/24)
and some do not
As Req: EasyBoot Systems SoftDisc 2.51.rar (04/12)


How do other binary newsreaders like NewsBin extract filenames from post
subject?


Your comments are appreciated.




------------------------------

Date: Tue, 24 Aug 2004 18:02:57 GMT
From: $_@_.%_
Subject: Re: RegExp Pattern Question
Message-Id: <lpLWc.7825$Nn2.1424@trndny05>


"GrindKore" <IDontLike@Spam.com> wrote in message-id:
<YdLWc.1$U75.0@animal.ultrafeed.com>
>
>Hello, I can't get my mind around RegExp pattern that can match file name in
>binary post subject field.
>
>For example some yEnc encoded binary post include filename in the ""
>3dttv8 par2 - "3dttv8.vol025+22.PAR2" yEnc (20/24)
>and some do not
>As Req: EasyBoot Systems SoftDisc 2.51.rar (04/12)
>
>
>How do other binary newsreaders like NewsBin extract filenames from post
>subject?
>
>
>Your comments are appreciated.

I think my newsreader does something like this:
  m/(.+)[(\[\{]+?(\d+)[\/\-]+?(\d+)[)\]\}]+?(.*)/) {
  #$1 = subj, $2 = part, $3 = total, $4 = more subj
  my $newsubj = $1.$4;

I'm sorry this is so vague, ive not worked on that bit of NewsSurfer in
some time.




------------------------------

Date: Tue, 24 Aug 2004 18:52:22 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: RegExp Pattern Question
Message-Id: <G7MWc.52165$Fg5.50512@attbi_s53>

GrindKore wrote:

> How do other binary newsreaders like NewsBin extract filenames from post
> subject?

I don't know about NewsBin, but other newsreader programs use the Subject
line only for the purpose of putting the article in the right order.
The actual file name can be found in the first few lines in the body
of the first post.
	-Joe


------------------------------

Date: Tue, 24 Aug 2004 15:20:36 -0400
From: "GrindKore" <IDontLike@Spam.com>
Subject: Re: RegExp Pattern Question
Message-Id: <6vMWc.25$U75.18@animal.ultrafeed.com>


"Joe Smith" <Joe.Smith@inwap.com> wrote in message
news:G7MWc.52165$Fg5.50512@attbi_s53...
> GrindKore wrote:
>
> > How do other binary newsreaders like NewsBin extract filenames from post
> > subject?
>
> I don't know about NewsBin, but other newsreader programs use the Subject
> line only for the purpose of putting the article in the right order.
> The actual file name can be found in the first few lines in the body
> of the first post.
> -Joe

Yeah I got the ordering part working pretty good, I hoped I could "guess"
the filename from the subject
because that way I only have to scan xover data for a given group, thus
saving a lot of bandwidth.




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6905
***************************************


home help back first fref pref prev next nref lref last post