[19713] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1908 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Oct 10 21:05:31 2001

Date: Wed, 10 Oct 2001 18:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <1002762308-v10-i1908@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Wed, 10 Oct 2001     Volume: 10 Number: 1908

Today's topics:
        bitwise XOR and funny output (Marlon Jackson)
    Re: bitwise XOR and funny output (Martien Verbruggen)
        DOS Filename > 8.3 <LindaTuner@hotmail.com>
    Re: excel (Malcolm Dew-Jones)
    Re: Keeping an escape code in a line <iltzu@sci.invalid>
    Re: Linux vs DOS(Win98) behaviour <bart.lateur@skynet.be>
    Re: Linux vs DOS(Win98) behaviour <tim@vegeta.ath.cx>
    Re: Optimizing lookup in hash of regexps (Clinton A. Pierce)
    Re: Optimizing lookup in hash of regexps:  15 times fas <markus.cl@gmx.de>
    Re: Pattern Matching (shaz)
    Re: Pattern Matching <jurgenex@hotmail.com>
    Re: Pattern Matching <jeff@vpservices.com>
    Re: perlsec + taint <Juha.Laiho@iki.fi>
    Re: perlsec + taint (Garry Williams)
    Re: Reformat Chain <dtweed@acm.org>
    Re: Regular Expressions (Garry Williams)
    Re: sort of array containing strings doesn't work <djberg96@hotmail.com>
    Re: Sorting larg arrays <joe+usenet@sunstarsys.com>
    Re: Stop Transversal of a Directory with Tar and Unzip (Randal L. Schwartz)
    Re: YOU ARE ALL GAY! <RevolvingDoors@btopenworld.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 10 Oct 2001 17:11:43 -0700
From: marlon_jackson@yahoo.com (Marlon Jackson)
Subject: bitwise XOR and funny output
Message-Id: <a1899594.0110101611.39b1ba63@posting.google.com>

I'm a little confused about the output I'm getting from a perl script
using bitwise XOR.

          This:

#/usr/bin/perl

$mystring = "Decrypt this!";

$mykey1 = "asdfasdfasdfasdfasd";
$mykey2 = "jkl;";

$myencryptedstring = ($mystring ^ $mykey1) ^ $mykey2;

$mydecryptedstring = ($myencryptedstring ^ $mykey2) ^ $mykey1;

print "\nString is: $mystring\n";
print "Encrypted string is: $myencryptedstring\n";
print "Decrypted string is: $mydecryptedstring\n";

           Yields this:

String is: Decrypt this!
§@sdfasdd string is: O}k/&#8593;&#9829;&#9658;F§&#8592;
Decrypted string is: Decrypt this!

Or something similar, depending on the OS. Why is my static string in
the second output line getting overwritten? Is it because of those
unprintable chars? or because $mykey1 is a longer string than
$mystring?


------------------------------

Date: Thu, 11 Oct 2001 00:59:10 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: bitwise XOR and funny output
Message-Id: <slrn9s9rmt.l88.mgjv@verbruggen.comdyn.com.au>

On 10 Oct 2001 17:11:43 -0700,
	Marlon Jackson <marlon_jackson@yahoo.com> wrote:
> I'm a little confused about the output I'm getting from a perl script
> using bitwise XOR.
> 
>           This:
> 
> #/usr/bin/perl

Are you sure you copied and pasted this? That first line doesn't look
like a valid she-bang to me.

> print "Encrypted string is: $myencryptedstring\n";

>            Yields this:

> §@sdfasdd string is: O}k/&#8593;&#9829;&#9658;F§&#8592;

> Or something similar, depending on the OS. Why is my static string in
> the second output line getting overwritten? Is it because of those
> unprintable chars? or because $mykey1 is a longer string than
> $mystring?

It is because one of the characters in what you're printing is a
carriage return (hex 0d), which on most terminals will position the
cursor at the start of the line. You could have had much more nasty
garbage in there, but you got lucky.

If you want to look at a string like this, maybe you should do
something like:

print "Encrypted string is: ";
printf "%02x ", ord $_ for split //, $myencryptedstring;
print "\n";

or if you don't like the multiple prints:

print "Encrypted string is: ",
	(map { sprintf "%02x ", ord $_ } split //, $myencryptedstring),
	"\n";

I'm not sure whether all this will work correctly under use locale,
and stuff like that.

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | The world is complex; sendmail.cf
Commercial Dynamics Pty. Ltd.   | reflects this.
NSW, Australia                  | 


------------------------------

Date: Wed, 10 Oct 2001 23:59:50 GMT
From: "Linda Turner" <LindaTuner@hotmail.com>
Subject: DOS Filename > 8.3
Message-Id: <Wh5x7.445$YC3.200529@typhoon.southeast.rr.com>

With a lot of help from this newsgroup, I have written a script to
sweep through a directory and updates some modules in place.  However,
it won't recognize Win NT files with names greater than the
traditional 8.3 format.  Can someone tell me how I can get them to
recognize them?


#!/usr/local/bin/perl -w

use File::Find;
use strict;

$/ = undef;

my $start_dir = 'C\\SCRIPTS';

File::Find::find( \&each_file, $start_dir );

sub each_file {
  return unless /\.sql$/;
  if ( -T $_ ) {
    print "processing $File::Find::name : ";
    if ( open F, "< $_" ) {
      my $s = <F>;
      close F;
      my $match_count = 0;
      if ( open F, "> $_" ) {
        my $match_count += ($s =~ s/COLUMN1/COLUMN2/ig);
        print F $s;
        close F;
        print "$match_count\n";
      }
      else {
        print "Error opening file for write: $!\n";
      }
    }
    else {
      print "Error opening file for read: $!\n";
    }
  }
}







------------------------------

Date: 10 Oct 2001 16:39:14 -0800
From: yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones)
Subject: Re: excel
Message-Id: <3bc4dc22@news.victoria.tc.ca>

Jessica Bull (jessica.bull@broadwing.com) wrote:
: Sorry for all of the newbie questions...thanks for helping.

: I am parsing through lines of data and pulling specific pieces to populate a
: spreadsheet.  I am on Win NT

: What is the best way to populate an excel spreadsheet?  it is really basic
: and should look something like this:

: DATE        FILENAME        PROCESS#        COMPLETION CODE

: I want it to append to the same file every time.  So I have 4 columns and an
: indefinate amount of rows.  Any ideas?


By far the easiest is to store the spread sheet as a simple tab delimited
text file.

As long as no one uses excel to save the data in a different format then
you can add more data by simply appending it to the file.

You could/should read the first line to make sure it has the expected
format before adding more data, and if the format is wrong then exit with
a message "Please re-save the spread sheet first as a tab delimted text
file!". 


Of course, this does not allow any pretty formatting or etc.


------------------------------

Date: 10 Oct 2001 22:18:35 GMT
From: Ilmari Karonen <iltzu@sci.invalid>
Subject: Re: Keeping an escape code in a line
Message-Id: <1002751760.8699@itz.pp.sci.fi>

In article <u93d4rl8kh.fsf@wcl-l.bham.ac.uk>, nobull@mail.com wrote:
>
>chop ( $_ = eval "<<__EndOfData__\n$_\n__EndOfData__\n" );
>
>This assumes that the string does not contain "\n__EndOfData__\n".
>This is usually a safe assuption.

If it's not, one can always do this:

  s/(\\*[\$\@\"])/length($1)%2 ? "\\$1" : $1/eg;
  $_ = eval qq("$_");

-- 
Ilmari Karonen -- http://www.sci.fi/~iltzu/
"Get real!  This is a discussion group, not a helpdesk.  You post something,
we discuss its implications.  If the discussion happens to answer a question
you've asked, that's incidental."           -- nobull in comp.lang.perl.misc



------------------------------

Date: Wed, 10 Oct 2001 22:33:15 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: Linux vs DOS(Win98) behaviour
Message-Id: <jsi9st04ias02elab7pl89gts6p6diavem@4ax.com>

T. Alex Beamish wrote:

>I discovered that the following script worked fine under Linux but
>locked up my machine under DOS:

>#!/usr/bin/perl -w
>
>#  Test program -- This is bugging me, it used to work but now does 
>#  not.
>
>open ( CMD, "dir |" ) or die "Error opening pipe: $!";
>while ( <CMD> )
>{
>  s/</&lt;/g;
>  s/>/&gt;/g;
>
>  print;
>}
>close ( CMD ) or die "Error closing pipe: $!";

It works on my copy of Win98. I hadn't expected that.

I'm not sure why I get a blank line between the normal lines. If I undef
$/, I don't get them. I'd expect CR's to be in there, but deleting them
doesn't help at all. Setting $/ to "\015\012" and then deleting the
returns, does work.

-- 
	Bart.


------------------------------

Date: Thu, 11 Oct 2001 00:04:49 GMT
From: Tim Hammerquist <tim@vegeta.ath.cx>
Subject: Re: Linux vs DOS(Win98) behaviour
Message-Id: <slrn9s9pb2.n9j.tim@vegeta.ath.cx>

Me parece que Malcolm Dew-Jones <yf110@vtn1.victoria.tc.ca> dijo:
>  T. Alex Beamish (talexb@tabsoft.on.ca) wrote:
[ snip ]
>  : open ( CMD, "dir |" ) or die "Error opening pipe: $!";
>  : while ( <CMD> )
[ snip ]
>  I guess you must have a "dir" command on linux.

I know my Mandrake distro does.

$ which dir
/usr/bin/dir
$ dir
Desktop  News  build  lib  mp3	public_html  tmp
Mail	 bin   doc    m3u  pix	src	     vault
$

It seems to be equivalent to 'ls -C', but takes all of the same
arguments.  Well, except for ls' behavior when STDOUT isn't a tty.

$ ls /
Desktop  News  build  lib  mp3	public_html  tmp
Mail	 bin   doc    m3u  pix	src	     vault
$ ls / | cat
bin
boot
dev
etc
home
lib
misc
mnt
mp3
opt
proc
root
sbin
tmp
usr
var
$ dir /
bin   dev  home  misc  mp3  proc  sbin	usr
boot  etc  lib	 mnt   opt  root  tmp	var
$ dir / | cat
bin   dev  home  misc  mp3  proc  sbin	usr
boot  etc  lib	 mnt   opt  root  tmp	var
$ ls -l /usr/bin/dir /bin/ls
-rwxr-xr-x    1 root     root        46736 Mar 31  2000 /bin/ls
-rwxr-xr-x    1 root     root        46736 Mar 31  2000 /usr/bin/dir
$ diff /usr/bin/dir /bin/ls
Binary files /usr/bin/dir and /bin/ls differ
$ cmp /usr/bin/dir /bin/ls
/usr/bin/dir /bin/ls differ: char 44905, line 154
$ wc -l < /usr/bin/dir
    158
$ uname -a
Linux vegeta 2.2.15-4mdk #1 Wed May 10 15:31:30 CEST 2000 i686 unknown
$

So, they are almost the same.

(Linux Mandrake 7.1, btw)

-- 
Nearly all men can stand adversity, but if you
want to test a man's character, give him power.
    -- Abraham Lincoln


------------------------------

Date: Wed, 10 Oct 2001 22:32:22 GMT
From: clintp@geeksalad.org (Clinton A. Pierce)
Subject: Re: Optimizing lookup in hash of regexps
Message-Id: <W%3x7.160660$K6.76730750@news2>

[Posted and mailed]

In article <87d73v9re1.fsf@powerhouse.boogie.cx>,
	Matt Christian <mattc@visi.com> writes:
> I have a Perl program that looks up data by matching against a hash of
> regexps (regular expressions) as follows:
> ============================ CUT & PASTE ============================
> [ HACK HACK HACK ]
> ============================ CUT & PASTE ============================

For starters, pre-compiling all those regexes saves a helluva lot of time.
Without changing your algorithm a lot:

my %lookupTable_precomp = (
  '1122'        => [ 'a value', qr/1122/ ],
  '2308'        => [ 'beta', qr/2308/ ],
  '[0-9]?99990' => [ 'read this', qr/[0-9]?99990/ ],
  '.*ERV'       => [ 'a value', qr/.*ERV/ ],
  '[a-z][0-9]+' => [ 'read this', qr/[a-z][0-9]+/],
  'ABCZ.+'      => [ 'alphabet', qr/ABCZ.+/ ],
  '.*random'    => [ 'beta', qr/.*random/ ],
);
sub lookup_pre {
  my $datum = shift;
  my $lookupTable = shift;
  my ($key, $result);
  foreach $key (keys %{ $lookupTable }) {
    if ($datum =~ /$lookupTable->{$key}->[1]/) {
      $result = $lookupTable->{$key}->[0];
      last;
    }
  }
  return $result;
}

PS: Style nit: I despise the re-use of %lookupTable and $lookupTable as
a variable name there.  Now, using this as a test wrapper:

use Benchmark;
my $result;
@data=map { @data } (0..100);  # NEED MORE DATA!!
timethese(10, {
        original => sub {
                        foreach $datum (@data) {
                                $result=lookup($datum, \%lookupTable);
                        }
                },
        precompiled => sub {
                        foreach $datum (@data) {
                                $result=lookup_pre($datum, \%lookupTable_precomp);
                        }
                }
})


This gets you:
  original:  3 wallclock secs ( 3.31 usr +  0.00 sys =  3.31 CPU) @  3.02/s (n=10)
precompiled:  1 wallclock secs ( 0.92 usr +  0.00 sys =  0.92 CPU) @ 10.87/s (n=10)

Which is a pretty snazzy improvement without really trying.

Turning the table inside out like this:

	use re 'eval';
	my @foo=();
	my @names=();
	my $r=0;
	for (keys %lookupTable) {
		push(@foo, "((?{\$a=$r})$_)");
		push(@names, $lookupTable{$_});
		$r++;
	}
	$r=join("|", @foo);  $r=qr/$r/;
	# Creates:
	# /((?{$a=1})foo)|((?{$a=2})bar)|...../

	sub lookup_long {
		my $datum=shift;
		if ($datum =~ /$r/) {
			return $names[$a];
		}
		return;
	}

Gave similar improvements and might perform even better depending on your
particular balance of data and regexes.



PS
Question: Is there ever a chance of having more than one key match a given
piece of data?  Since the keys are retrieved semi-randomly, is that considered
a bug?
-- 
    Clinton A. Pierce            Teach Yourself Perl in 24 Hours  *and*
  clintp@geeksalad.org                Perl Developer's Dictionary
"If you rush a Miracle Man,     for details, see http://geeksalad.org     
	you get rotten Miracles." --Miracle Max, The Princess Bride


------------------------------

Date: Thu, 11 Oct 2001 02:04:23 +0200
From: "Markus Dehmann" <markus.cl@gmx.de>
Subject: Re: Optimizing lookup in hash of regexps:  15 times faster
Message-Id: <9q2nrc$le54h$1@ID-101658.news.dfncis.de>

--- Original Message -----
From: "Matt Christian" <mattc@visi.com>
Newsgroups: comp.lang.perl.misc
Sent: Wednesday, October 10, 2001 10:22 PM
Subject: Optimizing lookup in hash of regexps


> Hi,
>
> I have a Perl program that looks up data by matching against a hash of
> regexps (regular expressions) as follows:
> ...


Hi,

I have a good advice for you: don't store your data separately in a hash! I
did some experiments with Benchmark.pm.
I let your code run 10000 times and it took 29 secs. Without storing the
data in hash it took only 2 secs!

If you want a fast script leave out the lookup function and the lookup hash.
Just use primitive if/else statements. Below comes the 2-secs-version. You
can generate such a code automatically from your regexps if you want. Then,
you can always care for your regexps in a separate file and produce a fast
code with if/else automatically from it.

Markus.

============================ CUT & PASTE ============================
 my $datum;
 foreach $datum (@data) {
   print "datum = '$datum'\t";

  my $result;
  # Try to match the current regexp against the datum

  if ($datum =~ /1122/){
     $result = 'a value';}
  elsif ($datum =~ /2308/){
     $result = 'beta';}
  elsif ($datum =~ /[0-9]?99990/){
   $result = 'read this';}
  elsif ($datum =~ /.*ERV/){
    $result = 'a value';}
  elsif ($datum =~ /[a-z][0-9]+/){
   $result = 'read this';}
  elsif ($datum =~ /ABCZ.+/){
   $result = 'alphabet';}
  elsif ($datum =~ /.*random/){
     $result = 'beta';}

   if (defined($result)) {
 print "result = '$result'";
   } else {
  print "NO MATCH!";
   }
   print "\n";
 }


}






------------------------------

Date: 10 Oct 2001 15:52:09 -0700
From: ssa1701@yahoo.co.uk (shaz)
Subject: Re: Pattern Matching
Message-Id: <23e71812.0110101452.7ce04da4@posting.google.com>

I want to test for the occurance of a particular string within a hash.

The following was suggested:

$y_or_n = grep { index($_, $small) != -1 } keys %strings;

but I have only now discovered that this does not match the string eaxactly.

If $small was "the", grep would match "there" and "then" also.

Is there any way of matching the exact string only. (ie find only "the")


------------------------------

Date: Wed, 10 Oct 2001 15:59:49 -0700
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Pattern Matching
Message-Id: <3bc4d2e5$1@news.microsoft.com>

"shaz" <ssa1701@yahoo.co.uk> wrote in message
news:23e71812.0110101452.7ce04da4@posting.google.com...
> I want to test for the occurance of a particular string within a hash.
> The following was suggested:
> $y_or_n = grep { index($_, $small) != -1 } keys %strings;
> but I have only now discovered that this does not match the string
eaxactly.
> If $small was "the", grep would match "there" and "then" also.
> Is there any way of matching the exact string only. (ie find only "the")

Well, yes, sort of. If you insist on using pattern matching you could anchor
the pattern
    m/^the$/
but I wonder why you would want to do that.
The easier solution is obviously to simply compare them:
    $_ eq 'the'
should be much faster.

jue




------------------------------

Date: Wed, 10 Oct 2001 15:58:55 -0700
From: Jeff Zucker <jeff@vpservices.com>
Subject: Re: Pattern Matching
Message-Id: <3BC4D2AF.13200960@vpservices.com>

shaz wrote:
> 
> I want to test for the occurance of a particular string within a hash.
> 
> The following was suggested:
> 
> $y_or_n = grep { index($_, $small) != -1 } keys %strings;

Umm, whats wrong with 

    $y_or_n = defined $strings{$small}; 

-- 
Jeff


------------------------------

Date: 10 Oct 2001 17:16:27 GMT
From: Juha Laiho <Juha.Laiho@iki.fi>
Subject: Re: perlsec + taint
Message-Id: <9q1vpb$404$1@ichaos.ichaos-int>

perlmisk@yahoo.co.uk (perl misk) said:
>sub get_hostname {
> 
>    $ENV{'PATH'} = '/bin:/usr/bin';  # taint checking :)
>    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
> 
>    if ($ENV{HOSTNAME}) { # first attempt
>        return;
>    }
>    else {
> 
>        if (`uname -a` =~m/^\w+\s+(\w+)\s/g) { # second attempt
>            return $1;
>        }
>        else { # failed :(
>            return 'UNKNOWN';
>        }
> 
>    }
>}
>
>my $hostname = get_host_name();

>Is $hostname once again "tainted" as all the changes to ENV happen
>within {}'s so they have expired.

Everything that comes from outside your program is tainted;
- environment variables
- contents of files
- output of external programs
- values of command-line arguments

 ...btw, is $0 tainted? How about value of current working directory?
Just had an idea that perhaps these should be.

But if you know for sure that something is not tainted (like you
might know for $ENV{HOSTNAME}), it is possible to untaint the data.
Even better, even thhough you might be rather certain, check the
environment variable with a regex. F.ex., only allow letters, digits,
minus-sign and dot. Additionally, don't allow minus-sign or dot as
the first character. This should already be rather good. See perlsec
for details for untainting data.

>Is it "bad practice" to use system or ``'s?

Not as such, but I'd avoid calling any external program for which
the same functionality is available from within Perl. In a busy
machine (or within a tight loop) the speed difference will be
noticeable, so this is a good habit.

A further comment to your code above; if your code will be part of
a CGI script, check whether you actually wish to use $ENV{HOSTNAME}
or $ENV{SERVER_NAME}.
-- 
Wolf  a.k.a.  Juha Laiho     Espoo, Finland
(GC 3.0) GIT d- s+: a C++ UH++++$ UL++++$ P++@ L+++ E(-) W+$@ N++ !K w !O
         !M V PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h--- r+++ y+++
"...cancel my subscription to the resurrection!" (Jim Morrison)


------------------------------

Date: Wed, 10 Oct 2001 22:15:09 GMT
From: garry@ifr.zvolve.net (Garry Williams)
Subject: Re: perlsec + taint
Message-Id: <slrn9s9i3c.g7n.garry@zfw.zvolve.net>

On 10 Oct 2001 01:33:13 -0700, perl misk <perlmisk@yahoo.co.uk> wrote:
> sub get_hostname {
>  
>     $ENV{'PATH'} = '/bin:/usr/bin';  # taint checking :)
>     delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
>  
>     if ($ENV{HOSTNAME}) { # first attempt
>         return;
>     }
>     else {
>  
>         if (`uname -a` =~m/^\w+\s+(\w+)\s/g) { # second attempt
>             return $1;
>         }
>         else { # failed :(
>             return 'UNKNOWN';
>         }
>     }
> }
> 
> my $hostname = get_host_name();
> 
> Is $hostname once again "tainted" as all the changes to ENV happen
> within {}'s so they have expired.

%ENV is a package variable and it was not local()ized in the sub, so
the change to its value made in the sub affects the package variable.
The value has not "expired".  

But the value(s) in %ENV have nothing to do with the taintedness of
the $hostname variable.  

The value assigned to $hostname is either the value of an environment
variable or a substring of the output of an external command or a
constant.  The first possibility will be considered tainted; the last
two will will be considered untainted.  (What's the /g doing in the
match?  It does nothing but obfuscate.)  

The first section of the perlsec manual page explains that the result
of `uname ...` will be considered tainted.  This is completely
independent of the value of *any* environment variables.  

But you probably already know that `uname ...` will result in an
insecure exception unless $ENV{PATH} has been set.  

By the way, the FAQ suggests a better way to determine the host name
(perlfaq9, "How do I find out my hostname/domainname/IP address?").  

> Is it "bad practice" to use system or ``'s?

No.  

-- 
Garry Williams


------------------------------

Date: Wed, 10 Oct 2001 23:03:05 GMT
From: Dave Tweed <dtweed@acm.org>
Subject: Re: Reformat Chain
Message-Id: <3BC4D250.E8414DA5@acm.org>

Houda Araj wrote:
> I used this script to reformat chain.  What Went Wrong?
> 
> Script
> #!/usr/bin/perl -w
> use strict;
> my @words = split /\s+/, <DATA>;
> my @notes = split /\s+/, <DATA>;
> print join( ' ', map { $_ . "_" . uc( splice @notes,0,1 ) } @words );

You forgot to include the data. The <DATA> filehandle refers to data
that is embedded in the script, following a line containing only the
string "__DATA__", as Thomas posted it.

-- Dave Tweed


------------------------------

Date: Wed, 10 Oct 2001 23:45:16 GMT
From: garry@ifr.zvolve.net (Garry Williams)
Subject: Re: Regular Expressions
Message-Id: <slrn9s9ncc.g7n.garry@zfw.zvolve.net>

On Wed, 10 Oct 2001 21:15:03 +0100, Ben <ben@benhaworth.com> wrote:
> Does any one know how to search a string from say four different letters but
> with them in any order?  e.g. I want to look for the letters e,l,t and n in
> the Unix words file.  They all need to be in the word but can be in any
> order.

  perl -ne '/(?=.*e)(?=.*l)(?=.*t)(?=.*n)/ && print' /usr/dict/words

-- 
Garry Williams


------------------------------

Date: Wed, 10 Oct 2001 23:46:30 GMT
From: "Daniel Berger" <djberg96@hotmail.com>
Subject: Re: sort of array containing strings doesn't work
Message-Id: <q55x7.8956$CN5.572805@typhoon.mn.mediaone.net>


"Abigail" <abigail@foad.org> wrote in message
news:slrn9s91vd.soe.abigail@alexandra.xs4all.nl...
> Ralph Jocham (rjocham72@netscape.net) wrote on MMCMLXI September MCMXCIII
> in <URL:news:ee8febb5.0110091426.1b3fa8f5@posting.google.com>:
> @@ Hello,
> @@ I am doing some kind of dependency analyisis for java packages.
> @@ As a result I get a list of all imports from all classes. Duplicates
> @@ are already removed.
> @@ Then I sort the array with : sort @uniq;
> @@ But the result is not sorted. Is there a size limite. The array
contains
> @@ about 2000 lines??
>
> The only size limit is your system. You can't sort a huge array if
> you don't have the RAM for it. But if Perl runs out of memory, it'll
> tell you.
>
> You probably have made a mistake.
>
>
>
> Abigail

On the other hand, Abigail, I've seen cases where values simply dropped off
the stack in very large arrays (actually, I think it was a hash in my case)
without any warning from Perl or the OS.  It took about 10 times through the
debugger before I would believe my own eyes, but in the end I had to accept
what was happening.  Fortunately, I was able to work around it.

This is rare, though I'm not the only one in my office that it's happened
to, so I'm pretty sure I'm not going crazy.  Well, maybe I am, but it would
be for other reasons. :)

Regards,

Mr. Sunblade






------------------------------

Date: 10 Oct 2001 19:00:56 -0400
From: Joe Schaefer <joe+usenet@sunstarsys.com>
Subject: Re: Sorting larg arrays
Message-Id: <m37ku3f6c7.fsf@mumonkan.sunstarsys.com>

"Mr. Mostrom" <mostromx32s@x32siname.com> writes:

> #!/usr/bin/perl
> #
> # Filename: example.pl
> #
> 
> @DIRS="/to/dir1 /to/dir2 /to/dir3";
> 
> foreach $DIR (@DIRS) {
>     @FILES = `find $DIR`;
>     @TOTAL = (@TOTAL, @FILES);
> }
> 
> #
> # Say, we now have 100.000 files in @TOTAL
> # Which is the quickest way to sort it, or
> # is it not doable, thinking about the time.
> #
> # I Want this do be done it just a second or two.

If you want your whole program to run in under a 
second or two, your request is surely impossible.

First, this line (dropping the useless loop)

  @FILES = `find @DIRS`;

will take many more than two seconds to complete.
Second,

  @TOTAL = (@TOTAL, @FILES);

is about the silliest way imaginable to add entries 
to @TOTAL.  What you want is push():

  push @TOTAL, @FILES;

Were it really needed, that would save you a 
a lot of useless frees and mallocs, and probably 
a second or two of CPU time.

Lastly, the function you are looking for 
to sort your files is ...drum roll please... 



"sort":

  my @files = sort `find @DIRS`;

which is an absolute speed demon compared to the 
original code you posted. If that's not fast enough, 
get better hardware.

In any case, get yourself a good book on Perl, 
learn how to use the documentation that comes with 
perl (i.e. perldoc), and consider subscribing to the
Perl beginners lists at

  http://learn.perl.org

And if I were you, I wouldn't post to clp.misc until 
I'd learned that (almost) all perl programs begin with

  #!/usr/bin/perl -wT
  use strict;

that (almost) all programer-defined variables are 
declared with "my", and that it is conventional to 
use lower-case (rather than upper-case) names for 
them.

-- 
Joe Schaefer    Did I ever tell you that Mrs. McCave had twenty-three sons and
               she named them all Dave?  Well she did, and that wasn't a smart
                                         thing to do.
                                               -- Dr. Seuss



------------------------------

Date: 10 Oct 2001 15:15:02 -0700
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Stop Transversal of a Directory with Tar and Unzip
Message-Id: <m13d4r5ehl.fsf@halfdome.holdit.com>

>>>>> "BUCK" == BUCK NAKED1 <dennis100@webtv.net> writes:

BUCK> I understand, here's a test that I ran for -x, and it excludes the files
BUCK> that match the patterns, if I type it like this:
 
BUCK> `unzip -qjnCL $tmpfile -x "*\.pl" "*readme*" "*\.ht" "*\.exe" -d $tmpdir
2> &1`

BUCK> The strange thing is that if I break the above into more readable lines
BUCK> (below), the -x doesn't work. Any idea why?

BUCK> `unzip -qjnCL $tmpfile 
BUCK>     -x "*\.pl" "*readme*" "*\.ht" "*\.exe" 
BUCK>     -d $tmpdir 2>&1`

Yes, newlines inside of qx// are given as newlines to the shell,
and that's a statement delimiter!

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


------------------------------

Date: Thu, 11 Oct 2001 00:04:00 +0100
From: "Adrian Philpott" <RevolvingDoors@btopenworld.com>
Subject: Re: YOU ARE ALL GAY!
Message-Id: <9q2k92$2vd$1@plutonium.btinternet.com>

Yeh, reckon I had better go get in the closet so I can come out of it!




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 1908
***************************************


home help back first fref pref prev next nref lref last post