[1550] in Perl-Users-Digest
Perl-Users Digest #548
daemon@ATHENA.MIT.EDU (Digestifier)
Fri Sep 11 05:10:39 1992
From: Digestifier <Perl-Users-Request@fuggles.acc.Virginia.EDU>
To: Perl-Users@fuggles.acc.Virginia.EDU
Reply-To: Perl-Users@fuggles.acc.Virginia.EDU
Date: Thu, 10 Sep 92 20:12:58 EDT
Perl-Users Digest #548, Volume #2 Thu, 10 Sep 92 20:12:58 EDT
Contents:
Re: What's the difference between `command` and system('command')? (Michael Cook)
RE: Case dependent substitution (Kevin Burton)
Re: Case dependent substitution (Michael Cook)
inctree, p. 274 in Programming Perl (David Bultman)
Fast String Operations? (Randy garrett)
Re: stupid questions : deleted ing leading spaces (Jan D.)
Re: What's the Right Way to undump perl on a SPARC? (Jan D.)
----------------------------------------------------------------------------
From: mcook@melkur.dev.cdx.mot.com (Michael Cook)
Subject: Re: What's the difference between `command` and system('command')?
Date: Thu, 10 Sep 1992 17:39:11 GMT
sasswb@unx.sas.com (Scott Bass) writes:
>What the difference between saying `some Unix command` and system('some Unix
>command')? Are they synonymous? Are there situations where one is better
>than the other?
`foo` captures foo's output. system('foo') doesn't.
To see the difference, try this:
$foo = `echo foo`;
print "foo=($foo), exit=$?\n";
$bar = system('echo bar');
print "bar=($bar), exit=$?\n";
>Finally, what I need to do is export the DISPLAY variable from a PERL script
>that's submitted via cron. I was just going to give the "export
>DISPLAY=value" (ksh) command. If there is a better PERL way of doing it,
>please let me know.
See %ENV in the manual:
$ENV{'DISPLAY'} = 'foo:0.0';
Michael.
------------------------------
From: Kevin Burton <noran!iowa!kburton@uunet.uu.net>
Subject: RE: Case dependent substitution
Reply-To: noran!iowa!kburton@uunet.uu.net
Date: Thu, 10 Sep 1992 19:48:24 GMT
Tom Christiansen writes:
| I've often wanted a /I switch that behaved this way. Here's
| what I use:
|
| s/abc/&mapcase('xyz')/gie;
|
| where mapcase is as follows:
|
| sub mapcase {
| local($lhs, $rhs) = ($&, shift);
| for (local($i) = 0; $i < length($lhs); $i++) {
| substr($lhs, $i, 1) =~ tr/A-Z//
| ? substr($rhs, $i, 1) =~ tr/a-z/A-Z/
| : substr($rhs, $i, 1) =~ tr/A-Z/a-z/;
| }
| $rhs;
| }
| .
| .
| .
| But it's not entirely satisfactory. Consider
|
| $_ = "Green is the green apple who has GREEN WORMS\n";
| print;
| s/green/&mapcase3('red')/gie;
| print;
|
| print "\n";
|
| $_ = "Red is the red apple who has RED WORMS\n";
| print;
| s/red/&mapcase3('green')/gie;
| print;
|
|
| Yields:
|
| Green is the green apple who has GREEN WORMS
| Red is the red apple who has RED WORMS
|
| Red is the red apple who has RED WORMS
| Green is the green apple who has GREen WORMS
Thanks for the reply it gave me enough to generate the solution that
is as follows:
sub mapcase {
local($lhs, $rhs) = ($&, shift);
local($i,$j);
# the case of the target will match th case of the source
for ($i = 0; $i < length($lhs); $i++)
{
substr($lhs, $i, 1) =~ tr/A-Z//
? substr($rhs, $i, 1) =~ tr/a-z/A-Z/
: substr($rhs, $i, 1) =~ tr/A-Z/a-z/;
}
# Avoid getting GReen when substituting re5 with green
while(substr($lhs, $i-1, 1) !~ tr/A-Za-z//)
{
$i--;
}
# Avoid getting GREen when substituting red with green
for ($j = $i; $j < length($rhs); $j++)
{
substr($lhs, $i-1, 1) =~ tr/A-Z//
? substr($rhs, $j, 1) =~ tr/a-z/A-Z/
: substr($rhs, $j, 1) =~ tr/A-Z/a-z/;
}
$rhs;
}
Thank you!
-
This is not an official statement of Noran Instruments. No warranty is
expressed or implied. The information included herein is not to be construed
as a committment on Noran's part.
Kevin Burton
Noran Instruments voice: (608) 831-6511 x317
2551 West Beltline Highway, Room 532 FAX: (608) 836-7224
Middleton, WI 53562 email: kburton@noran.com
------------------------------
From: mcook@melkur.dev.cdx.mot.com (Michael Cook)
Subject: Re: Case dependent substitution
Date: Thu, 10 Sep 1992 17:44:18 GMT
Kevin Burton <noran!iowa!kburton@uunet.uu.net> writes:
>I would like to substitute an arbitrary pattern for another, but it all of
>the alphabetic characters in the source pattern are upper case I would like
>to case the replacement pattern to also force upper case substitution.
>For example:
> substitute "abc" in "ABCxyzabc" with "xyz" produces "XYZxyzxyz"
> substitute "abc" in "abcxyzABC" with "xyz" produces "xyzxyzXYZ"
I was thinking that you could do something like this:
s%abc% ($k = $&) =~ /a-z/ ? "\L$k" : "\U$k" %gie;
But it doesn't work. It seems inside match (/a-z/) always fails, and you
always get "\U$k". Hmm...
Michael.
------------------------------
From: bultman@sonny.unx.sas.com (David Bultman)
Subject: inctree, p. 274 in Programming Perl
Date: Thu, 10 Sep 1992 20:25:37 GMT
I typed in the inctree program in Programming Perl and something is awry.
(p. 274 in the First edition, I checked the latest edition and the program
is identical)
If anyone has any ideas or previous experience with this, your feedback
would be greatly appreciated.
When I try inctree main.c and main.c looks like,
-- here --
#include <stdio.h>
main()
{
}
-- to here --
the output is,
Start program
main.c
/usr/include/stdio.h
/usr/include/sys/stdsyms.h
/usr/include/stdio.h DUPLICATE
main.c DUPLICATE
-- cut here ---
#!/usr/local/bin/perl
# Usage: inctree [options] [files]
# Configuration parameters.
printf "Start program\n";
$CPP = 'cc -E';
# $CPP = "cc -P";
# $CPP = "/lib/cpp";
$shiftwidth = 4;
# Process switches.
while ($ARGV[0] =~ /^-/) {
$_ = shift;
if (/^-D(.*)/) {
$defines .= " -D" . ($1 ? $1 : shift);
}
elsif (/^-I(.*)/) {
$includes .= " -I" . ($1 ? $1 : shift);
}
elsif (/^-m(.*)/) {
push(@pats, $1 ? $1 : shift);
}
elsif (/^-l/) {
$lines++;
}
else {
die "Unrecognized switch: $_\n";
}
}
# Build a subroutine to scan for any specified patterns.
if (@pats) {
$sub = "sub pats {\n";
foreach $pat (@pats) {
$sub .= " print '>>>>>>> ',\$_ if m$pat;\n";
}
$sub .= "}\n";
eval $sub;
++$pats;
}
# Now process each file on the command line.
foreach $file (@ARGV) {
open(CPP,"$CPP $defines $includes $file|")
|| die "Can't run cpp: $!\n";
$line = 2;
while (<CPP>) {
++$line;
&pats if $pats; # Avoid expensive call if we can.
next unless /^#/;
next unless /^# \d/;
($junk,$newline,$filename) = split;
$filename =~ s/"//g;
# Now figure out if it's a push, a pop or neither.
if ($stack[$#stack] eq $filename) { # Same file.
$line = $newline-1;
next;
}
if ($stack[$#stack-1] eq $filename) { # Leaving file.
$indent -= $shiftwidth;
$line = pop(@lines)-1;
pop(@stack);
}
else {
printf "%6d ", $line-2 if $lines;
push(@lines,$line);
$line = $newline;
print "\t" x ($indent / 8), ' ' x ($indent % 8),
$filename;
print " DUPLICATE" if $seen{$filename}++;
print "\n";
$indent += $shiftwidth;
push(@stack,$filename);
}
}
close CPP;
$indent = 0;
%seen = ();
print "\n\n";
$line = 0;
}
--- end cut ---
--
David Bultman, bultman@unx.sas.com SAS Institute, Inc.
Core Research and Development SAS Campus Dr, Cary, NC
(919) 677-8000 x6875 27513-2414 USA
------------------------------
From: rlg@IDA.ORG (Randy garrett)
Subject: Fast String Operations?
Date: Thu, 10 Sep 92 22:04:38 GMT
Here at last is a long delayed summary.
The original question had to do with replacing all occurrences
of "||" in a string with either a ~ or a -1 depending on
the value of another array. This is basically a database
conversion process, where all fields that are of type int ought
to get a NULL value of -1 and all fields of type char should get
a ~ . For instance, given a string "abc|123|||def|" and a
corresponding array (1,2,1,2,2,1) produce the string
"abc|123|-1|~|def|-1", if we assume an array value of 1 means int
and an array value of 2 means ~ .
I got some great suggestions for improving the speed of my code.
Thanks very much for all the help!
The code I posted was a subset of the actual code, so a few
of the suggestions were difficult to merge into the real code.
My explanation of some of the fine points also left something
to be desired.
Finally, I just ran out of time to do testing on some of the
submissions. Sorry! If time permits, I will also test those.
Also, I munged the submitted code slightly to permit benchmarking.
If I lost the original intent in doing so, please correct me.
The system I tested on was a Sun 4/490 running SunOS 4.1.1 and
perl-4.019 compiled using the Sun bundled C compiler. A better
compiler might speed all these results up. I identified the
submissions by FS0 thru FS4. I did not finish benching
submissions FS5 thru FS 9. Code and their authors is at the
end of this message. I am extremely sorry to say that I lost
the author's names of FS3 and 4. Please forgive me, and if
you read this send me a reply.
For a 10,000 character length string, all NULLS: (extreme test case)
All times the sum of user and system CPU seconds. Also, it took
about 1.0 second just to init the string.
FS0: 21
FS1: 10.6
FS2: 13.5
FS3: 4.8
FS4: 10
As you can see all the submissions were about twice as fast as
my original, and FS3 was about 4x faster.
For a more realistic test, I ran a loop with 4000 iterations with
a string of 25 Nulls. Just running a 4000 iteration loop with
no ops inside took .3 seconds.
FS0: 63
FS1: 37
FS2: 39
FS3: 18
FS4: 35
Since FS3 looked like the winner, I did more tests on it.
# of loops / # of Nulls Time
1 / 10000 4.8
1 / 5000 2.0
1 / 1000 .5
1 / 100 .2
100 / 100 2.2
1000 / 100 19.4
4000 / 100 76.0
1000 / 25 4.5
4000 / 25 18.0
######### Actual Code ################
########### FS0: by me -- Randy Garrett #########
#!/usr/local/bin/perl
# Walk thru a string; if find "||" inset either a ~ or a -1
# depending on value of @name
$Num = 25;
#$string = "a|bcd||g||i||||||";
$str = "|" x $Num;
for ($j = 0; $j < $Num; ++$j)
{ push (@name,"1"); }
# print "$string\n";
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
for ($j = 0; $j < 4000; ++$j)
{
$count = 0; # index into @name
$pos = 0; # index into $string
$string = $str;
while (($pos = index($string,'|',$pos)) >= 0) {
# print "Found at $pos $count = $name[$count]\n";
if (substr($string,$pos+1,1) eq "|" ) { # found a NULL
if ($name[$count] == 1) { # int ?
substr($string,$pos+1,0) = -1; }
elsif ($name[$count] == 2 ) { # char?
substr($string,$pos+1,0) = "~"; }
}
$pos++; $count++;
}
# print "$string\n";
}
print "$string\n";
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
################### FS1 #########################
# Hal R. Pomeranz -- pomeranz@nas.nasa.gov
#!/usr/local/bin/perl
$Num = 25;
$string = "|" x $Num;
for ($j = 0; $j < $Num; ++$j)
{ push (@name,"1"); }
#$string = "a|bcd||g||i|||";
#@name = (1,2,1,1,2,1,2,1);
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
for ($j = 0; $j < 4000; ++$j)
{
@cols = split(/\|/, $string);
for ($i = 0; $i < @name; $i++) {
$cols[$i] = ($name[$i] == 1 ? -1 : "~") unless ($cols[$i]);
}
# print join('|', @cols), "|\n";
join('|', @cols);
}
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
########## FS 2 ###############
# Christopher Davis -- ckd@eff.org
#
#!/usr/local/bin/perl -- # -*-Perl-*-
# $string = "a|bcd||g||i|||";
# @name = (1,1,2,1,1,2,1,2,1); # yours was off-by-one, I added 1 at start
$Num = 25;
$string = "|" x $Num;
for ($j = 0; $j < $Num; ++$j)
{ push (@name,"1"); }
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
for ($j = 0; $j < 4000; ++$j)
{
$nf = scalar(@name); # number of fields
undef @string;
foreach (@name) { # give us a nice boolean
push(@string,$_ - 1);
}
#@fields =split('\|',$string,$nf); # $nf will force it to keep nulls at end
@fields =split('\|',$string);
undef @newfields;
foreach (@string) { # i.e. for each field
$thisfield = shift(@fields); # take it off the old list
$thisfield = ($_?"~":-1) unless $thisfield; # if it's null, fix it
push(@newfields,$thisfield); # put it on the new list
}
# $newstring = join("|",@newfields,"|"); # add a pipe at the end
$newstring = join("|",@newfields); # add a pipe at the end
# print $newstring,"\n";
}
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
################### FS3 ###########################
#!/usr/local/bin/perl
# Walk thru a string; if find "||" insert either a ~ or a -1
# depending on value of @name
$Num = 25;
$string = "|" x $Num;
for ($j = 0; $j < $Num; ++$j)
{ push (@name,"2"); }
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
#$string = "a|bcd||g||i|||";
#@name = (1,2,1,1,2,1,2,1);
%defaults = (1, '-1', 2, "~"); # Specify defaults for each of your types
for ($j = 0; $j < 4000; ++$j)
{
@strings = split(/\|/, $string);
for $count ( 0.. $Num )
# foreach $count ( @strings )
{
$strings[$count] = $defaults{ $name[$count] } if $strings[$count] eq '';
}
$string = join("|", @strings);
# print "$string\n";
}
print "$string\n";
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
##################### FS4 #########################
#!/usr/local/bin/perl
#$string = "a|bcd||g||i|||";
#@name = (1,2,1,1,2,1,2,1);
$Num = 25;
$string = "|" x $Num;
for ($j = 0; $j < $Num; ++$j)
{ push (@name,"1"); }
for ($j = 0; $j < 4000; ++$j)
{
# if ($string =~ /\|\|/)
{
# chop ($string);
@field = split (/\|/, $string, @name +1);
for ($index = 0; $index < @field; $index++)
{
$field[$index] = ('', '-1', '~')[$name[$index]] if $field[$index] eq '';
}
$newstring = join ('|', @field);
}
# print "$newstring\n";
}
($u, $s, $cs, $cu) = times;
print "Time for $j = $u $s\n";
####### Following code not benchmarked ############
######################### FS 5 ###################
# Raymond Chen -- rjc@math.Princeton.EDU
#! /usr/local/bin/perl
$string = "a|bcd||g||i|||";
@default = ("-1","~", "-1", "-1", "~", "-1", "~", "-1","-1");
@F = split(/\|/, $string);
for ($i = 0; $i < $#F; $i++) { # unroll this loop for speed
$F[$i] = $default[$i] unless $F[$i];
}
print join("|", @F), "\n";
######################## FS 6 ####################
# From Mike Flynn (flynn_mike@jpmorgan.com)
#!/usr/local/bin/perl
# Walk thru a string; if find "||" insert either a ~ or a -1
# depending on value of @name
#@name = (1,2,1,1,2,1,2,1);
$Num = 25;
$string = "|" x $Num;
for ($j = 0; $j < $Num; ++$j)
{ push (@name,"1"); }
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
for ($j = 0; $j < 40; ++$j)
{
$count = 0;
foreach $element (split(/\|/, $string)) {
print "element = $element\n";
if ($element) {
print $element;
print "|" unless ($element eq "\n");
} else {
print "-1|" if ($name[$count] == 2);
print "~|" if ($name[$count] == 1);
}
$count++;
}
}
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
######################### FS 7 ####################
# From Larry Wall!
#!/usr/local/bin/perl -- # -*-Perl-*-
# Call this with test.data
$/ = '|';
#$Num = 25;
#$string = "|" x $Num;
#for ($j = 0; $j < $Num; ++$j)
# { push (@name,"2"); }
#($u, $s, $cs, $cu) = times;
#print "Time = $u $s\n";
@default = (-1,'~',-1,-1,'~',-1,'~',-1,-1,-1,-1,-1,'~',-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,'~');
while (<>) {
$field = 0, print "\n" if s/^\n//;
next unless /^\|/;
# print $default[$field];
}
continue {
# print;
++$field;
++$k;
}
# print "$string\n";
($u, $s, $cs, $cu) = times;
print "Time = $u $s\n";
########### Finally a submission to do it in C #######
# From Larry Wall, of all people (:-^)
#
# Unfortunately this code does a lot of other things
# that I don't want to recode it in C and I'm too lazy
# to learn to add my own user subroutines to Perl right now
#
#include <stdio.h>
char *def[] = {"-1","~","-1","-1","~","-1","~","-1"};
main() {
int ch;
int lastch;
int field = 0;
while ((ch = getc(stdin)) != EOF) {
if (ch == '\n')
field = 0;
else if (ch == '|') {
if (lastch == '|')
fputs(def[field], stdout);
field++;
}
putc(ch, stdout);
lastch = ch;
}
}
############### FS8 ###########################
# John Stoffel -- <john@wpi.WPI.EDU>
$a = split(/\|/,$string,100);
foreach $x (0 .. ($a-1)) {
if ($_[$x] eq "") {
if ($name[$x] == 1) { $_[$x] = "~"}
else { $_[$x] = "-1"}
}
}
print join('|',@_), "\n";
############### FS 9 ##########################
# Unfortunately, I forgot to mention that the
# strings are variable length, so it's not fair
# to assume they are always length 8.
# Thanks for your help, though ...
# Chris Sherman -- sherman@unx.sas.com
#!/usr/local/bin/perl
$string = "a|bcd||g||i|||";
@name = (0,1,0,0,1,0,1,0);
@vars =
($string =~ /([^|]*)\|([^|]*)\|([^|]*)\|([^|]*)\|([^|]*)\|([^|]*)\|([^|]*)/);for ($i = 0; $i < 8; $i++)
{
if ($vars[$i] eq "")
{
if ($name[$i])
{
$vars[$i] = "-1";
} else {
$vars[$i] = "~";
}
}
}
print join("|",@vars),"|\n";
------------------------------
From: jhd@irfu.se (Jan D.)
Subject: Re: stupid questions : deleted ing leading spaces
Date: Thu, 10 Sep 1992 22:58:48 GMT
In article <ASHERMAN.92Sep9160424@laser.fmrco.com> asherman@fmrco.COM writes:
>
>>>>>> jhd@irfu.se (Jan D.) said:
>
>jhd> chare@unilabs.uucp (Chris Hare) writes:
>
>>As a matter of of fact, if you can suggest a way to remove the leading
>>spaces, and compact multiple spaces any where else on the line to be one
>>space I'd appreciate it.
>>
>
>jhd> s/^ //;
>jhd> s/ +/ /g;
>
>jhd> If you want to replace general white space (i.e. tabs and space)
>jhd> you should say:
>
>jhd> s/^\s//;
>jhd> s/\s+/ /g;
>
>s/^\s+//; # or s/^ +//; for just spaces.
>s/\s+/ /g; # or s/ +//g; for just spaces.
>
>Otherwise:
>
> " Hello, world!"
>
>turns into:
>
> " Hello, world!"
>
>Not:
>
> "Hello, world!"
>
Well, I was looking at Chris first sentence:
I am running 4.035 and I need to delete a single leading space at the
beginning of a line.
And while we are nit picking, your comment for the second
line is wrong :-) :-).
Jan D.
------------------------------
From: jhd@irfu.se (Jan D.)
Subject: Re: What's the Right Way to undump perl on a SPARC?
Date: Thu, 10 Sep 1992 23:16:33 GMT
In article <ASHERMAN.92Sep9170013@laser.fmrco.com> asherman@fmrco.COM writes:
>>>>>> cjross@testament.bbn.com (Chris Ross) said:
>
>cjross> I could dig deeper on my own, but I'm out of time.
>cjross> Can perl be undumped on a SPARC, or am I out of luck?
>
>I think that it would be marvelously useful for Larry to put undump
>into the standard distribution. I wanted to undump something last
>night, and then someone told me just how big the TeX distribution is!
>I can't go tying up a 9600bps modem with a giant transmition like
>that...
>
>Larry? 5.0? Please.
>
If something like that will be added I'd prefer to have the unexec stuff from
GNU emacs compiled in by default (if available for that machine).
Are there machines that can undump, but not unexec? Or the other
way around? Which set is bigger?
For instance, HP PA-RISC can unexec but there isn't any
undump available (to my knowledge).
Jan D.
------------------------------
** FOR YOUR REFERENCE **
The service addresses, to which questions about the list itself and requests
to be added to or deleted from it should be directed, are as follows:
Internet: Perl-Users-Request@fuggles.acc.Virginia.EDU
Perl-Users-Request@uvaarpa.Virginia.EDU
BITNET: Perl-Req@Virginia
UUCP: ...!uunet!virginia!perl-users-request
You can send mail to the entire list (and comp.lang.perl) via one of these
addresses:
Internet: Perl-Users@uvaarpa.Virginia.EDU
BITNET: Perl@Virginia
UUCP: ...!uunet!virginia!perl-users
End of Perl-Users Digest
******************************