[9196] in Perl-Users-Digest
Perl-Users Digest, Issue: 2816 Volume: 8
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jun 5 11:07:28 1998
Date: Fri, 5 Jun 98 08:01:49 -0700
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 5 Jun 1998 Volume: 8 Number: 2816
Today's topics:
Re: Q: dispatch find/replace string <tchrist@mox.perl.com>
Re: Q: dispatch find/replace string <quednauf@nortel.co.uk>
Re: Regular Expressions (Scott Erickson)
Re: Regular Expressions <tchrist@mox.perl.com>
Re: shadowed password <tchrist@mox.perl.com>
Re: Spider programms in PERL <aqumsieh@matrox.com>
Re: Use of HTML, POD, etc in Usenet (was: Re: map in vo <boys@aspentech.com>
Re: Use of HTML, POD, etc in Usenet (was: Re: map in vo (Abigail)
Re: Why is there no "in" operator in Perl? (Ken Fox)
Re: Why is there no "in" operator in Perl? <jdporter@min.net>
Re: Why is there no "in" operator in Perl? <tchrist@mox.perl.com>
Re: Why is there no "in" operator in Perl? (I R A Aggie)
Win32::NetResource Module <Brian.Williams@uuplc.co.uk>
Digest Administrivia (Last modified: 8 Mar 97) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 5 Jun 1998 11:01:12 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: Q: dispatch find/replace string
Message-Id: <6l8j5o$cs7$3@csnews.cs.colorado.edu>
[courtesy cc of this posting sent to cited author via email]
In comp.lang.perl.misc,
Xah Lee <xah@shell13.ba.best.com> writes:
:
:What's an efficient way of doing find and replace with a large "find/replace" lookup table?
:
:An example:
:
: my $str = '1992 feb 1 kim';
: my %largeFindandRelpaceTable = ('jan'=>1,'feb'=>2,...);
: my $desiredResult = '1992 2 1 kim';
Believe it or not, simple tokenization doesn't win for a long
time because of the splitting overhead.
Here's one way:
#!/usr/bin/perl -w
# fixstyle - switch first set of <DATA> strings to second set
# usage: $0 [-v] [files ...]
use strict;
my $verbose = (@ARGV && $ARGV[0] eq '-v' && shift);
if (@ARGV) {
$^I = ".orig"; # preserve old files
} else {
warn "$0: Reading from stdin\n" if -t STDIN;
}
my $code = "while (<>) {\n";
# read in config, build up code to eval
while (<DATA>) {
chomp;
my ($in, $out) = split /\s*=>\s*/;
next unless $in && $out;
$code .= "s{\\Q$in\\E}{$out}g";
$code .= "&& printf STDERR qq($in => $out at \$ARGV line \$.\\n)"
if $verbose;
$code .= ";\n";
}
$code .= "print;\n}\n";
eval "{ $code } 1" || die;
__END__
analysed => analyzed
built-in => builtin
chastized => chastised
commandline => command-line
de-allocate => deallocate
dropin => drop-in
hardcode => hard-code
meta-data => metadata
multicharacter => multi-character
multiway => multi-way
non-empty => nonempty
non-profit => nonprofit
non-trappable => nontrappable
pre-define => predefine
preextend => pre-extend
re-compiling => recompiling
reenter => re-enter
turnkey => turn-key
If you run the first program on a hundred times that many pairs, you'll
see it bog down. Here's a version that doesn't have that's slower for
few changes, but faster when there are a lot of them.
#!/usr/bin/perl -w
# fixstyle2 - like fixstyle but faster for many many matches
use strict;
my $verbose = (@ARGV && $ARGV[0] eq '-v' && shift);
my %change = ();
while (<DATA>) {
chomp;
my ($in, $out) = split /\s*=>\s*/;
next unless $in && $out;
$change{$in} = $out;
}
if (@ARGV) {
$^I = ".orig";
} else {
warn "$0: Reading from stdin\n" if -t STDIN;
}
while (<>) {
my $i = 0;
s/^(\s+)// && print $1; # emit leading whitespace
for (split /(\s+)/, $_, -1) { # preserve trailing whitespace
print( ($i++ & 1) ? $_ : ($change{$_} || $_));
}
}
__END__
analysed => analyzed
built-in => builtin
chastized => chastised
commandline => command-line
de-allocate => deallocate
dropin => drop-in
hardcode => hard-code
meta-data => metadata
multicharacter => multi-character
multiway => multi-way
non-empty => nonempty
non-profit => nonprofit
non-trappable => nontrappable
pre-define => predefine
preextend => pre-extend
re-compiling => recompiling
reenter => re-enter
turnkey => turn-key
This second version breaks up each line into chunks of whitespace and
words, which isn't a fast operation. But then it uses those words to look
up their replacements in a hash, which is much faster than a substitution.
So the first part is slower and the second faster. How much difference
this makes depends on the number of matches.
If we didn't care so much about not changing the amount of whitespace
separating each word, the second version can run as fast as the first one
even for few changes. If you know a lot about your input, you can just
collapse white space into single blanks by plugging in this loop instead:
# very fact, but whitespace collapse
while (<>) {
for (split) {
print $change{$_} || $_, " ";
}
print "\n";
}
That leaves an extra blank at the end of each line. Place the following
code in front of the previous while loop above that's collapsing
whitespace:
my $pid = open(STDOUT, "|-");
die "cannot fork: $!" unless defined $pid;
unless ($pid) { # child
while (<STDIN>) {
s/ $//;
print;
}
exit;
}
--
"...this does not mean that some of us should not want, in a rather
dispassionate sort of way, to put a bullet through csh's head."
Larry Wall in <1992Aug6.221512.5963@netlabs.com>
------------------------------
Date: Fri, 05 Jun 1998 12:06:24 +0100
From: "F.Quednau" <quednauf@nortel.co.uk>
Subject: Re: Q: dispatch find/replace string
Message-Id: <3577D12F.F61EA77@nortel.co.uk>
Xah Lee wrote:
> What's an efficient way of doing find and replace with a large "find/replace" lookup table?
>
> An example:
>
> my $str = '1992 feb 1 kim';
> my %largeFindandRelpaceTable = ('jan'=>1,'feb'=>2,...);
> my $desiredResult = '1992 2 1 kim';
s/(jan)|(feb)|(mar)|.../%reptabl{$1}/
Could be wrong (I don't have any reference around right now...)
--
____________________________________________________________
Frank Quednau
http://www.surrey.ac.uk/~me51fq
________________________________________________
------------------------------
Date: Fri, 5 Jun 1998 09:21:25 -0500
From: Scott.L.Erickson@HealthPartners.com (Scott Erickson)
Subject: Re: Regular Expressions
Message-Id: <MPG.fe1bac4fa21a94b989680@news.mr.net>
In article <6l6mkb$1q48$1@news.gate.net>, dsiebert@gate.net says...
> I have the hardest time with these.
> I need to extract the text form between two tags
> <hw> and </hw>
> help please.
>
>
A couple of questions: Does the text between the two tags span multiple
lines? And are the tags always of that form, that is, no changes in case
and no white space between the angle brackets? Do the tags span blank
lines?
I would try the following:
$/ = ''; # turn on paragraph mode
/<hw>(.*?)<\/hw>/igsm # non greedy matching, case-insensitive, global
matching, ignore specialness of newlines
$extracted_text = $1;
Hope that helps.
Unfortunately, I am unable to test the above code because of system
problems. *sigh*
--
Scott Erickson
HealthPartners, Inc.
------------------------------
Date: 5 Jun 1998 14:29:22 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: Regular Expressions
Message-Id: <6l8vc2$oqa$4@csnews.cs.colorado.edu>
[courtesy cc of this posting sent to cited author via email]
In comp.lang.perl.misc,
dsiebert@gate.net (David Siebert) writes:
:I have the hardest time with these.
:I need to extract the text form between two tags
:<hw> and </hw>
:help please.
A regular expression is wrong in the general case. You can't do a fully
recursive parse with a regular expression. Use a module. Lift your eyes
to perlfaq9, wherein resides your answer. Is there some reason why you
declined to read the entries in that FAQ regarding "How do I remove HTML
from a string?" and "How do I extract URLs?"
I strongly suggest to *STOP*PROGRAMMING* Perl until you've taken an
afternoon to read the full FAQ included with every perl distribution,
and on your very system. The time it will save you in the long run will
be repaid a hundred times over.
--tom
--
I think I can sum up the difference between *BSD and Linux as follows:
"In Linux, new users get flamed for asking questions in the newsgroups
(or heaven forfend, the wrong newsgroup). In *BSD the principals
flame each other." --Warner Losh
------------------------------
Date: 5 Jun 1998 10:56:30 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: shadowed password
Message-Id: <6l8isu$cs7$2@csnews.cs.colorado.edu>
[courtesy cc of this posting sent to cited author via email]
In comp.lang.perl.misc,
Brandon George <brandon@gcn.ou.edu> writes:
:I bet this has come up before. I'm wanting to enter a password to a
:perl script have it echo *'s as it's typed. Is there a way of doing
:this without externally messing with stty?
It's in the FAQ. There's a module, or there's POSIX.
Here's POSIX. You'll have to do your own star-echoing.
# HotKey.pm
package HotKey;
@ISA = qw(Exporter);
@EXPORT = qw(cbreak cooked readkey);
use strict;
use POSIX qw(:termios_h);
my ($term, $oterm, $echo, $noecho, $fd_stdin);
$fd_stdin = fileno(STDIN);
$term = POSIX::Termios->new();
$term->getattr($fd_stdin);
$oterm = $term->getlflag();
$echo = ECHO | ECHOK | ICANON;
$noecho = $oterm & ~$echo;
sub cbreak {
$term->setlflag($noecho); # ok, so i don't want echo either
$term->setcc(VTIME, 1);
$term->setattr($fd_stdin, TCSANOW);
}
sub cooked {
$term->setlflag($oterm);
$term->setcc(VTIME, 0);
$term->setattr($fd_stdin, TCSANOW);
}
sub readkey {
my $key = '';
cbreak();
sysread(STDIN, $key, 1);
cooked();
return $key;
}
END { cooked() }
1;
--
Fungus doesn't take a vacation. --Rob Pike
------------------------------
Date: Fri, 05 Jun 1998 10:34:33 -0400
From: Ala Qumsieh <aqumsieh@matrox.com>
Subject: Re: Spider programms in PERL
Message-Id: <357801F8.616D613@matrox.com>
Tom Christiansen wrote:
> Your best hope is that someone will be happy to let you hire them as a
> consultant at about $150/hour or better to spend the severely nontrivial
> amount time it would take to teach you all this, since your current
> posting on spidering and your other one about autoposting to USENET do
> not exactly inspire us to complete confidence in your ability to grasp
> what we mean when we politely but succinctly suggest that you consult
> the libnet and the LWP module suites on CPAN--which is likely the most
> information you're liable to get, and in fact, just have.
>
> --tom
> --
> "It's okay to be wrong temporarily." --Larry Wall
Boy .. that is the longest sentence I've ever read!
--
Ala Qumsieh | No .. not just another
ASIC Design Engineer | Perl Hacker!!!!!
Matrox Graphics Inc. |
Montreal, Quebec | (Not yet!)
------------------------------
Date: Fri, 05 Jun 1998 13:39:27 +0100
From: Ian Boys <boys@aspentech.com>
Subject: Re: Use of HTML, POD, etc in Usenet (was: Re: map in void context regarded as evil - suggestion)
Message-Id: <3577E6FE.FF6@aspentech.com>
Ronald J Kimball wrote:
>
>
> Is B<this> any less plain text than _this_ or *this*?
>
IMHO, Yes. When I am reading, I find _this_ and *this* to scan
naturally and smoothly, without any interruption of brain-flow.
But when I come across B<this> or C<this>, there is a momentary
interruption while I mentally trip over it, making it harder to read.
Plain text should mean plain, easy to read text.
Mark up text for humans in a human-friendly way.
Mark up text for computers in a computer-friendly way.
Remember that computers and programming languages have no aesthetic
values for us to offend, but people do.
Ian
------------------------------
Date: 5 Jun 1998 14:52:12 GMT
From: abigail@fnx.com (Abigail)
Subject: Re: Use of HTML, POD, etc in Usenet (was: Re: map in void context regarded as evil - suggestion)
Message-Id: <6l90ms$7of$1@client3.news.psi.net>
Rajappa Iyer (rsi@lucent.com) wrote on MDCCXXXVIII September MCMXCIII in
<URL: news:xny7m2xuk3a.fsf@placebo.ho.lucent.com>:
++ abigail@fnx.com (Abigail) writes:
++
++ > Rajappa Iyer (rsi@lucent.com) wrote on MDCCXXXVIII September MCMXCIII in
++ > <URL: news:xny1zt5wag5.fsf@placebo.ho.lucent.com>:
++ > ++ abigail@fnx.com (Abigail) writes:
++ > ++
++ > ++ > Zenin (zenin@bawdycaste.org) wrote on MDCCXXXVII September MCMXCIII in
++ > ++ > ++ What's wrong with using
++ > ++ > ++ ---------------------------------------------------------------
++ > ++ > ++ instead of <hr>?
++ > ++ >
++ > ++ > Standards.
++ > ++
++ > ++ Aah... so HTML is more standard than ASCII? Thanks for this nugget.
++ >
++ > You're comparing apples and salt water here. I've yet to see an
++ > ASCII table that defines symbols for emphasis or horizontal lines.
++
++ Excuse me, but we were talking about standards (as opposed to
++ conventions) weren't we?
Yes, and ASCII defines standards for bitsequence to character translations,
which has nothing to do with use of *foo*, _bar_ and ----, where the
meaning of * *, _ _ and ---- used.
There's an RFC for HTML, there isn't an RFC for use of * *, AFAIK.
++ Yes, there are no adequate ways to express emphasis or emotions
++ (emoticons notwithstanding) in ASCII, but guess what? That's part of
++ the point. Tone is usually conveyed by the appropriate choice of
++ words. And where there's a compelling need for emphasis, Usenet has
++ had conventions for a long time (e.g. *emphasis*, _emphasis or
++ italics_, SHOUTING etc.)
Yes, but not standardized. Not necessarely meaningful for instance if
you're blind and aren't use a visual display to read Usenet.
++ Again, just in case, the point is not clear:
++ Usenet posts are ostensibly meant for human consumption; given that
++ these humans are expected only to have a reader which displays ASCII,
++ posts marked up with markup languages are rude, inconsiderate and
++ anti-social. It's right up there with sending M$ Word documents as
++ attachment.
I'm not saying one should. But it would be nice if one could.
Abigail
--
perl -MTime::JulianDay -lwe'@r=reverse(M=>(0)x99=>CM=>(0)x399=>D=>(0)x99=>CD=>(
0)x299=>C=>(0)x9=>XC=>(0)x39=>L=>(0)x9=>XL=>(0)x29=>X=>IX=>0=>0=>0=>V=>IV=>0=>0
=>I=>$r=-2449231+gm_julian_day+time);do{until($r<$#r){$_.=$r[$#r];$r-=$#r}for(;
!$r[--$#r];){}}while$r;$,="\x20";print+$_=>September=>MCMXCIII=>()'
------------------------------
Date: 5 Jun 1998 14:12:07 GMT
From: kfox@pt0204.pto.ford.com (Ken Fox)
Subject: Re: Why is there no "in" operator in Perl?
Message-Id: <6l8ubn$76q1@eccws1.dearborn.ford.com>
Tom Christiansen <tchrist@mox.perl.com> writes:
> Anyone who asks whether something is "in" a list has made
> a fundamental mistake. They've chosen a data structure
> that's O(N) to search. Bad!
Probably. There is the time/space trade-off to consider. Also,
an array has properties that the hash doesn't -- it's ordered for
one. It doesn't take many "sort keys %hash" to eliminate the
efficiency advantage of a hash.
- Ken
--
Ken Fox (kfox@ford.com) | My opinions or statements do
| not represent those of, nor are
Ford Motor Company, Powertrain | endorsed by, Ford Motor Company.
Analytical Powertrain Methods Department |
Software Development Section | "Is this some sort of trick
| question or what?" -- Calvin
------------------------------
Date: Fri, 05 Jun 1998 14:13:48 GMT
From: John Porter <jdporter@min.net>
Subject: Re: Why is there no "in" operator in Perl?
Message-Id: <3577FEBC.19CB@min.net>
Tom Christiansen wrote:
>
> Vector, List, Stack, Queue => @ARRAY
> Set, Record, Table, Structure => %HASH
>
> Anyone who asks whether something is "in" a list has made
> a fundamental mistake. They've chosen a data structure
> that's O(N) to search. Bad!
You are mighty quick to generalize, Tom.
Real-world examples refute your simple generalization.
Sometimes, the nature of the app demands a queue (say),
but a relatively infrequent search of the queue for
items matching some criteria.
Here's an example, off the top of my head:
I've got a message-processing system, that stores the
incoming msgs in a queue. The queue's length is
typically between 0 and 10, say, but occasionally much
higher. Every so often (say, every 1000 msgs, on avg)
I get a high-priority OOB message saying something like
"If you have any msgs from John Porter in your queue,
delete them!" It would be absurd to use a hash for
that search.
I.e. sometimes it's better to have a little O(N) when it is
a small part of your overall performance.
John Porter
------------------------------
Date: 5 Jun 1998 14:24:52 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: Why is there no "in" operator in Perl?
Message-Id: <6l8v3k$oqa$3@csnews.cs.colorado.edu>
[courtesy cc of this posting sent to cited author via email]
In comp.lang.perl.misc, jdporter@min.net writes:
:You are mighty quick to generalize, Tom.
Yes -- intentionally. That's because neophyte programmers who haven't
thought about these issues need very simple guidelines. In that spirit,
the question of "in a list" should be a red flag that someone chose the
wrong data structure. Somehow who realizes why they really did need
that operation (which is seldom) would not have to ask how to do it.
Therefore, it is a perfectly valid generalization, and your pointing
out of tiny holes theory that an experienced programmer
would care about does nothing to do dispute this, since it is
not the experienced programmers who are asking these ridiculous
questions.
As for the ordering of hashes, or rather, the lack thereof, I have never
*once* found this to be a hardship. The workarounds are many, and easy,
and they all take more time and space than a sort in the real cases
(not the general computational complexity) that I have tested.
--tom
--
tmps_base = tmps_max; /* protect our mortal string */
--Larry Wall in stab.c from the perl source code
------------------------------
Date: Fri, 05 Jun 1998 10:40:43 -0500
From: fl_aggie@thepentagon.com (I R A Aggie)
Subject: Re: Why is there no "in" operator in Perl?
Message-Id: <fl_aggie-0506981040430001@aggie.coaps.fsu.edu>
In article <6l8ubn$76q1@eccws1.dearborn.ford.com>,
kfox@pt0204.pto.ford.com wrote:
+ Probably. There is the time/space trade-off to consider. Also,
+ an array has properties that the hash doesn't -- it's ordered for
+ one.
Really? I don't see it, so educate me.
I'm presuming, from your 'sort keys %hash' that you mean 'in sequence'. But
that can be guaranteed _only_ if one preallocates and fill the list
_a priori_.
I think that in a real-world example that presumption will fall flat.
For instance:
push @list,$value;
AFAIK, there is no way to know that @list will be ordered unless you
know for sure that the input itself is ordered.
+ It doesn't take many "sort keys %hash" to eliminate the
+ efficiency advantage of a hash.
True. But why would you want to? Because its a hash, you don't need to know
_where_ something is, only that it exists. If it does exist, you already
know enough to access the value.
Besides, one could always:
@keys=sort keys %hash;
Gosh, an ordered list... :)
James
--
Consulting Minister for Consultants, DNRC
The Bill of Rights is paid in Responsibilities - Jean McGuire
To cure your perl CGI problems, please look at:
<url:http://www.perl.com/CPAN-local/doc/FAQs/cgi/idiots-guide.html>
------------------------------
Date: Fri, 5 Jun 1998 14:11:55 +0100
From: "Williams, Brian" <Brian.Williams@uuplc.co.uk>
Subject: Win32::NetResource Module
Message-Id: <96EACFA925CAD111871B00805F8B0E9D46E206@UUG-RES-MC00>
I am trying to use the Win32::NetResource module to "net use" drive
mappings, without much success. If anybody has any worked examples, or
the the URL of a site with examples, please post.
Brian Williams
Brian.Williams@uuplc.co.uk
------------------------------
Date: 8 Mar 97 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 8 Mar 97)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V8 Issue 2816
**************************************