[17753] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 5173 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Dec 21 18:05:38 2000

Date: Thu, 21 Dec 2000 15:05:16 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <977439916-v9-i5173@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Thu, 21 Dec 2000     Volume: 9 Number: 5173

Today's topics:
    Re: assign to array of references nobull@mail.com
    Re: automatic FAQ answerer idea <smerr612@mailandnews.com>
    Re: automatic FAQ answerer idea eggrock@my-deja.com
    Re: automatic FAQ answerer idea <mischief@velma.motion.net>
    Re: bug in open() (Ilya Zakharevich)
        CGI module not deleting tmp files <gorgano@altavista.com>
    Re: Convert YM to YMD Where D Is Last D of M Using Date <Jerry.Wilcox@ucop.edu>
        Creating Union/Intersect arrays... stefan@borgia.com
    Re: Creating Union/Intersect arrays... <mothra@nowhereatall.com>
    Re: Creating Union/Intersect arrays... (John J. Trammell)
    Re: Creating Union/Intersect arrays... (Tad McClellan)
        h4x0r vNought (Tom Christiansen)
        Help Bidirectional IPC with IPC::Open2 <usenet@hank.org>
    Re: How could I get the time of a server? (Andrew N. McGuire)
    Re: HTML parse nodo70@my-deja.com
    Re: HTML parse (Jerome O'Neil)
    Re: HTML parse <jhelman@wsb.com>
    Re: HTML parse <brondsem@my-deja.com>
    Re: If I don't want to 'goto' <iltzu@sci.invalid>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 21 Dec 2000 19:02:04 +0000
From: nobull@mail.com
Subject: Re: assign to array of references
Message-Id: <u9y9x9y4wj.fsf@wcl-l.bham.ac.uk>

"John Lin" <johnlin@chttl.com.tw> writes:

>     my @a = ('A'..'Z');
>     my @refs = \@a[7..10];
>     ${$refs[0]} = 'h';
>     ${$refs[1]} = 'i';
>     ${$refs[2]} = 'j';
>     ${$refs[3]} = 'k';
>     print @a;
> 
> __END__
> ABCDEFGhijkLMNOPQRSTUVWXYZ
> 
> Could I replace line 3~6 with one assignment?

No.  This has bugged me for years.  You need an explicit loop.  If all
you want is tidy looking code AFAIK the best you can do is something
like:

# If speed matters then write this in C not Perl
sub assign (\@@) { $$_ = shift for @{+shift} }

my @a = ('A'..'Z');
my @refs = \@a[7..10];
assign @refs => ('h'..'k');
print @a;
__END__
ABCDEFGhijkLMNOPQRSTUVWXYZ

-- 
     \\   ( )
  .  _\\__[oo
 .__/  \\ /\@
 .  l___\\
  # ll  l\\
 ###LL  LL\\


------------------------------

Date: Thu, 21 Dec 2000 19:34:58 GMT
From: Steven Merritt <smerr612@mailandnews.com>
Subject: Re: automatic FAQ answerer idea
Message-Id: <91tm11$l9a$1@nnrp1.deja.com>

In article <91te60$iuh$1@boomer.cs.utexas.edu>,
  logan@cs.utexas.edu (Logan Shaw) wrote:
> In article <slrn943tq7.4tk.tadmc@magna.metronet.com>,
> Tad McClellan <tadmc@metronet.com> wrote:
> >When Joe Gimme-Gimme gets pissed at the newsgroup, he starts
> >making response-triggering postings just to watch the
> >machine spew stuff...
>
> But the response machine is written in Perl and uses these
> magic things called "associative arrays" to keep track of Joe
> Gimme-Gimme's last posting.  If he has posted in the last N
> seconds (where N is something like 86400), then it doesn't
> respond to him or it sends the response only to his e-mail.

Problems with the "email them answers" have been identified, and seem to
be nontrivial.  Munging is almost impossible to deal with, ask the
spammers.  I agree with limiting the amount of replies to any particular
person per time period.

> Also, the robot limits itself to M postings per day as a failsafe.

Also a good idea.

> I'm perfectly happy if the prevailing opinion is that people don't
> want an FAQ robot, but I'm not too happy when people say "we don't
> want an FAQ robot because of <problem with trivial solution>".  I
> don't mean to be a pain, but that is what is happening so far.

Building the database is nontrivial.  Recognizing FAQs from postings
like "HELP!! ITS BROKEN!!!   I think the problem is <BLAH>..." is hard
and if you simply ignore this type of message, then you haven't really
helped either the regulars or the clueless.  We don't need to fend off
questions from the semi-clueful, we can just give them a RTFM Section X
kind of message.  We need a reliable way to clue in the clueless.

Steven
--
King of Casual Play
The One and Only Defender of Cards That Blow


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: Thu, 21 Dec 2000 22:07:31 GMT
From: eggrock@my-deja.com
Subject: Re: automatic FAQ answerer idea
Message-Id: <91tuut$tfl$1@nnrp1.deja.com>

I might be way off base here but...

Is it possible to have some sort of prompt into the (proposed) system?
For instance: "My clueless question"--and I'm one of the
clueless--generates a response that "Are you looking for this?" with the
answer indexed to the FAQ and possibly to related newsgroup questions.
Rather than a post to this or any other newsgroup it could simply
generate some relevant information related to the request. If it's off
base, then the person could post to the newsgroup, after clicking on the
"Or try reading this FAQ" link of course.

Maybe O'Reilly could just index all their books electronically. ;)

That sounds like a ton of work though. I'd be happy to help; my Perl
knowledge is rather shady (compared with the 'gods', or at least people
who have had formal training in programming) but I'm willing to do grunt
work. Beats the hell out of seeing "We've already discussed this before"
responses to relevant discussions to brand newbies.

IMHO, automated posts to a newsgroup are not a good thing, but
information is.


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: Thu, 21 Dec 2000 23:03:51 -0000
From: Chris Stith <mischief@velma.motion.net>
Subject: Re: automatic FAQ answerer idea
Message-Id: <t4532nfeg8o1c4@corp.supernews.com>

Logan Shaw <logan@cs.utexas.edu> wrote:
> In article <x766kedznc.fsf@home.sysarch.com>,
> Uri Guttman  <uri@sysarch.com> wrote:
>>it has been shot down for a variety of reasons.

I think it would be nice to have a test group for it,
so interested parties could see how it works and get in
on the development. It's obviously not a good idea to
test something in production on this scale anyway.

I propose the group alt.lang.perl.askthebot be created by
someone interested in the idea. Humans interested in the
idea could post there, as well as anyone who thinks that
their question may not be ready for posting to the
comp.lang.perl.* hierarchy. 

> Interesting.  I will give my answer to these reasons just for the sake
> of discussion.  It may be that it is not a good idea, but having not
> seen previous discussions about it, I haven't come to a conclusion
> myself.

>> some IIRC were it would be to hard to make accurate,

alt.lang.perl.askthebot ;)

I think having a group where people know going in that there's
AI development and testing happening across the group would
make accuracy less of a concern.

> This could be a problem.  It is easy, with a probabilistic (Bayesian)
> engine, to ignore those cases where the confidence is not high.  So,
> although perfect accuracy would be best, a nice compromise would be
> silence when the system isn't pretty sure it's accurate.  Further,
> there may be questions which it can't recognize accurately and ones it
> can.  It would be easy to use it only for the former kind.  (In fact,
> I'd start development by having it try to answer only one very
> frequently asked question.)

This would be a very good idea. We could also filter out, based upon the
number of posts by the same user per a period of time, flooding or
DOS-style attacks of the type mentioned by someone (I think Tad McClellan)
in which someone trolls for FAQ answers and gets a lot of them.

>> it would generate too much new volume,

alt.lang.perl.askthebot

Once again, not as much a problem in a group created for the purpose.
Also one of the reasons I suggest making the group part of the alt
tree instead of the comp tree. People expect lots of bogus posts in
the alt groups anyway. ;)

> Once a question is posted, it is likely to generate volume whether or
> not an autoresponder is in place.  *If* the system could be made to
> work as intended, then I'd argue it would actually decrease volume
> since it would reply very quickly, and since it could potentially
> prevent flamewars.

Mitigate flamewars, maybe. Flamewars will always happen, because
some people actually enjoy them.

>> many people don't want a robot in a newsgroup, etc.

alt.lang.perl.askthebot

> Don't we already have a robot in this newsgroup posting answers to FAQs
> randomly?  Would it be better (again, *if* it can be made to work as
> intended) to send responses in a more directed fashion?

Yes, a more directed fashion would be a plus.

>> it couldn't do the proper
>>thing by emailing the answers since so many use antispam addresses.

> I'd argue that the proper thing is to post, optionally copying on
> e-mail.  Mailing wouldn't be useful because other readers of the group
> wouldn't know when the question has already been answered and thus
> would spend their time answering an already-answered question.

I would agree with this, particularly if it's in its own group, so
people expected to have a large percentage of the post from the bot.

>>just watch this thread develop. :(

Hopefully this is an unexpected turn in the thread's development.
I think this could be of interest to Perl people, to AI people,
to library/research people, and to others as well. I think this sort
of projects a lot of potential merit, but I agree this is not the
place for it.

If it's corralled in its own group, this could be a good thing. 
I'd hate to see something like an IRC bot like or the old
CancelBot/RessurectorBot wars. I'm sure, though, if this worked
out well, you'd see request bots all over the alt.binaries tree.

> I will.  I hope it doesn't turn into a mess, and I'm sorry in advance
> if it does.

Chris
--
Christopher E. Stith
mischief@motion.net - if you are heesun1@mail.com or heesun3@mail.com,
                    - I have already reported you to your postmaster
- as a newsgroup email address harvester and spammer who doesn't even
- give someone a chance to unsubscribe.



------------------------------

Date: 21 Dec 2000 19:17:35 GMT
From: ilya@math.ohio-state.edu (Ilya Zakharevich)
Subject: Re: bug in open()
Message-Id: <91tl0f$mi3$1@charm.magnus.acs.ohio-state.edu>

[A complimentary Cc of this posting was NOT sent to Joe Schaefer 
<joe+usenet@sunstarsys.com>],
who wrote in article <m3r931hcjb.fsf@mumonkan.sunstarsys.com>:
> IMHO, it's because the 5.6 behavior is unpredictable.

?!  If start-process fails, it is reported to the caller.

>  The documentation I posted earlier says that open should succeed on
> a successful fork, and fail if the fork fails.

Then this is a buggy piece of documentation (quite typical, sigh).
This cannot be true, since start-a-process uses fork() on legacy
systems only.

> In 5.6, something funny is happening that passes the exec failure
> back to open.

Yes, I managed to implement sane error-reporting on legacy systems too.

Ilya


------------------------------

Date: Thu, 21 Dec 2000 21:52:00 GMT
From: Jason Hurst <gorgano@altavista.com>
Subject: CGI module not deleting tmp files
Message-Id: <91tu1t$sil$1@nnrp1.deja.com>

I'm using the CGI module to upload files to my web server.  However, i
can't seam to get the temp files that it creates to go away.  The
perldoc says those are automatically deleted when the script terminates
but i don't see it happening.  Am i missing an option or something?  Or
is it maybe a bug on windows..?

I know you can get a reference to the file ($ref = $query->upload
('foo')) however that doesn't work for unlinking, obviously, as you
need a string to the file, and the string it returns is the path on the
client’s machine.

So my question is, is there any way to get the CGI module to delete
those files with an option.  OR alternatively, is there any way to use
that reference to delete the files?  Thanks for your help in advance!

-jason JAPH



Sent via Deja.com
http://www.deja.com/


------------------------------

Date: Thu, 21 Dec 2000 11:19:54 -0800
From: Jerry Wilcox <Jerry.Wilcox@ucop.edu>
Subject: Re: Convert YM to YMD Where D Is Last D of M Using Date::Manip
Message-Id: <211220001119549983%Jerry.Wilcox@ucop.edu>

In article <91rlk0$3kc$1@nnrp2.phx.gblx.net>, Jim Monty
<monty@primenet.com> wrote:

>I want to use Date::Manip to convert date strings from YYYYMM to
>YYYYMMDD, where the DD is the last day of the month MM. How do I
>do this?
>

While I see that others have argued that Date::Manip would be overkill
(and that might be true), I would argue that the clarity of the
following solution is far greater than that of the suggested solutions,
and it does what you asked for the way you asked for it.

#!/usr/bin/perl -w
use Date::Manip;
my ($arg,$date1,$mydate);
while ($arg = shift) {
        $date1=&DateCalc($arg,"+ 1 month - 1 day");
        $mydate=&UnixDate($date1,"%Y%m%d");
        print "for input |$arg|, the desired value is |$mydate|\n";
}

Here's a sample execution:

$ testManip.pl 200012 199910 200002 200102 199602 198504
for input |200012|, the desired value is |20001231|
for input |199910|, the desired value is |19991031|
for input |200002|, the desired value is |20000229|
for input |200102|, the desired value is |20010228|
for input |199602|, the desired value is |19960229|
for input |198504|, the desired value is |19850430|

Once again, this might not be the method of choice if I were processing
millions of records on a recurring basis, but it would depende on the
need for efficiency.

Of course, you could eliminate one intermediate variable and one line
of code by combining as in

    $mydate=&UnixDate(&DateCalc($arg,"+ 1 month - 1 day"),"%Y%m%d");

and I'm sure there are other, perhaps better solutions. This is just
off the top of my head this morning.

Regards,
  Jerry


------------------------------

Date: Thu, 21 Dec 2000 21:23:13 GMT
From: stefan@borgia.com
Subject: Creating Union/Intersect arrays...
Message-Id: <91tsbp$r17$1@nnrp1.deja.com>

I can't seem to find any effecient way to determine the arrays that
should be created for a union and intersect operation.  Is there a
function for that or does anyone know a good way to do this?

@a=('abc','acb','bac','bca','cab','cba');
@b=('a','b','c','abc','bac');

# Those values that are in both
@intersect=('abc','bac')
# Those values that are not in both
@union=('a','b',c','acb','bca','cab','cba')

Thanks,
Stefan Adams


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: Thu, 21 Dec 2000 13:31:54 -0800
From: mothra <mothra@nowhereatall.com>
Subject: Re: Creating Union/Intersect arrays...
Message-Id: <3A4276CA.A4138C47@nowhereatall.com>

stefan@borgia.com wrote:
> 
> I can't seem to find any effecient way to determine the arrays that
> should be created for a union and intersect operation.  Is there a
> function for that or does anyone know a good way to do this?
> 
[snipped]

try perldoc -q array and look for the entry

 How do I compute the difference of two arrays?  How do I compute the
intersection of two arrays?


------------------------------

Date: 21 Dec 2000 21:36:45 GMT
From: trammell@nitz.hep.umn.edu (John J. Trammell)
Subject: Re: Creating Union/Intersect arrays...
Message-Id: <slrn9443ji.mf9.trammell@nitz.hep.umn.edu>

On Thu, 21 Dec 2000 21:23:13 GMT, stefan@borgia.com <stefan@borgia.com> wrote:
>I can't seem to find any effecient way to determine the arrays that
>should be created for a union and intersect operation.  Is there a
>function for that or does anyone know a good way to do this?

Found in /usr/local/lib/perl5/5.6.0/pod/perlfaq4.pod
     How do I compute the difference of two arrays?  How do I
     compute the intersection of two arrays?

     Use a hash.  Here's code to do both and more.  It assumes
     that each element is unique in a given array:

         @union = @intersection = @difference = ();
         %count = ();
         foreach $element (@array1, @array2) { $count{$element}++ }
         foreach $element (keys %count) {
             push @union, $element;
             push @{ $count{$element} > 1 ? \@intersection : \@difference },
             $element;
         }

     Note that this is the symmetric difference, that is, all
     elements in either A or in B, but not in both.  Think of it
     as an xor operation.
-- 
John J. Trammell
johntrammell@yahoo.com


------------------------------

Date: Thu, 21 Dec 2000 15:53:00 -0500
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: Creating Union/Intersect arrays...
Message-Id: <slrn944rdc.63g.tadmc@magna.metronet.com>

stefan@borgia.com <stefan@borgia.com> wrote:

>I can't seem to find 


Where have you been looking?

Did it include the Perl FAQs?

You are expected to check the Perl FAQ *before* posting
to the Perl newsgroup.


>any effecient way to determine the arrays that
>should be created for a union and intersect operation.  Is there a
                                   ^^^^^^^^^
>function for that or does anyone know a good way to do this?


   perldoc -q intersect

      "How do I compute the difference of two arrays?  
       How do I compute the intersection of two arrays?"


-- 
    Tad McClellan                          SGML consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: 21 Dec 2000 13:27:41 -0700
From: tchrist@perl.com (Tom Christiansen)
Subject: h4x0r vNought
Message-Id: <3a4267bd@cs.colorado.edu>


Todo:

    Needs to be turned into subroutines.  
    Order actually matters.

#!/usr/bin/perl -p

sub pick_one { $_[rand @_] } 

# random words

s{ microsoft }{micr0\$oft}gix;
s{ Windows }{Winbloz}gx;

s{ Bill \s+ Gates }{
    pick_one (
	'3i11 Gate$',
	'The Prince of Evil',
	'Our Lord and Master, President $Bill',
	'Lord $t Bill',
    )
}gex; 

s{ool}{3w1}gix;
s{through}{thrU}gix;
s{queue}{Q}gix;
s{what}{wud}gix;
s{\beasy\b}{EZ}gix;
s{\beasier\b}{EZr}gix;
s{\beasiet\b}{EZt}gix;

s{\b hacker}{h4x0r}gix;
s{\b an \s+ elite}{a l33t}gix;
s{\b elite }{31337}gix;

s{ \b sure }{shur}gix;
s{ \b sugar }{shugger}gix;

s{ \b anyone }{NE1}gix;
s{ \b someone }{sum1}gix;
s{ \b some }{sum}gix;
s{ \B thing \b }{thun}gix;
s{ \b ounce (?= s?)  }{oz}gix;
s{ \b pound (?= s?) }{lb}gix;

s{ \b any }{NE}gix;

s{ \b about \b }{bout}gix;
s{ \b not \b }{!}gix;
s{ \b and }   {&}gix;
s{ \b or }    {|}gix;
s{ exor }    {^}gix;
s{ starr? }    {*}gix;
s{ power }    {**}gix;
s{ bang }    {!}gix;
s{ sharp }    {#}gix;
s{ slash }    {/}gix;
s{ question }    {?}gix;
s{ hook }    {?}gix;
s{ plus }    {+}gix;
s{ minus }    {-}gix;
s{ times }    {*}gix;
s/brace/{/gix;
s/bracket/[/gix;
s{ \b equal (?= s?) }    {=}gix;

s{ underline }     {_}gix;
s{ underlying }    {_}gix;
s{ underscore }    {_}gix;

s{  's  }{z}gix;

s{  issue  }{ishyu}gix;

s{ \B tion }{shun}gix;

s{ \b where \b }    {wayer}gix;

s{ \b there \b }    {thayer}gix;
s{ \b they're \b }  {thayer}gix;
s{ \b their \b }    {thayer}gix;

s{ \b use (?> [sd]?) \b }    {yooz}gix;

s{\b at \b}{\@}gix;
s{\b of \b}{uv}gix;
s{\b into \b}{N2}gix;

s{ ([bdglmnr]) s \b}{${1}z}gix;

s{\b a \b}{uh}gix;

s{ xed \b}{xt}gix;

s{\b does \b}{duz}gix;
s{ oft }{off}gix;

s{ ded \b }{did}gix;

s{ d? ge[ds]? \b }{j$1}gix;

s{\b enough \b}{enuf}gix;
s{\b is \b}{iz}gix;
s{\b was \b}{w\@z}gix;
s{\b k? new }{nU}gix;

s{\b corr }{kr}gix;
s{ ice \b }{iz}gix;

s{ ares \b }{ayrz}gix;

s{ ows \b }{0z}gix;

s{ (?<= [aeiou] ) ct  }{k}gix;

s{ iles \b  }{ilz}gix;

s{ (?<=...) iced \b } {ist}gix;
s{ ussed \b } {ist}gix;

s{ ould \b}{ud}gix;
s{ (?<!e) ight \b}{ite}gix;

s{ ation \b}{ashun}gix;
s{ ention \b}{Nshun}gix;

s{ (?<= [^\Waeiouy] ) a (?= [^\Waeiouy] ) } { rand(100) < 30 ? '@' : "a" }egx;
s{ ack } { rand(100) < 30 ? "ax" : "ack" }egx;
s{ ble \b } {bul}gix;
s{ qu } {kw}gix;

s{ \b ch?r } {kr}gix;
s{ \b kn } {n}gix;
s{ \b wh } {w}gix;
s{ \b ps } {s}gix;
s{ \b gn } {n}gix;
s{ \b mn } {n}gix;
s{ \b wr } {r}gix;
s{ \b rh } {r}gix;
s{ \b dg } {j}gix;
s{ tch \b } {ch}gix;
s{ \b sch } {sk}gix;
s{ ck } {kk}gix;
s{ l } { rand(100) < 60 ? "1" : "l" }egx;
s{ E } { rand(100) < 30 ? "3" : "E" }egx;
#s{ a } { rand(100) < 30 ? '@' : "a" }egix;
#s{ o } { rand(100) < 30 ? '0' : "o" }egix;

s{(?:less|fewer) than}{<}g;
s{(?:more|greater) than}{>}g;

# first, the numbers

s{ \b ( alth | furl | thor | th ) ough \b }{${1}0}gix;

s{ \b one                       \b }{1}gix;
s{ \b once                      \b }{1s}gix;
s{ \b (?: two |too | to)        \b }{2}gix;

s{ \b (?: four | for | fore)       }{4}gix;
s{    (?: ate | eight )   (s?)  \b }{8$1}gix;

s{    tain (s?)  \b }{10$1}gix;
s{    ten[cs]e \b    }{10s}gix;

s{    ([iea])n[cs]e \b    }{${1}nz}gix;
s{    ens \b    }{enz}gix;
	
s{ \b (b)e (?= \b | [^\Waeiou] )   }{B}gix;

s{ eigh \b }{A}gix;
s{ ay \b }{A}gix;

s{ \b se[ea] \b }{A}gix;
s{ (?!< [gqaeio] ) ues \b }{uz}gix;

s{ \b bee? (?= [^\Waeiou] )	      }{B}gix;

s{ \b bea (?= [^\Waeiou] )	      }{B}gix;

s{ \B [cs]y \b }{C}gix;
s{ \B [cs]ie(s?) \b }{C$1}gix;
s{ \b (d)e (?= \b | [^\Waeiou] )   }{D}gix;
s{ \b eff                          }{F}gix;
s{ \b gee                       \b }{G}gix;
s{ \b (i) \b   }{ ("eye", "aye")[rand 2] }egix;
s{    elle? $                   \b }{L}gix;
s{ \b emm?                         }{M}gix;
s{ (?: \b enn? | en \b )           }{N}gix;

s{ \b in \b 			   }{N}gix;
s{ (?<= [nm] ) en }{N}gix;

s{ \b oh \b 			   }{0}gix;
s{ \b and \b 			   }{Nd}gix;

# all the -ow words into -0 words
s{arrow(?=s?\b)}{ar0}gix;
s{\barr(?=iv)}{R}gix;
s{bellow(?=s?\b)}{bel0}gix;
s{bestow(?=s?\b)}{best0}gix;
s{blow(?=s?\b)}{bl0}gix;
s{borrow(?=s?\b)}{bor0}gix;
s{crow(?=s?\b)}{cr0}gix;
s{elbow(?=s?\b)}{elb0}gix;
s{fallow(?=s?\b)}{fal0}gix;
s{fellow(?=s?\b)}{fel0}gix;
s{flow(?=s?\b)}{fl0}gix;
s{furrow(?=s?\b)}{fur0}gix;
s{glow(?=s?\b)}{gl0}gix;
s{grow(?=s?\b)}{gr0}gix;
s{hallow(?=s?\b)}{hal0}gix;
s{harrow(?=s?\b)}{har0}gix;
s{hollow(?=s?\b)}{hol0}gix;
s{know(?=s?\b)}{kn0}gix;
s{low(?=s?\b)}{l0}gix;
s{mallow(?=s?\b)}{mal0}gix;
s{marrow(?=s?\b)}{mar0}gix;
s{mellow(?=s?\b)}{mel0}gix;
s{minnow(?=s?\b)}{min0}gix;
s{narrow(?=s?\b)}{nar0}gix;
s{pillow(?=s?\b)}{pil0}gix;
s{rainbow(?=s?\b)}{rainb0}gix;
s{shadow(?=s?\b)}{shad0}gix;
s{shallow(?=s?\b)}{shal0}gix;
s{show(?=s?\b)}{sh0}gix;
s{slow(?=s?\b)}{sl0}gix;
s{snow(?=s?\b)}{sn0}gix;
s{sorrow(?=s?\b)}{sor0}gix;
s{sparrow(?=s?\b)}{spar0}gix;
s{stow(?=s?\b)}{st0}gix;
s{swallow(?=s?\b)}{swal0}gix;
s{tallow(?=s?\b)}{tal0}gix;
s{throw(?=s?\b)}{thr0}gix;
s{tomorrow(?=s?\b)}{tomor0}gix;
s{tow(?=s?\b)}{t0}gix;
s{wallow(?=s?\b)}{wal0}gix;
s{widow(?=s?\b)}{wid0}gix;
s{window(?=s?\b)}{wind0}gix;
s{yellow(?=s?\b)}{yel0}gix;

# now the P's
s{ \b pe[ea]	}{P}gix;
s{ \b peace	}{Ps}gix;
s{ \b peach	}{Pch}gix;
s{ \b peak	}{Pk}gix;
s{ \b peal	}{Pl}gix;
s{ \b peat	}{Pt}gix;
s{ \b peer	}{Pr}gix;
s{ \b peep	}{Pp}gix;
s{ \b peeve	}{Pv}gix;

# now the Q's
s{ cu ([blmnrt]) (?= [aeiou] ) }{Q$1}gix;

# now the R's
s{ \b are \b }{R}gix;
s{ \b ar (?= [^\Waeiou] ) }{R}gix;

s{ \b ( b | c | ch | g | h | j | m | p | t ) arr? (?: ed )? \b  }{ ${1}R };
s{ \b arr ( aign | ange | ay | ear | est | ive ) }{R$1}gix;

# now the S's
s{ \b ess? }{S}gix;
s{ ess  }{S}gix;
s{ esce }{S}gix;

# now the T's
s{ \b te[ea] (?= s?) \b }{T}gix;
s{ ty \b }{T}gix;
s{ ties \b }{Tz}gix;

# now the U's
s{ \b you \b }{
    ( qw<eu ewe yoo> )[rand 3] 
}egix;

# now the V's
s{ vy \b }{V}gix;
s{ ( [ie] ) vv? ie ( [sd] )  \b }{${1}V${2}}gix;

# now the X's
s{ ex 	    }{X}gix;
s{ eck?s     }{X}gix;
s{ \b acc (?=e) }{X}gix;

# now the Y's
s{ \b why }{Y}gix;

# now the Z's
s{ zz? (?: ies? | y ) }{Z}gix;

s/ing\b/in/g;

s/'//g;


------------------------------

Date: Thu, 21 Dec 2000 11:58:46 -0800
From: Bill Moseley <usenet@hank.org>
Subject: Help Bidirectional IPC with IPC::Open2
Message-Id: <MPG.14ac00d293321e83989787@news.newsguy.com>

I've used IPC::Open2 before without problem.   But, now I'm trying to 
create a client/server setup with Open2.  Basically, I want to emulate 
in a pipe http -- but with "Keep Alives".

I'm want to open another process, then send it a set of headers, get a 
set of headers in response, and repeat the process for another request -
- yet never close the pipe.  I also will need to send to the server and 
receive back some content (just like a POST).

Can someone help?


> cat client.pl

#!/usr/local/bin/perl -w
use strict;
use IPC::Open2;
use IO::Select;
use IO::Handle;

my ( $rh, $wh );

my $pid = open2($rh, $wh, './server.pl');
$pid || die "Failed to open";

my $read = IO::Select->new( $rh );

$rh->autoflush;
$wh->autoflush;

for (1..2) {
    print "\n>>$0: Sending Headers:$_\n";

    print $wh "Header-number: $_\n",
              "Content-type: perl/test\n",
              "Header: test\n\n";


    # Now read the response
    while ( 1 ) {

        my $fh;
        
        if ( ($fh) = $read->can_read(0) ) {
            print "Can read!\n";

            my $buffer = <$rh>;
            #$fh->read( $buffer, 1024 );

            last unless $buffer;

            print "<<$0: Read $buffer";
        } else {
            print "Can't read sleeping...\n";
            sleep 1;
        }
    }
    print "$0: All done!\n";
}
            
    

> cat server.pl 

#!/usr/local/bin/perl -w
use strict;

$|=1;

warn "In $0 pid=$$\n";

while (1) {
    my @headers = ();
    while ( <> ) {
        chomp;
        if ( $_ ) {
            warn "$0: Read '$_'\n";
            
            push @headers, $_;
        } else {
            for ( @headers ) {
                warn "$0: Sending $_\n";
                print $_,"\n";
            }
            print "\n";
            last;
        }
    }
}
    

-- 
Bill Moseley


------------------------------

Date: 21 Dec 2000 13:47:58 -0600
From: anmcguire@ce.mediaone.net (Andrew N. McGuire)
Subject: Re: How could I get the time of a server?
Message-Id: <86itodblox.fsf@hawk.ce.mediaone.net>

>>>>> "GW" == Glenn West <westxga@my-deja.com> writes:

GW> In article <90onm4$ci85@imsp212.netvigator.com>,
GW>   "Lucas" <wstsoi@hongkong.com> wrote:
>> Hi,
>> 
>> How could I get the time, date of a server by script?
>> I just checked some documents about telnet and Net::FTP,
>> but I couldn't find any clues.
>> 
>> Thanks very much again
>> 

GW> If the server is running Unix, you could connect to the port associated
GW> with the daytime service (usually port 13).

GW> HTH...

  That may not help much unless the UNIX boxen in question are his
own...  The reason being that most sysadmins comment out daytime in
/etc/inetd.conf.


anm

-- 
perl -wMstrict -e '
$a=[[qw[J u s t]],[qw[A n o t h e r]],[qw[P e r l]],[qw[H a c k e r]]];$.++
;$@=$#$a;$$=[reverse sort map$#$_=>@$a]->[$|];for$](--$...$$){for$}($|..$@)
{$$[$]][$}]=$a->[$}][$]]}}$,=$";$\=$/;print map defined()?$_:$,,@$_ for @$;
'


------------------------------

Date: Thu, 21 Dec 2000 19:32:36 GMT
From: nodo70@my-deja.com
Subject: Re: HTML parse
Message-Id: <91tlsl$l7h$1@nnrp1.deja.com>

I have read HTML::LinkExtor but it only extract the links.  I also need
to store the text which is for the link.  So I write my own script
below would do that.  However, in some case it doesn't handle all so I
ask if anyone can point it out.

/snip code/
#!/tools/opt/bin/perl -w
use strict;
open (INDEX, "index.html") || die "Cannot open index.html";
my @lines = <INDEX>;
close INDEX;
foreach my $line (sort @lines) {
   chomp $line;
   if ($line =~ /\<a/i) {
           my ($link,$text) = ($line =~ /<a href\=\"(.+)\"\>(.+)
<\/a>/i);
           #$link =~ s/(.+)"(.*)/$1/;
           #$text =~ s/\<.+\>(.+)\<\/.+\>/$1/gsi if ($text =~ m/\>/);
           #$text =~ s/<\/.+\>//g;
           my ($linkText)   = sprintf ("%-60s %-s\n",$link,$text);
           print $linkText;
   }
}
/snip code/

The above script works a line like:
<a href="http://www.yourname.com">Your Name</a>

print out:
http://www.yourname.com           Your Name


But it doesn't work with some kind like this:
<a href="http://www.yourname.com"><p align="center">Your Name</a>

print out:
http://www.yourname.com"><p align="center      Your Name

Because it look the last double but I don't know how to fix it.  Please
help.

Thanks,
Nodo


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: Thu, 21 Dec 2000 19:57:37 GMT
From: jerome@activeindexing.com (Jerome O'Neil)
Subject: Re: HTML parse
Message-Id: <Rgt06.1991$eK1.284554@news.uswest.net>

nodo70@my-deja.com elucidates:
> I have read HTML::LinkExtor but it only extract the links.  I also need
> to store the text which is for the link.  So I write my own script
> below would do that.  However, in some case it doesn't handle all so I
> ask if anyone can point it out.

If HTML::LinkExtor isn't exactly what you need, then you should use
HTML::Parser, or some descendent of it,  to get exactly what you need.  

It's a whole lot easier that trying to use a regex.

Try this to get started.  Then get the documentation, and read it until
you know it.

#!/usr/local/bin/perl
use strict;
use HTML::TreeBuilder;
my $file = shift;

open(F, $file) or die "$!: $file";
my $s = read F, my $f, -s F, 0;
close F;

my $tree = HTML::TreeBuilder->new;
$tree->parse($f);

my $links = $tree->extract_links('a');

foreach(@{$links}){
    my($link, $element) = @$_;
    print $element->content_list, qq{\n};
}


> But it doesn't work with some kind like this:
> <a href="http://www.yourname.com"><p align="center">Your Name</a>

Of course it doesn't.  You are using a saw when you need a hammer.

If you are dealing with HTML files, use tools that will help you with
HTML files.  Regex's are nice, but they aren't a global panacea for
searching.

> Because it look the last double but I don't know how to fix it.  Please
> help.

perldoc HTML::Parser, HTML::TreeBuilder, and other HTML::* modules.

-- 
If men could learn from history, what lessons it might teach us!  But
passion and party blind our eyes, and the light which experience gives
is a lantern on the stern, which shines only on the waves behind us.
				--Samuel Taylor Coleridge, "Recollections"


------------------------------

Date: Thu, 21 Dec 2000 20:16:41 GMT
From: Jeff Helman <jhelman@wsb.com>
Subject: Re: HTML parse
Message-Id: <3A426584.B675DABF@wsb.com>

nodo70@my-deja.com wrote:
> 
> I have read HTML::LinkExtor but it only extract the links.  I also need
> to store the text which is for the link.  So I write my own script
> below would do that.  However, in some case it doesn't handle all so I
> ask if anyone can point it out.

Note that I agree with other posters that you should probably be using
the HTML::* family of packages, but if you really want to shoot yourself
in the foot, far be it from me to deny you the shotgun. :)

This is by no means perfect, but it should get just about everything. 
Note, however, that it doesn't provide much of the nice functionality
the HTML::LinkExtor does (like cleaning up relative links).  Should you
want that, just add it in.

First off, slurp your file into a scalar named $Body (I'll leave the how
of that to you).  Then run the following regexp against it.

while ($Body =~ m!<A[^>]+HREF=(['"])([^\1]+?)\1[^>]*>(.+?)</A>!sig) {
	my ($link, $text) = ($2, $3);
	my $linkText = sprintf ("%-60s %-s\n",$link,$text);
	print $linkText;
}

This will catch any link that may include other tags besides HREF in the
<A...> block, as well as links that break across lines and links
enclosed in single quotes.  But, as I said above, this is by no means
perfect (<CAVEAT>it's actually the product of about 30 seconds of
thought, so I'm sure that someone could poke a hole in it pretty
easily</CAVEAT>), but it should get you most of the way there.

There you go.  Fire away. :)
JH



------------------------------

Date: Thu, 21 Dec 2000 20:07:01 GMT
From: Dave Brondsema <brondsem@my-deja.com>
Subject: Re: HTML parse
Message-Id: <91tnsu$n6b$1@nnrp1.deja.com>

In article <91tlsl$l7h$1@nnrp1.deja.com>,
  nodo70@my-deja.com wrote:
> I have read HTML::LinkExtor but it only extract the links.  I also
need
> to store the text which is for the link.  So I write my own script
> below would do that.  However, in some case it doesn't handle all so I
> ask if anyone can point it out.

By default, a regex is "greedy", meaning it will match on as much as it
can.  To stop if from being greedy, put a ? after that part of the
regex.

>
> /snip code/
> #!/tools/opt/bin/perl -w
> use strict;
> open (INDEX, "index.html") || die "Cannot open index.html";
> my @lines = <INDEX>;
> close INDEX;
> foreach my $line (sort @lines) {
>    chomp $line;
>    if ($line =~ /\<a/i) {
>            my ($link,$text) = ($line =~ /<a href\=\"(.+)\"\>(.+)
<\/a>/i);

>            my ($link,$text) = ($line =~ /<a href\=\"(.+)?\"\>(.+)
<\/a>/i);

The added ? should fix your problem (untested).

>            #$link =~ s/(.+)"(.*)/$1/;
>            #$text =~ s/\<.+\>(.+)\<\/.+\>/$1/gsi if ($text =~ m/\>/);
>            #$text =~ s/<\/.+\>//g;
>            my ($linkText)   = sprintf ("%-60s %-s\n",$link,$text);
>            print $linkText;
>    }
> }
> /snip code/
>
> The above script works a line like:
> <a href="http://www.yourname.com">Your Name</a>
>
> print out:
> http://www.yourname.com           Your Name
>
> But it doesn't work with some kind like this:
> <a href="http://www.yourname.com"><p align="center">Your Name</a>
>
> print out:
> http://www.yourname.com"><p align="center      Your Name
>
> Because it look the last double but I don't know how to fix it.
Please
> help.
>
> Thanks,
> Nodo
>
> Sent via Deja.com
> http://www.deja.com/
>

--
Dave Brondsema


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: 21 Dec 2000 20:04:10 GMT
From: Ilmari Karonen <iltzu@sci.invalid>
Subject: Re: If I don't want to 'goto'
Message-Id: <977428144.2821@itz.pp.sci.fi>

In article <3A420BBD.26ECDF1F@schaffhausen.de>, Malte Ubl wrote:
>John Lin schrieb:
 [snip]
>> is not suitable because many "my" variables are involved,
>> they would either be invisible or need to be passed as arguments
>> which is ugly.
>
>I guess you have to give a better reason to why not use subs. You present
>a perfect solution to your problem and then you say its ugly. You cant expect
>us to give you something better than the best. In case you wouldn't have
>mentioned the subs, probably, everybody here would have told you to use
>them.

Well, he does have a point.  It is frequently useful for an error
handling routine to dump lots of internal state for debugging, or at
least to have access to all the relevant state data so that it can
pick the relevant parts to display based on the error type.

This is by definition the opposite of encapsulation, and therefore
mechanisms which enforce encapsulation, such as lexical variables,
suddenly become a problem.  The most practical solution is often to
keep the error handling code within the same scope as it it called
from, but this prevents refactoring it to an external subroutine.

However, a solution that does work, as Abigail pointed out, is to use
a closure defined in the same scope:

  while (condition) {
      my ($lots, $of, $internal, $lexical, $state, $variables);
  #   . . .
      my $fail = sub {
	  warn "I can see the lexicals here! (state = $state)\n";
      };
  #   . . .
      $fail->(), next if errorA;
  #   . . .
      $fail->(), next if errorB;
  }

-- 
Ilmari Karonen -- http://www.sci.fi/~iltzu/
"Get real!  This is a discussion group, not a helpdesk.  You post
 something, we discuss its implications.  If the discussion happens to
 answer a question you've asked, that's incidental." -- nobull in clpm





------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 5173
**************************************


home help back first fref pref prev next nref lref last post