[18150] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 318 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Feb 20 00:05:48 2001

Date: Mon, 19 Feb 2001 21:05:10 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <982645510-v10-i318@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Mon, 19 Feb 2001     Volume: 10 Number: 318

Today's topics:
    Re: (OFF TOPIC) Character read stopping at 200 characte <godzilla@stomp.stomp.tokyo>
    Re: Character read stopping at 200 characters <obiwan@mahood.com>
    Re: Character read stopping at 200 characters (Martien Verbruggen)
        chop and chomp (Another Way)
    Re: chop and chomp (Chris Fedde)
    Re: delete lock file, possible race condition or dead l <johnlin@chttl.com.tw>
    Re: Examples of Using HTML Parser <bart.lateur@skynet.be>
    Re: FAQ 4.48:   How do I select a random element from a <godzilla@stomp.stomp.tokyo>
    Re: kill unix process <krahnj@acm.org>
    Re: kill unix process (Gwyn Judd)
    Re: matching special characters? (typo correction) <chart@bestweb.net>
        matching special characters? <chart@bestweb.net>
    Re: matching special characters? (Martien Verbruggen)
    Re: matching special characters? <godzilla@stomp.stomp.tokyo>
    Re: perl irc channel (Gwyn Judd)
    Re: PROPOSAL: Graphics::ColorNames (Martien Verbruggen)
    Re: PROPOSAL: Graphics::ColorNames (Abigail)
    Re: PROPOSAL: Graphics::ColorNames <sam@illuminated.co.uk>
        Reg Ex Help Pls <grichards@endertechnology.com>
    Re: Reg Ex Help Pls <bwalton@rochester.rr.com>
    Re: Reg Ex Help Pls <joe+usenet@sunstarsys.com>
    Re: Reg Ex Help Pls <krahnj@acm.org>
    Re: Reg Ex Help Pls <godzilla@stomp.stomp.tokyo>
    Re: Reg Ex Help Pls ianb@ot.com.au
        sending cookies with LWP user agent <inaganami@email.uc.edu>
    Re: Specifying the length of regular expression ianb@ot.com.au
    Re: striping HTML <no@email.com>
    Re: This sending of mail works flawlesly... but... <me@noshadow.net>
    Re: use of => (Martien Verbruggen)
    Re: use of => <godzilla@stomp.stomp.tokyo>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 19 Feb 2001 20:11:26 -0800
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: (OFF TOPIC) Character read stopping at 200 characters
Message-Id: <3A91EE6E.E11C8F53@stomp.stomp.tokyo>

Martien Verbruggen wrote:
 
> [Removed comp.lang.perl. 
> Group no longer exists, please 
> inform your news admin]
 
(snipped)

Actually, comp.lang.perl and alt.perl do exist.
Randal, others and myself post in both groups
on a regular basis. You should correct yourself
by saying, "Most servers do not carry those groups."

Incidently, both rogue groups are relatively free
of conflict, discontent and hatred which you and
others bring to this group, habitually.

Do USENET a favor. Don't you or your friends
here, move into those rogue groups. Please,
if you don't mind. There are ample trolls
like yourself to found everywhere in USENET.
Help us to keep at least two groups peaceful
compared to the rest of USENET.


Godzilla!


------------------------------

Date: Mon, 19 Feb 2001 21:32:29 -0500
From: Obi Wan <obiwan@mahood.com>
To: MJS <sharty@ccbs.com>
Subject: Re: Character read stopping at 200 characters
Message-Id: <3A91D73D.6439CC27@mahood.com>

You say that the files get written properly, with all 1000 characters,
right?

But when you "recall the file for editing again" the lines are chopped at
200 characters?

What are you doing when you "recall the file for editing again"?

If you are opening it with perl, this tells me that when reading from a
file on an NT system with perl, 1000 character lines are truncated to 200
characters.  Is this what's happening to you?  If so (and this is a
guess), maybe NT has a line-length limit of 200 characters?  Who knows...
I'd say check the man pages, but.... alas, documentation in windows?

MJS wrote:

> Hello:
>
> I'm running active Perl on NT.
>
> I have several scripts that I "inherited" and am trying to work with.
>
> One reads an edit file (text), and builds a web page that the user can
> then input/edit data in.  It outputs it through a form post and a CGI
> script.
>
> It writes 2 files, a TEXT file and an HTML shell.
>
> The field in question is a free flow text box, that I want to have,
> say 1000 characters in.  When I run the script, the entire string is
> written out sucessfully to both files ( TXT and HTML)
>
> When I recall the file for editing again, PERL is cutting off the 1000
> character input at 200.
>
> How to I extend the maximum default length of a variable, or define it
> to be a certain length??
>
> Thank you.
>
> Mike



------------------------------

Date: Tue, 20 Feb 2001 02:38:28 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: Character read stopping at 200 characters
Message-Id: <slrn993m54.3kr.mgjv@verbruggen.comdyn.com.au>

[Removed comp.lang.perl. Group no longer exists, please inform your
news admin]

On Fri, 16 Feb 2001 08:29:41 GMT,
	MJS <sharty@ccbs.com> wrote:
> Hello:
> 
> I'm running active Perl on NT.
> 
> I have several scripts that I "inherited" and am trying to work with.
> 
> One reads an edit file (text), and builds a web page that the user can
> then input/edit data in.  It outputs it through a form post and a CGI
> script.
> 
> It writes 2 files, a TEXT file and an HTML shell.
> 
> The field in question is a free flow text box, that I want to have,
> say 1000 characters in.  When I run the script, the entire string is
> written out sucessfully to both files ( TXT and HTML)

You checked this by having a look at the file, I presume? How did you
check? With what sort of tool?

> When I recall the file for editing again, PERL is cutting off the 1000
> character input at 200.

It's Perl or perl. Not PERL.

And Perl has no such limitations. Perl is not doing this. If you have,
by any chance, a ctrl-z character in the file, then the underlying
system libraries will truncate the file at that point, on DOS-based
systems. Maybe that's what's happening?

> How to I extend the maximum default length of a variable, or define it
> to be a certain length?? 

You can't. You don't have to. I've had much larger strings than 1000
characters in Perl variables. 

Something else is wrong. But we don't know what, because you haven't
given us any code to inspect.

Maybe you should make absolutely sure that the file you write _is_ a
text file, i.e. doesn't contain any stuff that shouldn't be in a text
file on the platform you're in.

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | I took an IQ test and the results
Commercial Dynamics Pty. Ltd.   | were negative.
NSW, Australia                  | 


------------------------------

Date: 20 Feb 2001 04:22:56 GMT
From: anotherway83@aol.com (Another Way)
Subject: chop and chomp
Message-Id: <20010219232256.04852.00000222@ng-ch1.aol.com>

hey
whats the diff. between chop and chomp (please don't say sumthing like "chomp
has an M, chop doesn't", lol)they seem to work the same way

thanks
peace


------------------------------

Date: Tue, 20 Feb 2001 04:30:54 GMT
From: cfedde@fedde.littleton.co.us (Chris Fedde)
Subject: Re: chop and chomp
Message-Id: <2qmk6.283$zN2.135468544@news.frii.net>

In article <20010219232256.04852.00000222@ng-ch1.aol.com>,
Another Way <anotherway83@aol.com> wrote:
>hey
>whats the diff. between chop and chomp (please don't say sumthing like "chomp
>has an M, chop doesn't", lol)they seem to work the same way
>
>thanks
>peace

Should be obvious from the perlfunc manual page:

       chomp   This safer version of the chop entry elsewhere in
               this document removes any trailing string that
               corresponds to the current value of `$/'

       chop    Chops off the last character of a string and
               returns the character chopped.

Hope this helps
chris
-- 
    This space intentionally left blank


------------------------------

Date: Tue, 20 Feb 2001 10:02:54 +0800
From: "John Lin" <johnlin@chttl.com.tw>
Subject: Re: delete lock file, possible race condition or dead lock?
Message-Id: <96sjbo$2gv@netnews.hinet.net>

None of this really has anyhing whatever to do with Perl - the same
prinicples would apply whatever language we did this in.

<nobull@mail.com> wrote
> "John Lin" writes:
>
> > For some reason (to keep the directory clean) I have to delete
> > the lock file after locking.
> > Could this extra action (delete the lock file) cause race condition
> > or dead lock?

> There is a race because ... [snip]
> Note the -f check is redundant so you may as well remove it.

Yes, I understand now.

> If the filesystem and OS support it, you can use sysopen to atomically
> create a lock file.  In this case the flock() becomes redundant.

Hmm...  Here I picture out the model

atomic_create_and_lock($filename);
    critical_section_here();
atomic_unlock_and_delete($filename);

But no system has such  atomic_unlock_and_delete($filename)
(unless we use another lock file.  Oh, no.)

or

F = atomic_create_and_lock($filename);
    critical_section_here();
    delete($filename);    # inside lock-unlock section
unlock(F);

Delete an opened file?  Experiment shows the delete always fails.

Then, is this a mission impossible (to create then clean-up lock file)?  : (
Any other ideas?
Thank you very much.

John Lin





------------------------------

Date: Tue, 20 Feb 2001 00:03:16 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: Examples of Using HTML Parser
Message-Id: <h2d39tkogvo8p97p9c1iict9trk4m4g82j@4ax.com>

Gabriel Richards wrote:

>"In order to make the parser do anything interesting, you must make a
>subclass where you override one or more of the following methods as
>appropriate:"
>
>A subclass? What's that? "The default implementation of these methods do
>nothing, i.e., the tokens are just ignored." So, how do I change the
>implementation? Can someone point me to some examples please?
	
	<http://www.gellyfish.com/htexamples/>

-- 
	Bart.


------------------------------

Date: Mon, 19 Feb 2001 18:34:27 -0800
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: FAQ 4.48:   How do I select a random element from an array?
Message-Id: <3A91D7B3.4C8F1C84@stomp.stomp.tokyo>

PerlFAQ Server wrote:

(snippage)

>   How do I select a random element from an array?
 
>   Use the rand() function (see the rand entry in the perlfunc manpage):
 
>   # at the top of the program:

A call for srand does not need to be at the top
of script. It only needs to precede its first use.

A script should not call srand unless srand is
actually to be used. It is inefficient to call
a function and not use it. A conditional should
be used to determine if random is needed under
circumstances where it may or may not be needed.


> srand;  # not needed for 5.004 and later
 
> # then later on
> $index   = rand @array;
> $element = $array[$index];
 

There are both a safer way to do this and a more
efficient way to do this.

For safety, to assure better portability for older
versions of Perl, an if conditional can call srand
only if needed:

  $index   = rand @array;

  if (!($index))
   { srand; $index = rand @array }

  $element = $array[$index];


This following method is more efficient but does
not include a srand call for portability:

$element = $array[rand(@array)];

However, an if conditional could be easily added
to check $element as in my previous example.

This FAQ needs to be updated to correct errors
and to display better methodology.


Godzilla!


------------------------------

Date: Tue, 20 Feb 2001 03:21:01 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: kill unix process
Message-Id: <3A91E3CC.6D054651@acm.org>

Ji Lee wrote:
> 
> here is my code:
> 
> exec "/usr/bin/ksh", "-c", <<EOF;
> ps -ef|grep slapd |grep -v grep |awk '{print $2}'|xargs kill -9
> sleep 2

system( "ps -ef|grep slapd |grep -v grep |awk '{print $2}'|xargs kill
-9" );

Or for a more Perlish way:

kill 9, map { (split " ")[1] } grep( /slapd/, `ps -ef` );

> I will to convert the above line to perl. can anyone help?

And of course if you have the command killall do:
killall -9 slapd


John


------------------------------

Date: Tue, 20 Feb 2001 04:27:24 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: kill unix process
Message-Id: <slrn993sha.s3i.tjla@thislove.dyndns.org>

I was shocked! How could Ji Lee <jlee8@irix1.gl.umbc.edu>
say such a terrible thing:
>
>here is my code:
>
>exec "/usr/bin/ksh", "-c", <<EOF;
>ps -ef|grep slapd |grep -v grep |awk '{print $2}'|xargs kill -9
>sleep 2
>
>I will to convert the above line to perl. can anyone help?

Ok to translate the above line to Perl (not 'perl') you need to add the
line 'EOF' just after the 'sleep 2'.

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
Actors search for rejection. If they don't get it they reject themselves.
-Chevy Chase


------------------------------

Date: Tue, 20 Feb 2001 01:23:33 GMT
From: GWN <chart@bestweb.net>
Subject: Re: matching special characters? (typo correction)
Message-Id: <B6B7313C.E387%chart@bestweb.net>

in article B6B72FA9.E2C5%chart@bestweb.net, GWN at chart@bestweb.net wrote
on 2/19/01 8:16 PM:

> Hello perl experts,
> I'm a novice perl programmer and have a problem that has me stumped. My
> program  is looking for a match for words entered on a form to a field in a
> database. The words entered on the form are in @ckeywords. The values from
> the database are in $keywords. Here is the code I have for checking:
> 
> 
> foreach $search_term (@ckeywords)  {
> $_=$keywords;
> if (/$search_term/i)  {
> $marker=1;}
> }
> 
> I realize there is probably a better way to do this (how do I do the check
> for a match without putting $keywords into $_   ?) but it seems to work
> fine, unless a quotation mark is entered on the form. The program just stops
> at the "if ($search_term/i)  {"   line when the $search_term is ? .
> 
> 
> It also has problems with other special characters.
> 
> I would like to able to have the program match a question mark. Any insight
> into how I match special characters with this program would be greatly
> appreciated.
> 
> thanks :)
> 

Correcting typo...in the paragraph after the code it's a question mark, not
a quotation mark...the paragraph should read:

I realize there is probably a better way to do this (how do I do the check
for a match without putting $keywords into $_   ?) but it seems to work
fine, unless a question mark is entered on the form. The program just stops
at the "if ($search_term/i)  {"   line when the $search_term is ? .


thanks



------------------------------

Date: Tue, 20 Feb 2001 01:16:42 GMT
From: GWN <chart@bestweb.net>
Subject: matching special characters?
Message-Id: <B6B72FA9.E2C5%chart@bestweb.net>

Hello perl experts,
    I'm a novice perl programmer and have a problem that has me stumped. My
program  is looking for a match for words entered on a form to a field in a
database. The words entered on the form are in @ckeywords. The values from
the database are in $keywords. Here is the code I have for checking:


foreach $search_term (@ckeywords)  {
       $_=$keywords;
       if (/$search_term/i)  {
           $marker=1;}
}

I realize there is probably a better way to do this (how do I do the check
for a match without putting $keywords into $_   ?) but it seems to work
fine, unless a quotation mark is entered on the form. The program just stops
at the "if ($search_term/i)  {"   line when the $search_term is ? .


It also has problems with other special characters.

I would like to able to have the program match a question mark. Any insight
into how I match special characters with this program would be greatly
appreciated.

thanks :)



------------------------------

Date: Tue, 20 Feb 2001 01:31:45 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: matching special characters?
Message-Id: <slrn993i81.3kr.mgjv@verbruggen.comdyn.com.au>

On Tue, 20 Feb 2001 01:16:42 GMT,
	GWN <chart@bestweb.net> wrote:

>        if (/$search_term/i)  {

> I would like to able to have the program match a question mark. Any insight
> into how I match special characters with this program would be greatly
> appreciated.

Look at the quotemeta function, described in perlfunc, or the \Q\E
modifiers described in the perlre documentation.

if (/\Q$search_term/i)

You don't need to assign $keywords to $_ BTW.

if ($keywords =~ /\Q$search_term/i)

will match $search_term directly against $keywords. This is described
in perlre and perlop.

Martien

PS. To read the documentation, use the man command, or the perldoc
command, or Shuck, or the installed HTML docuemntation, all depending
on the platform you're on.
-- 
Martien Verbruggen              | 
Interactive Media Division      | Useful Statistic: 75% of the people
Commercial Dynamics Pty. Ltd.   | make up 3/4 of the population.
NSW, Australia                  | 


------------------------------

Date: Mon, 19 Feb 2001 18:10:42 -0800
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: matching special characters?
Message-Id: <3A91D222.6E7B0CB9@stomp.stomp.tokyo>

GWN wrote:

> My program  is looking for a match for words entered on a form 
> to a field in a database. The words entered on the form are in
> @ckeywords. The values from the database are in $keywords.

> foreach $search_term (@ckeywords)  {
>        $_=$keywords;
>        if (/$search_term/i)  {
>            $marker=1;}
> }


A presumption is made your variable $keywords holds
single line data.

You will discover using index () to be significantly
faster, uses less memory, does not have 'bugs' with 
user metacharacters and, will improve the performance
and the efficiency of your script.

foreach $search_term (@ckeywords)
 {
  if (index ($keywords, $search_term) > -1)
   { $marker = 1; }
 }

You may add a counter if you wish to track
the number of matches:

$counter = 0;
foreach $search_term (@ckeywords)
 {
  if (index ($keywords, $search_term) > -1)
   { $marker = 1; $counter++; }
 }

You may exit your loop if you wish to
quit on your first match:

foreach $search_term (@ckeywords)
 {
  if (index ($keywords, $search_term) > -1)
   { $marker = 1; last; }
 }


Godzilla!


------------------------------

Date: Tue, 20 Feb 2001 04:17:34 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: perl irc channel
Message-Id: <slrn993rus.s3i.tjla@thislove.dyndns.org>

I was shocked! How could AvA <a.v.a@home.nl>
say such a terrible thing:
>lol randal, somehow i doubt that anyone would fire you, but okay if you change
>ur mind
>one day then then we would be honoured

He is being disingenuous I think. He can't fire himself. That said, I
should point out that long-standing usenet custom says that any replies
should come after (as I do) the text you are replying to, not before (as
you do).

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
If you had your life to live over again--you'd need more money.
-Construction Digest (contributed by Chris Johnston)


------------------------------

Date: Mon, 19 Feb 2001 23:08:32 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: PROPOSAL: Graphics::ColorNames
Message-Id: <slrn9939rg.2th.mgjv@verbruggen.comdyn.com.au>

On Mon, 19 Feb 2001 16:50:20 -0500,
	Robert Rothenburg <wlkngowl@unix.asb.com> wrote:
> Abigail wrote:
>> 
>> Don't forget that you are programming in *Perl*. If such
>> microoptimalization is important to you, forget Perl exists. Use C.
> 
> Aye! A few hundren color names for X-windows multipled by 3 elements in
> an array? That's a lot.

$ cat foo.pl
#!/usr/local/bin/perl -w
use strict;

my ($vsize_start, $rss_start) = split ' ', 
    `ps -ovsize,rss -p $$ | grep -v VSZ`;

my %cnames;
open XC, "showrgb |" or die $!;
while (<XC>)
{
    chomp;
    my ($r, $g, $b, $name) = split ' ', $_, 4;

    $cnames{$name} = @ARGV ? 
        sprintf "%02x%02x%02x", $r, $g, $b:
        [$r, $g, $b];
}
close XC or die $!;

my ($vsize_end, $rss_end) = split ' ', 
    `ps -ovsize,rss -p $$ | grep -v VSZ`;

printf "VSIZE: $vsize_end - $vsize_start = %d\n", $vsize_end - $vsize_start;
printf "RSS  : $rss_end - $rss_start = %d\n", $rss_end - $rss_start;

$ ./foo.pl
VSIZE: 2988 - 2784 = 204
RSS  : 1516 - 1264 = 252
$ ./foo.pl foo
VSIZE: 2868 - 2784 = 84
RSS  : 1408 - 1264 = 144

Looks like the difference between using array refs and a hex string
isn't that large. it's largish when compared to each other, but small
when compared to the total amount of memory allocation that needs to
be done. Apart from that, I haven't benchmarked the difference in
speed, but splitting and reassembling hex keys can be reasonably
expensive as well. pack() and unpack() are reasonably cheap, but all
the splitting, sprintf-ing and temporary arrays to hold the results
might actually impose a higher load than you want.

I'd say that the difference in the two methods would be trivial in any
case, unless you start putting access to this stuff in very tight
loops. But I can't really imagine an application doing that.

Of course, no one says that you need to load all colours on startup.
You could store them in several files, and load them on demand. One
file with all the most common ones. Several files with the less common
ones, broken up by name or so. Some logic could load them when needed.
However, I probably wouldn't even bother :)

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | Can't say that it is, 'cause it
Commercial Dynamics Pty. Ltd.   | ain't.
NSW, Australia                  | 


------------------------------

Date: 19 Feb 2001 23:44:39 GMT
From: abigail@foad.org (Abigail)
Subject: Re: PROPOSAL: Graphics::ColorNames
Message-Id: <slrn993bv7.594.abigail@tsathoggua.rlyeh.net>

Martien Verbruggen (mgjv@tradingpost.com.au) wrote on MMDCCXXIX September
MCMXCIII in <URL:news:slrn9939rg.2th.mgjv@verbruggen.comdyn.com.au>:
$$ 
$$ Of course, no one says that you need to load all colours on startup.
$$ You could store them in several files, and load them on demand. One
$$ file with all the most common ones. Several files with the less common
$$ ones, broken up by name or so. Some logic could load them when needed.
$$ However, I probably wouldn't even bother :)


A dbm file comes in mind...


Abigail
-- 
map{${+chr}=chr}map{$_=>$_^ord$"}$=+$]..3*$=/2;        
print "$J$u$s$t $a$n$o$t$h$e$r $P$e$r$l $H$a$c$k$e$r\n";


------------------------------

Date: Tue, 20 Feb 2001 00:51:47 +0000
From: Sam Kington <sam@illuminated.co.uk>
Subject: Re: PROPOSAL: Graphics::ColorNames
Message-Id: <3A91BFA2.47AF8B2E@illuminated.co.uk>

Abigail wrote:
> 
> Martien Verbruggen (mgjv@tradingpost.com.au) wrote on MMDCCXXIX September
> MCMXCIII in <URL:news:slrn9939rg.2th.mgjv@verbruggen.comdyn.com.au>:
> $$
> $$ Of course, no one says that you need to load all colours on startup.
> $$ You could store them in several files, and load them on demand. One
> $$ file with all the most common ones. Several files with the less common
> $$ ones, broken up by name or so. Some logic could load them when needed.
> $$ However, I probably wouldn't even bother :)
> 
> A dbm file comes in mind...

Having torn my hair out regarding all the different types of
implementation of dbm files on different systems (just Unices; let's not
even talk about Windows systems; they're fairly harmless, inasmuch as
all the weird behaviours have already been claimed by random Unix
variants, and Windows is reduced to emulating one of the cruftiest) --
I'd say that dbm files are one of those things that seem like a good
idea until you've had to deal with them, on a cross-platform basis.

Bear in mind that you have to construct the dbm file at the Makefile.PL
level, unless you know something that I don't (in which case, *please*
tell me.)

Yes, dbm files are an obvious Good Idea. But good luck on your
supporting them. (Your ExtUtils::MakeMaker rules, in particular, are
going to be pretty ugly - unless you know something I don't. Mail me for
more details.)

> map{${+chr}=chr}map{$_=>$_^ord$"}$=+$]..3*$=/2;
> print "$J$u$s$t $a$n$o$t$h$e$r $P$e$r$l $H$a$c$k$e$r\n";

Damn, Abigail, I thought I'd escaped your evil sigs by unscribing from
the scary devil monastery. Damn, damn, damn. Have you ever considered
collating them into book-form, or at least some handy web page?

Sam
-- 
Home page: http://www.illuminated.co.uk/
INWO Homebrew: http://www.illuminated.co.uk/inwo/
I spilled spot remover on my dog. He's gone now.


------------------------------

Date: Tue, 20 Feb 2001 02:20:25 GMT
From: "Gabriel Richards" <grichards@endertechnology.com>
Subject: Reg Ex Help Pls
Message-Id: <Jvkk6.44913$_D.6675061@typhoon.we.rr.com>

Hi. I need more reg ex help please.

The string is a directory structure or arbitrary length like:

/<anything>/<anything>/anything/anything/

I want to extract the last "anything". So I tried:

$temp =~ /\/(.+?)\/$/;
$name = $1;

but that grabs everything except the opening and closing "/". I thought the
"$" told it to match at the end of the string (i.e. furthest to the right?).
I don't think I'm understanding its use correctly.

Gabe




------------------------------

Date: Tue, 20 Feb 2001 03:01:06 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: Reg Ex Help Pls
Message-Id: <3A91DE6E.D6D393F0@rochester.rr.com>

Gabriel Richards wrote:
 ...
> The string is a directory structure or arbitrary length like:
> 
> /<anything>/<anything>/anything/anything/
> 
> I want to extract the last "anything". So I tried:
> 
> $temp =~ /\/(.+?)\/$/;
> $name = $1;
> 
> but that grabs everything except the opening and closing "/". I thought the
> "$" told it to match at the end of the string (i.e. furthest to the right?).
> I don't think I'm understanding its use correctly.
> 
> Gabe

Well, you need to put in a proper definition of "anything".  I assume
that means any string that does not contain a /, but is at least one
character long.  Is that right?  If so, try:

   $temp =~ m#/([^/]+)/$#;

Note that this will not work if the last / is following by anything.  If
you want that to work, then maybe:

   $temp =~ m#/([^/]+)/(?:[^/]+)?$#;
-- 
Bob Walton


------------------------------

Date: 19 Feb 2001 22:27:24 -0500
From: Joe Schaefer <joe+usenet@sunstarsys.com>
Subject: Re: Reg Ex Help Pls
Message-Id: <m3y9v2c9ur.fsf@mumonkan.sunstarsys.com>

"Gabriel Richards" <grichards@endertechnology.com> writes:

> Hi. I need more reg ex help please.
> 
> The string is a directory structure or arbitrary length like:
> 
> /<anything>/<anything>/anything/anything/
> 
> I want to extract the last "anything". 

Are you sure you always have a trailing "/"?

> So I tried:
> 
> $temp =~ /\/(.+?)\/$/;
> $name = $1;
> 
> but that grabs everything except the opening and closing "/". I thought the
> "$" told it to match at the end of the string (i.e. furthest to the right?).

Not quite.  The regexp will try to match as soon as possible, and *then*
as little as possible (since you modified the greediness). "Soon" here
means the very first slash, and your trailing $ means that no shorter 
match than the whole string will work. Hence the "?" has no meaningful 
effect in this case.  You need to replace .+ with something that won't
match a "/".  TMTOWTDI, though.

If you definitively know that there's a trailing "/", you could use a
rindex + substr approach.  However, if there might not be a final "/", 
then split() or a regexp match (or substitution) is certainly better:

    chomp $temp;  # drop potential trailing newline
    my $name = (split '/', $temp)[-1];
or
    my ($name) = $temp =~ m{ / ([^/]+?) /? \s* $ }x; # x ignores spaces
                                ^^^^^^
If the string is tainted user input, you would be better off matching 
on a positive expression like \w+? (that excludes "/") rather than a 
negative one like [^/]+?. For example, the regexp above will match 
strings with newlines and control characters in them. This way $name
is safely untainted; note the other approaches won't do this. (The "+?" 
in this pattern allows the \s* to match as early as possible, so trailing
newlines, tabs, and spaces are removed automatically).

See perlop and perlre for regexp info, and the documentation in perlfunc 
for details on rindex, substr, and split.

HTH

Joe Schaefer
-- 
%ENV=(); $A="\rr jpeurls ht\ba \rcankotehe"x666;END{ system
"$^X -wT $0 $^S";print"r\n"}sub foo{$_=pop||exit;/$_/;print
eval 'BEGIN{$^H='. ($^H+=666) .'}$_[-(()=$A=~//g)+$[]';}@_=
reverse$A=~/./g;&foo while$ARGV[0]=~//g;#evil mess for *nix


------------------------------

Date: Tue, 20 Feb 2001 03:11:52 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: Reg Ex Help Pls
Message-Id: <3A91E1A8.CAF14CEE@acm.org>

Gabriel Richards wrote:
> 
> Hi. I need more reg ex help please.
> 
> The string is a directory structure or arbitrary length like:
> 
> /<anything>/<anything>/anything/anything/
> 
> I want to extract the last "anything". So I tried:
> 
> $temp =~ /\/(.+?)\/$/;
> $name = $1;
> 
> but that grabs everything except the opening and closing "/". I thought the
> "$" told it to match at the end of the string (i.e. furthest to the right?).
> I don't think I'm understanding its use correctly.

The problem is that '.' also matches '/' try:
$temp =~ m-/([^/]+?)/$-;


John


------------------------------

Date: Mon, 19 Feb 2001 19:56:59 -0800
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: Reg Ex Help Pls
Message-Id: <3A91EB0B.CDD2BABF@stomp.stomp.tokyo>

Gabriel Richards wrote:

(snippage)

> The string is a directory structure or arbitrary length like:
 
> /<anything>/<anything>/anything/anything/
 
> I want to extract the last "anything". So I tried:
 
> $temp =~ /\/(.+?)\/$/;
> $name = $1;
 
> but that grabs everything except the opening and closing "/". 
> I thought the "$" told it to match at the end of the string 
> (i.e. furthest to the right?). I don't think I'm understanding
> its use correctly.


Use of substring is almost always significantly faster
and vastly more efficient than a regex for circumstances
such as yours.

Although my first test script below my signature uses twice as 
much coding for a substring than it does for a regex, my substring
method runs more than twice as fast and uses less memory to even
a greater degree. My substring could be shortened into two lines
of code but this would obfuscate my code and not change performance.

Do yourself a favor and use substring whenever possible. Use
of a regex is most often a very poor choice in programming
save for circumstances where you have no choice or, use
of a regex is your best option.

Below my signature, in my first test script, you will
find a somewhat easier regex to use, should you decide
to do this the most popular and least efficient way.
My regex can be swapped right into your match method
with minor syntax changes, using a " m " operator.
Research 'match' and ' m operator ' for details.
A direct swap can be made by replacing my " ¡ "
with " / " and escaping " \/ " your right slashes,
as you show in your original code.

My second test script reflects benchmark timings which
will certainly surprise both you and others here who
adamantly worship regex methods, as good Cargo Cultists
are prone, quite in the literal sense, to do doing.

You will also note my benchmark test script gives
this regex the greatest advantage by not using your
match and equal method. Even with an advantage, this
regex performs most miserable compared to a substring
method. Your match and equal method would perform even
worse, although only moderately worse.

Godzilla!
--

TEST SCRIPT ONE:
________________

#!perl

print "Content-type: text/plain\n\n";

$string = "/<anything>/<anything>/anything/GRAB THIS/";

$start = rindex ($string, "/", length ($string) - 2) + 1;

$stop = length ($string) - $start - 1;

$new = substr ($string, $start, $stop);

print "Fast Substring Method:\n  $new\n\n";


$string =~ s¡.*/(.*)/¡$1¡;

print "Slow Regex Method:\n  $string";

exit;

PRINTED RESULTS:
________________

Fast Substring Method:
  GRAB THIS

Slow Regex Method:
  GRAB THIS


######################


TEST SCRIPT TWO:
________________

#!perl

print "Content-type: text/plain\n\n";

use Benchmark;

timethese (1000000,
 {

 'name1' => '$string = "/<anything>/<anything>/anything/GRAB THIS/";
             $start = rindex ($string, "/", length ($string) - 2) + 1;
             $stop = length ($string) - $start - 1;
             $new = substr ($string, $start, $stop);',

 'name2' => '$string = "/<anything>/<anything>/anything/GRAB THIS/";
             $string =~ s¡.*/(.*)/¡$1¡;',

 });

exit;


PRINTED RESULTS:
________________


Benchmark: timing 1000000 iterations of name1, name2...
name1:  6 wallclock secs ( 6.15 usr +  0.00 sys =  6.15 CPU) @ 162601.63/s
(n=1000000)
name2: 17 wallclock secs (17.20 usr +  0.00 sys = 17.20 CPU) @ 58139.53/s
(n=1000000)


------------------------------

Date: 20 Feb 2001 04:40:13 GMT
From: ianb@ot.com.au
Subject: Re: Reg Ex Help Pls
Message-Id: <96ssfd$r7a$1@news.netmar.com>

In article <Jvkk6.44913$_D.6675061@typhoon.we.rr.com>, Gabriel Richards
<grichards@endertechnology.com> writes:
>Hi. I need more reg ex help please.
>
>The string is a directory structure or arbitrary length like:
>
>/<anything>/<anything>/anything/anything/
>
>I want to extract the last "anything". So I tried:
>
>$temp =~ /\/(.+?)\/$/;
>$name = $1;
>
>but that grabs everything except the opening and closing "/". I thought the
>"$" told it to match at the end of the string (i.e. furthest to the right?).
>I don't think I'm understanding its use correctly.

Well, the $ is anchoring to the end of the string, just like you wanted. The
problem lies elsewhere.

First, let me say that you should probably be using the File::Basename
module,
but so you can learn about regexes, read on:

What's happened is that you've fallen into the non-greediness trap,
using ".+?" to try to match the shortest thing. What you really want
is [^/]+, which can't match a slash. Read up on non-greediness. It doesn't
guarantee to match the shortest matching string; it will match the shortest
in the current attempted match (starting position).

You should also avoid using // as your match delimiters if the pattern
includes "/" (but don't forget the "m").

The trailing slash is also optional in a path.

Try:
> perl -e '$path="/hello/there/fred/"; $path =~ m%/([^/]+)/?$%; print
"MATCHED: $1\n";' 
MATCHED: fred

Regards,


Ian



 -----  Posted via NewsOne.Net: Free (anonymous) Usenet News via the Web  -----
  http://newsone.net/ -- Free reading and anonymous posting to 60,000+ groups
   NewsOne.Net prohibits users from posting spam.  If this or other posts
made through NewsOne.Net violate posting guidelines, email abuse@newsone.net


------------------------------

Date: Mon, 19 Feb 2001 21:18:49 -0500
From: Murali K Inaganti <inaganami@email.uc.edu>
Subject: sending cookies with LWP user agent
Message-Id: <3A91D409.61290686@email.uc.edu>

Hi all,

I am writing simple a auto mail check program, which gets mail from
yahoo id. I was using LWP::Useragent for sending for sending all the
login passwd etc,
Its returning a 302 Found (redirection) and it never logs me in.

but if i input the generated  URL  through netscape, I logs me in.

can anyone tell me whats wrong or give  me some pointers to
such programs

thanking you,

Murali






------------------------------

Date: 19 Feb 2001 23:35:10 GMT
From: ianb@ot.com.au
Subject: Re: Specifying the length of regular expression
Message-Id: <96saje$sne$1@news.netmar.com>

In article <t92nbv22nllh2e@corp.supernews.com>, Greg Bacon
<gbacon@HiWAAY.net> writes:
>In article <3A9107AA.4294F72F@ot.com.au>,
>    Ian Boreham  <ianb@ot.com.au> wrote:

OK, let's try this again.

>: Greg Bacon wrote:
>: 
>: > : Yes you do; the original spec was length greater than ten.
>: 
>: Sigh. You didn't try it, did you? The lookahead expression is not anchored
>: at the end, so it is sufficient for the lookahead to check that it matches
>: the minimum number. The body of the expression will actually match the
full
>: string.

This interchange was with regard to the necessity of the comma in "{10,}",
not whether it should be 10 or 11. I put a comma in the original, then
realised it was unnecessary and posted to that effect, after confirming
with my original tests that it wasn't. You claimed the comma was necessary,
presumably by inspection. The comma is not needed. Try it. The reason why
it is not needed is quoted above.


>Umm...

-- snip irrelevant example which really refers to below comment, not above --

>: > ...and the 10 should be 11. :-)
>: 
>: Yes, you are right. I obviously didn't read the post carefully enough.
>
>Please make up your mind! :-)

I have. Obviously I'm not the only one who hasn't been reading carefully.


>When doesn't one have a choice?

I provided a justification in my original post, but I'll try
again. Sometimes you aren't writing the code, and you can only write
the regex. For example, you are using a third-party module which
requires you to specify a regex (or a file containing hundreds of regexes)
for validation purposes, or for matching "interesting" log messages. You
can't modify the module, so you just write an appropriate regex. These
cases are rare, but they do happen.

The other variation on this theme is where you have written the code
yourself, but you need to use many regexes (e.g. in a loop). On the
odd occasion, you want to do something fancier, but you can't
just change the logic since the extra code applies to only one case, and
you don't want the overhead, bug potential or security concerns of
allowing arbitrary expressions to be evaluated each time.


>I agree with you that these are all interesting questions from an
>academic standpoint, but clarity is king in real code.

Academic? Real code? I'm not even going to bite.

Trying to improve the clarity of code is something I deal with every
day. I also think the issue is interesting from an academic point of
view, but I was posting from a pragmatic point of view. Sometimes you
don't have much choice, and you have to get your hands dirty to solve
a problem.


>Please don't construct the "you want everyone to dumb down their code" straw
man
>because it's not my point at all.

I hope you're just asking me not to do that in response, because I have not
done that in my posts, and I don't agree with such tactics or the argument
either.

Regards,


Ian


 -----  Posted via NewsOne.Net: Free (anonymous) Usenet News via the Web  -----
  http://newsone.net/ -- Free reading and anonymous posting to 60,000+ groups
   NewsOne.Net prohibits users from posting spam.  If this or other posts
made through NewsOne.Net violate posting guidelines, email abuse@newsone.net


------------------------------

Date: Tue, 20 Feb 2001 04:22:02 GMT
From: "Frank Miller" <no@email.com>
Subject: Re: striping HTML
Message-Id: <Khmk6.445892$U46.13252262@news1.sttls1.wa.home.com>

"Scott R. Godin" <dontspamthewebmaster@webdragon.net> wrote in message
news:96rd4j$2ph$4@216.155.32.167...
>
> Gah, I definitely have not had enough coffee.. I meant to say "not you,
> Todd" as Frank was the original offender, but this point as now moot, as
> I see from another post that Frank has seen the light to sticking with
> the group's standards for certain things. ;) (Thanks, Frank :)

I may be slow, but I do learn :-)

FrankM






------------------------------

Date: Tue, 20 Feb 2001 04:17:35 GMT
From: Chicheng Zhang <me@noshadow.net>
Subject: Re: This sending of mail works flawlesly... but...
Message-Id: <3A91F0AE.428FBEC2@noshadow.net>

Me wrote:

>    use NET::SMTP;
>    $smtp = Net::SMTP->new('serverip');
>    $smtp->mail($ENV{'USER'});
>    $smtp->to("$mail");
>    $smtp->cc("$cc\@mydom.com");
>    $smtp->data();
>
> etc etc....
>
> The cc doesn't work.
> Why?
>
> Thanks,
>
> Me

cc is neither a method supported by Net::SMTP nor a command in SMTP
protocol (RFC821).

--chicheng



------------------------------

Date: Mon, 19 Feb 2001 23:24:04 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: use of =>
Message-Id: <slrn993aok.2th.mgjv@verbruggen.comdyn.com.au>

On Mon, 19 Feb 2001 17:34:41 GMT,
	Bart Lateur <bart.lateur@skynet.be> wrote:
> Hpz wrote:
> 
>>I am still learning perl and my books never game me a good explaination of
>>what "=>" does. my understanding is that it has the same effect as a comma,
>>but makes it easyer to look at what is going where. am i right?
> 
> Close enough. It also has the effect that if the thing on the left side
> of it is a bareword, it automatically gets quoted (= used as literal
> string). A bareword is the same pattern as what you'd normally use for
> variable and sub names: starting with a letter or underscore, and
> continuing with zero or more of letter, digit or underscore.

Maybe it's worth noting that in 5.6.0 a bug crept in here. Any of the
autmagic barewords starting with v (see perldata) are not correctly
quoted by =>. This bug, I believe, has been reported, and should be
fixed in the next version.

$ perl -wle '%a = (v1234 => 42); print $a{v1234}'
Use of uninitialized value in print at -e line 1.

$

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | Hi, John here, what's the root
Commercial Dynamics Pty. Ltd.   | password?
NSW, Australia                  | 


------------------------------

Date: Mon, 19 Feb 2001 16:32:10 -0800
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: use of =>
Message-Id: <3A91BB0A.18140C15@stomp.stomp.tokyo>

Hpz wrote:
 
> I am still learning perl and my books never game me a 
> good explaination of what "=>" does. my understanding 
> is that it has the same effect as a comma, but makes 
> it easyer to look at what is going where. am i right?

Others have explained technical details of this naming
convention and, have explained inherent bugs related
to this naming convention.

I never use 'fat arrows' as they are highly illogical.
Daily common convention dictates a,

=>

means "to the right" and a, 

<=

means "to the left", obviously logical.

Neither of these symbols logically represent what
they do pertaining to Perl programming. They are
amongst many oxymoronic misnomers of new Perl.
These symbols and others of a similar notion
represent an effort towards making Perl as
obfuscated at possible; Egyptian Hieroglyphics
of equal difficulty in deciphering.

Logically, those fat arrows should represent
mathematical statements. A => would be read
as "equal to or greater than" and a <= clearly
would be "less than or equal to" in translation.
This use of mathematical symbology to represent
something other than math, is added insult to
injured logic.

Use of a comma and use of a semicolon, is highly
logical in contrast. A comma used in Perl means
precisely what it represents in Plain English,
"something new and highly related" as should be.
Use of a semicolon in Perl also makes good sense.
Common usage of a semicolon is to symbolically
denote, "Something new and moderately related."

However, this does not explain the illogic of
using a period to represent "join together"
in Perl. Contrasting this, use of "join" does
well represent just that; join. Personally,
as with other Egyptian Hieroglyphics of Perl,
I never use a period to indicate a joining.
Quotes work well as does our word, join.

New Perl is riddled with illogical symbols.
Clearly use of Perl Hieroglyphics or use
of Plain English Perl, is a personal choice.
My choice is to use Plain English for writing
Perl scripts; this is significantly more logical
than using incomprehensible illogical symbols.

Godzilla!


------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 318
**************************************


home help back first fref pref prev next nref lref last post