[15612] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 3025 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri May 12 06:11:24 2000

Date: Fri, 12 May 2000 03:10:13 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <958126213-v9-i3025@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Fri, 12 May 2000     Volume: 9 Number: 3025

Today's topics:
        Problem: hash introducing extra space on output? (Bill)
    Re: Regex Question (hopefully, an educated guess) (Tad McClellan)
    Re: Regular expression ? <godzilla@stomp.stomp.tokyo>
        regular expression needed <ii4533@fh-wedel.de>
    Re: Subroutine error stops module ?! <lr@hpl.hp.com>
    Re: Subroutine error stops module ?! <hans-jan@stack.nl>
    Re: Subroutine error stops module ?! poppln@my-deja.com
    Re: unpack c struct <blah@nospam.com>
        Windows or Unix, Perl or C hacktic@my-deja.com
    Re: Windows or Unix, Perl or C (DEDSRD)
    Re: Windows or Unix, Perl or C <damon_jebb@nai.com>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Fri, 12 May 2000 09:03:21 GMT
From: wfeidt@cpcug.org (Bill)
Subject: Problem: hash introducing extra space on output?
Message-Id: <8F323BDC3wfeidthiscom@207.126.101.97>

I'm having a problem with the code segment below. The full program 
is available at: 

  http://www.agnic.org/temp/umdext/dcprj/c/test1.txt

The excerpted segment begins on line 100.

Excerpted code segment:


%filename =  ("employer.html",    "@employer",
              "position.html",    "@position",
              "animals.html",     "@animals",
              "business.html",    "@business",
              "environment.html", "@environment",
              "extension.html",   "@extension",
              "food.html",        "@food",
              "genag.html",       "@genag",
              "landscape.html",   "@landscape",
              "other.html",       "@other",
              "plants.html",      "@plants",
              "new.html",         "@new");


foreach (keys %filename)  {
     open OUTFIL, ">$_" or die "Cannot open $_ for write: $!";
     print $_;
     chmod 0644, $_;


###omitted 12 if/elsif lines for brevity###


     &HEAD;
     print OUTFIL "$filename{$_}";
     &FOOT;
     close OUTFIL;
}


The problem is that the output from the 'print OUTFIL "$filename{$_}";'
statement looks like this:


<li><a href="j20000412f05.html">...
 <li><a href="j20000412f03.html">...
 <li><a href="j20000414f01.html">...
 <li><a href="j20000501f01.html">...


rather than the desired:


<li><a href="j20000412f05.html">...
<li><a href="j20000412f03.html">...
<li><a href="j20000414f01.html">...
<li><a href="j20000501f01.html">...


I've checked the source arrays (e.g. @employer, @position, etc.)  and there
is no "extra space" present in the them.  The -w flag is set and I am
using both "strict" and "diagnostics". This is my first attempt to work
with a hash, so any leads will be quite appreciated.


Thanks,

--

Bill
wfeidt@cpcug.org


------------------------------

Date: Fri, 12 May 2000 00:49:14 -0400
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: Regex Question (hopefully, an educated guess)
Message-Id: <slrn8hn3aa.mnq.tadmc@magna.metronet.com>

On 11 May 2000 23:14:21 GMT, mcnuttj@missouri.edu <mcnuttj@missouri.edu> wrote:

>@TEMP = snmpwalk($community, $ip, $mib);  # snmpwalk is in a sub I wrote
>SYSTEM: foreach ( @TEMP ) {


You know that you don't need @TEMP at all, don't you?

   SYSTEM: foreach ( snmpwalk($community, $ip, $mib) ) {


>	if ( /^sysDescr.+ : (.*)$/ ) {


>So here's the question:  Is that regular expression a good one?  


If it works it is "a good one".


>More
>specifically:
>
>1)  How do the ^ and $ requirements affect the speed?  Faster or slower?


Faster (generally). They reduce the number of alternatives 
for the regex engine to consider.

And since perl matches patterns left-to-right, anchoring to
the beginning of the string is (nearly?) always a big gain.


>2)  How do the specific strings affect the speed?  If the code around it
>is done properly, I could, for example, use / : (.*)$/ and it would work
>just fine, but is that slower than using the far-more-specific regex I
>have above?


I dunno.

Why don't you benchmark it and find out for yourself with
your real data (we can't do it for you 'cause we don't
have your data)?

   perldoc Benchmark


>3)  If won't work in all cases, 


Show us some of those.

Or even one of those.


>but someone mentioned before that 'split'
>might work better in a case like this.  I could split on /\s+:\s+/, since
>the colon will never appear on the "left" side of the string (it might
>appear on the "right", 


Have you considered the implications of using the 3rd argument
to split?

If you set it to 2, you get everything to the left of the first
colon as a list element, and all of the rest of the string in
the other list element:

   my($left, $right) = split /\s+:\s+/, $_, 2;


>but by then, the pattern is already matched).  As
>long as I don't use s///g, it'll only match once, and I'll be all right,
>right?


You never want to use s/// or s///g or even m//g with split().

It will not match only once. split() keeps applying it over
and over to generate its return list.


>4)  How much do (\w+), (.+), (\s+), (\d+), etc. slow things down?  


Ummm, compared to what?


>I avoid
>the use of things like (.*) (the code shown above is an exception), but
>what about the + quantifier and the parentheses?


+ is likely faster than *

 .* can match zero times, so if the engine doesn't match any
chars at that position, it has to keep trying to match.

But if it can't match any characters at the position corresponding 
to .+, it can stop right there and return false.

parens slow things down due to copying the matched chars to
memory, so don't use "memory parens":

   
     if ( /^sysDescr.+ : (?:.*)$/ ) {
                         ^^^  ^
                         ^^^  ^ does grouping without memory

Of course that isn't going to help if you later need the
matched chars...


>Speed is the objective here.  


So you have already profiled your code then?

And you know for a fact that it is this part of your code
that is slow?


>Ideas and comments welcome.  


See above.


>Flames can be
>sent to /dev/null  


Oh no!

I gotta start reading the complete article before I decide
to answer.

If I had seen that earlier I would have killfiled you and moved
on. But since I have it typed in already, I'll only do one
of those things.

So long.



[ People say that when they know they are doing something
  wrong. I find apologizing for what you're about to do,
  and then doing it anyway to be most offensive.

  It seems strange to me that you included it, because you
  seemed to have followed netiquette as far as I can tell...
]

>Consider the various
>'perlfaq' man pages thumbed through, but not *combed* through.  :-)

That's all that 'netiquette requires, so you had nothing to fear.


-- 
    Tad McClellan                          SGML Consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: Thu, 11 May 2000 22:59:34 -0700
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: Regular expression ?
Message-Id: <391B9DC6.E7026811@stomp.stomp.tokyo>

tvn007@my-deja.com wrote:

> Could someone please help me with the following pattern matching ?
 
> Here is the data:
 
> 305   : 0_0  2.01  A
> 306   : 0_0  1.80  AR
> 305   : 0_0  2.50  AR
 
> I would like to extract number 2.01, 1.80 and 2.50
 
> Here is what I have, and it does not work...

(snipped, context retained)


I have taken note your company, Quantum Effect Devices,
actively recruits at various University of California
campuses, although yesterday was your last scheduled
on campus event. Might try my alma mater, UC Riverside.
We have an excellent computer science department.


A notion I consistently promote is to organize
your data base well for ease in access and ease
in manipulation. If this were my data base,
I would organize it similar to this:

305¦0_0¦2.01¦A

Leaving out spaces and using a non-volatile
delimiter would reduce this data extraction
to a very simple split operation, then pulling
your third in position variable, 2.01 for this
example line above.

However, there are circumstances when we cannot
effectively control a data base, such as your
data base is extracted from elsewhere, under
another's control or, perhaps is just a fantasy
example for whatever reasons.

Using your data base and adding one extra line
for a total of four, per your subsequent article,
I have written two codes combined into one. This
is a very simple basic code with an intent, as
always, to exemplify methodology and logic rather
than eloquence and shorter combined code, both
of which lead to incomprehensible code. This
code of mine is verbose and written in as
'Plain English' as possible for easier
reading and quicker understanding.

This code, in both examples, pre-conditions your
data base by reducing all white space to a single
space, for easier access using a very basic,
"can't be more simple" type matching operator.

Method One in my code reduces all white space
to a single space, then employs a very basic
catch all matching operator, without removing
a leading space in your data. Method Two is
very much the same but removes a leading
space in $value for a split. Method One
is a simple match and print method. My
Method Two is a more elaborate double
split to extract variables.

I have written this code to demonstrate
how to capture all variables, beyond the
single variable of interest to you. My
hopes are this very basic simple code
will provide enough examples with which
you can study, learn, modify and have a
bit of fun playing around. I've also
included a line counter in each example
if this is of interest to you. I have
left newline characters \n intact for
ease and better clarity in print.

For fun in playing around, I have designed
my code to exhibit what some would refer
to as a "bug". If you wish to play and
observe this bug, which shows in a print,
make these changes:


Under Method One:

  $test_line_1 =~ s/( )+/ /g;

Change this above line to:

  $test_line_1 =~ s/\s+/ /g;


Under Method Two:

  $test_line_2 =~ s/\s+/ /g;

Change this above line to:

  $test_line_2 =~ s/( )+/ /g;


You may also test these lines
by substituting them in as above:

  $test_line_1 =~ s/[ ]+/ /g;

  $test_line_2 =~ s/[ ]+/ /g;


There are other variations on this,
what I consider to be a bug, when
working with spaces in a match and
print type operation. Try some of
your own substitution codes and
discover what happens. Test each 
individually or change both, then 
observe changes in your print. A bug 
to some, others may argue if this is 
a bug or proper behavior.


In closing, this code's intent is to
display choices in methods, to display
a different way of approaching this and,
to provide you with more options in your
personal decision making regarding these
needs you have for a script. With this
code, those codes of others, you now
have a wide variety of choices.


Code is first, test.txt contents next
and printed results last. This is ready
to use on our internet after making
needed changes in your Perl locale and
any possible changes you need to make
in your path for test.txt as shown.
This is only a demonstration test script
and not intended for everyday usage.


Kiralynne Schilitubi


TEST CODE:
______________________


#!/usr/local/bin/perl

print "Content-Type: text/plain\n\n";

open (TEST, "test.txt");
@Test_Array_1 = <TEST>;
close (TEST);


print "\n\n Method One: single space, substitute: \n\n";

$counter_1 = 1;

foreach $test_line_1 (@Test_Array_1)
 {
  $test_line_1 =~ s/( )+/ /g;
  $test_line_1 =~ s/ (.*) : (.*) (.*) (.*)/$1$2$3$4/;
  print "
   Line $counter_1:
    Variable One is: $1
    Variable Two is: $2
    Variable Three is: $3
    Variable Four is: $4

    Your Variable Of Interest Is: $3 \n";

  $counter_1++;
 }


print "\n\n Method Two: single space, split: \n\n";

open (TEST, "test.txt");
@Test_Array_2 = <TEST>;
close (TEST);

$counter_2 = 1;

foreach $test_line_2 (@Test_Array_2)
 {
  $test_line_2 =~ s/\s+/ /g;
  local ($key, $value) = split (/:/, $test_line_2);
  $value =~ s/^ //;
  local ($var2, $var3, $var4) = split (/ /, $value);

  print "
   Line $counter_2:
    Variable One is: $key
    Variable Two is: $var2
    Variable Three is: $var3
    Variable Four is: $var4

    Your Variable Of Interest Is: $var3 \n";

  $counter_2++;
 }

print "\n\n 
  Schizoid And Very Effective!

  Godzilla!";

exit;



TEST TEXT CONTENTS:
______________________

         abc   : 0_0  2.02  A
         305   : 0_0  2.01  A
         306   : 0_0  1.80  AR
         305   : 0_0  2.50  AR


PRINTED RESULTS:
______________________


 Method One: single space, substitute: 


   Line 1:
    Variable One is: abc
    Variable Two is: 0_0
    Variable Three is: 2.02
    Variable Four is: A

    Your Variable Of Interest Is: 2.02 

   Line 2:
    Variable One is: 305
    Variable Two is: 0_0
    Variable Three is: 2.01
    Variable Four is: A

    Your Variable Of Interest Is: 2.01 

   Line 3:
    Variable One is: 306
    Variable Two is: 0_0
    Variable Three is: 1.80
    Variable Four is: AR

    Your Variable Of Interest Is: 1.80 

   Line 4:
    Variable One is: 305
    Variable Two is: 0_0
    Variable Three is: 2.50
    Variable Four is: AR

    Your Variable Of Interest Is: 2.50 


 Method Two: single space, split: 


   Line 1:
    Variable One is:  abc 
    Variable Two is: 0_0
    Variable Three is: 2.02
    Variable Four is: A

    Your Variable Of Interest Is: 2.02 

   Line 2:
    Variable One is:  305 
    Variable Two is: 0_0
    Variable Three is: 2.01
    Variable Four is: A

    Your Variable Of Interest Is: 2.01 

   Line 3:
    Variable One is:  306 
    Variable Two is: 0_0
    Variable Three is: 1.80
    Variable Four is: AR

    Your Variable Of Interest Is: 1.80 

   Line 4:
    Variable One is:  305 
    Variable Two is: 0_0
    Variable Three is: 2.50
    Variable Four is: AR

    Your Variable Of Interest Is: 2.50 


 
  Schizoid And Very Effective!

  Godzilla!


------------------------------

Date: Fri, 12 May 2000 11:01:17 +0200
From: Nils <ii4533@fh-wedel.de>
Subject: regular expression needed
Message-Id: <391BC85D.4C7F088F@fh-wedel.de>

Hi
I am lookin for an expression wich substitude something like that
s/label(:(.*)?)/you wrote \2/
now in the case that \2 is an empty string instead of \2 'nothing'
should be insert
anyone an idea how to do that?
thanks nils




------------------------------

Date: Thu, 11 May 2000 22:32:57 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: Subroutine error stops module ?!
Message-Id: <MPG.13853764cffa704598aa62@nntp.hpl.hp.com>

In article <slrn8hmjlp.laa.tadmc@magna.metronet.com> on Thu, 11 May 2000 
20:22:17 -0400, Tad McClellan <tadmc@metronet.com> says...
> On 11 May 2000 19:04:33 GMT, Hans <hans-jan@stack.nl> wrote:

 ...

> >open (FLIST, "<list_2.txt") || die "Can't open list_2.txt";
>        ^^^^^
> 
> >close (LIST);
>         ^^^^
> 
> 
> You would have been warned about that typo if you had enabled warnings.
> 
> Take all of the help that you can get.

That help includes printing $! as part of the diagnostic on failure to 
open the file.

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: 12 May 2000 07:19:34 GMT
From: Hans <hans-jan@stack.nl>
Subject: Re: Subroutine error stops module ?!
Message-Id: <8fgba6$55u$1@news.tue.nl>

Tad McClellan <tadmc@metronet.com> wrote:
> On 11 May 2000 19:04:33 GMT, Hans <hans-jan@stack.nl> wrote:

>>use strict;


> That part is Very Good though!


>>open (FLIST, "<list_2.txt") || die "Can't open list_2.txt";
>        ^^^^^

>>close (LIST);
>         ^^^^

This is a typo I made in this message.

I've narrowed it down to :
my $msg=MIME:Lite->new(. ........ );
&print_file_list;
$msg->send;

Apperently I forgot to pass $msg to sub print_file_list and back to the main
routine. I tried
&print_file_list ($msg);

and 

sub print_file_list {
my $msg_in = @_;
 .......
return $msg_in;

but it won't work.

Hans






------------------------------

Date: Fri, 12 May 2000 08:54:41 GMT
From: poppln@my-deja.com
Subject: Re: Subroutine error stops module ?!
Message-Id: <8fggsc$b2m$1@nnrp1.deja.com>

In article <8fgba6$55u$1@news.tue.nl>,
  Hans <hans-jan@stack.nl> wrote:

> sub print_file_list {
> my $msg_in = @_;
> .......
> return $msg_in;

A list in scalar context return the number of items in it.
use should use:

  my ($msg_in) = @_;

or

  my $msg_in = $_[0];


Nadav


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: Fri, 12 May 2000 10:03:29 +0200
From: Marco Natoni <blah@nospam.com>
Subject: Re: unpack c struct
Message-Id: <391BBAD1.EDD54570@nospam.com>

Hi Eric,

Eric Chen wrote:
> There's an API that returns some value which has the following 
> C struct format.
> Typedef struct{
> UCHAR unMsgsize;
> UCHAR unErrorcode;
> UCHAR unMsgbody;
> USHORT unHour;
> USHORT unMinute;
> ULONG ulOpen;
> ULONG ulClose;
> }ReturnValue;
> How can I parse the return value in perl?

  As your subject suggests: there is an unpack sub that could carry on
your task.

	perldoc -f unpack
	perldoc -f pack


	Best regards,
		Marco


------------------------------

Date: Fri, 12 May 2000 04:51:19 GMT
From: hacktic@my-deja.com
Subject: Windows or Unix, Perl or C
Message-Id: <8fg2k3$rh8$1@nnrp1.deja.com>

Hi all,

I need to develop a webserver that hosts websites. It needs to make user
accounts and with some menu based CGI-tool that allows the user to
create their web-sites. It's something like xoom.com and geocities.com .

This is my first big project, but don't know where to start.
I know C but I'm not familiar with Perl. I heard that Perl is a lot
easier to work with than C making CGI. Any truth in this?

I need to consider some things:
- add user accounts
- send confirmation mail
- track login sessions
- use commandline image editing tools
- upload files (mainly images)
- edit HTML files by user
- maintain user area qoutas
- security

Would this be easiest done on UNIX or on Windows NT?
E.g. to add user account on UNIX, it's some command like 'adduser' in
the shell, but is it that easy on Windows NT?

I really would appreciate some advice and pointers on how to get
started.

Regards;
-Mark-



Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: 12 May 2000 05:26:20 GMT
From: dedsrd@aol.com (DEDSRD)
Subject: Re: Windows or Unix, Perl or C
Message-Id: <20000512012620.19178.00002170@ng-ci1.aol.com>

I would looked at Mac OSX the PDF doc made it sound fairly automated. Also read
at webmonkey.com that Mac made BSDUnix into a MacDonalds register.
As for Perl everything I've read says it's the #1 choice for CGI scripting.
Darrell


------------------------------

Date: Fri, 12 May 2000 08:52:04 +0100
From: "Damon Jebb" <damon_jebb@nai.com>
Subject: Re: Windows or Unix, Perl or C
Message-Id: <8fgd7c$lm3$1@new-news.na.nai.com>

The answer to these questions is, as ever, it depends...

I doubt that C would be the best language for any web development, but you
sould consider Javascript, VBScript as well as Perl.  Personally I like
Perl, but if you have good C++ skills you could prefer JavaScript.
Personally I like Perl, it's powerful, and does some things that I want to
do more readilly that the other languages (like patterm matching, file
handling, etc.).

From the platform point of view you need to consider what other computers
are on the network that you are connecting to, if any.  If you definitely
have oter computers that are either Unix or NT then I would think that the
safest approach would be to use the same platform.  One of the reasons for
this is that I have experience with trying to set up security and user
accounts on an NT box running NFS services and Linux using Samba - there are
compromises to be made that you may find inappropriate.

Not answers, just my two-pence worth.  If you haven't already make a
complete list of what yu think your current requirements are and then see
which combination of platform and language most closely matches the list.
Be sure that the choice will always be a compromise :-)

HTH

Damon
<hacktic@my-deja.com> wrote in message news:8fg2k3$rh8$1@nnrp1.deja.com...
> Hi all,
>
> I need to develop a webserver that hosts websites. It needs to make user
> accounts and with some menu based CGI-tool that allows the user to
> create their web-sites. It's something like xoom.com and geocities.com .
>
> This is my first big project, but don't know where to start.
> I know C but I'm not familiar with Perl. I heard that Perl is a lot
> easier to work with than C making CGI. Any truth in this?
>
> I need to consider some things:
> - add user accounts
> - send confirmation mail
> - track login sessions
> - use commandline image editing tools
> - upload files (mainly images)
> - edit HTML files by user
> - maintain user area qoutas
> - security
>
> Would this be easiest done on UNIX or on Windows NT?
> E.g. to add user account on UNIX, it's some command like 'adduser' in
> the shell, but is it that easy on Windows NT?
>
> I really would appreciate some advice and pointers on how to get
> started.
>
> Regards;
> -Mark-
>
>
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.




------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 3025
**************************************


home help back first fref pref prev next nref lref last post