[24686] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 6847 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Aug 9 09:06:03 2004

Date: Mon, 9 Aug 2004 06:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 9 Aug 2004     Volume: 10 Number: 6847

Today's topics:
    Re: creating shell scripts using #!/usr/local/env perl (Peter J. Acklam)
        DDD initialisation (Roman Kaganovich)
        First position two strings differ <abodeman@yahoo.com>
    Re: First position two strings differ <abodeman@yahoo.com>
    Re: First position two strings differ <tassilo.von.parseval@rwth-aachen.de>
    Re: First position two strings differ (Anno Siegel)
    Re: First position two strings differ <tassilo.von.parseval@rwth-aachen.de>
    Re: First position two strings differ (Anno Siegel)
    Re: Hash reference question <ceo@nospam.on.net>
        Help with Date / Time <Primster7@yahoo.com>
    Re: Help with Date / Time <ebohlman@omsdev.com>
    Re: Newbie problem with perl and rsh (zenshade)
    Re: Newbie problem with perl and rsh <sholden@flexal.cs.usyd.edu.au>
    Re: Newbie problem with perl and rsh (Anno Siegel)
    Re: Newbie problem with perl and rsh <matrix_calling@yahoo.dot.com>
    Re: Parse log files for data <Joe.Smith@inwap.com>
        Printing only a portion of a matched regex -- newbie qu <dot@dot.dot>
    Re: Printing only a portion of a matched regex -- newbi <dot@dot.dot>
    Re: Printing only a portion of a matched regex -- newbi <abodeman@yahoo.com>
    Re: Printing only a portion of a matched regex -- newbi <dot@dot.dot>
    Re: Printing only a portion of a matched regex -- newbi <Joe.Smith@inwap.com>
    Re: Printing only a portion of a matched regex -- newbi <gnari@simnet.is>
    Re: Problem with accentued characters <someone@example.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 9 Aug 2004 04:39:44 -0700
From: pjacklam@online.no (Peter J. Acklam)
Subject: Re: creating shell scripts using #!/usr/local/env perl
Message-Id: <fe4612a5.0408090339.2c48222a@posting.google.com>

Abigail <abigail@abigail.nl> wrote:
>
> The point is that assuming that "env" is present, and has the
> same location on all system doesn't differ from assuming that
> "perl" is present and has the same location on all machines.

Well, yes, but this is only true for Perl programs that are
supposed to work absolutely everywhere, but I'm being pragmatic
and I'm happy as long as the programs run on all systems that I
have.  And for me, the least common multiple is

    #!/usr/bin/env perl
    ...

with the second best solution being

    #!/bin/sh
    PATH=$PATH:/usr/local/bin
    exec perl -Sx $0 ${1+"$@"}
    #!perl
    ...

Peter


------------------------------

Date: 8 Aug 2004 23:54:34 -0700
From: rkaganov@ort.org.il (Roman Kaganovich)
Subject: DDD initialisation
Message-Id: <a3cd74f7.0408082254.5f220a51@posting.google.com>

Hello,

How can I automaticly initialisate DDD to display set of my variables
when I start debugging session?


------------------------------

Date: Sun, 08 Aug 2004 22:27:44 -0500
From: "Brian Kell" <abodeman@yahoo.com>
Subject: First position two strings differ
Message-Id: <opscf3winjz772u5@ultramarine.unl.edu>

What's the most efficient way to compare two strings and return the first  
position they differ? The corresponding C code would look something like  
this:

int strcpos(const char *s, const char *t) {
     int i;
     for (i = 0; s[i] == t[i]; ++i)
         if (s[i] == '\0')
             return -1; /* s==t */
     return i;
}

You could do it like this:

sub strcpos {
     my @chars = split //, shift;
     my @chart = split //, shift;
     my $i = 0;
     while ((my $c = shift @chars) == shift @chart) {
         return -1 unless defined $c;
         ++$i;
     }
     return $i;
}

But that just feels un-Perlish and inefficient. Is there some other clever  
way to do it?

Brian


------------------------------

Date: Sun, 08 Aug 2004 22:29:27 -0500
From: "Brian Kell" <abodeman@yahoo.com>
Subject: Re: First position two strings differ
Message-Id: <opscf3zdndz772u5@ultramarine.unl.edu>

Oops. In my Perl idea, I am comparing strings with ==, when clearly I  
should be using eq instead. Please overlook that little glitch. I must  
have been copying too much from my C. :)

Brian


------------------------------

Date: Mon, 9 Aug 2004 07:02:20 +0200
From: "Tassilo v. Parseval" <tassilo.von.parseval@rwth-aachen.de>
Subject: Re: First position two strings differ
Message-Id: <2noequF2ra5gU1@uni-berlin.de>

Also sprach Brian Kell:

> What's the most efficient way to compare two strings and return the first  
> position they differ? The corresponding C code would look something like  
> this:
> 
> int strcpos(const char *s, const char *t) {
>      int i;
>      for (i = 0; s[i] == t[i]; ++i)
>          if (s[i] == '\0')
>              return -1; /* s==t */
>      return i;
> }
> 
> You could do it like this:
> 
> sub strcpos {
>      my @chars = split //, shift;
>      my @chart = split //, shift;
>      my $i = 0;
>      while ((my $c = shift @chars) == shift @chart) {
>          return -1 unless defined $c;
>          ++$i;
>      }
>      return $i;
> }
> 
> But that just feels un-Perlish and inefficient. Is there some other clever  
> way to do it?

You can use the xor-trick: Xor the two strings and look for the first
character which is different from '\0':
    
    sub strcpos {
	my ($s, $t) = @_;
	return ($s ^ $t) =~ /(\000+)[^\000]/ ? length($1) : -1;
    }
   
    print strcpos("foobar", fooBar");
    __END__
    3

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: 9 Aug 2004 08:30:36 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: First position two strings differ
Message-Id: <cf7cnc$n6o$1@mamenchi.zrz.TU-Berlin.DE>

Tassilo v. Parseval <tassilo.parseval@post.rwth-aachen.de> wrote in comp.lang.perl.misc:
> Also sprach Brian Kell:
> 
> > What's the most efficient way to compare two strings and return the first  
> > position they differ? The corresponding C code would look something like  
> > this:
> > 
> > int strcpos(const char *s, const char *t) {
> >      int i;
> >      for (i = 0; s[i] == t[i]; ++i)
> >          if (s[i] == '\0')
> >              return -1; /* s==t */
> >      return i;
> > }
> > 
> > You could do it like this:
> > 
> > sub strcpos {
> >      my @chars = split //, shift;
> >      my @chart = split //, shift;
> >      my $i = 0;
> >      while ((my $c = shift @chars) == shift @chart) {
> >          return -1 unless defined $c;
> >          ++$i;
> >      }
> >      return $i;
> > }
> > 
> > But that just feels un-Perlish and inefficient. Is there some other clever  
> > way to do it?
> 
> You can use the xor-trick: Xor the two strings and look for the first
> character which is different from '\0':
>     
>     sub strcpos {
> 	my ($s, $t) = @_;
> 	return ($s ^ $t) =~ /(\000+)[^\000]/ ? length($1) : -1;

The regex must be anchored to the beginning of the string.  OTOH,
there's no need to check for a non-null character.  Also, the return
value should be 0 if the strings differ from the beginning.  I'd say

        ($s ^ $t) =~ /^\000*/;
        $+[ 0];


>     }
>    
>     print strcpos("foobar", fooBar");
>     __END__
>     3

Anno


------------------------------

Date: Mon, 9 Aug 2004 13:14:05 +0200
From: "Tassilo v. Parseval" <tassilo.von.parseval@rwth-aachen.de>
Subject: Re: First position two strings differ
Message-Id: <2np4k1F34b4mU1@uni-berlin.de>

Also sprach Anno Siegel:

> Tassilo v. Parseval <tassilo.parseval@post.rwth-aachen.de> wrote in comp.lang.perl.misc:
>> Also sprach Brian Kell:
>> 
>> > What's the most efficient way to compare two strings and return the first  
>> > position they differ? The corresponding C code would look something like  
>> > this:
>> > 
>> > int strcpos(const char *s, const char *t) {
>> >      int i;
>> >      for (i = 0; s[i] == t[i]; ++i)
>> >          if (s[i] == '\0')
>> >              return -1; /* s==t */
>> >      return i;
>> > }

[...]

>> You can use the xor-trick: Xor the two strings and look for the first
>> character which is different from '\0':
>>     
>>     sub strcpos {
>> 	my ($s, $t) = @_;
>> 	return ($s ^ $t) =~ /(\000+)[^\000]/ ? length($1) : -1;
> 
> The regex must be anchored to the beginning of the string.  OTOH,
> there's no need to check for a non-null character.  Also, the return
> value should be 0 if the strings differ from the beginning.  I'd say
> 
>         ($s ^ $t) =~ /^\000*/;
>         $+[ 0];

It should be anchored, yes. I did the test for non-null in order to
be able to return -1 when the match fails (that is, the two strings are
equal).  So your @+ approach wont return -1 on equal strings.

To handle strings that differ at the first character, I'd now rewrite
the whole thing into

    return ($s ^ $t) =~ /^(\000*)[^\000]/ ? length($1) : -1;
    
Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: 9 Aug 2004 12:44:10 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: First position two strings differ
Message-Id: <cf7riq$4hl$1@mamenchi.zrz.TU-Berlin.DE>

Tassilo v. Parseval <tassilo.parseval@post.rwth-aachen.de> wrote in comp.lang.perl.misc:
> Also sprach Anno Siegel:
> 
> > Tassilo v. Parseval <tassilo.parseval@post.rwth-aachen.de> wrote in
> comp.lang.perl.misc:
> >> Also sprach Brian Kell:
> >> 
> >> > What's the most efficient way to compare two strings and return the first  
> >> > position they differ? The corresponding C code would look something like  
> >> > this:
> >> > 
> >> > int strcpos(const char *s, const char *t) {
> >> >      int i;
> >> >      for (i = 0; s[i] == t[i]; ++i)
> >> >          if (s[i] == '\0')
> >> >              return -1; /* s==t */
> >> >      return i;
> >> > }
> 
> [...]
> 
> >> You can use the xor-trick: Xor the two strings and look for the first
> >> character which is different from '\0':
> >>     
> >>     sub strcpos {
> >> 	my ($s, $t) = @_;
> >> 	return ($s ^ $t) =~ /(\000+)[^\000]/ ? length($1) : -1;
> > 
> > The regex must be anchored to the beginning of the string.  OTOH,
> > there's no need to check for a non-null character.  Also, the return
> > value should be 0 if the strings differ from the beginning.  I'd say
> > 
> >         ($s ^ $t) =~ /^\000*/;
> >         $+[ 0];
> 
> It should be anchored, yes. I did the test for non-null in order to
> be able to return -1 when the match fails (that is, the two strings are
> equal).  So your @+ approach wont return -1 on equal strings.

I was happy with it returning an index outside of the strings, but -1
may be a better indicator.

> To handle strings that differ at the first character, I'd now rewrite
> the whole thing into
> 
>     return ($s ^ $t) =~ /^(\000*)[^\000]/ ? length($1) : -1;

Yup.

Anno


------------------------------

Date: Mon, 09 Aug 2004 03:10:13 GMT
From: ChrisO <ceo@nospam.on.net>
Subject: Re: Hash reference question
Message-Id: <pWBRc.1280$Gk.446@newssvr31.news.prodigy.com>

Steve May wrote:

> ChrisO wrote:
> 
>> Kristofer Pettijohn wrote:
>>
>>> I'm defining a hash similiar to what follows:
>>>
>>> my $HASH = (
>>>   'key1' => (
>>>     'sub1' => 'key1value1',
>>>     'sub2' => 'key1value2'
>>>   ),
>>>   'key2' => (
>>>     'sub1' => 'key2value1',
>>>     'sub2' => 'key2value2')
>>> )
>>>
>>> and I would like to pass the reference of $HASH{'key1'} to a sub.  How
>>> do I go about doing this?
>>>
>>
>> First of all, define the hash properly using matching braces and not 
>> parens and end with a semicolon:
>>
>> my $hash = {
>>    key1 => {
>>       sub1 => 'key1value1',
>>       sub2 => 'key1value2',
>>    },
>>    key2 => {
>>       sub1 => 'key2value1',
>>       sub2 => 'key2value2',
>>    },
>> };
>>
>> The quoting of the keys is not required unless you are going to 
>> include whitespace in your key values.
>>
> 
> Ah.... that is not totally correct. When warnings are on and using 
> strict, a key like this-is-a-key will *not* compile.
> 

I'm not surprised.  When I said "whitespace", I had some faint 
recollection of dashes or underscores not working as you pointed out. 
But since I never code without "use warnings qw( all );" and "use 
strict;" I don't find much occasion to trip over those kinds of things. 
  I see these two pragmas pretty religiously recommended here in c.l.p.m 
and for good reason.

-ceo


------------------------------

Date: Sun, 08 Aug 2004 23:07:28 -0700
From: Hoops <Primster7@yahoo.com>
Subject: Help with Date / Time
Message-Id: <d25eh0lqqd2u3l39riq3piortqgsuov4q9@4ax.com>

Hello,


I'm brand new with Perl and I'm just trying to fix the date and time
of a program that someone else wrote.



The line shows this:


$date = date("l dS of F Y h:i:s A");



I need to change it to read the correct date/time for U.S. PST.



Any help would be appreciated. 


------------------------------

Date: 9 Aug 2004 08:09:51 GMT
From: Eric Bohlman <ebohlman@omsdev.com>
Subject: Re: Help with Date / Time
Message-Id: <Xns954021027A196ebohlmanomsdevcom@130.133.1.4>

Hoops <Primster7@yahoo.com> wrote in 
news:d25eh0lqqd2u3l39riq3piortqgsuov4q9@4ax.com:

> I'm brand new with Perl and I'm just trying to fix the date and time
> of a program that someone else wrote.
> 
> 
> 
> The line shows this:
> 
> 
> $date = date("l dS of F Y h:i:s A");
> 
> 
> 
> I need to change it to read the correct date/time for U.S. PST.

Perl doesn't have a built-in date() function, so the program must either be 
defining one itself or using a module that contains a date() function.  If 
the latter, you'll need to determine what module it is (look for lines 
starting with 'use' toward the beginning of the program) and then look at 
that module's documentation to find out what the function is expecting (I 
don't know offhand which of the several date-related modules it's from).  
If the former, good luck! (that would mean that the original author defined 
his own function, and when people re-invent the wheel like that, they only 
rarely document it adequately).

Scary thought: that format did remind me of something, and I just realized 
what it was: the built-in date() function in PHP!  Are you *sure* you're 
dealing with *Perl* code here? (if there's a 'use PHP::DateTime;' toward 
the beginning of the program, then you really are)



------------------------------

Date: 8 Aug 2004 22:45:05 -0700
From: zenshade@wowway.com (zenshade)
Subject: Re: Newbie problem with perl and rsh
Message-Id: <76a99d19.0408082145.e8b7b90@posting.google.com>

Brian McCauley <nobull@mail.com> wrote in message news:<cf5rk1$lt9$1@slavica.ukpost.com>...

> I council you not to use the word "newbie" in subject lines - it tends 
> to predispose people against you...

> Your problem has nothing to do with rsh.  Had you tried replacing 'rsh' 
> with 'echo' you'd have found the problem persisted.  This process is 
> known as "problem partitioning".  It is an absolutely vital skill in 
> programming.  If you think your problems lies elsewhere than it does 
> then you are unlikely to find a solution.

Let N be the number of lines in your response.  Subtract one.  N - 1
is the number of unnecessary lines in your post.

> At a guess I'd say remove the linefeed from the end of $co_id.

This would have been quite sufficient.  Your points are well taken and
the line above much appreciated, but is it really necessary to strike
such a condescending, prejudicial, and one might even say arrogant
attitude?

>Clear and concise 
> subject lines are very important to the co-operative nature  of Usenet. 
> Wasting space in them will be perceived as an uncooperative act and so 
> will also serve to predispose people against you.

While I'm certainly in favor of giving a nod in the direction of
clarity and conciseness for the sake of general principles, there
comes a point where I'd much sooner just tell a great lot of people to
piss off than put up with their snobbery and finicky, nit-picky
notions of what and how Usenet is to be used. Not that you've actually
crossed that line or anything :). (Anyone feeling compelled to point
out the grammar errors in the previous sentences has most definitely
crossed that line. So piss off ;)

Seriously, though, thanks for mentioning the linefeed.  That's exactly
the type of response I was looking for.


------------------------------

Date: 9 Aug 2004 06:09:05 GMT
From: Sam Holden <sholden@flexal.cs.usyd.edu.au>
Subject: Re: Newbie problem with perl and rsh
Message-Id: <slrnche581.li1.sholden@flexal.cs.usyd.edu.au>

On 8 Aug 2004 22:45:05 -0700, zenshade <zenshade@wowway.com> wrote:
> Brian McCauley <nobull@mail.com> wrote in message news:<cf5rk1$lt9$1@slavica.ukpost.com>...
>
>> I council you not to use the word "newbie" in subject lines - it tends 
>> to predispose people against you...
>
>> Your problem has nothing to do with rsh.  Had you tried replacing 'rsh' 
>> with 'echo' you'd have found the problem persisted.  This process is 
>> known as "problem partitioning".  It is an absolutely vital skill in 
>> programming.  If you think your problems lies elsewhere than it does 
>> then you are unlikely to find a solution.
>
> Let N be the number of lines in your response.  Subtract one.  N - 1
> is the number of unnecessary lines in your post.

Because good general advice isn't useful.

>
>> At a guess I'd say remove the linefeed from the end of $co_id.
>
> This would have been quite sufficient.  Your points are well taken and
> the line above much appreciated, but is it really necessary to strike
> such a condescending, prejudicial, and one might even say arrogant
> attitude?

You decided not to follow the posting guidelines and so on, and just
expect everyone to bow to you whims. How arrogant of us.

Anyway, you might want to find somewhere you haven't been killfiled
by most of the helpful people for your next demand.

-- 
Sam Holden


------------------------------

Date: 9 Aug 2004 09:55:22 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Newbie problem with perl and rsh
Message-Id: <cf7hma$qks$1@mamenchi.zrz.TU-Berlin.DE>

zenshade <zenshade@wowway.com> wrote in comp.lang.perl.misc:
> Brian McCauley <nobull@mail.com> wrote in message
> news:<cf5rk1$lt9$1@slavica.ukpost.com>...
> 
> > I council you not to use the word "newbie" in subject lines - it tends 
> > to predispose people against you...
> 
> > Your problem has nothing to do with rsh.  Had you tried replacing 'rsh' 
> > with 'echo' you'd have found the problem persisted.  This process is 
> > known as "problem partitioning".  It is an absolutely vital skill in 
> > programming.  If you think your problems lies elsewhere than it does 
> > then you are unlikely to find a solution.
> 
> Let N be the number of lines in your response.  Subtract one.  N - 1
> is the number of unnecessary lines in your post.

So you define as "unnecessary" what you don't want to hear?  How arrogant
is that?

Anno


------------------------------

Date: Mon, 09 Aug 2004 17:10:38 +0530
From: Abhinav <matrix_calling@yahoo.dot.com>
Subject: Re: Newbie problem with perl and rsh
Message-Id: <ZvJRc.15$Ah.64@news.oracle.com>

zenshade wrote:
> Brian McCauley <nobull@mail.com> wrote in message news:<cf5rk1$lt9$1@slavica.ukpost.com>...
> 
> 
>>I council you not to use the word "newbie" in subject lines - it tends 
>>to predispose people against you...
> 
> 
>>Your problem has nothing to do with rsh.  Had you tried replacing 'rsh' 
>>with 'echo' you'd have found the problem persisted.  This process is 
>>known as "problem partitioning".  It is an absolutely vital skill in 
>>programming.  If you think your problems lies elsewhere than it does 
>>then you are unlikely to find a solution.
> 
> 
> Let N be the number of lines in your response.  Subtract one.  N - 1
> is the number of unnecessary lines in your post.
> 

If you have lurked around this group for even a day, you will find that the 
person who replied to your post is one of the most helpful out here, and 
AFAI am concerned, one of the veterans.

Hope that puts your view in the proper perspective.


Regards

--

Abhinav


------------------------------

Date: Mon, 09 Aug 2004 02:19:42 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: Parse log files for data
Message-Id: <1bBRc.224457$a24.74831@attbi_s03>

Joe Smith wrote:
> Dan wrote:
> 
>> ls -1 *.log
>> gives me an alphabetical listing of my log files.
>>
>> ls -1 *.log | grep /.*/
>> returns nothing.  No matter what I put into the regular expression,
> 
>   ls -1 *.log | grep '/.*/'

Sorry about that.  Grep does not use //.  But the argument needs quoting.

   ls -1 *.log | grep '.*'


------------------------------

Date: Mon, 9 Aug 2004 12:49:14 +1000
From: "DIAMOND Mark R." <dot@dot.dot>
Subject: Printing only a portion of a matched regex -- newbie quesiton
Message-Id: <cf6om0$b85$1@perki.connect.com.au>

My apologies to begin with. I am a relatively new, and infrequent user of
perl.

I have a series of html files with contact information for doctors. The
files have enormous amounts of other stuff in them including script, image
links and so on.
But the names all appear between a particular <span ...> tag and a </b> tag,
with the words like "level7Name" or "level2Contact" (the quotes are in the
tag) marking the particlar spans.
Line breaks don't seem to follow any particular pattern. The two structures
<span ... level.Name> .... nametoprint</b> and the equivalent for the
contact address are quite distinct without any strange embedding of the two.

What I'd like to do is print out the names, and the contact information, but
I've obviously gone wrong somewhere. I couldn't work out whether I should or
should not have a global at the end of the s///, but in either case, I still
have a problem. Any help would be very much appreciated.

$/ = ".\n";
$doctorlistfile = "c:\\tmp\\doctors.tmp";
open(DOCTORLISTFILE, "> $doctorlistfile" ) || die "Can't open
$doctorlistfile \n";
while(<>) {
    s/<span +class=\"level[0-9]Name\"><b>([^<]*)<\/b>/ $1 /;
    print DOCTORLISTFILE $1;
    s/<span +class=\"level[0-9]Contact\"><b>([^<]*)<\/b>/ $1 /;
    print DOCTORLISTFILE $1;
}

--
Mark R. Diamond






------------------------------

Date: Mon, 9 Aug 2004 13:33:02 +1000
From: "DIAMOND Mark R." <dot@dot.dot>
Subject: Re: Printing only a portion of a matched regex -- newbie quesiton
Message-Id: <cf6r84$c68$1@perki.connect.com.au>

I should have added that I have searched the NG on Google groups, but part
of the problem is that I'm not quite sure what I should be searching for
"print only match OR matching" pointed me to solutions which printed only
*lines* with an appropriate match.

mark




------------------------------

Date: Sun, 08 Aug 2004 23:02:35 -0500
From: "Brian Kell" <abodeman@yahoo.com>
Subject: Re: Printing only a portion of a matched regex -- newbie quesiton
Message-Id: <opscf5ilasz772u5@ultramarine.unl.edu>

$/ = ".\n";
$doctorlistfile = "c:\\tmp\\doctors.tmp";
open(DOCTORLISTFILE, "> $doctorlistfile" ) || die "Can't open
$doctorlistfile \n";
while(<>) {
     s/<span +class=\"level[0-9]Name\"><b>([^<]*)<\/b>/ $1 /;
     print DOCTORLISTFILE $1;
     s/<span +class=\"level[0-9]Contact\"><b>([^<]*)<\/b>/ $1 /;
     print DOCTORLISTFILE $1;
}

It looks like you're close. You probably just want to use m// instead of  
s///, though, since you're only trying to match, not actually do a  
substitution, right? (And you probably want to print a newline after each  
of those, right?)

So something like this might work:

     m/<span +class="level[0-9]Name"><b>([^<]*)<\/b>/;
     print DOCTORLISTFILE "$1\n";

If that doesn't work, what does it print instead?

Brian


------------------------------

Date: Mon, 9 Aug 2004 15:05:41 +1000
From: "DIAMOND Mark R." <dot@dot.dot>
Subject: Re: Printing only a portion of a matched regex -- newbie quesiton
Message-Id: <cf70lr$ea2$1@perki.connect.com.au>

Thanks, Brian. You are quite right. I just want to match, not change. And I
do want those newlines.. But it only prints the first instance of a name. I
have made two slight changes . The first so that the print is conditional,
the second because I realised that the tag that marks the end of the name or
contact is not always the same, so I have checked for the beginning of the
tag only in the following.

$/ = ".\n";
$doctorlistfile = "c:\\tmp\\doctors.tmp";
open(DOCTORLISTFILE, "> $doctorlistfile" ) || die "Can't open
$doctorlistfile \n";
while(<>) {
     print DOCTORLISTFILE "$1\n" if m/<span
+class="level[0-9]Name"><b>([^<]*)</;
     print DOCTORLISTFILE "$1\n" if m/<span
+class="level[0-9]Contact"><b>([^<]*)</;
}

but as I say, only a single name (the first correct match) is extracted from
the file.

Another question to which I am unsure of the answer is whether the second
appearance of $1 is correct, or whether the indices of the $ increase
throughout the loop rather than just within each regex; i.e. is the first
match in the second regex actually called $2 ?

Cheers.

--
Mark R. Diamond


"Brian Kell" <abodeman@yahoo.com> wrote in message
news:opscf5ilasz772u5@ultramarine.unl.edu...
> $/ = ".\n";
> $doctorlistfile = "c:\\tmp\\doctors.tmp";
> open(DOCTORLISTFILE, "> $doctorlistfile" ) || die "Can't open
> $doctorlistfile \n";
> while(<>) {
>      s/<span +class=\"level[0-9]Name\"><b>([^<]*)<\/b>/ $1 /;
>      print DOCTORLISTFILE $1;
>      s/<span +class=\"level[0-9]Contact\"><b>([^<]*)<\/b>/ $1 /;
>      print DOCTORLISTFILE $1;
> }
>
> It looks like you're close. You probably just want to use m// instead of
> s///, though, since you're only trying to match, not actually do a
> substitution, right? (And you probably want to print a newline after each
> of those, right?)
>
> So something like this might work:
>
>      m/<span +class="level[0-9]Name"><b>([^<]*)<\/b>/;
>      print DOCTORLISTFILE "$1\n";
>
> If that doesn't work, what does it print instead?
>
> Brian




------------------------------

Date: Mon, 09 Aug 2004 08:07:45 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: Printing only a portion of a matched regex -- newbie quesiton
Message-Id: <lhGRc.225605$a24.22213@attbi_s03>

DIAMOND Mark R. wrote:

> $/ = ".\n";
> while(<>) {

If your file does not have any lines that end with a period, then
the entire file will be read in by <>, and the code inside the while{}
block will be executed only once.  Try
   print "$. = '$_'\n";
as a debugging aid.

>      print DOCTORLISTFILE "$1\n" if m/<span
> +class="level[0-9]Name"><b>([^<]*)</;
>      print DOCTORLISTFILE "$1\n" if m/<span
> +class="level[0-9]Contact"><b>([^<]*)</;

> Another question to which I am unsure of the answer is whether the second
> appearance of $1 is correct

In each regex, $1 corresponds to the first set of capturing parentheses in
that regex.  The presence of any other regex in the file does not change this.
	-Joe


------------------------------

Date: Mon, 9 Aug 2004 08:52:52 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: Printing only a portion of a matched regex -- newbie quesiton
Message-Id: <cf7dt2$2bn$1@news.simnet.is>

"DIAMOND Mark R." <dot@dot.dot> wrote in message
news:cf70lr$ea2$1@perki.connect.com.au...
> Thanks, Brian. You are quite right. I just want to match, not change. And
I
> do want those newlines.. But it only prints the first instance of a name.
I
> have made two slight changes . The first so that the print is conditional,
> the second because I realised that the tag that marks the end of the name
or
> contact is not always the same, so I have checked for the beginning of the
> tag only in the following.
>
> $/ = ".\n";
this looks a bit tentative in light of your first post.
skip it

> $doctorlistfile = "c:\\tmp\\doctors.tmp";
> open(DOCTORLISTFILE, "> $doctorlistfile" ) || die "Can't open
> $doctorlistfile \n";
> while(<>) {
>      print DOCTORLISTFILE "$1\n" if m/<span
> +class="level[0-9]Name"><b>([^<]*)</;

you were almost there.
change the if to a while and add a /g:
    print DOCTORLISTFILE "$1\n"
      while m/<span +class="level[0-9]Name"><b>([^<]*)</g;

>
> but as I say, only a single name (the first correct match) is extracted
from
> the file.

consistent with your $/ , probably

>
> Another question to which I am unsure of the answer is whether the second
> appearance of $1 is correct, or whether the indices of the $ increase
> throughout the loop rather than just within each regex; i.e. is the first
> match in the second regex actually called $2 ?

each regex resets the $n variables

gnari






------------------------------

Date: Mon, 09 Aug 2004 12:38:48 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: Problem with accentued characters
Message-Id: <sfKRc.74827$T_6.70531@edtnps89>

Pierre Thibault wrote:
> 
> I am having problems with a perl script.  The fellowing script is 
> working well except for directory or file names using accented 
> characters. The last line (!system...) is giving me errors when I'm 
> using the result from the 'find' command if the file or the directory 
> contain accented characters. The error is "No such file or directory".

While Perl is a good "glue" language it is better to use Perl's built-in
features and modules for more efficiency and control.


> I'm using Perl 5.0 version 8 subversion 1 RC3 on Mac OS 10.3.4.
> 
> I really don't know how to solve this problem. Any help would be 
> appreciated.
> 
> #!/usr/bin/perl
> #
> # Create a list of files with their MD5 sums
> 
> use strict;
> 
> `mkdir -p /Users/pierreth/M5`;
> 
> my @path_list = ((split /:/, $ENV{"PATH"}), "/Applications");
> for(@path_list) {
>    my $file_name_for_path = $_;
>    $file_name_for_path =~ s#/#-#g;
>    $file_name_for_path =~ s#^-##;
>    my $save_path = "/Users/pierreth/M5/$file_name_for_path";
>    `rm $save_path -f > /dev/null 2>&1`;
>    print "Now processing $file_name_for_path\n";
>    for(`find -L $_ -type f`) {
>       chomp;
>       s#\\#\\\\#g; # Paying attention to special character like \ 
>       s#'#\\'#g; # or '
>       !system ("md5sum '".$_."' >> '".$save_path."'") or die "Error $_ 
> $save_path";
>    }
> }

Untested!

#!/usr/bin/perl
#
# Create a list of files with their MD5 sums

use warnings;
use strict;
use File::Find;
use File::Path;
use Env '@PATH';
use Digest::MD5 'md5_hex';

my $dir = '/Users/pierreth/M5';

-d $dir or eval { mkpath( $dir ) };
die "Cannot create $dir: $@" if $@;

for my $path ( @PATH, '/Applications' ) {
     ( my $file_name_for_path = $path ) =~ s!^/!!;
     $file_name_for_path =~ tr!/!-!;
     my $save_path = "$dir/$file_name_for_path";

     open my $MD5, '>', $save_path or die "Cannot open $save_path: $!";
     print "Now processing $file_name_for_path\n";

     find( sub {
         return unless -f;
         local $/;
         open my $FILE, '<', $_ or die "Cannot open $_: $!";
         binmode $FILE;
         print $MD5 md5_hex( <$FILE> ), "  $_\n";
         }, $path );
     }

__END__



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6847
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[24686] in Perl-Users-Digest

Perl-Users Digest, Issue: 6847 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Aug 9 09:06:03 2004

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Aug 9 09:06:03 2004