[17884] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 44 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jan 11 21:10:38 2001

Date: Thu, 11 Jan 2001 18:10:18 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <979265418-v10-i44@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Thu, 11 Jan 2001     Volume: 10 Number: 44

Today's topics:
    Re: Perl idioms for converting string into list of char <mr_joesixpack@yahoo.com>
    Re: Perl idioms for converting string into list of char <ren.maddox@tivoli.com>
    Re: Perl idioms for converting string into list of char (Weston Cann)
    Re: Perl idioms for converting string into list of char <wyzelli@yahoo.com>
        reading binary files littlesputnik@my-deja.com
    Re: reading binary files <iboreham@my-deja.com>
    Re: regex help, please (Stan Brown)
    Re: regex help, please (Stan Brown)
    Re: regex help, please (Greg Bacon)
    Re: regex help, please (Jerome O'Neil)
    Re: Regexp for balanced parens <spug@halcyon.com>
    Re: replacing spaces with %20 (Damian James)
    Re: variable (Martien Verbruggen)
    Re: variable (Chris Fedde)
    Re: variable (Martien Verbruggen)
    Re: would a regex be helpful here? (Craig Berry)
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 11 Jan 2001 23:36:33 GMT
From: Paul Laub <mr_joesixpack@yahoo.com>
Subject: Re: Perl idioms for converting string into list of characters...
Message-Id: <93lg21$2n3$1@nnrp1.deja.com>

I can't believe no one has mentioned using unpack. Here are two
ways, though they may be too obscure to count as an "idiom".

map { chr $_ } unpack 'c*', $string

or

unpack 'a' x length($string), $string

Paul


In article
<iowa88_song88.remove_eights-1101011149120001@58.salt-lake-city-03-04rs.
ut.dial-access.att.net>,
  iowa88_song88.remove_eights@hotmail.com (Weston Cann) wrote:
> I can figure out a way convert a string into a list of characters via
> repeated use of the substr function.
>
> It also looks as if you can do it using split(//,$str).
>
> Are there other perl idioms for doing this?
>
> =================================================================
> "The best laid plans of mice and men are about equal."
> iowa_so8ng@hot8mail.com
> Address is spam repelant. Remove eights to reach me.
>


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: 11 Jan 2001 17:37:02 -0600
From: Ren Maddox <ren.maddox@tivoli.com>
Subject: Re: Perl idioms for converting string into list of characters...
Message-Id: <m3d7dtacf5.fsf@dhcp11-177.support.tivoli.com>

Ansel Sermersheim <ansel@babylon.dyndns.org> writes:

> One that I've used a variant of in the JAPH in my signature, but never
> seen anywhere else is '$string =~ /./g'.  I had assumed it only to be
> good for obfuscated code, but I just tested it and it seems to be
> significantly faster than 'split //, $string' on my machine:

Running your test on my system still shows MATCH to be faster, but
only by a very small margin:

     MATCH: 32 wallclock secs (31.49 usr +  0.00 sys = 31.49 CPU) @ 417.50/s (n=13147)
     SPLIT: 31 wallclock secs (31.62 usr +  0.00 sys = 31.62 CPU) @ 414.20/s (n=13097)

$ perl -v

This is perl, v5.6.0 built for i686-linux-thread-multi

> Anybody know why?  This seems odd to me.

Well, they both have to use the REGEX engine, so it isn't too
surprising.  FWIW, I previously had done some other benchmarking of
this that included just manually building the array, and it was less
than twice as fast as these two methods, so there probably isn't a
whole lot of room for improvement (short of having an internal
representation that allowed for list access to strings... ouch!).

-- 
Ren Maddox
ren@tivoli.com


------------------------------

Date: Fri, 12 Jan 2001 01:37:05 GMT
From: iowa88_song88.remove_eights@hotmail.com (Weston Cann)
Subject: Re: Perl idioms for converting string into list of characters...
Message-Id: <iowa88_song88.remove_eights-1101011844310001@15.salt-lake-city-03-04rs.ut.dial-access.att.net>

In article <m3puhtam0e.fsf@dhcp11-177.support.tivoli.com>, Ren Maddox
<ren.maddox@tivoli.com> wrote:

> > I can figure out a way convert a string into a list of characters via
> > repeated use of the substr function. It also looks as if you can do it 
> > using split(//,$str). Are there other perl idioms for doing this?
> 
> You could also do something with chop like:
> 
> unshift @chars, chop $string while length $string;

[snip]

> Of course, the real question is probably, Why do you want to do this?
> Treating a string as a list of characters is seldom the right way to
> go about things in Perl.  What is the underlying problem that you are
> trying to solve?

Well, partly, I'm just satisfying curiousity, since split is an acceptable
way to do thing for me. But I'm beginning to think that one good way to 
learn more about Perl is by asking about The Other Ways To Do It.

But I'm happy to satisfy your curiousity too and maybe learn something
else in the process. The application is a typing test. Up until now, I've 
been comparing texts and answer keys (using Algorithm::Diff) word by word 
-- splitting on whitespace and some punctuation. This has been a fairly
satisfactory way to do things, until some people asked for the 
script to grade exercises like:

asdffasdsdafdsasdffasfsdfsadfaadsfafdsdsdasddfafsdfsdssaasdfsdddafsdfddsss

Suddenly, word by word doesn't work.

Now, since Algorithm::Diff takes two lists to do its work, the easiest
thing to do, in terms of not having to think very much, is to take the
text and the answer key and chop them up into char-by-char lists, and 
modify my typing score formulas slightly.

So... are there other ways to do it?

Weston

=================================================================
"The best laid plans of mice and men are about equal."
iowa_so8ng@hot8mail.com 
Address is spam repelant. Remove eights to reach me.


------------------------------

Date: Fri, 12 Jan 2001 11:35:24 +0930
From: "Wyzelli" <wyzelli@yahoo.com>
Subject: Re: Perl idioms for converting string into list of characters...
Message-Id: <cvt76.12$zL3.4888@vic.nntp.telstra.net>

"Weston Cann" <iowa88_song88.remove_eights@hotmail.com> wrote in message
news:iowa88_song88.remove_eights-1101011149120001@58.salt-lake-city-03-0
4rs.ut.dial-access.att.net...
> I can figure out a way convert a string into a list of characters via
> repeated use of the substr function.
>
> It also looks as if you can do it using split(//,$str).
>
> Are there other perl idioms for doing this?
>

@chars = split /|/,$string;

Wyzelliis
--
($a,$b,$w,$t)=(' bottle',' of beer',' on the wall','Take one down, pass
it around');
for(reverse(1..100)){$s=($_!=1)?'s':'';$c.="$_$a$s$b$w\n$_$a$s$b\n$t\n";
$_--;$s=($_!=1)?'s':'';$c.="$_$a$s$b$w\n\n";}print"$c*hic*";






------------------------------

Date: Fri, 12 Jan 2001 00:14:51 GMT
From: littlesputnik@my-deja.com
Subject: reading binary files
Message-Id: <93li9k$4mu$1@nnrp1.deja.com>

Hello,

I am trying to use a perl script to strip "^M" characters from a file,
but they are transparent to the perl interpreter.  Is there some other
way to specify "^M" so that the perl interpreter will see it and

$line=~s/^M//;

will work?
Thanks.

Heather


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: Fri, 12 Jan 2001 01:40:28 GMT
From: Ian Boreham <iboreham@my-deja.com>
Subject: Re: reading binary files
Message-Id: <93lnac$91t$1@nnrp1.deja.com>

In article <93li9k$4mu$1@nnrp1.deja.com>,
  littlesputnik@my-deja.com wrote:
> I am trying to use a perl script to strip "^M" characters from a file,
> but they are transparent to the perl interpreter.

On certain platforms, they are converted or merged into the logical
newline '\n' when you read a line from a text file. That is not the same
as transparent to the interpreter. If they are in the string, perl can
see them.

>Is there some other
> way to specify "^M" so that the perl interpreter will see it and
>
> $line=~s/^M//;

^M is a regex that matches a capital M at the start of a line.

This same problem was discussed recently in the thread "strange
end-of-line", so check that thread for the solution.


Ian


Sent via Deja.com
http://www.deja.com/


------------------------------

Date: 11 Jan 2001 19:01:02 -0500
From: stanb@panix.com (Stan Brown)
Subject: Re: regex help, please
Message-Id: <93lhfu$fpb$1@panix2.panix.com>

In <slrn95rhv8.2q0.bernard.el-hagin@gdndev25.lido-tech> bernard.el-hagin@lido-tech.net (Bernard El-Hagin) writes:

>On 11 Jan 2001 09:44:06 -0500, Stan Brown <stanb@panix.com> wrote:
>>Could some kind soul point out the error of my ways?
>>
>>I have a regexy that looks like this:
>>
>>'[a-z\-A-Z_0-9 .#/%\(\)]{1,32} '
>                  ^
>What delimiter did you use for the m// operator? If you used slashes
>then the pointed out slash breaks the regex and has to be escaped:

>$_ = '150/35# LETD';
>print "Yipee" if /[a-z\-A-Z_0-9 .#\/%\(\)]{1,32}/;

>output:
>Yipee
That was it. The actual usage was buried in the Parse::Lex module, so I was
not thinking about this.

Thanks, very much.


------------------------------

Date: 11 Jan 2001 19:02:49 -0500
From: stanb@panix.com (Stan Brown)
Subject: Re: regex help, please
Message-Id: <93lhj9$ft1$1@panix2.panix.com>

In <t5rj26jq5dmq18@corp.supernews.com> gbacon@HiWAAY.net (Greg Bacon) writes:

>In article <93kgrm$er7$1@panix3.panix.com>,
>    Stan Brown <stanb@panix.com> wrote:

>: I have a regexy that looks like this:
>: 
>: '[a-z\-A-Z_0-9 .#/%\(\)]{1,32} '
>: 
>: I am using this a the pattern for Parse::Lex to scan for. I intend for it
>: to accept ALL of the folowing:
>: Any Alph character (upper or lower case),
>: Any number,
>: The folowing non aplhanumeric charcaters .#/()%

>Your pattern matches runs of length n, where n is between 1 and 32,
>where each character in the run is one of

>  - abcdefghijklmnopqrstuvwxyz
>  - ABCDEFGHIJKLMNOPQRSTUVWXYZ
>  - 0123456789
>  - '_', '-', ' ', '#', '/', '%', '.', '(', ')' 

>Note that there are alphabetic characters that aren't present in the
>above list.

	?? when I was in school the alphabet consisted of a through z, what am
	I missing here?



------------------------------

Date: Fri, 12 Jan 2001 01:13:19 -0000
From: gbacon@HiWAAY.net (Greg Bacon)
Subject: Re: regex help, please
Message-Id: <t5smhfobr765f8@corp.supernews.com>

In article <93lhj9$ft1$1@panix2.panix.com>,
    Stan Brown <stanb@panix.com> wrote:

: In <t5rj26jq5dmq18@corp.supernews.com> gbacon@HiWAAY.net (Greg Bacon) writes:
:
: >Note that there are alphabetic characters that aren't present in the
: >above list.
: 
: 	?? when I was in school the alphabet consisted of a through z,
:       what am I missing here?

Is omega an alphabetic character?  It depends on whom you ask. :-)

Greg
-- 
Disclaimer: I have not benchmarked anything because I do not know what
your data looks like.
    -- mjd displaying uncommon wisdom


------------------------------

Date: Fri, 12 Jan 2001 01:24:59 GMT
From: jerome.oneil@360.com (Jerome O'Neil)
Subject: Re: regex help, please
Message-Id: <L1t76.1236$464.422416@news.uswest.net>

stanb@panix.com (Stan Brown) elucidates:

>>Note that there are alphabetic characters that aren't present in the
>>above list.
> 
> 	?? when I was in school the alphabet consisted of a through z, what am
> 	I missing here?

Russian, Greek, Chineese, Esperanto(!)....  In short, you're missing the 
3.75 billion people that didn't go to school in the US.  

-- 
If men could learn from history, what lessons it might teach us!  But
passion and party blind our eyes, and the light which experience gives
is a lantern on the stern, which shines only on the waves behind us.
				--Samuel Taylor Coleridge, "Recollections"


------------------------------

Date: 12 Jan 2001 00:08:30 GMT
From: Seattle PERL Users Group <spug@halcyon.com>
Subject: Re: Regexp for balanced parens
Message-Id: <93lhtu$r85$1@brokaw.wa.com>


Damian Conway's Text::Balanced module, available from CPAN, does
this and lots more.

-Tim
*========================================================================*
| Dr. Tim Maher, CEO, Consultix       (206) 781-UNIX/8649;  ask for FAX# | 
| Email: tim@consultix-inc.com        Web: http://www.consultix-inc.com  |
| TIM MAHER: Unix/Perl  DAMIAN CONWAY: Adv. Perl   COLIN MEYER: Perl/DBI |
|Feb 7 Perl/DBI; 12 Int Perl; 15 OO-Perl; 20 Data Munging; 22 Adv OO-Perl|
*========================================================================*

: Well, I typed this on my W95 box (Activestate build 520):
: D:\>perldoc -q balanced
: Found in C:\perl\lib\pod\perlfaq6.pod
:   Can I use Perl regular expressions to match balanced text?

:             Although Perl regular expressions are more powerful than
:             "mathematical" regular expressions, because they feature
:             conveniences like backreferences (`\1' and its ilk),
:             they still aren't powerful enough. You still need to use
:             non-regexp techniques to parse balanced text, such as
:             the text enclosed between matching parentheses or
:             braces, for example.

:             An elaborate subroutine (for 7-bit ASCII only) to pull
:             out balanced and possibly nested single chars, like ``'
:             and `'', `{' and `}', or `(' and `)' can be found in
:             http://www.perl.com/CPAN/authors/id/TOMC/scripts/pull_qu
:             otes.gz .

:             The C::Scan module from CPAN contains such subs for
:             internal usage, but they are undocumented.

: Craig Kelly <cekelly@dvol.com>
: -------------------------------
: "Hey, I'm just this guy, see?"
:             -Zaphod Beeblebrox
: -------------------------------


------------------------------

Date: 12 Jan 2001 01:03:03 GMT
From: damian@puma.qimr.edu.au (Damian James)
Subject: Re: replacing spaces with %20
Message-Id: <slrn95sm04.mvl.damian@puma.qimr.edu.au>

In article <93ksin$fak$1@nnrp1.deja.com>, ^Jerry wrote:
>Dave Brondsema posted this... works great, too.. *Thanks, Dave*
>$image_name =~ s/ /\%20/g;

Well, this ought to work for spaces - but what about other characters?

>...I've
>installed Active Perl on my computer, too..  helps me figure out what
>I've messed up...  (I do that a lot.. hehe)
>

In that case, bring up the ActiveState documentation in your browser, scroll 
the left hand pane till you come to 'Module Docs', then under 'Root Libraries'
click on CGI. What you see in the right hand pane should be enough information
to work out a generic solution to the above (with LESS code than you would 
otherwise need :-).

Cheers,
Damian


------------------------------

Date: Thu, 11 Jan 2001 23:30:06 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: variable
Message-Id: <slrn95sgdr.jgc.mgjv@verbruggen.comdyn.com.au>

On Thu, 11 Jan 2001 05:11:07 GMT,
	Chris Fedde <cfedde@fedde.littleton.co.us> wrote:
> In article <x7n1cy4sln.fsf@home.sysarch.com>,
> Uri Guttman  <uri@sysarch.com> wrote:
>>>>>>> "CF" == Chris Fedde <cfedde@fedde.littleton.co.us> writes:
>>
>>  CF> $/ = '';
>>
>>why the paragraph mode? it doesn't gain anything when you mung single
>>chars.
> 
> avoid an explicit loop.  

$/ = '' puts input in paragraph mode, which will read paragraphs at a
time. I think you wanted to set it to undef, which will return the
whole file in a single call to <>. Correct?

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | 
Commercial Dynamics Pty. Ltd.   | Curiouser and curiouser, said Alice.
NSW, Australia                  | 


------------------------------

Date: Thu, 11 Jan 2001 23:38:03 GMT
From: cfedde@fedde.littleton.co.us (Chris Fedde)
Subject: Re: variable
Message-Id: <vtr76.880$B9.190392832@news.frii.net>

In article <slrn95sgdr.jgc.mgjv@verbruggen.comdyn.com.au>,
Martien Verbruggen <mgjv@tradingpost.com.au> wrote:
>On Thu, 11 Jan 2001 05:11:07 GMT,
>	Chris Fedde <cfedde@fedde.littleton.co.us> wrote:
>> In article <x7n1cy4sln.fsf@home.sysarch.com>,
>> Uri Guttman  <uri@sysarch.com> wrote:
>>>>>>>> "CF" == Chris Fedde <cfedde@fedde.littleton.co.us> writes:
>>>
>>>  CF> $/ = '';
>>>
>>>why the paragraph mode? it doesn't gain anything when you mung single
>>>chars.
>> 
>> avoid an explicit loop.  
>
>$/ = '' puts input in paragraph mode, which will read paragraphs at a
>time. I think you wanted to set it to undef, which will return the
>whole file in a single call to <>. Correct?
>

There was only one paragraph after the __END__.
-- 
    This space intentionally left blank


------------------------------

Date: Fri, 12 Jan 2001 01:26:55 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: variable
Message-Id: <slrn95sn8t.jgc.mgjv@verbruggen.comdyn.com.au>

On Thu, 11 Jan 2001 23:38:03 GMT,
	Chris Fedde <cfedde@fedde.littleton.co.us> wrote:
> In article <slrn95sgdr.jgc.mgjv@verbruggen.comdyn.com.au>,
> Martien Verbruggen <mgjv@tradingpost.com.au> wrote:
>>On Thu, 11 Jan 2001 05:11:07 GMT,
>>	Chris Fedde <cfedde@fedde.littleton.co.us> wrote:
>>> In article <x7n1cy4sln.fsf@home.sysarch.com>,
>>> Uri Guttman  <uri@sysarch.com> wrote:
>>>>>>>>> "CF" == Chris Fedde <cfedde@fedde.littleton.co.us> writes:
>>>>
>>>>  CF> $/ = '';
>>>>
>>>>why the paragraph mode? it doesn't gain anything when you mung single
>>>>chars.
>>> 
>>> avoid an explicit loop.  
>>
>>$/ = '' puts input in paragraph mode, which will read paragraphs at a
>>time. I think you wanted to set it to undef, which will return the
>>whole file in a single call to <>. Correct?
>>
> 
> There was only one paragraph after the __END__.

Making assumptions about input without documenting cethese
assumptionsthem leads to bugs.  If you mean to read a whole file, you
should make that clear. If you men to read paragraphs, you should make
that intention clear.

My comment was induced by the fact that you told Perl to read
paragraphs, but you implied that you meant it to read the whole file.
That for this particular input the two are identical isn't the issue.
The issues is that, even if you don't care about maintainability in
your own code, this was a post to Usenet. If you, yourself, aren't
clear in what you mean, then you will get followups like these that
try to make it clear.

People will read your code, and copy and paste it, change the input,
and not realise why it suddenly doesn't work as they think it should
work. After all, Mr. Fedde said that this would avoid the explicit
loop. Don't see the followups as criticism, but as extra explanations
of what's going on, if not for your benefit, then for others who are
reading this thread, and who didn't know about how $/ modifies the
reading of input.

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | Hi, John here, what's the root
Commercial Dynamics Pty. Ltd.   | password?
NSW, Australia                  | 


------------------------------

Date: Fri, 12 Jan 2001 00:22:03 -0000
From: cberry@cinenet.net (Craig Berry)
Subject: Re: would a regex be helpful here?
Message-Id: <t5sjhbrenphda9@corp.supernews.com>

webqueen wrote:
: Say I have an input like:
: 
:     1,2,4,6-9,12,15-16,20
: 
: which I want to explode into:
:     1,2,3,6,7,8,9,12,15,16,20
: 
: in other words, convert each n-m into n,n+1,n+2..,m

Interesting, almost the inverse of another problem submitted here
recently. 

: I can write a script to do this manually, but I wonder is there any
: regex functionality that might help me?

  $_ = '1,2,4,6-9,12,15-16,20';
  s/(\d+)-(\d+)/join ',', $1..$2/eg;
  print;

-- 
   |   Craig Berry - http://www.cinenet.net/~cberry/
 --*--  "The hills are burning, and the wind is raging; and the clock
   |   strikes midnight in the Garden of Allah." - Don Henley


------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 44
*************************************


home help back first fref pref prev next nref lref last post