[29628] in Perl-Users-Digest
Perl-Users Digest, Issue: 872 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Sep 21 03:09:37 2007
Date: Fri, 21 Sep 2007 00:09:04 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 21 Sep 2007 Volume: 11 Number: 872
Today's topics:
Re: Concatenation: $i.$j different from "$i$j" <dummy@example.com>
Re: FAQ 4.8 How do I perform an operation on a series o sln@netherlands.co
Re: FAQ 4.8 How do I perform an operation on a series o <dummy@example.com>
Re: List Variable becomes undefined inexplicably <tadmc@seesig.invalid>
Re: looking at parsing procedures <zaxfuuq@invalid.net>
new CPAN modules on Fri Sep 21 2007 (Randal Schwartz)
Re: perl and unix command <tadmc@seesig.invalid>
Re: Using (?{}) code blocks and $^R <nobull67@gmail.com>
Re: utf8 and HTML Entities sln@netherlands.co
Re: utf8 and HTML Entities <helmut@wollmersdorfer.at>
Re: utf8 and HTML Entities <paduille.4061.mumia.w+nospam@earthlink.net>
Why no warning when redclaring a variable in same scope <tony@skelding.co.uk>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 21 Sep 2007 03:11:28 GMT
From: "John W. Krahn" <dummy@example.com>
Subject: Re: Concatenation: $i.$j different from "$i$j"
Message-Id: <ADGIi.5531$nO3.2192@edtnps90>
Occidental wrote:
> {
> my $i = "TT";
> my $j = "AT";
>
> if ("$i$j" =~ /X/)
> {
> print "$i$j matched\n";
> }
> else
> {
> print "$i$j not matched\n";
> }
> }
>
> {
> my $i = "TT";
> my $j = "AT";
>
> if ($i.$j =~ /X/)
> {
> print $i.$j . " matched\n";
> }
> else
> {
> print $i.$j . " not matched\n";
> }
> }
>
> gives
>
> TTAT not matched
> TTAT matched
>
> Can anyone explain?
Yes, perl can:
$ perl -MO=Deparse,-p -e'
{
my $i = "TT";
my $j = "AT";
if ("$i$j" =~ /X/)
{
print "$i$j matched\n";
}
else
{
print "$i$j not matched\n";
}
}
{
my $i = "TT";
my $j = "AT";
if ($i.$j =~ /X/)
{
print $i.$j . " matched\n";
}
else
{
print $i.$j . " not matched\n";
}
}
'
{
(my $i = 'TT');
(my $j = 'AT');
if (("$i$j" =~ /X/)) {
print("$i$j matched\n");
}
else {
print("$i$j not matched\n");
}
}
{
(my $i = 'TT');
(my $j = 'AT');
if (($i . ($j =~ /X/))) {
print((($i . $j) . " matched\n"));
}
else {
print((($i . $j) . " not matched\n"));
}
}
-e syntax OK
if (("$i$j" =~ /X/)) {
Is false because there is no 'X' in 'TTAT'.
if (($i . ($j =~ /X/))) {
Is true because the string from the expression ($i . ($j =~ /X/)) is true.
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
------------------------------
Date: Thu, 20 Sep 2007 18:47:52 -0700
From: sln@netherlands.co
Subject: Re: FAQ 4.8 How do I perform an operation on a series of integers?
Message-Id: <1f86f35s4ksje80c15oml375milcuhvn36@4ax.com>
On Thu, 20 Sep 2007 18:08:00 -0700, Bill H <bill@ts1000.us> wrote:
>On Sep 20, 9:03 pm, PerlFAQ Server <br...@stonehenge.com> wrote:
>> This is an excerpt from the latest version perlfaq4.pod, which
>> comes with the standard Perl distribution. These postings aim to
>> reduce the number of repeated questions as well as allow the community
>> to review and update the answers. The latest version of the complete
>> perlfaq is athttp://faq.perl.org.
>>
>> --------------------------------------------------------------------
>>
>> 4.8: How do I perform an operation on a series of integers?
>>
>> To call a function on each element in an array, and collect the results,
>> use:
>>
>> @results = map { my_func($_) } @array;
>>
>> For example:
>>
>> @triple = map { 3 * $_ } @single;
>>
>> To call a function on each element of an array, but ignore the results:
>>
>> foreach $iterator (@array) {
>> some_func($iterator);
>> }
>>
>> To call a function on each integer in a (small) range, you can use:
>>
>> @results = map { some_func($_) } (5 .. 25);
>>
>> but you should be aware that the ".." operator creates an array of all
>> integers in the range. This can take a lot of memory for large ranges.
>> Instead use:
>>
>> @results = ();
>> for ($i=5; $i < 500_005; $i++) {
>> push(@results, some_func($i));
>> }
>
>I have not seen this before. What is the purpose of the "_" in 500_005
>or is it just used to replace 500,005?
>
>Bill H
Yeah, thats interresting
"for ($i=5; $i < 500_005; $i++)"
between 500 o' 5
or between 5-500,005, thats interresting
------------------------------
Date: Fri, 21 Sep 2007 03:15:18 GMT
From: "John W. Krahn" <dummy@example.com>
Subject: Re: FAQ 4.8 How do I perform an operation on a series of integers?
Message-Id: <aHGIi.65287$Pd4.28564@edtnps82>
Bill H wrote:
> On Sep 20, 9:03 pm, PerlFAQ Server <br...@stonehenge.com> wrote:
>>
>> @results = ();
>> for ($i=5; $i < 500_005; $i++) {
>> push(@results, some_func($i));
>> }
>
> I have not seen this before. What is the purpose of the "_" in 500_005
> or is it just used to replace 500,005?
perldoc perldata
[ SNIP ]
Scalar value constructors
Numeric literals are specified in any of the following floating point or
integer formats:
12345
12345.67
.23E-10 # a very small number
3.14_15_92 # a very important number
4_294_967_296 # underscore for legibility
0xff # hex
0xdead_beef # more hex
0377 # octal (only numbers, begins with 0)
0b011011 # binary
You are allowed to use underscores (underbars) in numeric literals between
digits for legibility. You could, for example, group binary digits by
threes (as for a Unix-style mode argument such as 0b110_100_100) or by
fours (to represent nibbles, as in 0b1010_0110) or in other groups.
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
------------------------------
Date: Fri, 21 Sep 2007 01:32:07 GMT
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: List Variable becomes undefined inexplicably
Message-Id: <slrnff66ml.irj.tadmc@tadmc30.sbcglobal.net>
mattbreedlove@yahoo.com <mattbreedlove@yahoo.com> wrote:
> I believe there is a place for strict and warnings.
That's nice but it is not relevant to getting someone to examine
your problem.
What is relevant to getting the interest of others is what the
_others_ think.
I've been here for quite some time, I know how a program without
warnings and strict are received here (the most common reaction
is to simply move on to the next question).
I was trying to help you avoid being ignored. Disregard my
comments if being ignored does not concern you.
> That place is not
> necessarily in a quick test script to illustrate a completely
> unrelated problem.
warnings and strictures very often find bugs in programs. Many
people will find it demeaning to be asked to do the work of a machine.
It is inconsiderate of your audience to post without warnings and strict.
> Thanks
Off to the killfile you go then.
> PERL should be able to keep track of the frame of reference
There is no PERL.
There is a Perl and there is a perl.
> This should be the same:
> while(<CMD>) {
> chomp;
> $LINE="$_";
> }
>
> as this:
>
> while ( $LINE = <CMD> ) {
> chomp $LINE;
>
> Such a "waste" of a precious line when it could be done with one line
> less,
The number of lines is not the point.
The point is maintainability.
Being direct is easier to read and understand than being circuitous.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Fri, 21 Sep 2007 00:06:08 -0700
From: "Wade Ward" <zaxfuuq@invalid.net>
Subject: Re: looking at parsing procedures
Message-Id: <pPydneX7iur5wm7bnZ2dnUVZ_gadnZ2d@comcast.com>
"Wade Ward" <zaxfuuq@invalid.net> wrote in message
news:O4-dncwwILFfL23bnZ2dnUVZ_rSinZ2d@comcast.com...
>
>
>
> "Michele Dondi" <bik.mido@tiscalinet.it> wrote in message
> news:v96ve3190qo083jkl2trbp4v9q3ob6kh5s@4ax.com...
>> On Mon, 17 Sep 2007 23:02:34 -0700, "Wade Ward" <zaxfuuq@invalid.net>
>> wrote:
>>
>>>I've got backlash syndrom. After looking at linux and windows for twenty
>>
>> Please try to quote properly. It's getting increasingly difficult to
>> reply to your posts. I had written:
>>
>> : >How do I test the subject line to see whether 'Solaris' occurs?
>> :
>> : /Solaris/
>>
>> Then you go on:
>>
>>>years my chance of getting one or the other is fifty fifty, when the
>>>difference matters.
>>>
>>
>>
>> I just meant that if $str is your string then to test whether it
>> contains "Solaris" you can do
>>
>> if ($str =~ /Solaris/) { ... }
>>
>> If the string is in $_ you can just do
>>
>> if (/Solaris/) { ... }
>>
>> Or else you can use the specialized index() function about which you
>> can read in
>>
>> perldoc -f index
>>
>> But *IIRC* the regex engine optimizes the above match to index()
>> anyway.
>>
>>>So one indicates a string literal by bracketing with backslash?
>>
>> Huh?!? No, one specifies a literal string with the q() and qq()
>> operators, commonly disguised as '' and "" respectively. Instead you
>> can use the m() match operator to check for a pattern which needs not
>> be a literal string, but if it is (i.e. it contains no metacharachters
>> having a special meaning in regexen) then it is treated as a pattern
>> as well. If you use forward slashes as delimiters, then you can omit
>> the "m".
>>
>> At this point I strongly recommend you to carefully read the "Quote
>> and Quote-like Operators" section in
>>
>> perldoc perlop
>>
>>
> I'll take a look tomorrow. I can't wait to get a printer. I'm gonna
> spend a hundred bucks on the best perl reference. Ques que say?
http://www.zaxfuuq.net/perl2.htm
A little bit of technical art here. I had a big day that didn't involve me
doing what I said, as it regards perl. Instead, I'm using perl to provide
verification of Keith <Jabba the Hut's>Thompson's c.l.c. troll nature.
One of the hardest things to teach people about picking up perl can use this
screendump as a reference. You could use text to describe it to a windows
user who doesn't know its significance, but a picture says a thousand words.
Of course, Keith can beat that that with either 2000/today or 2000/every
time he opens his fucking mouth.
--
Wade Ward
wade@zaxfuuq.net
"I ain't got time to bleed."
------------------------------
Date: Fri, 21 Sep 2007 04:42:16 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Fri Sep 21 2007
Message-Id: <JopBqG.2tz@zorch.sf-bay.org>
The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN). You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.
Algorithm-Dependency-Objects-0.04
http://search.cpan.org/~nuffin/Algorithm-Dependency-Objects-0.04/
An implementation of an Object Dependency Algorithm
----
Alien-SVN-1.4.5.1
http://search.cpan.org/~mschwern/Alien-SVN-1.4.5.1/
A wrapper for installing the SVN Perl bindings
----
Apache2-Filter-Minifier-CSS-1.00
http://search.cpan.org/~gtermars/Apache2-Filter-Minifier-CSS-1.00/
CSS minifying output filter
----
Apache2-Filter-Minifier-JavaScript-1.00
http://search.cpan.org/~gtermars/Apache2-Filter-Minifier-JavaScript-1.00/
JS minifying output filter
----
Archive-Extract-0.24
http://search.cpan.org/~kane/Archive-Extract-0.24/
A generic archive extracting mechanism
----
Astro-NED-Query-0.12
http://search.cpan.org/~djerius/Astro-NED-Query-0.12/
base class for NED queries
----
Carp-REPL-0.11
http://search.cpan.org/~sartak/Carp-REPL-0.11/
read-eval-print-loop on die and/or warn
----
Catalyst-Plugin-Cache-FileCache-0.7
http://search.cpan.org/~mramberg/Catalyst-Plugin-Cache-FileCache-0.7/
(DEPRECATED) File cache
----
Class-Dot-1.0.4
http://search.cpan.org/~asksh/Class-Dot-1.0.4/
Simple way of creating accessor methods.
----
Class-Dot-Model-0.1.3
http://search.cpan.org/~asksh/Class-Dot-Model-0.1.3/
Simple way of defining models for DBIx::Class.
----
Class-Workflow-0.06
http://search.cpan.org/~nuffin/Class-Workflow-0.06/
Light weight workflow system.
----
Config-XPath-0.10
http://search.cpan.org/~pevans/Config-XPath-0.10/
a module for retrieving configuration data from XML files by using XPath queries
----
DBM-Deep-1.0002
http://search.cpan.org/~rkinyon/DBM-Deep-1.0002/
A pure perl multi-level hash/array DBM that supports transactions
----
Data-Validate-XSD-1.03
http://search.cpan.org/~doctormo/Data-Validate-XSD-1.03/
Validate complex structures by definition
----
Date-PeriodParser-0.06
http://search.cpan.org/~mcmahon/Date-PeriodParser-0.06/
Turns English descriptions into time periods
----
Devel-Events-Filter-Size-0.03
http://search.cpan.org/~nuffin/Devel-Events-Filter-Size-0.03/
Add Devel::Size info to event data.
----
Devel-Events-Objects-0.03
http://search.cpan.org/~nuffin/Devel-Events-Objects-0.03/
Object tracking support for Devel::Events
----
Devel-Leak-Module-0.01_01
http://search.cpan.org/~adamk/Devel-Leak-Module-0.01_01/
Track loaded modules and namespaces
----
Devel-Leak-Module-0.01_02
http://search.cpan.org/~adamk/Devel-Leak-Module-0.01_02/
Track loaded modules and namespaces
----
Devel-PerlySense-0.01_22
http://search.cpan.org/~johanl/Devel-PerlySense-0.01_22/
IntelliSense for Perl
----
Encode-Arabic-1.5
http://search.cpan.org/~smrz/Encode-Arabic-1.5/
Encodings of Arabic
----
Encode-Mapper-1.5
http://search.cpan.org/~smrz/Encode-Mapper-1.5/
Intuitive, yet efficient construction of mappings for Encode
----
Eval-Context-0.01
http://search.cpan.org/~nkh/Eval-Context-0.01/
Evalute perl code in context wrapper
----
File-Stat-Moose-0.01
http://search.cpan.org/~dexter/File-Stat-Moose-0.01/
Status info for a file - Moose-based
----
Games-Pentominos-1.0
http://search.cpan.org/~dami/Games-Pentominos-1.0/
solving the pentominos paving puzzle
----
HTML-Formulate-0.08
http://search.cpan.org/~gavinc/HTML-Formulate-0.08/
module for producing/rendering HTML forms
----
HTML-TableParser-0.36
http://search.cpan.org/~djerius/HTML-TableParser-0.36/
Extract data from an HTML table
----
HTML-WidgetValidator-0.0.3
http://search.cpan.org/~nanzou/HTML-WidgetValidator-0.0.3/
Perl framework for validating various widget HTML snipets
----
Hook-Modular-0.02
http://search.cpan.org/~marcel/Hook-Modular-0.02/
making pluggable applications easy
----
LaTeX-Table-v0.1.0
http://search.cpan.org/~limaone/LaTeX-Table-v0.1.0/
Perl extension for the automatic generation of LaTeX tables.
----
LaTeX-Table-v0.1.1
http://search.cpan.org/~limaone/LaTeX-Table-v0.1.1/
Perl extension for the automatic generation of LaTeX tables.
----
MIME-tools-5.421
http://search.cpan.org/~doneill/MIME-tools-5.421/
----
Module-Build-PM_Filter-v1.1
http://search.cpan.org/~vmoral/Module-Build-PM_Filter-v1.1/
Add a PM_Filter feature to Module::Build
----
Module-ScanDeps-0.77
http://search.cpan.org/~smueller/Module-ScanDeps-0.77/
Recursively scan Perl code for dependencies
----
Net-DNS-Resolver-Programmable-v0.003.1
http://search.cpan.org/~jmehnle/Net-DNS-Resolver-Programmable-v0.003.1/
programmable DNS resolver class for offline emulation of DNS
----
Net-DNS-ToolKit-0.32
http://search.cpan.org/~miker/Net-DNS-ToolKit-0.32/
tools for working with DNS packets
----
Net-DNS-ToolKit-0.33
http://search.cpan.org/~miker/Net-DNS-ToolKit-0.33/
tools for working with DNS packets
----
Net-FullAuto-0.09
http://search.cpan.org/~reedfish/Net-FullAuto-0.09/
Perl Based Secure Distributed Computing Network Process Automation Utility
----
Net-FullAuto-0.10
http://search.cpan.org/~reedfish/Net-FullAuto-0.10/
Perl Based Secure Distributed Computing Network Process Automation Utility
----
Net-eBay-0.43
http://search.cpan.org/~ichudov/Net-eBay-0.43/
Perl Interface to XML based eBay API.
----
POE-Component-Server-IRC-1.20
http://search.cpan.org/~bingos/POE-Component-Server-IRC-1.20/
A fully event-driven networkable IRC server daemon module.
----
Parallel-Iterator-0.2.0
http://search.cpan.org/~andya/Parallel-Iterator-0.2.0/
Simple parallel execution
----
Parallel-Workers-0.1.0
http://search.cpan.org/~andya/Parallel-Workers-0.1.0/
Simple parallel execution
----
Prima-prigraph-win32-1.05
http://search.cpan.org/~karasik/Prima-prigraph-win32-1.05/
binary prigraph.dll distribution for win32
----
Puppet-LogBody-1.003
http://search.cpan.org/~ddumont/Puppet-LogBody-1.003/
Log facility
----
SVG-Template-Graph-0.13
http://search.cpan.org/~ronan/SVG-Template-Graph-0.13/
Perl extension for generating template-driven graphs with SVG
----
Shell-GetEnv-0.03_1
http://search.cpan.org/~djerius/Shell-GetEnv-0.03_1/
extract the environment from a shell after executing commands
----
Slay-Maker-0.07
http://search.cpan.org/~nodine/Slay-Maker-0.07/
An perl make engine using perl code for rules
----
Socket-Class-1.1.1
http://search.cpan.org/~chrmue/Socket-Class-1.1.1/
A class to communicate with sockets
----
Teamspeak-0.5
http://search.cpan.org/~maletin/Teamspeak-0.5/
Interface to administrate Teamspeak-Voice-Server
----
Template-Multipass-0.01
http://search.cpan.org/~nuffin/Template-Multipass-0.01/
Add a meta template pass to TT
----
Template-Plugin-deJSON-0.03
http://search.cpan.org/~strytoast/Template-Plugin-deJSON-0.03/
----
Text-Aspell-0.09
http://search.cpan.org/~hank/Text-Aspell-0.09/
Perl interface to the GNU Aspell library
----
Text-CSV-Separator-0.16
http://search.cpan.org/~enell/Text-CSV-Separator-0.16/
Determine the field separator of a CSV file
----
Tie-JCR-0.02
http://search.cpan.org/~hanenkamp/Tie-JCR-0.02/
A tied hash interface for Java::JCR::Node
----
Tie-RefHash-Weak-0.07
http://search.cpan.org/~nuffin/Tie-RefHash-Weak-0.07/
A Tie::RefHash subclass with weakened references in the keys.
----
TinyAuth-0.90
http://search.cpan.org/~adamk/TinyAuth-0.90/
Extremely light-weight web-based authentication manager
----
TinyAuth-0.91
http://search.cpan.org/~adamk/TinyAuth-0.91/
Extremely light-weight web-based authentication manager
----
Tk-IDElayout-0.31
http://search.cpan.org/~cerney/Tk-IDElayout-0.31/
Tk Widget for Layout of Frames Similar to an IDE.
----
Tk-ObjScanner-2.011
http://search.cpan.org/~ddumont/Tk-ObjScanner-2.011/
Tk data scanner
----
WWW-Mechanize-Pluggable-1.04
http://search.cpan.org/~mcmahon/WWW-Mechanize-Pluggable-1.04/
custmomizable via plugins
----
WWW-Ofoto-1.20
http://search.cpan.org/~mgrimes/WWW-Ofoto-1.20/
A module to interact with the Ofoto (now Kodakgallery) website
----
WWW-Translate-Apertium-0.03
http://search.cpan.org/~enell/WWW-Translate-Apertium-0.03/
Open source machine translation
----
WWW-Translate-interNOSTRUM-0.11
http://search.cpan.org/~enell/WWW-Translate-interNOSTRUM-0.11/
Catalan < > Spanish machine translation
----
Web-Scraper-0.18
http://search.cpan.org/~miyagawa/Web-Scraper-0.18/
Web Scraping Toolkit inspired by Scrapi
----
Win32-0.32
http://search.cpan.org/~jdb/Win32-0.32/
Interfaces to some Win32 API Functions
----
Win32-GUIRobot-0.04
http://search.cpan.org/~karasik/Win32-GUIRobot-0.04/
send keyboard and mouse input to win32, analyze graphical output
----
pler-0.30
http://search.cpan.org/~adamk/pler-0.30/
The DWIM Perl Debugger
----
threads-shared-1.14
http://search.cpan.org/~jdhedden/threads-shared-1.14/
Perl extension for sharing data structures between threads
----
version-0.73
http://search.cpan.org/~jpeacock/version-0.73/
Perl extension for Version Objects
If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.
This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
http://www.stonehenge.com/merlyn/LinuxMag/col82.html
print "Just another Perl hacker," # the original
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
------------------------------
Date: Fri, 21 Sep 2007 01:32:08 GMT
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: perl and unix command
Message-Id: <slrnff672b.irj.tadmc@tadmc30.sbcglobal.net>
Paul Lalli <mritty@gmail.com> wrote:
> On Sep 20, 9:39 am, lerameur <leram...@yahoo.com> wrote:
>
>> > > I am writting a perl scipt in unix and I will invoking some
>> > > unix commands in my script. If the first command takes a few
>> > > minutes to process, how do I make sure the second command do
>> > > not start until the first command is finished ?
>>
>> > perldoc -f system
>>
>> ..
>>
>> />perldoc -f system
>> ksh: perldoc: not found
>
> Your installation of Perl is broken.
Or you don't have the right directories in your PATH.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Fri, 21 Sep 2007 07:00:50 -0000
From: Brian McCauley <nobull67@gmail.com>
Subject: Re: Using (?{}) code blocks and $^R
Message-Id: <1190358050.753246.23110@n39g2000hsh.googlegroups.com>
On Sep 20, 12:06 am, Clint Olsen <clint.ol...@gmail.com> wrote:
> I've been writing some behemoth reguar expressions and having some good
> luck relying on $^R to pass results from the RE. However, I have a
> non-intuitive result coming from Perl:
Wow, this is a "Gem"!
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> #use re 'debug';
>
> my $multiline_comment = qr@/\*(?{ print "starting multi-line\n"; [ 0, 2 ] })
> (?:(.)+?
> (?{ [ $^R->[0], $^R->[1] + length $^N ] })
> | (?: (\n+) (?{ print "found newline in multi\n"; [ length $^N, 1 ] }))
> )*?
> \*/ (?{ print "finished comment\n"; [1, 1]; })
> @x;
>
> my $foo = "/* foo
> bar */";
>
> while ($foo =~ m/$multiline_comment/g) {
> print "@{$^R}\n";
>
> }
>
> When I run this example, I get:
>
> # ./test
> starting multi-line
> found newline in multi
> finished comment
> 0 2
>
> I expected to see $^R contain the results of the final code block, not the
> block above. I assume there's some weird scoping that I didn't anticipate,
> but I'm not sure why.
IMNSO $^R was a special variable too far. Code in (?{}) is already
able to set dynamically scoped (package variables) and that really
should be enough to propagate information from one to another. It's
certainly much clearer what's going on with explicit variables.
What's happening is that all but the first (?{}) does, in effect, an
implicit local($^R). What is, IMHO, confusing here is the fact that
the first (?{}) does not local($^R).
Consider:
#!/usr/bin/perl
use strict;
use warnings;
$_='X';
/ (?{1}) (?{2}) X (?{ print $^R }) /x and print "$^R\n"; #1
'21'
/ (?{1}) (?: (?{2}) X )? (?{ print $^R }) /x and print "$^R\n"; #2
'21'
/ (?{1}) (?: (?{2}) Y )? (?{ print $^R }) /x and print "$^R\n"; #3
'11'
our $r;
/ (?{$r=1}) (?{local $r=2}) X (?{ print $r }) /x and print "$r\n";
#1a '21'
/ (?{$r=1})((?{local $r=2}) X)?(?{ print $r }) /x and print "$r\n";
#2a '21'
/ (?{$r=1})((?{local $r=2}) Y)?(?{ print $r }) /x and print "$r\n";
#3a '11'
/ (?{$r=1})((?{ $r=2}) Y)?(?{ print $r }) /x and print "$r\n";
#3b '22'
__END__
All the above matches succeed. Examples 'a' show what's going on in
terms of an ordinary variable. (I only changed (?:) to () to avoid
line-wrap in the example - the value of $1 is not in discussion
here).
In examples 1 the local is kinda redundant as it was in your code but
in examples 2 and 3 it is needed so that $^R can be popped when the
character does not match. Examples 3 shows that when the pattern after
the second (?{}) fails the stack is popped back so that
local()izations from that block are undone.
Example 3b shows what would go wrong if there were not an implicit
local($^R). Even though the $r=2 happens in an branch of the pattern
match that subsequently was backtracked its effect is not undone.
All the above may, in fact, be a simplification of the truth but in
conclusion I think that perlre's description of $^R should say that
the state of $^R should be considered indeterminate after completion
of the match.
------------------------------
Date: Thu, 20 Sep 2007 18:31:44 -0700
From: sln@netherlands.co
Subject: Re: utf8 and HTML Entities
Message-Id: <j576f3d703q8gfo42j4rudt1ps35ocg7ua@4ax.com>
On Wed, 19 Sep 2007 14:59:02 +0200, Nick Gerber <ng7067@gmx.com> wrote:
>Hi
>
>I'm lost :-(
>
>I have a string encodet in utf8 with part HTML Entities and part
>characters in utf-8.
>
>How do I translate the HTML Entities into proper utf-8?
>
>Thanks
Should be enough here to get you going:
sub convertEntities
{
my ($self, $str_ref, $opts) = @_;
my $alt_str = '';
my $res = 0;
my ($entchr);
# Usage info:
# Option bitmask: 1=char reference, 2=general reference, 4=parameter reference
# Default option is char and general references (&)
# Ignore Parameter references (%) in Attvalue and Content
# Process PE's in DTD and Entity decls
$opts = 3 unless defined $opts;
while ($$str_ref =~ /$self->{'RxEntConv'}/gc)
{
# Unicode character reference
if (defined $4) {
# decimal
if (($opts & 1) && defined ($entchr = getEntityUchar($self, $4))) {
$alt_str .= "$1$entchr";
$res = 1;
} else {
$alt_str .= "$1$2#$4;";
}
} elsif (defined $5) {
# hex
if (($opts & 1) && length($5) < 9 && defined ($entchr = getEntityUchar($self, hex($5)))) {
$alt_str .= "$1$entchr";
$res = 1;
} else {
$alt_str .= "$1$2#$5;";
}
}
else {
# General reference
if ($2 eq '&') {
if (($opts & 2) && exists $self->{'general_ent_subst'}->{$3}) {
$alt_str .= $1;
# expand general references,
# bypass if seen in the recursion ring
# ----
if (defined $self->{'ring_ent_subst'}->{$3}) {
$alt_str .= "$1$2$3;";
} else {
# recurse expansion
# ----
my ($entname, $alt_entval) = ($3, undef);
my $entval = $self->{'general_ent_subst'}->{$entname};
$self->{'ring_ent_subst'}->{$entname} = 1;
if (defined ($alt_entval = convertEntities ($self, \$entval, 2))) {
$alt_str .= $$alt_entval;
} else {
$alt_str .= $self->{'general_ent_subst'}->{$entname};
}
$self->{'ring_ent_subst'}->{$entname} = undef;
$res = 1;
}
} else {
$alt_str .= "$1$2$3;";
}
} else {
# Parameter reference
if (($opts & 4) && exists $self->{'parameter_ent_subst'}->{$3}) {
$alt_str .= "$1$self->{'parameter_ent_subst'}->{$3}";
$res = 1;
} else {
$alt_str .= "$1$2$3;";
}
}
}
}
if ($res) {
$alt_str .= substr $$str_ref, pos($$str_ref);
return \$alt_str;
}
return undef;
}
sub getEntityUchar
{
my ($self, $code) = @_;
if (($code >= 0x01 && $code <= 0xD7FF) ||
($code >= 0xE000 && $code <= 0xFFFD) ||
($code >= 0x10000 && $code <= 0x10FFFF)) {
return chr($code);
}
return undef;
}
sub addEntity
{
my ($self, $peflag, $entname, $entval) = @_;
# Non-normalized, internal entities only
# (no external defs yet, ie:SYSTEM/PUBLIC/NDATA)
return undef unless
($entval =~ s/^\s*'([^']*?)'\s*$/$1/s || $entval =~ s/^\s*"([^"]*?)"\s*$/$1/s);
# Replacement text: convert parameter and character references only
my ($alt_entval);
if (defined ($alt_entval = convertEntities ($self, \$entval, 5))) {
$entval = $$alt_entval;
}
my $enttype = 'general_ent_subst';
$enttype = 'parameter_ent_subst' if ($peflag);
if (exists $self->{'$enttype'}->{$entname}) {
# warn, pre-existing ent name
return undef;
}
$self->{$enttype}->{$entname} = $entval;
$self->{'Entities'} .= "|(?:$entname)";
# recompile regexp
$self->{'RxEntConv'} = qr/(.*?)(&|%)($self->{'Entities'});/s;
return \$entval;
}
@UC_Nstart = (
"\\x{C0}-\\x{D6}",
"\\x{D8}-\\x{F6}",
"\\x{F8}-\\x{2FF}",
"\\x{370}-\\x{37D}",
"\\x{37F}-\\x{1FFF}",
"\\x{200C}-\\x{200D}",
"\\x{2070}-\\x{218F}",
"\\x{2C00}-\\x{2FEF}",
"\\x{3001}-\\x{D7FF}",
"\\x{F900}-\\x{FDCF}",
"\\x{FDF0}-\\x{FFFD}",
"\\x{10000}-\\x{EFFFF}",
);
@UC_Nchar = (
"\\x{B7}",
"\\x{0300}-\\x{036F}",
"\\x{203F}-\\x{2040}",
);
$Nstrt = "[A-Za-z_:".join ('',@UC_Nstart)."]";
$Nchar = "[-\\w:\\.".join ('',@UC_Nchar).join ('',@UC_Nstart)."]";
$Name = "(?:$Nstrt$Nchar*?)";
$RxENTITY = qr/^\s+(?:($Name)|(?:%\s+($Name)))\s+(.*?)$/s;
------------------------------
Date: Fri, 21 Sep 2007 07:27:16 +0200
From: Helmut Wollmersdorfer <helmut@wollmersdorfer.at>
Subject: Re: utf8 and HTML Entities
Message-Id: <fcvknl$10pg$1@geiz-ist-geil.priv.at>
Nick Gerber wrote:
> I tried HTML/Entities.pm, but it didn't do the trick for me. But, it was
> me that could not make it to do the conversion for me. I'll try again.
That's my way which works for millions of HTML (or XML) files:
use HTML::Entities;
my $ENCODING = 'utf8'; # or iso-8859-7, CP1250 etc.
open (HTML, "<:encoding($ENCODING)", "$DIR/$file")
or die "Can't open: $1!";
my $data = <HTML>;
my $content = decode_entities($data);
binmode(STDOUT, ":utf8");
print "$content\n";
It is also save (in most cases) to use
my $content = decode_entities(decode_entities($data));
which decodes something like
&amp;
| $ perl -version
| This is perl, v5.8.8 built for i486-linux-gnu-thread-multi
Helmut Wollmersdorfer
------------------------------
Date: Fri, 21 Sep 2007 01:36:05 -0500
From: "Mumia W." <paduille.4061.mumia.w+nospam@earthlink.net>
Subject: Re: utf8 and HTML Entities
Message-Id: <13f6q4mg930uo89@corp.supernews.com>
On 09/20/2007 08:31 PM, sln@netherlands.co wrote:
> On Wed, 19 Sep 2007 14:59:02 +0200, Nick Gerber <ng7067@gmx.com> wrote:
>
>> Hi
>>
>> I'm lost :-(
>>
>> I have a string encodet in utf8 with part HTML Entities and part
>> characters in utf-8.
>>
>> How do I translate the HTML Entities into proper utf-8?
>>
>> Thanks
>
> Should be enough here to get you going:
>
> [ long program snipped ]
No, that's too much.
Mr. Gerber didn't post any code or data, and so he didn't get many
responses because no one knew exactly what he was talking about.
As Mr. Bullock said, HTML::Entities should do it. Here is an example:
#!/usr/bin/perl
use strict;
use warnings;
use HTML::Entities;
binmode(STDOUT, ':utf8');
local $/;
my $data = <DATA>;
$data = decode_entities($data);
print $data, "\n";
__DATA__
膄 膅 膆
á é í ó ú
ä ë ï ö ü
------------------------------
Date: Thu, 20 Sep 2007 21:25:52 -0700
From: Mintcake <tony@skelding.co.uk>
Subject: Why no warning when redclaring a variable in same scope
Message-Id: <1190348752.846369.237370@i38g2000prf.googlegroups.com>
#!/usr/local/bin/perl
use strict;
use warnings;
for my $i (1..10)
{
my $i = 0; # Why no warning
}
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 872
**************************************