[31106] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2351 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Apr 18 18:09:45 2009

Date: Sat, 18 Apr 2009 15:09:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 18 Apr 2009     Volume: 11 Number: 2351

Today's topics:
        dmake.exe:  Error: -- 'C:\Perl\libConfig.pm' not found, tuser1@gmail.com
    Re: effienct way to select random value from the hash <jurgenex@hotmail.com>
    Re: generic variable name possible? <tadmc@seesig.invalid>
    Re: I'm looking for a Perl Book... I think. <someone@somewhere.nb.ca>
    Re: I'm looking for a Perl Book... I think. <nat.k@gm.ml>
    Re: I'm looking for a Perl Book... I think. <jurgenex@hotmail.com>
    Re: I'm looking for a Perl Book... I think. <nat.k@gm.ml>
    Re: I'm looking for a Perl Book... I think. <jurgenex@hotmail.com>
    Re: I'm looking for a Perl Book... I think. <nat.k@gm.ml>
    Re: I'm looking for a Perl Book... I think. <someone@somewhere.nb.ca>
    Re: I'm looking for a Perl Book... I think. <ben@morrow.me.uk>
    Re: I'm looking for a Perl Book... I think. <jurgenex@hotmail.com>
    Re: What does `my' do?! <nospam-abuse@ilyaz.org>
    Re: What does `my' do?! <nat.k@gm.ml>
    Re: What does `my' do?! derykus@gmail.com
    Re: What does `my' do?! <ben@morrow.me.uk>
    Re: What does `my' do?! <xhoster@gmail.com>
    Re: What's wrong with the following regular expression? <haoniukun@gmail.com>
    Re: What's wrong with the following regular expression? <jurgenex@hotmail.com>
    Re: What's wrong with the following regular expression? <usenet@larseighner.com>
    Re: What's wrong with the following regular expression? <tadmc@seesig.invalid>
    Re: What's wrong with the following regular expression? <ben@morrow.me.uk>
    Re: What's wrong with the following regular expression? <jurgenex@hotmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 18 Apr 2009 13:30:26 -0700 (PDT)
From: tuser1@gmail.com
Subject: dmake.exe:  Error: -- 'C:\Perl\libConfig.pm' not found, and can't be  made
Message-Id: <b9e0fdd8-ec28-40af-92af-9351b18ce2b8@w9g2000yqa.googlegroups.com>

I am trying to build Scalar::Util::Refcount from CPAN on Windows XP
under Activestate 5.10

C:\>perl -v
This is perl, v5.10.0 built for MSWin32-x86-multi-thread
(with 5 registered patches, see perl -V for more detail)

Copyright 1987-2007, Larry Wall

Binary build 1004 [287188] provided by ActiveState http://www.ActiveState.com
Built Sep  3 2008 13:16:37
===========================

I have downloaded and extracted Scalar::Util::Refcount from CPAN and I
have successfully installed MinGW.ppm from http://ppm4.activestate.com/MSWin32-x86/5.10/1004/,

but when I try to run dmake to make , it fails...

C:\Scalar-Util-Refcount-1.0.2>dmake
dmake.exe:  Error: -- `C:\Perl\libConfig.pm' not found, and can't be
made


------------------------------

Date: Sat, 18 Apr 2009 11:29:54 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: effienct way to select random value from the hash
Message-Id: <uq6ku4h63d58q4hqrm4mvgiomj08s5hr7c@4ax.com>

Eric Pozharski <whynot@pozharski.name> wrote:
>On 2009-04-16, Jürgen Exner <jurgenex@hotmail.com> wrote:
>*SKIP*
>> I don't know for sure, but treating $hashref as a reference to an array
>> might do the trick already.
>
>No, it doesn't
>
>	{2565:11} [0:0]$ perl -Mstrict -wle 'my $x= { a => "b" }; print scalar @$x'
>	Not an ARRAY reference at -e line 1.
>
>However,
>
>	{2565:11} [0:0]$ perl -Mstrict -wle 'my $x= { a => "b" }; print scalar(() = %$x)'
>	2

Thanks!

>And one more point of concern: serializing hash into array is a subject
>of Perl's own key reordering.  Maybe just do it once?

Hmmm, when would perl reorder the keys? I would assume only when the
hash itself changes, i.e. when elements are added/removed. And in that
case probably you want to restart the random sequence anyway.

But the key question still remains unanswered: is
serialization/flattening of the hash faster than a keys()?

jue


------------------------------

Date: Sat, 18 Apr 2009 15:21:52 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: generic variable name possible?
Message-Id: <slrngukdj0.7dq.tadmc@tadmc30.sbcglobal.net>

ela <ela@yantai.org> wrote:
>
> While I succeeded in producing generic file names, I found that I cannot 
> name the file pointer 


Using the proper terminology can go a long way toward finding a 
solution, because you have a useful search term to use.

It is not a "file pointer", it is a "filehandle".


> ($PROBLEMFP in the following codes) generically. I 
> have to keep unknown number of file pointers (depending on the number of 
> lines in modellist, not known beforehand) open and therefore I have to 
> create "n" $PROBLEMFP's. Is it possible to achieve this in Perl?


    perldoc -q filehandle

        How can I make a filehandle local to a subroutine?  How 
        do I pass filehandles between subroutines?  How do I 
        make an array of filehandles?
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^


> #!/usr/bin/perl
> my ( $iteration, $modellist ) = @ARGV;


You should enable warnings and strictures.

    use warnings;
    use strict;


> open( my $FPM, '<', "$modellist") or die "could not open '$modellist' $!";
                      ^          ^
                      ^          ^

    perldoc -q vars

        What's wrong with always quoting "$vars"?


> $lineM = <$FPM>;

    chomp $lineM;

> @models = split(/\t/, $lineM);
> foreach (@models) {


You do not need the @models temporary variable:

    foreach ( split(/\t/, $lineM) ) {


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Sat, 18 Apr 2009 15:17:53 -0300
From: "Guy" <someone@somewhere.nb.ca>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <49ea194c$0$5499$9a566e8b@news.aliant.net>

> "Learning Perl" is the 1st in a series of 3 Perl tutorial books. It
> does not cover OO programming.
>
> "Intermediate Perl" is the 2nd, and it does cover OO.
>
> "Mastering Perl" is the 3rd in the series.

OK, I went on O'Reilly and ordered the three following books, buy 2 get 3.
- Intermediate Perl, 1Ed
- Mastering Perl, 1Ed
- CGI Programming with Perl, 2Ed

I hope to get them soon and hope they won't be bad choices.
Cheers,
Guy 




------------------------------

Date: Sat, 18 Apr 2009 11:24:19 -0700
From: Nathan Keel <nat.k@gm.ml>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <nVoGl.55510$_R4.9194@newsfe11.iad>

Guy wrote:

>> "Learning Perl" is the 1st in a series of 3 Perl tutorial books. It
>> does not cover OO programming.
>>
>> "Intermediate Perl" is the 2nd, and it does cover OO.
>>
>> "Mastering Perl" is the 3rd in the series.
> 
> OK, I went on O'Reilly and ordered the three following books, buy 2
> get 3. - Intermediate Perl, 1Ed
> - Mastering Perl, 1Ed
> - CGI Programming with Perl, 2Ed
> 
> I hope to get them soon and hope they won't be bad choices.
> Cheers,
> Guy

You might want to add Learning Perl to that list.  Unless you know a
reasonable amount of Perl now, you're probably going to have some
trouble jumping into Intermediate Perl.  There's not a whole lot to
using Perl for CGI (including the CGI module, which you don't have to
use if you don't want to), so it's best to ensure you know Perl, and
things like the CGI module and writing for CGI scripts will be
incredibly easy, and you'll understand more of the why about why you do
certain things.


------------------------------

Date: Sat, 18 Apr 2009 11:25:20 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <gm6ku4pipogfnir4bignla65hq7e41lgej@4ax.com>

"Guy" <someone@somewhere.nb.ca> wrote:
>> Have you tried reading the documentation for CGI.pm that comes with 
>> CGI.pm?
>>
>> That was not a rhetorical question either.
>
>I bought some web space and asked that I could use Perl. I was just given a 
>username and a password and a domain name.  I didn't install the webserver.

And this comment relates to Tad's question how?

jue


------------------------------

Date: Sat, 18 Apr 2009 11:29:03 -0700
From: Nathan Keel <nat.k@gm.ml>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <PZoGl.55512$_R4.55179@newsfe11.iad>

Jürgen Exner wrote:

> "Guy" <someone@somewhere.nb.ca> wrote:
>>> Have you tried reading the documentation for CGI.pm that comes with
>>> CGI.pm?
>>>
>>> That was not a rhetorical question either.
>>
>>I bought some web space and asked that I could use Perl. I was just
>>given a
>>username and a password and a domain name.  I didn't install the
>>webserver.
> 
> And this comment relates to Tad's question how?
> 
> jue

In that the OP clearly is new to this, so people should understand that
when responding, else they'll just get confused by all of the arrogant
and sarcastic remarks about the OP's response.  Give them a break,
they're new to it.  They don't seem to be making an effort to be off
topic or "bother anyone" by wasting their time.


------------------------------

Date: Sat, 18 Apr 2009 11:48:40 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <sj7ku4h2118glgl53mg9kocf4cu5fudka3@4ax.com>

Nathan Keel <nat.k@gm.ml> wrote:
>J?Exner wrote:
>> "Guy" <someone@somewhere.nb.ca> wrote:
>>>> Have you tried reading the documentation for CGI.pm that comes with
>>>> CGI.pm?
>>>>
>>>> That was not a rhetorical question either.
>>>
>>>I bought some web space and asked that I could use Perl. I was just
>>>given a
>>>username and a password and a domain name.  I didn't install the
>>>webserver.
>> 
>> And this comment relates to Tad's question how?
>
>In that the OP clearly is new to this, so people should understand that
>when responding, else they'll just get confused by all of the arrogant
>and sarcastic remarks about the OP's response.  Give them a break,
>they're new to it.  They don't seem to be making an effort to be off
>topic or "bother anyone" by wasting their time.

Well, ok, fine. In that case to the OP:

You don't need a web space (nor username, password,  or domain name for
such) to access the documentation of CGI.pm or any Perl module for that
matter.  
Just install it locally and then a simple 'perldoc CGI' will bring up
the documentation. You want to make sure that you are using the same
version of CGI.pm locally as on your web server, otherwise the
documentation might not match what's on the server and also your code
may behave differently on the server than when you tested it on your
local installation.

If you just want a sneak preview of the documentation of a module
without installing it, then go to CPAN and search for that module. All
the documentation is there, too.

jue


------------------------------

Date: Sat, 18 Apr 2009 13:09:24 -0700
From: Nathan Keel <nat.k@gm.ml>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <VrqGl.62558$qO1.22614@newsfe13.iad>

Jürgen Exner wrote:

> Well, ok, fine. In that case to the OP:

I appreciate that effort and I'm sure the OP does even more.  Cheers!


------------------------------

Date: Sat, 18 Apr 2009 17:23:50 -0300
From: "Guy" <someone@somewhere.nb.ca>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <49ea36d1$0$5503$9a566e8b@news.aliant.net>

> You don't need a web space (nor username, password,  or domain name for
> such) to access the documentation of CGI.pm or any Perl module for that
> matter.
> Just install it locally and then a simple 'perldoc CGI' will bring up
> the documentation. You want to make sure that you are using the same
> version of CGI.pm locally as on your web server, otherwise the
> documentation might not match what's on the server and also your code
> may behave differently on the server than when you tested it on your
> local installation.


I'm obviously in the dark ages here.  I hope I don't upset anyone when I say 
that I don't have a local server here at home. I was just planning to write 
a relatively short script and testing it online, and then link to it when 
it's working. I do appreciate all the info coming in.
Thanks again to all.
Guy 




------------------------------

Date: Sat, 18 Apr 2009 21:42:15 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <7iqpb6-vc7.ln1@osiris.mauzo.dyndns.org>


Quoth "Guy" <someone@somewhere.nb.ca>:
> > You don't need a web space (nor username, password,  or domain name for
> > such) to access the documentation of CGI.pm or any Perl module for that
> > matter.
> > Just install it locally and then a simple 'perldoc CGI' will bring up
> > the documentation. You want to make sure that you are using the same
> > version of CGI.pm locally as on your web server, otherwise the
> > documentation might not match what's on the server and also your code
> > may behave differently on the server than when you tested it on your
> > local installation.
> 
> 
> I'm obviously in the dark ages here.  I hope I don't upset anyone when I say 
> that I don't have a local server here at home. I was just planning to write 
> a relatively short script and testing it online, and then link to it when 
> it's working. I do appreciate all the info coming in.

You don't need a 'server'. You just need a computer: the one you're
posting from will probably do fine. If it's running Win32, you can get
perl from strawberryperl.com; if it's a Mac, you've almost certainly got
perl installed already. If it's running something else, let us know and
we'll see if anyone knows how to get perl installed on it.

While installing a local webserver can be useful for testing websites,
CGI.pm has a useful 'testing' mode when a program using it is run from
the command line. See the docs, when you've got them.

Note that you can also find the docs from http://search.cpan.org, which
will let you see the docs for old versions as well so you can use the
docs from the version your ISP has installed.

Ben



------------------------------

Date: Sat, 18 Apr 2009 13:51:48 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: I'm looking for a Perl Book... I think.
Message-Id: <hteku4tce9hivh139vm6fbftvrro5pp48h@4ax.com>

"Guy" <someone@somewhere.nb.ca> wrote:
>I'm obviously in the dark ages here.  I hope I don't upset anyone when I say 
>that I don't have a local server here at home. 

You do have a computer thou, don't you? Unless you have an _extremely_
exotic OS you can install Perl and all its documentation including
CGI.pm on that computer. 

>I was just planning to write 
>a relatively short script and testing it online, and then link to it when 
>it's working. 

Well, that's a very cumbersome workflow, in particular because the site
will be unusable while you are developing and testing the scripts.
Typically that is a big no-go for any commercial application.
Standard procedure is to develop and test in a private test environment
before publishing to the official web server.

And there are any number of web servers that can be installed locally
and allow you to develop and test locally before deploying.

jue


------------------------------

Date: Sat, 18 Apr 2009 18:02:29 GMT
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: What does `my' do?!
Message-Id: <slrnguk5q0.6ug.nospam-abuse@chorin.math.berkeley.edu>

On 2009-04-17, sln@netherlands.com <sln@netherlands.com> wrote:
>>I still manage to run OS/2 on all boxes in the house...

> WOOS2 - Windows on OS/2

No, I never got to installing and patching Windows3 [so that it would
become a subsystem of OS/2].  (And, IIRC, this would be called WinOS2.)

Yours,
Ilya


------------------------------

Date: Sat, 18 Apr 2009 11:27:40 -0700
From: Nathan Keel <nat.k@gm.ml>
Subject: Re: What does `my' do?!
Message-Id: <xYoGl.55511$_R4.9099@newsfe11.iad>

Ilya Zakharevich wrote:

> Another sigh...  *If* the things were this simple...
> 
> perl -wle "my $x = 12, $x = 13; print $x"
> Name "main::x" used only once: possible typo at -e line 1.
> 12

Things are *nearly* that simple.  While your points are valid in that
there's more to it, who in the world would write code like that?  You
write weird or bad code, you get a weird or bad result.  I get the
topic of interest and that you wouldn't write code like that, but I
wouldn't approach the topic like you would.  All of the results are
predictable here, I don't see the problem.


------------------------------

Date: Sat, 18 Apr 2009 13:17:15 -0700 (PDT)
From: derykus@gmail.com
Subject: Re: What does `my' do?!
Message-Id: <e65bdf37-8c65-4779-9737-bd8c2a6967f0@d7g2000prl.googlegroups.com>

On Apr 18, 10:06=A0am, Ilya Zakharevich <nospam-ab...@ilyaz.org> wrote:
> On 2009-04-17, Ben Morrow <b...@morrow.me.uk> wrote:
> ...
> >> Yor code example represents a closure wherein the value of a lexical
>
> Wrong - already discussed.
>
> >> variable declared outside of its scope gets captured. As usual, $x is
> >> cleared at the end of the scope of $ARGV[0]
>
> As my example shows, it's value is 12. =A0So it is not "cleared".
>

Just to be clear in this rather complicated thread, Ilya is responding
to Ferry B's post
rather than Ben's.



------------------------------

Date: Sat, 18 Apr 2009 21:32:07 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: What does `my' do?!
Message-Id: <7vppb6-vc7.ln1@osiris.mauzo.dyndns.org>


Quoth Ilya Zakharevich <nospam-abuse@ilyaz.org>:
> 
> Another sigh...  *If* the things were this simple...
> 
>   perl -wle "my $x = 12, $x = 13; print $x"
>   Name "main::x" used only once: possible typo at -e line 1.
>   12
> 
> So `my $x' has a runtime effect too: it puts the "new" $x on
> stack - as different from plain `$x', which puts the "current" $x on
> stack.  And to muddy things yet more, the switch of the "current" $x
> to the new value happens at time of `;' - which I consider a very
> brain-damaged decision...

Even weirder:

    ~% perl -le'$x = ($x = 5) + 1; print $x'
    6
    ~% perl -le'my $x = (my $x = 5) + 1; print $x'
    5
    ~%

:)

Ben



------------------------------

Date: Sat, 18 Apr 2009 14:31:14 -0700
From: Xho Jingleheimerschmidt <xhoster@gmail.com>
Subject: Re: What does `my' do?!
Message-Id: <49ea4754$0$4983$ed362ca5@nr5-q3a.newsreader.com>

Ben Morrow wrote:
> Quoth Xho Jingleheimerschmidt <xhoster@gmail.com>:
>> Ben Morrow wrote:
>>> No. People seem to keep making this mistake. Named subs are *not*
>>> closures in Perl 5, they simply keep a ref to all the variables they
>>> reference.
>> Isn't that what a closure is?
> 
> Not quite. Named subs are compiled once at compile time, and only keep a
> ref to the outer lexicals as they existed then. This means that code
> like
> 
>     use warnings;
> 
>     sub mkfoo {
>         my ($x) = @_;
> 
>         sub foo { $x }
> 
>         return \&foo;
>     }

You could change the named sub def to:

           eval q{sub foo {$x}};

>     say mkfoo($_)->() for 1..2;
> 
> gives
> 
>     Variable "$x" will not stay shared at clos line 6.
>     1
>     1
> 
> whereas if you use an anon sub you get a fresh clone of the sub with
> refs to the *current* set of outer lexicals, so code like
> 
>     sub mkfoo {
>         my ($x) = @_;
>         return sub { $x };
>     }

You could put the anonymous sub def in a begin block, to prevent it from 
  being compiled.  It can't be done as elegantly as the string eval 
above, because you need to somewhere to stash value, but it can be done.
> 
>     say mkfoo($_)->() for 1..2;
> 
> gives
> 
>     1
>     2
> 
> as expected. This is obviously important when using closures as
> callbacks and such.

Sure, anonymous subs are more useful in such situations, but I think 
that what triggers recompilation is an accident of the language, and not 
the essence of a closure.  I think the essence of a closure is what it 
does with lexical variables at the time it is compiled.  So I think a 
named sub is still a closure, it is just one that has an implicit BEGIN 
block around it, and so only compiles once unless you take steps to 
change that.

Xho


------------------------------

Date: Sat, 18 Apr 2009 11:10:40 -0700 (PDT)
From: kun niu <haoniukun@gmail.com>
Subject: Re: What's wrong with the following regular expression?
Message-Id: <671aab8b-f4df-4502-8d64-8d8b3b509563@c18g2000prh.googlegroups.com>

On 4=D4=C219=C8=D5, =C9=CF=CE=E71=CA=B110=B7=D6, Lars Eighner <use...@larse=
ighner.com> wrote:
> In our last episode,
> <45beda24-7f78-4891-8119-e879c505d...@z23g2000prd.googlegroups.com>, the
> lovely and talented kun niu broadcast on comp.lang.perl.misc:
>
>
>
> > On 4??19??, ????12??01??, kun niu <haoniu...@gmail.com> wrote:
> >> Dear all,
>
> >> I'm trying to help to extrace email from company's website.
> >> Here's part of my test script.
>
> >> $content =3D "<a href=3D\"mailto:t...@google.com\"><a class=3D\"hello\=
" href=3D
> >> \"mailto:t...@google.com?title=3Dhello\">";
> >> @emails =3D ($content =3D~ /<a.*href=3D"mailto:(.*)>"/cgim);
> >> foreach my $email (@emails)
> >> {
> >>     print "email:" .  $email . "\n";}
>
> >> But to my surprise, no result is printed.
> >> I'm working on Debian squeeze.
> >> My perl version is 5.10.0.
> >> Would anyone here please help me out?
> >> Thanks for any hints or advice in advance.
> > I'm sorry thatthe script should be the following one:
> > $content =3D "<a href=3D\"mailto:test\@google.com\"><a class=3D\"hello\=
"
> > href=3D\"mailto:test\@google.com?title=3Dhello\">";
> > @emails =3D ($content =3D~ /<a.*href=3D"mailto:(.*)>"/cgim);
> > foreach my $email (@emails)
> > {
> >     print "email:" .  $email . "\n";
> > }
>
> First, parsing html with regular expressions is an extremely bad idea.
>
> Second, regular expresions are greedy, so this provides at most one match=
 .
>
> Third, you have set up () to capture something.  What did you do with it?
> Answer: you did nothing with it.
>
> Fourth, ($content =3D~ /<a.*href=3D"mailto:(.*)>"/cgim) --- why do think =
this
> could ever return anything besides TRUE or FALSE.
>
> --
>         Lars Eighner <http://larseighner.com/> use...@larseighner.com
>             88 days since Rick Warren prayed over Bush's third term.
>    Obama: No hope, no change, more of the same. Yes, he can, but no, he w=
on't.

Really appreciate your attention.
Here's part of my code that's been modified:
$content =3D "<a class=3D\"hello\" href=3D\"http://www.vapee.com/new\">
\nvapee</a>\n<a href=3D\"http://www.google.com\">google</a><a href=3D
\"mailto:niu\@google.com\">thank you</a>";

@matches =3D ($content =3D~ /<a.*?href=3D"(.*?)">/gim);

foreach my $value (@matches)
{
    print $value . "\n";
    if($value =3D~ /^mailto:/)
    {
        print "i'm an emal $value.\n";
    }
}
In case that you have a perl intrepreter, you can try it.
It gives the right value at least for me.:)
And do you have any other recommendation for html parsing except
regex?
Would html::parser be faster?
Thanks again for your reply.


------------------------------

Date: Sat, 18 Apr 2009 11:35:25 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: What's wrong with the following regular expression?
Message-Id: <137ku49crk0j8jriut53m4vuc3g1l86cdf@4ax.com>

kun niu <haoniukun@gmail.com> wrote:
>And do you have any other recommendation for html parsing except
>regex?
>Would html::parser be faster?

No, but it would do it correctly.

This has been discussed many, many times over. And there is only this
'sln' guy, how still believes a non-regular language (HTML) could be
parsed by regular expressions. Of course he still owes the proof that
Chomsky was wrong.

jue


------------------------------

Date: Sat, 18 Apr 2009 19:34:03 +0000 (UTC)
From: Lars Eighner <usenet@larseighner.com>
Subject: Re: What's wrong with the following regular expression?
Message-Id: <slrngukal9.2bon.usenet@debranded.larseighner.com>

In our last episode,
<671aab8b-f4df-4502-8d64-8d8b3b509563@c18g2000prh.googlegroups.com>, the
lovely and talented kun niu broadcast on comp.lang.perl.misc:

> In case that you have a perl intrepreter, you can try it.
> It gives the right value at least for me.:)
> And do you have any other recommendation for html parsing except
> regex?
> Would html::parser be faster?

No, but it would be robust.  You are scraping e-mail addresses.  It is
difficult to imagine a morally acceptable reason for doing this, but more to
the point, you don't have control of the original documents.  You don't know
that the documents were valid to begin with, you don't know what variations
may be in the scraped A tags, and so forth.  Without thinking about it, I
can see three ways your regex can fail with real-world data.  You could
patch those up, and then you could deal with the ways it fails that I don't
see immediately.  Eventually the code is not readable, not maintainable, and
still not right.

Yes, yes, yes.  I have used regex to write a quick and dirty one-liner to
make a small change in a stack of documents that I had created and that I
knew no one had mucked with.  It's still nuts to use regex for html in a
production script.

> Thanks again for your reply.

-- 
        Lars Eighner <http://larseighner.com/> usenet@larseighner.com
            88 days since Rick Warren prayed over Bush's third term.
   Obama: No hope, no change, more of the same. Yes, he can, but no, he won't.


------------------------------

Date: Sat, 18 Apr 2009 15:09:35 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: What's wrong with the following regular expression?
Message-Id: <slrngukcrv.7dq.tadmc@tadmc30.sbcglobal.net>

kun niu <haoniukun@gmail.com> wrote:


> $content = "<a href=\"mailto:test@google.com\"><a class=\"hello\" href=
                                   ^^^^^^^
> \"mailto:test@google.com?title=hello\">";


You should always enable warnings when developing Perl code.


> @emails = ($content =~ /<a.*href="mailto:(.*)>"/cgim);
                                               ^^
                                               ^^ these are transposed...

-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Sat, 18 Apr 2009 21:36:29 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: What's wrong with the following regular expression?
Message-Id: <d7qpb6-vc7.ln1@osiris.mauzo.dyndns.org>


Quoth Jürgen Exner <jurgenex@hotmail.com>:
> kun niu <haoniukun@gmail.com> wrote:
> >And do you have any other recommendation for html parsing except
> >regex?
> >Would html::parser be faster?
> 
> No, but it would do it correctly.
> 
> This has been discussed many, many times over. And there is only this
> 'sln' guy, how still believes a non-regular language (HTML) could be
> parsed by regular expressions. Of course he still owes the proof that
> Chomsky was wrong.

Chomsky has nothing to do with it. Perl's regexes are far from regular,
especially with the new recursion features in 5.10. It may or may not be
possible to correctly parse HTML with a Perl regex; the important points
are that such a regex would likely be incomprehensible, and that
HTML::Parser is already written and works extermely well.

Ben



------------------------------

Date: Sat, 18 Apr 2009 14:02:24 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: What's wrong with the following regular expression?
Message-Id: <ijfku49dscj9b7cgtlqqer8c22lqf98thm@4ax.com>

Ben Morrow <ben@morrow.me.uk> wrote:
>
>Quoth Jürgen Exner <jurgenex@hotmail.com>:
>> kun niu <haoniukun@gmail.com> wrote:
>> >And do you have any other recommendation for html parsing except
>> >regex?
>> >Would html::parser be faster?
>> 
>> No, but it would do it correctly.
>> 
>> This has been discussed many, many times over. And there is only this
>> 'sln' guy, how still believes a non-regular language (HTML) could be
>> parsed by regular expressions. Of course he still owes the proof that
>> Chomsky was wrong.
>
>Chomsky has nothing to do with it. Perl's regexes are far from regular,

Agreed. However we had this discussion before and that sln guy actually
does believe non-regular languages can be parsed with regular
automatons. And I'm sure he will chime in again, because it's his
favourite subject.

>especially with the new recursion features in 5.10. It may or may not be
>possible to correctly parse HTML with a Perl regex; 

Agreed. As I have mentioned many, many times before. I just tried to
keep it short and simple this time.

>the important points
>are that such a regex would likely be incomprehensible, and that
>HTML::Parser is already written and works extermely well.

200% agreed.

jue


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2351
***************************************


home help back first fref pref prev next nref lref last post