[28285] in Perl-Users-Digest
Perl-Users Digest, Issue: 9649 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Aug 26 14:05:57 2006
Date: Sat, 26 Aug 2006 11:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sat, 26 Aug 2006 Volume: 10 Number: 9649
Today's topics:
Re: Improved with a hatchet? <benmorrow@tiscali.co.uk>
Re: Match regular expression from LEFT to right <fritz-bayer@web.de>
Re: Match regular expression from LEFT to right <fritz-bayer@web.de>
Re: Match regular expression from LEFT to right <tadmc@augustmail.com>
Re: Match regular expression from LEFT to right anno4000@radom.zrz.tu-berlin.de
Re: page reload question <mgarrish@gmail.com>
Perl's GUI <zhushenli@gmail.com>
Re: Perl's GUI <sigzero@gmail.com>
Re: Perl's GUI <zentara@highstream.net>
Re: Question about UNIVERSAL <benmorrow@tiscali.co.uk>
Re: regular expression variables under debugger <rvtol+news@isolution.nl>
Re: regular expression variables under debugger <1usa@llenroc.ude.invalid>
Re: regular expression variables under debugger <tadmc@augustmail.com>
Re: regular expression variables under debugger <wlcna@nospam.com>
Re: regular expression variables under debugger <hjp-usenet2@hjp.at>
Re: warnings (was Re: Most useful standard module?) <not-for-replies@zombie.org.uk>
Re: warnings (was Re: Most useful standard module?) (Chris Richmond - MD6-FDC ~)
Re: warnings (was Re: Most useful standard module?) <benmorrow@tiscali.co.uk>
Re: warnings <mritty@gmail.com>
Re: warnings (Chris Richmond - MD6-FDC ~)
Re: Win32::GUI and Scrolling Text <zentara@highstream.net>
Re: Win32::GUI and Scrolling Text <zentara@highstream.net>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 25 Aug 2006 23:09:45 +0100
From: Ben Morrow <benmorrow@tiscali.co.uk>
Subject: Re: Improved with a hatchet?
Message-Id: <9294s3-ll7.ln1@osiris.mauzo.dyndns.org>
Quoth xhoster@gmail.com:
> xx087+news@ncf.ca wrote:
> > At 2006-08-25 11:38AM, "Brian McCauley" wrote:
> > > Glenn Jackman wrote:
> > >
> > > > return splice(@list, $index+1), @list
> > >
> > > Whilst that's likely to be fast and efficient and give the right
> > > result it still gives me an uneasy feeling that it's behaviour is,
> > > strictly speaking, undefined and could, at least in principle, change
> > > in a future release.
> >
> > How is it undefined? The splice docs say:
[snip]
>
> Is the second occurence of @list in the return statement evaluated before
> or after the splice is done to it? In other words, is the comma between
> splice(...),@list as sequence point or not?
As perl is currently implemented, it is. The scalar and list comma
operators are actually the same underneath (a generic evaluate-and-build
-a-list op), and the main use of the comma op in scalar context is as a
sequence point. Now, there's obviously no official spec for Perl that I
can wave at you :), but I seriously doubt p5p would change something
like that without a major amount of fuss.
All of this only goes for Perl5, of course. Perl6 may well change
things, but, well, that's kinda the point :).
Ben
--
The cosmos, at best, is like a rubbish heap scattered at random.
Heraclitus
benmorrow@tiscali.co.uk
------------------------------
Date: 26 Aug 2006 00:50:22 -0700
From: "fritz-bayer@web.de" <fritz-bayer@web.de>
Subject: Re: Match regular expression from LEFT to right
Message-Id: <1156578622.659821.197320@p79g2000cwp.googlegroups.com>
robic0 wrote:
> On 25 Aug 2006 06:49:33 -0700, "fritz-bayer@web.de" <fritz-bayer@web.de> wrote:
>
> >Hi,
> >
> >lets say I have the following string:
> >
> ><tr> dfsdfre <tr>fsdsfd35gd <tr>khf758 <tr>afdga654jhuotj <input
> >type="text"> 67kfbs356</tr>sh tu65 </tr> hbrubs</tr>
> >
> >and I would like to capture the text before the <input...> until the
> >first <tr> and the text after until the first <tr> so that I get
> >
> ><tr>afdga654jhuotj <input type="text> 67kfbs356</tr>
> >
> >How would I do this?
> >
> >=~ m!<tr>.*?<input type="text">.*?<tr>!
> >
> >will only work capturing the first <tr> after the <input..>. The
> >problem is that I have to find a expression, which starts looking from
> >the right to the left of <input...>.
> >
> >Fritz
>
> Finding text from the phrase back to the keyword is indeed hard since
> searches procede left to right in general.
That is actually getting to the heart of my question. If somebody could
tell me how do do this, then my problem would be solved.
I have read in Oreilly Regular Expressions Book but could not find a
topic on it, even though I just skimmed through the chapters.
I had a feeling so, that lookaheads could be helpfull, because they are
just used to mark a position. But this I guess still leaves me with
defining where this position is, so I figured they aren't the answer to
my question.
> Html/Xml is indeed easier to parse because of its mark-up, and indeed
> one of the hardest things to do corectly.
>
Actually the real example consists of HTML. So there are all kinds of
different tags and they of course can varry. I want to grap a group of
radio boxes, which are contained inside a table. But I only want to
grap the first and last row of the table and everything within.
The back of the expression is easy, but searching from the first radio
box to the left is difficult, because up to this point the document
contains all kinds of tags, words and so on, that you always catch
something in the front. That's why I would like to look from the right
to the left. Then I could ignore all this noise before.
> Some other alternatives:
>
> - Method 1 is is a negative character class with one character '<'.
> <> are very powerfull delimeters.
This would fail of course, because it will capture many other tags in
front of my radio button group.
>
> - Method 2 is an alternative to a negative assertion construct (?!...)
> i believe was mentioned by another poster. I believe the method below to
> be a close proximity to negative assertions.
> I'm not at all comfortable with negative assertions, however, logically it is the only way.
>
> I made all the tags start tags, and narrowed down the regex to the range
> of interest, the start/end text;
>
This could work, if I let in rund through until the very end. However,
I'm not sure, I have to try.
>
> use strict;
> use warnings;
>
> my $string =
> '<tr> dfsdfre <tr>fsdsfd35gd <tr>khf758 <tr>afdga654jhuotj <input type="text"> 67kfbs356<tr>sh tu65 <tr> hbrubs<tr>';
>
> # -- method 1 --
>
> my ($capt) = $string =~ m!(<tr>[^<]*<input type="text">)!;
> print "found: $capt\n";
>
>
> # -- method 2 --
>
> while ($string =~ /<tr>(.*?)(?:(<tr>)|<input type="text">)/g)
> # 1 1( 2 2| )
> {
> if (defined $2)
> {
> pos($string) = pos($string) - 4;
> next;
> }
> print "found: $1\n";
> }
>
> __END__
>
> found: <tr>afdga654jhuotj <input type="text">
> found: afdga654jhuotj
------------------------------
Date: 26 Aug 2006 00:55:38 -0700
From: "fritz-bayer@web.de" <fritz-bayer@web.de>
Subject: Re: Match regular expression from LEFT to right
Message-Id: <1156578938.745760.135980@i3g2000cwc.googlegroups.com>
anno4000@radom.zrz.tu-berlin.de wrote:
> fritz-bayer@web.de <fritz-bayer@web.de> wrote in comp.lang.perl.misc:
> >
> > anno4000@radom.zrz.tu-berlin.de wrote:
> > > fritz-bayer@web.de <fritz-bayer@web.de> wrote in comp.lang.perl.misc:
> > > > Hi,
> > > >
> > > > lets say I have the following string:
> > > >
> > > > <tr> dfsdfre <tr>fsdsfd35gd <tr>khf758 <tr>afdga654jhuotj <input
> > > > type="text"> 67kfbs356</tr>sh tu65 </tr> hbrubs</tr>
> > > >
> > > > and I would like to capture the text before the <input...> until the
> > > > first <tr> and the text after until the first <tr> so that I get
> > > >
> > > > <tr>afdga654jhuotj <input type="text> 67kfbs356</tr>
> > > >
> > > > How would I do this?
> > > >
> > > > =~ m!<tr>.*?<input type="text">.*?<tr>!
> > > >
> > > > will only work capturing the first <tr> after the <input..>. The
> > > > problem is that I have to find a expression, which starts looking from
> > > > the right to the left of <input...>.
> > >
> > > Your explanation is confused about whether the closing part should
> > > be <tr> or </tr>. Please clear that up.
> > >
> > > Anno
> >
> > Hi Anno,
> >
> > this is just an example. Actually I'm not looking for a concrete
> > solution for this. It's just an example to illustrate my problem.
>
> Right. Because it is meant to illustrate the problem it is important
> that you make it consistent.
Hi Anno, sorry you are right. The thing is my real example contains so
much text that I did not want to post it here. But of course, if I
don't I'm likely to get the right answer on the wrong question.
So let me explain this words - as I did below. I'm trying to capture a
group of radio buttons which resides inside a table in the middle of a
html document, which contains lots of tags and text.
Capturing the back of the html table after the radio buttons is easy as
a ".*?</table>, will do the job. However, capturing the first <table>
tag before the radio button group is more difficult, because there are
plenty of table tags before.
Actually I only want to get the rows in the table, which contain the
radio buttons, but I guess once I get the table I can just strip the
table tags off.
>
> Concrete solutions is all I have to offer. I don't think there is
> a general solution to your backwards-matching problem. There may
> well be individual solutions to special cases of it.
>
> > If I have a text which contains a lot of text for example. An in the
> > middle somewhere I have the phrase "this is the center of the text",
> > then how can I capture this sentence plus the 10 words preceeding and
> > following the sentence.
>
> I'll use a somewhat simplistic definition of "word": any sequence of
> non-spaces, optionally followed by a space, /\S+ ?/ in regex. This
> will match $n words around "this is the center ":
>
> my $text = "stop this and stop that and " .
> "this is the center " .
> "and stop stop again";
> my $n = 3;
> my ( $extr) = $text =~
> /((?:\S+ ?){$n}this is the center (?:\S+ ?){$n})/;
> print $extr || '-failed-', "\n";
>
> That prints three words on both sides of "this is the center "
>
> stop that and this is the center and stop stop
>
> > Or how could I caputre this sentece plus any text which preceeds this
> > sentence UNTIL the word "stopword" is matched. A "stopword.*?this is
> > the center of the text" can fail, if the word "stopword" is a common
> > word, which appears several times.
>
> Using "stop" for "stopword" and the same text from above:
>
> ( $extr) = $text =~ /.*(stop.*this is the center.*?stop)/;
> print $extr || '-failed-', "\n";
>
> prints the nearest pair of "stop"s surrounding "this is the center"
> plus intervening text:
>
> stop that and this is the center and stop
>
> These are rough solutions which may be good enough for some
> applications but not for others. Refining them while sticking
> to the principle that one regex must do it is usually *not*
> worth the while. If a robust, flexible solution is needed,
> it is better to do the work in several steps.
>
> Anno
------------------------------
Date: Sat, 26 Aug 2006 08:33:09 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Match regular expression from LEFT to right
Message-Id: <slrnef0jcl.r6e.tadmc@magna.augustmail.com>
fritz-bayer@web.de <fritz-bayer@web.de> wrote:
> Actually the real example consists of HTML.
Then you probably should not be trying to process it with
regular expressions.
You should use a module that understands HTML for processing HTML data.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 26 Aug 2006 14:51:05 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Match regular expression from LEFT to right
Message-Id: <4lb5epF14632U1@news.dfncis.de>
fritz-bayer@web.de <fritz-bayer@web.de> wrote in comp.lang.perl.misc:
> anno4000@radom.zrz.tu-berlin.de wrote:
> > fritz-bayer@web.de <fritz-bayer@web.de> wrote in comp.lang.perl.misc:
> > > anno4000@radom.zrz.tu-berlin.de wrote:
> > > > fritz-bayer@web.de <fritz-bayer@web.de> wrote in comp.lang.perl.misc:
[...]
> So let me explain this words - as I did below. I'm trying to capture a
> group of radio buttons which resides inside a table in the middle of a
> html document, which contains lots of tags and text.
>
> Capturing the back of the html table after the radio buttons is easy as
> a ".*?</table>, will do the job. However, capturing the first <table>
> tag before the radio button group is more difficult, because there are
> plenty of table tags before.
It isn't so hard. Did you look at the "stop" example I gave?
> > my $text = "stop this and stop that and " .
> > "this is the center " .
> > "and stop stop again";
> > ( $extr) = $text =~ /.*(stop.*this is the center.*?stop)/;
> > print $extr || '-failed-', "\n";
Allowing an arbitrary greedy match before capturing the leading "stop"
eats as much text as possible while still allowing the match. So it
finds the "stop" nearest to the center text with no more intervening
"stop"s. That's what you want, isn't it?
Anno
------------------------------
Date: 26 Aug 2006 05:01:37 -0700
From: "Matt Garrish" <mgarrish@gmail.com>
Subject: Re: page reload question
Message-Id: <1156593697.587886.91380@i3g2000cwc.googlegroups.com>
pleaseexplaintome@yahoo.com wrote:
> Hi, I have a perl/cgi script that includes dynamically created
> checkboxes and file names. When a given checkbox is checked I move the
> related file. how do I redisplay the page without the checkbox and
> file name. In other words, I want to reload the page just as if the
> user has entered for the 1st time.
>
> i've tried variations of:
>
> <form action="/cgi-bin/page.cgi" method="post" onsubmit="doRefresh()">
>
> function doRefresh(){
> location.replace("/cgi-bin/page.cgi");
> //location.reload("/cgi-bin/page.cgi");
> //location.reload("");
> }
>
This doesn't make any sense at all. If you're posting back the form to
move the files, why don't you remove the checkboxes from the page you
return in response? You're trying to intercept the post action and
instead reload the page, which means the form the user is submitting
will never get processed.
Matt
------------------------------
Date: 26 Aug 2006 04:20:04 -0700
From: "Davy" <zhushenli@gmail.com>
Subject: Perl's GUI
Message-Id: <1156591204.790797.279940@b28g2000cwb.googlegroups.com>
Hi all,
I want to choose a GUI for my Perl program. Someone tell me there is
Perl/Tk and easy to use. But I found the latest version of Perl/Tk is
released on 11 Apr 2004. Does it mean Perl/Tk is mature and no update
for a long time or other?
And is there any other good choice for Perl's GUI?
Thanks!
Davy
------------------------------
Date: 26 Aug 2006 04:40:30 -0700
From: "Robert Hicks" <sigzero@gmail.com>
Subject: Re: Perl's GUI
Message-Id: <1156592430.870135.315790@m73g2000cwd.googlegroups.com>
Davy wrote:
> Hi all,
>
> I want to choose a GUI for my Perl program. Someone tell me there is
> Perl/Tk and easy to use. But I found the latest version of Perl/Tk is
> released on 11 Apr 2004. Does it mean Perl/Tk is mature and no update
> for a long time or other?
>
> And is there any other good choice for Perl's GUI?
>
> Thanks!
> Davy
wxPerl...google for it.
There is also Tkx which is a newer interface into Tk that is great and
coming from ActiveState. It is on CPAN.
Robert
------------------------------
Date: Sat, 26 Aug 2006 13:00:39 GMT
From: zentara <zentara@highstream.net>
Subject: Re: Perl's GUI
Message-Id: <s1g0f25413ghhvvm793lnf3kcqkpjebkb1@4ax.com>
On 26 Aug 2006 04:20:04 -0700, "Davy" <zhushenli@gmail.com> wrote:
>Hi all,
>
>I want to choose a GUI for my Perl program. Someone tell me there is
>Perl/Tk and easy to use. But I found the latest version of Perl/Tk is
>released on 11 Apr 2004. Does it mean Perl/Tk is mature and no update
>for a long time or other?
Tk will be the easiset for you to get going. The basics concepts of
writing a Tk program has not changed since 2004. There have been
bug-fixes in Tk since then, but Nick Ing-Simmons has left applying
the bug fix patches to the end users. He may be waiting to see how
they all work out, before issuing a new Tk version level.
Many are dissapointed by this, but it is not a "show-stopper". The only
bug which I absolutely had to patch, was one where Gtk2 apps were
crashing Tk apps. You can check the Tk buglist for details.
>And is there any other good choice for Perl's GUI?
Perl/Tk is based on Tcl/TK, and the 804.27 version level represents
the latest port. You might want to look at http://www.tcl.tk/ as an
alternative. Perl/Tk is for people who like Perl as there basic
language, but you might like Tcl.
The other real competitor is Perl/Gtk2. Perl/Gtk2 has had the
advantage of being built from the ground up with a consistent
object model, and widget design policies. WxPerl is based on
Perl/Gtk2.
The biggest complaint about Tk, is that it is a big hodgepodge
of independent widgets, whearas Gtk2 is all based on Glib objects.
This dosn't sound too important yet, but it becomes so when you
try to sub-class objects and control signalling between them.
Now for the bad news. Perl/Gtk2 can make it very difficult to do
what is simple in Tk. The most obvious thing is colorization of
widgets. Gtk2 has a powerful "theme" system, so you can make all your
apps look like your other Gtk2 apps like mozilla. BUT, that same theme
system, makes it quite difficult to give custom colors and fonts to
individual widgets.
Tk dosn't have that problem.
Use Tk. It will be easier for you to get your app up and running.
Afterwards, you can port it to Gtk2 or WxWidgets and compare
how easy it was to write.
--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
------------------------------
Date: Fri, 25 Aug 2006 22:56:52 +0100
From: Ben Morrow <benmorrow@tiscali.co.uk>
Subject: Re: Question about UNIVERSAL
Message-Id: <4a84s3-ll7.ln1@osiris.mauzo.dyndns.org>
Quoth "Ferry Bolhar" <bol@adv.magwien.gv.at>:
> Ben Morrow:
>
> >> Is this what "Regexp::DESTROY" is for?
> >
> > WTF??? *NO*.
> >
> > Regexp::DESTROY does nothing. It's defined in perl/universal.c .
>
> If it does nothing, why does it exist?
I don't know. I suspect the memory leak you mention below has something
to do with it, though I can't offhand come up with a situation where not
having a DESTROY can cause leaks.
> > What on earth made you think it had *anything* to do with cycles in the
> > @ISA hierarchy?
>
> Well, it was just a guess - some longe time ago, I heard about an error
> with a special kind of objects which did not get destroyed correctly,
> resulting in memory leaks - and that's what Regexp::DESTROY was
> created for. And since @ISA has to do with objects and you wrote
> about an error in @ISA handling without further details, I guessed it
> could be this one...
Sorry, I must have been unclear. What I meant was:
When perl finds a loop in the @ISA hierarchy (that is, when it hits
a class for the second time while searching for a given method), it
will croak with 'Recursive inheritance detected...'. UNIVERSAL must
be special-cased in this test, such that if perl hits UNIVERSAL
twice while looking for a given method, it simply concludes the
method cannot be found without throwing an error.
I did not mean there was an error in perl, but that perl already detects
cycles in @ISA as an error in your program.
Ben
--
If you put all the prophets, | You'd have so much more reason
Mystics and saints | Than ever was born
In one room together, | Out of all of the conflicts of time.
benmorrow@tiscali.co.uk The Levellers, 'Believers'
------------------------------
Date: Sat, 26 Aug 2006 12:54:22 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: regular expression variables under debugger
Message-Id: <ecpgdg.15o.1@news.isolution.nl>
wlcna schreef:
> Dr.Ruud:
>> For me it works (with -d) as you expect.
>> Did you also try to add watch expressions with -w for $str and $1?
>
> So, I thought putting a watch on $1 was a useful suggestion
How about the watch on $str?
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: Sat, 26 Aug 2006 11:38:52 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: regular expression variables under debugger
Message-Id: <Xns982B4DE685C01asu1cornelledu@127.0.0.1>
"wlcna" <wlcna@nospam.com> wrote in news:J1PHg.18$tU.4
@newssvr21.news.prodigy.com:
> "Henry Law" <news@lawshouse.org> wrote in message
> news:1156517403.19827.0@damia.uk.clara.net...
>> wlcna wrote:
>>> "Tad McClellan" <tadmc@augustmail.com> wrote in message
>>> news:slrneesp5j.mcs.tadmc@magna.augustmail.com...
>>>> Then the problem is probably in the part that made you use
>>>> the "essentially" qualifier... :-)
>>>
>>> :) Of course, but not really in this case. The offending code is
>>> essentially IDENTICAL to what I posted. But the "inessential" that
>>> is the key is *I think* WHERE THE INPUT IS COMING FROM, see my other
>>> post.
>>
>> Oh for heaven's sake, man: post some runnable code that, when _that_
>> _exact_ _code_ is run, shows your problem in your environment.
...
> It's there, see my other post to your neighbor in this thread.
I saw that one.
> That's complete runnable code that reproduces the problem, for me of
> course, every time, and two sets of debugger executions showing the
> problem of the regex failing for no apparent reason.
>
> That run includes all the anomalies I've previously mentioned,
> including that the first time through the identical regex does not
> work, second time through it does work even though all inputs are the
> same.
That is not necessarily true: The URL you are requesting might be the
same, but there is no guarantee that the content you are getting back is
the same.
You have to check if the returned content is the same. You have to check
that the match succeeded before accessing the match variables. It is as
simple as that.
By the way, you do realize that
$str =~ /([0-9]*)$/;
will match even if there are no digits at the end of the string, right?
#!/usr/bin/perl
use strict;
use warnings;
my $str = q{this is a string with no digits};
if ( $str =~ /([0-9]*)$/ ) {
print "Matched: $1\n";
}
__END__
D:\Src> stupid
Matched:
Anyway, I have read enough of your remarks to come to the inevitable
conclusion: *PLONK*
Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
------------------------------
Date: Sat, 26 Aug 2006 08:25:59 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: regular expression variables under debugger
Message-Id: <slrnef0iv7.r6e.tadmc@magna.augustmail.com>
wlcna <wlcna@nospam.com> wrote:
> I must say a good number of you perl programmers seem like wusses.
You say this to people that you want something from?
> I'm
> not primarily a perl programmer
It will be a lot harder to get help from Perl programmers after
you call them names.
> but just gotta say that.
Cutting off your nose to spite your face will not advance you
towards a solution to your problem.
> Not all of
> you, but a whole bunch.
We love you too.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Sat, 26 Aug 2006 15:40:00 GMT
From: "wlcna" <wlcna@nospam.com>
Subject: Re: regular expression variables under debugger
Message-Id: <kXZHg.12227$1f6.11747@newssvr27.news.prodigy.net>
"Dr.Ruud" <rvtol+news@isolution.nl> wrote in message
news:ecpgdg.15o.1@news.isolution.nl...
> wlcna schreef:
>> Dr.Ruud:
>
>>> For me it works (with -d) as you expect.
>>> Did you also try to add watch expressions with -w for $str and $1?
>>
>> So, I thought putting a watch on $1 was a useful suggestion
>
> How about the watch on $str?
>
I didn't see the compelling reason for that one - $str wasn't changing
while this problem was happening. But the watch on $1 produced the
interesting weird stuff happening with that utf8 thing I mentioned.
It's very clear to me that that is where the problem lies or at least
that is where the real problem is happening. There's more sample code
below, but if you want to know without running it, $str in both sets is
"Yahoo! News: U.S. News."
Another update: I modified the code just previously posted to try to
reproduce the problem using HTML::TreeBuilder instead of XML:: since
that's generally available without an install and easier for wusses to
try. Unfortunately, with HTML::TreeBuilder, this problem didn't seem to
occur (this surprised me)...
Here's the code I used to test HTML::, and I also cut out a few lines
that could be shortened since each single additional line seems to tax
SO MUCH the brains of some out here.
I'd note I picked the string to retrieve in the code below b/c it could
be retrieved via a parse using *either* the HTML or XML library (which
is not necessarily an easy thing to find), and in both cases via the
"as_text" method. This same string when retrieved using the XML library
did again produce the error, but when retrieved using HTML
Another update: I now am sometimes getting "Out of memory" bombs while
running these simple regexes using the XML library. Again, for those
who don't listen: same inputs, regex results different when using XML
library! You can't reproduce w/$str = "Yahoo! News: U.S. News".
I don't say for sure this is a Perl bug since God only knows it could be
some kind of data corruption in my computer, lightning striking, some
documented but to me unknown incompatibility, who knows. But something
is definitely wrong here and analyzing my two lines of regex code is not
really the point (for those doing that, not talking about you).
I'm going to try it on a second machine if possible....
---------------------
# HTML version
use LWP::Simple;
use HTML::TreeBuilder;
my $strUrl = 'http://rss.news.yahoo.com/rss/us';
my $strHtml = get( $strUrl );
my $t = new HTML::TreeBuilder;
$t->parse( $strHtml );
$t->eof;
my $str = $t->content->[0]->content->[0]->as_text;
$str =~ /(.*)news/i;
my $testPart = $1;
my $testWhole = $&;
my $breakpoint = 3;
print "testPart: <$testPart>, testWhole: <$testWhole>\n";
---------------------
And I may as well show the related XML version which still produces the
error. This code pulls out the exact same string using the same final
access method "as_text". And this one does show the problem (again, for
me of course).
---------------------
#!/usr/bin/perl
use strict;
# *SECOND* XML and RSS VERSION
use LWP::Simple;
use XML::TreeBuilder;
my $strUrl = 'http://rss.news.yahoo.com/rss/us';
# retrieve
my $strHtml = get( $strUrl );
# parse the data retrieved.
my $t = new XML::TreeBuilder;
$t->parse( $strHtml );
$t->eof;
my $str = $t->content->[1]->content->[1]->as_text;
$str =~ /(.*)news/i;
my $testPart = $1;
my $testWhole = $&;
my $breakpoint = 3;
print "testPart: $testPart, testWhole: $testWhole\n";
------------------------------
Date: Sat, 26 Aug 2006 17:38:46 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: regular expression variables under debugger
Message-Id: <slrnef0qo8.63p.hjp-usenet2@yoyo.hjp.at>
On 2006-08-25 12:19, wlcna <wlcna@nospam.com> wrote:
> "Ilya Zakharevich" <nospam-abuse@ilyaz.org> wrote in message
> news:ecmc9f$1cof$1@agate.berkeley.edu...
>> the most crucial info is your perl -V. Why did not you post it yet?
>>
>
> 5.8.2 is the version.
>
> Re: getting the latest, is there a way to update perl without losing
> all the libraries that are installed? Can you give me a tip on
> dealing with this issue? I compile my own perl (under Linux)...
Yes. The Configure script asks for extra directories to add to @INC.
IIRC it even tries to detect an already installed perl and asks if it
should add these directories.
hp
--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR | > ist?
| | | hjp@hjp.at | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
------------------------------
Date: Sat, 26 Aug 2006 10:24:44 GMT
From: Brian Greenfield <not-for-replies@zombie.org.uk>
Subject: Re: warnings (was Re: Most useful standard module?)
Message-Id: <hv70f21o3b6l3bnkvhkljg162lmu4t12ps@4ax.com>
On Fri, 25 Aug 2006 21:17:38 +0000 (UTC),
crichmon@filc9283.fm.intel.com (Chris Richmond - MD6-FDC ~) wrote:
>We treat warnings as fatal
>errors. They *have* to be fixed or the code doesn't get released.
Another thing you can do with the warnings pragma that you can't with
perl -w (AFAIK) is
use warnings FATAL => 'all';
Now, you don't have any option to ignore warnings
------------------------------
Date: Sat, 26 Aug 2006 15:04:42 +0000 (UTC)
From: crichmon@filc9283.fm.intel.com (Chris Richmond - MD6-FDC ~)
Subject: Re: warnings (was Re: Most useful standard module?)
Message-Id: <ecpnua$v6c$2@news01.intel.com>
In article <hv70f21o3b6l3bnkvhkljg162lmu4t12ps@4ax.com>,
Brian Greenfield <not-for-replies@zombie.org.uk> writes:
>On Fri, 25 Aug 2006 21:17:38 +0000 (UTC),
>crichmon@filc9283.fm.intel.com (Chris Richmond - MD6-FDC ~) wrote:
>Another thing you can do with the warnings pragma that you can't with
>perl -w (AFAIK) is
> use warnings FATAL => 'all';
We did something similar to that in the bootup code, but
it isn't "on" after bootup.
--
Chris Richmond | I don't speak for Intel & vise versa
------------------------------
Date: Fri, 25 Aug 2006 23:15:41 +0100
From: Ben Morrow <benmorrow@tiscali.co.uk>
Subject: Re: warnings (was Re: Most useful standard module?)
Message-Id: <dd94s3-ll7.ln1@osiris.mauzo.dyndns.org>
Quoth crichmon@eng.fm.intel.com:
> In article <slrneeukhq.oqt.tadmc@magna.augustmail.com>,
> Tad McClellan <tadmc@augustmail.com> writes:
> >If you are forced to use a non-warnings-clean module, the -w spews
> >a bunch of stuff that you have no control over, which "trains" you
> >to ignore warnings even if there might be a "real" one in amongst
> >all of the noise.
>
> Oh, that sort of explains things. We treat warnings as fatal
> errors. They *have* to be fixed or the code doesn't get released.
In that case you may be interested in the -W switch, which turns on all
warnings everywhere regardless of 'no warnings' or $^W. You should
probably read perllexwarn, as the new (well, since 5.6, which isn't
exactly new) architecture is worth working with rather than against.
Ben
--
Every twenty-four hours about 34k children die from the effects of poverty.
Meanwhile, the latest estimate is that 2800 people died on 9/11, so it's like
that image, that ghastly, grey-billowing, double-barrelled fall, repeated
twelve times every day. Full of children. [Iain Banks] benmorrow@tiscali.co.uk
------------------------------
Date: 26 Aug 2006 04:48:48 -0700
From: "Paul Lalli" <mritty@gmail.com>
Subject: Re: warnings
Message-Id: <1156592927.999062.103430@b28g2000cwb.googlegroups.com>
Dr.Ruud wrote:
> Chris Richmond - MD6-FDC ~ schreef:
>
> > We treat warnings as fatal
> > errors. They *have* to be fixed or the code doesn't get released.
>
> Have you grepped all source for blocks with
>
> local $^W
>
> or
>
> no warnings
>
> in them? Or are those accepted fixes?
No reason for the grep. Provide the -W option on the command line,
which overrides both 'no warnings' and $^W.
Paul Lalli
------------------------------
Date: Sat, 26 Aug 2006 15:03:23 +0000 (UTC)
From: crichmon@filc9283.fm.intel.com (Chris Richmond - MD6-FDC ~)
Subject: Re: warnings
Message-Id: <ecpnrr$v6c$1@news01.intel.com>
In article <eco74s.e8.1@news.isolution.nl>,
"Dr.Ruud" <rvtol+news@isolution.nl> writes:
>Chris Richmond - MD6-FDC ~ schreef:
>> We treat warnings as fatal
>> errors. They *have* to be fixed or the code doesn't get released.
>
>Have you grepped all source for blocks with
> local $^W or no warnings
>in them? Or are those accepted fixes?
No and no. Its a community of developers that those sorts of
things would bring a lot of bad peer pressure & public humiliation.
--
Chris Richmond | I don't speak for Intel & vise versa
------------------------------
Date: Sat, 26 Aug 2006 16:05:39 GMT
From: zentara <zentara@highstream.net>
Subject: Re: Win32::GUI and Scrolling Text
Message-Id: <ouq0f25t3f5f0eklt52fuvv7admnr1r1il@4ax.com>
On 25 Aug 2006 10:39:39 -0700, "jackbarnett@gmail.com"
<jackbarnett@gmail.com> wrote:
>TK stuff is working.... this is what I'm trying to do (can't use tail
>and that ulgy filehandler thing you got there):
>
>Here is my code. basically I need the "dostuff" to run the entire
>time, but looks like it's also fighting with MainLoop() [keeps locking
>up application, not updating, etc]
>
>Thoughts?
Yeah, gui apps run a thing called an "event loop" which must
be your first concern. Otherwise, you get a condition called
"blocking the gui", where the gui becomes unresponsive. You
should not use sleep() or while(1) loops unless you really are sure
what you are doing. Those will block the gui eventloop.
That "ugly filehandler" (as you refer to it) is what gui programs use
to watch filehandles, without blocking the gui. It is sort of like
IO::Select, but designed to work within an event-loop system.
It watches filehandles with blocking the gui.
The preferred method is to use fileevent, but since you say you
can't use it (possible problems on win32?), I've changed your
example to use a timer, instead of sleep. It is probably very
inefficient to do it this way, but here it is. It probably wouldn't
be too bad if your timer was set to 10 seconds ( 10000). And you
probably would want to clear your text box every 10th run (or whatever),
since you are rereading the whole file each time.
#!/usr/bin/perl
use warnings;
use strict;
use Tk;
my $main = MainWindow->new;
my $t = $main-> Scrolled('Text',
-wrap=>'none')->pack(-expand=>1);
my $file = "z.txt";
my $timer;
my $hello = $main->Button(
-text => 'Start It',
-command => sub { &start_it } );
$hello->pack;
MainLoop();
sub start_it{
$hello->configure(-state => 'disabled'); #prevent double starts
$timer = $main->repeat(1000,\&dostuff); # 1000 milliseconds
}
sub dostuff() {
open (FILE, "$file")
or die ("Can't open file: $file: $!\n");
while ( <FILE> ) {
my $line=$_;
chomp($line);
print ("$line\n");
$t->insert('end',"$line\n");
$t->yview('end');
$t->see('end'); #same thing
$t->update();
}
close (FILE)
or warn ("Can't close file: $file: $!\n");
}
__END__
--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
------------------------------
Date: Sat, 26 Aug 2006 16:11:47 GMT
From: zentara <zentara@highstream.net>
Subject: Re: Win32::GUI and Scrolling Text
Message-Id: <sfs0f250dsk5md54bfbe6caekjjk0fo94f@4ax.com>
On Sat, 26 Aug 2006 16:05:39 GMT, zentara <zentara@highstream.net>
wrote:
Opps , I made a small typo, and I didn't want it to confuse you.
>That "ugly filehandler" (as you refer to it) is what gui programs use
>to watch filehandles, without blocking the gui. It is sort of like
>IO::Select, but designed to work within an event-loop system.
>It watches filehandles with blocking the gui.
^^^^^^^^^^^^^
It watches filehandles withOUT blocking the gui.
( referring to Tk's fileevent method )
--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 9649
***************************************