[24840] in Perl-Users-Digest
Perl-Users Digest, Issue: 6991 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Sep 11 03:06:27 2004
Date: Sat, 11 Sep 2004 00:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sat, 11 Sep 2004 Volume: 10 Number: 6991
Today's topics:
Re: "RFC": re [un]pack() (Anno Siegel)
Re: another try (Darius)
better way to parse html <noemail@#$&&!.net>
Re: better way to parse html <noreply@gunnar.cc>
Re: better way to parse html <tadmc@augustmail.com>
How to expand escape sequence (e.g. \n)? <wojtekdz@att.net>
Re: How to expand escape sequence (e.g. \n)? <gifford@umich.edu>
Re: Lwp Post Problem (Charles DeRykus)
Re: Perl 6 and OOP (J. Romano)
Re: Repeatedly parsing a file to "clean" it. (Graeme Stewart)
Re: Socket holding pattern (Anno Siegel)
Re: Socket holding pattern <uri@stemsystems.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 10 Sep 2004 22:36:31 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: "RFC": re [un]pack()
Message-Id: <chta9f$bio$1@mamenchi.zrz.TU-Berlin.DE>
Michele Dondi <bik.mido@tiscalinet.it> wrote in comp.lang.perl.misc:
> On Thu, 09 Sep 2004 23:15:25 +0200, Michele Dondi
> <bik.mido@tiscalinet.it> wrote:
>
> >Coming to the point, it often happens to resort to "cascaded"
> >[un]pack()s. In my specific case I have
> [snip]
>
> Any cmt on this?!?
Well, if I must...
I have occasionally used a sequence of pack and unpack in a single
statement (can't remember triple ones), so I don't see much need for
your syntax extension.
pack and unpack suffer from obscurity, not from a lack of flexibility.
If anything is wanting it's a means to identify which parts of one
template interact with which parts of the other. I don't see your
extension helping much, in fact it makes it natural to put them all
on one line. Some kind of column formatting (do-able with explicit
pack/unpack) is a better approach.
Anno
------------------------------
Date: 10 Sep 2004 20:26:43 -0700
From: dmedhora@yahoo.com (Darius)
Subject: Re: another try
Message-Id: <26a5971.0409101926.1bf85f76@posting.google.com>
Mark Clements <mark.clements@kcl.ac.uk> wrote in message news:<41415c47$1@news.kcl.ac.uk>...
>
> Is this merely a learning exercise or do you in fact have a corrupted
> xml file that you are trying to fix?
This is merely a learning excersize. I have solved the problem with
arrays
but i just couldn't with reg exps.
>
> If the former then you need to read the other postings: there are
> many(!) xml tools available to make your life easier. Messing around
> with xml is not a good way of teaching yourself about regular expressions.
>
I guess...
> If the latter, is the correct film title always "Somethings gotta give"?
> If the film title varies then how are you expecting to tell in each case
> with what text the "e or whatever needs to be replaced? Have you
> considered restoring from backup?
>
yes it is always that. and it occurs just once in a line. i added it
twice
bcoz its merely a learning excersize for me now.. When it was urgent,
i used
arrays to solve it shamelessly:) but then looking at this example, and
also
as per Anno, i dont think its possible to use regex anyway.
I tried this out using some junk xml type lines:
first line has 2 occurrences, second line has 1.
line 1:<word_word1 string="start" date="2004-09-02 07:33:22"
id="2033878" word_id="2000589" get_id="8647" ><word name="MOVIE"><film
title="S"things Gotta Give" the_number="531780"
/></word></word_word1><film title="S'"e Gotta Give"
the_number="531780" />
line 2:<one name="S'things Gotta Give" something="true" type="demand
xyz" system_number="531780"/>
my $line;
while(chomp($line=<>)){
$line=~/(.*?)(="S)(.*?)(Gotta)/gc;
$line=~s/$3/omethings /g;
$line=~/\G(.*?)(="S)(.*?)(Gotta)/gc;
$line=~s/$3/omethings /;
print "FINAL:$line\n";
}
gives me:
FINAL:<word_word1 string="start" date="2004-09-02 07:33:22"
id="2033878" word_id="2000589" get_id="8647" ><word name="MOVIE"><film
title="Somethings Gotta Give" the_number="531780"
/></word></word_word1><film title="S'"e Gotta Give"
the_number="531780" />
FINAL:<one name="Somethings Gotta Give" something="true" type="demand
xyz" system_number="531780"/>
The second occurence on line 1 didn't get replaced. I have decided to
give up and read more now :)
Thanks
- Darius
> Mark
Thanks Mark..
------------------------------
Date: Fri, 10 Sep 2004 20:49:54 -0500
From: Fred <noemail@#$&&!.net>
Subject: better way to parse html
Message-Id: <pan.2004.09.11.01.49.49.374663@#$&&!.net>
Greetings,
I am using this code to impersonate a browser, grab a page and return only
the data elements I want, namely the ozone level. I would like to
entertain any comments about style, and in particular what the heck this
line does:
$list[2] =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs;
because it is really past my understanding, I got it from the FAQ's
Thank you.
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Cookies;
use HTTP::Request::Common;
use LWP::Simple;
my $ua = LWP::UserAgent->new;
$ua->cookie_jar(HTTP::Cookies->new(file => 'cookie_jar', autosave => 1));
my $html_page = get 'http://www.tnrcc.state.tx.us/cgi-bin/monops/daily_summary?82';
my @array = $html_page =~ /(.*\n)/g;
my @list = undef;
my @ozone = undef;
my $i;
my( $found, $index ) = ( undef, -1 );
for( $i = 0; $i < @array; $i++ )
{
if( $array[$i] =~ /44201/ )
{
$found = $array[$i];
$index = $i;
push @list, $found ;
$found = $array[$i + 1];
push @list, $found;
$list[2] =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs;
@ozone = split(' ',$list[2]);
@ozone = sort { $b <=> $a } @ozone;
last;
}
}
print "The current ozone level is: $ozone[0]";
------------------------------
Date: Sat, 11 Sep 2004 04:37:41 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: better way to parse html
Message-Id: <2qf6naFspst2U1@uni-berlin.de>
Fred wrote:
> I am using this code to impersonate a browser, grab a page and
> return only the data elements I want, namely the ozone level. I
> would like to entertain any comments about style, and in particular
> what the heck this line does:
> $list[2] =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs;
> because it is really past my understanding, I got it from the FAQ's
It removes HTML tags. :) It's a more complicated way of doing:
$list[2] =~ s/<[^>]*>//g;
but unlike the latter, the FAQ regex is not broken because of
attributes such as:
value="<me@example.com>"
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Fri, 10 Sep 2004 23:07:29 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: better way to parse html
Message-Id: <slrnck4ug1.al0.tadmc@magna.augustmail.com>
Fred <noemail@#$&&!.net> wrote:
> I would like to
> entertain any comments about style, and in particular what the heck this
> line does:
> $list[2] =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs;
Let's comment it:
$list[2] =~ s/< # opening angle bracket
(?: # grouping withOUT memory
[^>'"]* # any char except quotes and close angle
| # or
(['"]) # a quote and
.*? # as few chars as possible and
\1 # the SAME quote that was used to start it
)* # zero or more of that group
> # closing angle bracket
//gsx;
> my @list = undef;
@list will contain one element here. Is that what you wanted?
If you want it to contain zero elements then use:
my @list; # defaults to empty list
or
my @list = ();
> for( $i = 0; $i < @array; $i++ )
foreach my $i ( 0 .. $#array ) # much easier to understand
> $found = $array[$i + 1];
> push @list, $found;
You do not need the temporary variable:
push @list, $array[$i + 1];
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Sat, 11 Sep 2004 03:24:51 GMT
From: "Wojtek Dziegielewski" <wojtekdz@att.net>
Subject: How to expand escape sequence (e.g. \n)?
Message-Id: <7eu0d.2075$z_3.1757@trndny07>
How to expand escape sequences contained in a variable?
Example, if I want the input record separator ($/) to be 2 consecutive new
line characters, I can use literal string like this:
$/ = "/n/n";
However, my sequence for the input record separator is inside a variable,
like this:
my $a = '/n/n';
I need to assign the contents of $a to $/ in such a way that /n's will get
translated into newline characters. How can I have it accomplished in Perl?
------------------------------
Date: Sat, 11 Sep 2004 01:43:22 -0400
From: Scott W Gifford <gifford@umich.edu>
Subject: Re: How to expand escape sequence (e.g. \n)?
Message-Id: <qsz4qm5tl6t.fsf@mspacman.gpcc.itd.umich.edu>
"Wojtek Dziegielewski" <wojtekdz@att.net> writes:
[...]
> my sequence for the input record separator is inside a variable,
> like this:
>
> my $a = '/n/n';
>
> I need to assign the contents of $a to $/ in such a way that /n's will get
> translated into newline characters. How can I have it accomplished in Perl?
It works like you'd expect, except there are two bugs in your code
above. First, you use a backslash, not a forward slash, to write \n.
Second, \n isn't interpreted inside of single quotes, but only inside
of double quotes. So if you do:
my $a = "\n\n";
$/ = $a;
you'll get what you expect.
----ScottG.
------------------------------
Date: Sat, 11 Sep 2004 06:19:44 GMT
From: ced@bcstec.ca.boeing.com (Charles DeRykus)
Subject: Re: Lwp Post Problem
Message-Id: <I3v5Kw.49s@news.boeing.com>
In article <abb15b21.0409010209.71b849db@posting.google.com>,
Sure <csuresh01@yahoo.com> wrote:
>Hello All,
>I want to update a form using the LWP & HTTP method. It was
>working fine when I am updating the values like this
>
>$ua = LWP::UserAgent->new;
>$url ='http://xxx.be/cgi-bin/viewauth/Tracking/TestProjectAgainInitialDevStory#edittable2';
>use HTTP::Request::Common;
>
>my $res = $ua->request(POST $url,
> Content_Type =>'form-data',
> Content => [
> ettablenr => '2',
> etcell2x1 =>'Task',
> etcell2x2 =>'2',
> etcell2x3 =>'3',
> etcell2x4 =>'4',
> etcell2x5 =>'High',
> etcell2x6 =>'SureshC',
> etcell2x7 =>'CSuresh',
> etcell2x8 =>'Twiki Data Updation',
> etrows => '2',
> etsave =>'Save table']);
>
>
>It was not working when I store the value into a Variable. Like This.
>
>$postStr = ettablenr => '2', etcell2x1 =>'Task', etcell2x2 =>'2',
>etcell2x3 =>'3', etcell2x4 =>'4', etcell2x5 =>'High', etcell2x6
>=>'SureshC', etcell2x7 =>'CSuresh', etcell2x8 =>'Twiki Data
>Updation', etrows => '2', etsave =>'Save table']);
>
The above statement won't compile...
>$ua = LWP::UserAgent->new;
>my $res = $ua->request(POST $url, Content_Type=>'form-data', Content
>=>[$postStr]);
>
--
Charles DeRykus
------------------------------
Date: 10 Sep 2004 22:41:53 -0700
From: jl_post@hotmail.com (J. Romano)
Subject: Re: Perl 6 and OOP
Message-Id: <b893f5d4.0409102141.879370d@posting.google.com>
> On 10 Sep 2004 07:04:12 -0700, jl_post@hotmail.com (J. Romano) wrote:
>
> > Now to my question: Knowing that Perl 6 will have better OO
> >handling, does that mean that Data::Dumper will lose its usefulness on
> >Perl 6 objects? Or will it still be able to let a programmer peer
> >into them and let him/her tinker with their data using the debugger?
Michele Dondi <bik.mido@tiscalinet.it> replied in message
news:<i984k01jrmmf20atq3v9serems23eh7dmf@4ax.com>...
>
> There will be a predefined method for this. Only, the definitive name
> has not been chosen yet. Proposals include, not surprisingly C<.code>,
> C<.dump>, etc. Larry has been thinking about C<.perl>, envisioning a
> possible future for, say, C<.python>, C<.ruby>, etc.
Interesting.
But I'm confused as to what the .perl() method would do. Are you
saying that "$a.perl()" would do the same thing that "Dumper $a" does
now? And if that's so, what would "$a.python()" and "$a.ruby()"
generate? (Code that can be eval()'ed in those respective languages,
maybe? That would be very interesting, I would think. Molto
interessante...)
-- Jean-Luc
------------------------------
Date: 10 Sep 2004 16:18:24 -0700
From: g_stewart@hotmail.com (Graeme Stewart)
Subject: Re: Repeatedly parsing a file to "clean" it.
Message-Id: <61d476af.0409101518.4eac813b@posting.google.com>
Thank you, thank you, thank you.
Both the utilization of subfunctions and the implementation of seek()
will make this code significantly more pleasing to the eye!
Wolfgang Hommel <wolf@code-wizards.com> wrote in message news:<chstc2$krg$06$1@news.t-online.com>...
> Hi Graeme,
>
> > I've got a perl script that repeatedly opens and closes a couple of
> > files then excutes a while loop against those open files (example
> > below). Rather than repeatedly open and close the files, can I do this
> > in a better / more efficient way?
>
> Not sure what exactly you consider as "efficient", but it's rather
> unlikely that the performance of your program suffers from opening and
> closing files. Instead, regarding the code you posted, I'd recommend
>
> a) using subfunctions instead of repeating code :-)
>
> b) "slurp mode" for reading each of those textfiles and splitting the
> whole thing by line breaks instead of reading them in line by line.
>
> If you really want to just avoid closing and re-opening a file, indeed
> seek() is your friend.
>
>
> Regards,
> Wolfgang
------------------------------
Date: 10 Sep 2004 22:13:11 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Socket holding pattern
Message-Id: <cht8tn$ab1$1@mamenchi.zrz.TU-Berlin.DE>
Uri Guttman <uri@stemsystems.com> wrote in comp.lang.perl.misc:
> >>>>> "AS" == Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> writes:
[...]
> and i agree, since this is only for a mud, who cares? kick all the
> players and let them reconnect.
"Only" a mud? What are you talking about? Break a player in action
and you have made an enemy for life! :)
Anno
------------------------------
Date: Sat, 11 Sep 2004 01:28:34 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Socket holding pattern
Message-Id: <x73c1p8uic.fsf@mail.sysarch.com>
>>>>> "AS" == Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> writes:
AS> Uri Guttman <uri@stemsystems.com> wrote in comp.lang.perl.misc:
>> >>>>> "AS" == Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> writes:
AS> [...]
>> and i agree, since this is only for a mud, who cares? kick all the
>> players and let them reconnect.
AS> "Only" a mud? What are you talking about? Break a player in action
AS> and you have made an enemy for life! :)
it won't be the first one i have ever made, nor the last! :)
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 6991
***************************************