[30376] in Perl-Users-Digest
Perl-Users Digest, Issue: 1619 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Jun 7 21:09:44 2008
Date: Sat, 7 Jun 2008 18:09:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sat, 7 Jun 2008 Volume: 11 Number: 1619
Today's topics:
Re: Counting lines in big number of files - in parallel <hjp-usenet2@hjp.at>
Re: FAQ 4.67 Why does passing a subroutine an undefined <hjp-usenet2@hjp.at>
Re: FAQ 5.38 How do I select a random line from a file? <hjp-usenet2@hjp.at>
Re: FAQ 5.38 How do I select a random line from a file? <danrumney@warpmail.new>
Re: FAQ 5.38 How do I select a random line from a file? <brian.d.foy@gmail.com>
Re: FAQ 7.8 How do I declare/create a structure? <hjp-usenet2@hjp.at>
Few questions about arguments and subroutines/modules <telemach@go2.pl>
File Locked After Close? xmp333@yahoo.com
Re: File Locked After Close? <danrumney@warpmail.new>
Performance on Windows: Cygwin is much faster. Why? <dutch@example.com>
Re: Performance on Windows: Cygwin is much faster. Why? <ben@morrow.me.uk>
Re: Performance on Windows: Cygwin is much faster. Why? <dutch@example.com>
Re: Performance on Windows: Cygwin is much faster. Why? <hjp-usenet2@hjp.at>
Re: Performance on Windows: Cygwin is much faster. Why? <ben@morrow.me.uk>
Re: Performance on Windows: Cygwin is much faster. Why? <ben@morrow.me.uk>
Re: Perl CGI Issue <1usa@llenroc.ude.invalid>
Perl FAQ and Perl 5.10: Find the answers that can show <brian.d.foy@gmail.com>
Re: select and filehandle <3abd05ad-0d59-400f-9882-f700 <hjp-usenet2@hjp.at>
Re: Set breakpoint at a file/class in debugger <hjp-usenet2@hjp.at>
Re: XML::Parser Tree Style <hjp-usenet2@hjp.at>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sat, 7 Jun 2008 18:49:54 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Counting lines in big number of files - in parallel.
Message-Id: <slrng4lf1j.bkp.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-03 15:28, Jürgen Exner <jurgenex@hotmail.com> wrote:
> hadzio@gmail.com wrote:
>>But the above sequencial counting is very slow (2-3 hours). My server
>>is quite powerfull (72 CPU and fast filesystems)
>
> I don't know what a 'fast filesystem' is, but whatever method you are
> going to use it will be I/O bound unless you have something like a RAID5
> or similar system with strongly distributed data over numerous physical
> HDDs.
Well, if he has 72 CPUs, it is very likely that he also has several
disks (and probably a lot of memory, too, so that his 25000 files may
even be already in the cache if he counts them often enough).
hp
------------------------------
Date: Sat, 7 Jun 2008 18:23:39 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: FAQ 4.67 Why does passing a subroutine an undefined element in a hash create it?
Message-Id: <slrng4ldgc.bkp.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-05 19:03, PerlFAQ Server <brian@stonehenge.com> wrote:
> 4.67: Why does passing a subroutine an undefined element in a hash create it?
>
> If you say something like:
>
> somefunc($hash{"nonesuch key here"});
>
> Then that element "autovivifies"; that is, it springs into existence
> whether you store something there or not. That's because functions get
> scalars passed in by reference. If somefunc() modifies $_[0], it has to
> be ready to write it back into the caller's version.
>
> This has been fixed as of Perl5.004.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Then the answer to the question should be: "Because you are using a very
old version of perl."
I think that FAQs referring to behaviour of old perl versions should be
dropped after some time. I'm not sure what that time should be but 11
years ought to be enough.
hp
------------------------------
Date: Sat, 7 Jun 2008 19:44:44 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: FAQ 5.38 How do I select a random line from a file?
Message-Id: <slrng4li8c.bkp.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-04 19:03, PerlFAQ Server <brian@stonehenge.com> wrote:
> 5.38: How do I select a random line from a file?
>
>
> Here's an algorithm from the Camel Book:
>
> srand;
> rand($.) < 1 && ($line = $_) while <>;
That has the disadvantage of having to read the whole file sequentially.
If your file is large, and the lines are roughly of equal length, it is
probably preferrable to just do a random seek into the file and read the
line you've hit (by searching backwards and forwards for $/).
hp
------------------------------
Date: Sat, 07 Jun 2008 14:59:54 -0400
From: Dan Rumney <danrumney@warpmail.new>
Subject: Re: FAQ 5.38 How do I select a random line from a file?
Message-Id: <484adab2$0$30196$4c368faf@roadrunner.com>
Peter J. Holzer wrote:
> On 2008-06-04 19:03, PerlFAQ Server <brian@stonehenge.com> wrote:
>> 5.38: How do I select a random line from a file?
>>
>>
>> Here's an algorithm from the Camel Book:
>>
>> srand;
>> rand($.) < 1 && ($line = $_) while <>;
>
> That has the disadvantage of having to read the whole file sequentially.
Agreed
> If your file is large, and the lines are roughly of equal length, it is
> probably preferrable to just do a random seek into the file and read the
> line you've hit (by searching backwards and forwards for $/).
The problem with that is that the probability of selecting a line
becomes a function of that line's length.
The algorithm above ensures that the probability of selecting a line is
precisely 1/N where N is the total number of lines in the file.
------------------------------
Date: Sun, 08 Jun 2008 01:34:57 +0100
From: brian d foy <brian.d.foy@gmail.com>
Subject: Re: FAQ 5.38 How do I select a random line from a file?
Message-Id: <080620080134579524%brian.d.foy@gmail.com>
In article <484adab2$0$30196$4c368faf@roadrunner.com>, Dan Rumney
<danrumney@warpmail.new> wrote:
> Peter J. Holzer wrote:
> > On 2008-06-04 19:03, PerlFAQ Server <brian@stonehenge.com> wrote:
> >> 5.38: How do I select a random line from a file?
> > If your file is large, and the lines are roughly of equal length, it is
> > probably preferrable to just do a random seek into the file and read the
> > line you've hit (by searching backwards and forwards for $/).
> The problem with that is that the probability of selecting a line
> becomes a function of that line's length.
>
> The algorithm above ensures that the probability of selecting a line is
> precisely 1/N where N is the total number of lines in the file.
From time to time, I think about how I would solve this problem if I
actually needed it for something important. That is, not as some
thought experiment or quote-for-the-sig thing.
Has anyone solved this for something non-trivial? Say, for something
with huge numbers of lines or large file sizes, or where you want to
choose several random lines?
The solution that I think about (but have never implemented), is some
sort of pre-indexing of line endings so you have a list of file
positions where all the lines line (or start, or whatever). When you
want a random line, you choose a random element from that list. Open
the file, seek, and read a line.
I guess if it really mattered, you'd put every line in a some sort of
persistence thingy and choose choose a random record without having to
read a file at all.
------------------------------
Date: Sat, 7 Jun 2008 20:04:23 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: FAQ 7.8 How do I declare/create a structure?
Message-Id: <slrng4ljd7.bkp.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-06 01:03, PerlFAQ Server <brian@stonehenge.com> wrote:
> 7.8: How do I declare/create a structure?
>
> In general, you don't "declare" a structure. Just use a (probably
> anonymous) hash reference. See perlref and perldsc for details. Here's
> an example:
>
> $person = {}; # new anonymous hash
> $person->{AGE} = 24; # set field AGE to 24
> $person->{NAME} = "Nat"; # set field NAME to "Nat"
>
> If you're looking for something a bit more rigorous, try perltoot.
Or perldoc fields.
hp
------------------------------
Date: Sat, 7 Jun 2008 15:50:31 -0700 (PDT)
From: Telemach <telemach@go2.pl>
Subject: Few questions about arguments and subroutines/modules
Message-Id: <bcedc05a-dca5-441d-9739-6baf58d49f78@y38g2000hsy.googlegroups.com>
I'm newbie who is looking for some online tutorial about advanced use
of arguments for subroutines and modules.
Let's say I have a script that would take 3 arguments
first - single word
second - sentence
third - path
additional one that would display help
How to :
- declare a default value for first argument if not provided by user
- allow input of the full sentence ; right now script takes only the
first word from sentence
- create a condition that would for example : download a certain file
if third argument (which is a path) is provided, no argument shall not
trigger download
- display help after running for example example.pl -h
I was browsing thru different tutorials but haven't yet found answers
to above.
- Telemach -
------------------------------
Date: Sat, 7 Jun 2008 12:37:17 -0700 (PDT)
From: xmp333@yahoo.com
Subject: File Locked After Close?
Message-Id: <5ad1b4cb-926e-4f38-b469-30c825a1c349@s50g2000hsb.googlegroups.com>
Hi,
The last line of the following code snippet fails:
open FILE, '<file.dat' || die '...';
process(FILE);
close FILE;
system('command file.dat');
It looks like FILE is still locked because if I break out the last
line into a separate script, it works. In fact, if I open FILE on
another file, the last line also works. I'd like to fix the issue
with something more elegant than re-opening the handle on a dummy
file.
Any suggestions? I'm using ActiveState Perl on Windows running in a
DOS shell.
Thanks.
------------------------------
Date: Sat, 07 Jun 2008 16:10:12 -0400
From: Dan Rumney <danrumney@warpmail.new>
Subject: Re: File Locked After Close?
Message-Id: <484aeb2c$0$30171$4c368faf@roadrunner.com>
xmp333@yahoo.com wrote:
> Hi,
>
> The last line of the following code snippet fails:
>
> open FILE, '<file.dat' || die '...';
> process(FILE);
> close FILE;
> system('command file.dat');
>
[snip]
> Any suggestions? I'm using ActiveState Perl on Windows running in a
> DOS shell.
Try checking the return code of the close function to see if there's
some kind of problem preventing the file from closing
If that doesn't show anything, can you provide the output from your program?
------------------------------
Date: Sat, 7 Jun 2008 14:26:58 +0200
From: "Dutch" <dutch@example.com>
Subject: Performance on Windows: Cygwin is much faster. Why?
Message-Id: <484a7e93$0$14352$e4fe514c@news.xs4all.nl>
Below is a small program to calculate some values using Lehmer's algorithm.
Running it on Windows XP, I found some rather strange performance facts.
Using ActiveState 5.10, it runs in 1117 seconds.
With Strawberry 5.10, it takes 945 seconds.
But with Cygwin (using version Perl 5.8), on the same machine, it finishes
in 488 seconds, it is twice as fast!
These results are reproducible.
Why would Cygwin be faster by such a large margin? Is it because of the
underlying libraries being more efficient or am I overlooking something?
Following, the program:
# Given: G(n) = ( 7**5 * G(n-1) ) % ( 2^^31 - 1 )
# Find : X = G( 943683858) = 133481
# Y = G(1657543960) = 447352
$| = 1;
my $starttime = time();
my $n = 0;
my $g = 555;
my $multiplier = 7**5;
my $mask = 2**31 - 1;
for (; $n < 943_683_858; $n++) { $g = ($multiplier * $g) % $mask; }
my $seconds = time() - $starttime;
print "Found X=$g in $seconds seconds, ";
for (; $n < 1_657_543_960; $n++) { $g = ($multiplier * $g) % $mask; }
$seconds = time() - $starttime;
print "Y=$g in $seconds seconds.\n";
# END OF PROGRAM
------------------------------
Date: Sat, 7 Jun 2008 21:45:33 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Performance on Windows: Cygwin is much faster. Why?
Message-Id: <dk8rh5-oke1.ln1@osiris.mauzo.dyndns.org>
Quoth "Dutch" <dutch@example.com>:
> Below is a small program to calculate some values using Lehmer's algorithm.
> Running it on Windows XP, I found some rather strange performance facts.
>
> Using ActiveState 5.10, it runs in 1117 seconds.
>
> With Strawberry 5.10, it takes 945 seconds.
>
> But with Cygwin (using version Perl 5.8), on the same machine, it finishes
> in 488 seconds, it is twice as fast!
Can you test this with Cygwin 5.10 and Strawberry 5.8? I suspect the
difference is more likely to be a 5.10 slowdown than a Cygwin speedup;
if this is the case, it should probably be reported to p5p.
Also, can you post the results of
perl -MDevel::Peek -e"Dump 1_657_543_960 * 7**5"
with each perl? If some of your perls have 32bit and some 64bit
integers, that is likely to make a difference to performance.
Ben
--
Razors pain you / Rivers are damp
Acids stain you / And drugs cause cramp. [Dorothy Parker]
Guns aren't lawful / Nooses give
Gas smells awful / You might as well live. ben@morrow.me.uk
------------------------------
Date: Sat, 7 Jun 2008 23:20:42 +0200
From: "Dutch" <dutch@example.com>
Subject: Re: Performance on Windows: Cygwin is much faster. Why?
Message-Id: <484afbac$0$14345$e4fe514c@news.xs4all.nl>
"Ben Morrow" <ben@morrow.me.uk> wrote:
> Can you test this with Cygwin 5.10 and Strawberry 5.8?
Sure, I'll try that tomorrow.
> Also, can you post the results of
> perl -MDevel::Peek -e"Dump 1_657_543_960 * 7**5"
Can't do it for ActiveState right now, but:
==========
Strawberry
==========
perl -v says this:
This is perl, v5.10.0 built for MSWin32-x86-multi-thread
The output you requested:
SV = NV(0x9d59fc) at 0x9b98b4
REFCNT = 1
FLAGS = (PADTMP,NOK,READONLY,pNOK)
NV = 27858341335720
======
Cygwin
======
perl -v says this:
This is perl, v5.8.8 built for cygwin-thread-multi-64int
The output you requested:
SV = IV(0x10029318) at 0x10010fc0
REFCNT = 1
FLAGS = (PADBUSY,PADTMP,IOK,READONLY,pIOK)
IV = 27858341335720
I'm not sure what the output means, but I see that the version under Cygwin
talks about 64int. Perhaps this (as you mentioned) point to using 64-bit
integers? I can see why that would speed things up. However, I'm guessing
here.
This all runs on an Intel Core Duo processor, btw, not sure if it's
relevant.
Dutch
------------------------------
Date: Sun, 8 Jun 2008 00:13:22 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Performance on Windows: Cygwin is much faster. Why?
Message-Id: <slrng4m202.moo.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-07 20:45, Ben Morrow <ben@morrow.me.uk> wrote:
> Quoth "Dutch" <dutch@example.com>:
>> Below is a small program to calculate some values using Lehmer's algorithm.
>> Running it on Windows XP, I found some rather strange performance facts.
>>
>> Using ActiveState 5.10, it runs in 1117 seconds.
>>
>> With Strawberry 5.10, it takes 945 seconds.
>>
>> But with Cygwin (using version Perl 5.8), on the same machine, it finishes
>> in 488 seconds, it is twice as fast!
>
> Can you test this with Cygwin 5.10 and Strawberry 5.8? I suspect the
> difference is more likely to be a 5.10 slowdown than a Cygwin speedup;
> if this is the case, it should probably be reported to p5p.
5.10 seems to be a bit slower, but not much:
5.8.8 as included in Debian Etch:
Found X=133481 in 325 seconds, Y=447352 in 573 seconds.
5.10.0 (compiled from source with default options):
Found X=133481 in 343 seconds, Y=447352 in 597 seconds.
(both run on the same system:
Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz)
> with each perl? If some of your perls have 32bit and some 64bit
> integers, that is likely to make a difference to performance.
Found X=133481 in 139 seconds, Y=447352 in 245 seconds.
(Intel(R) Xeon(R) CPU X5355 @ 2.66GHz)
So 64 bit at 43% higher clock rate seems to be 134% faster than 32 bit -
quite impressive (but it's a different CPU, too - so that might be
deceptive)
hp
------------------------------
Date: Sun, 8 Jun 2008 00:00:34 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Performance on Windows: Cygwin is much faster. Why?
Message-Id: <ihgrh5-0ak1.ln1@osiris.mauzo.dyndns.org>
Quoth "Dutch" <dutch@example.com>:
>
> perl -v says this:
> This is perl, v5.10.0 built for MSWin32-x86-multi-thread
>
> The output you requested:
>
> SV = NV(0x9d59fc) at 0x9b98b4
> REFCNT = 1
> FLAGS = (PADTMP,NOK,READONLY,pNOK)
> NV = 27858341335720
This means some of your intermediate results are being stored as NVs,
that is, as C doubles, because they are too big to fit in an IV, which
in this case is a 32bit C long.
> perl -v says this:
> This is perl, v5.8.8 built for cygwin-thread-multi-64int
>
> The output you requested:
>
> SV = IV(0x10029318) at 0x10010fc0
> REFCNT = 1
> FLAGS = (PADBUSY,PADTMP,IOK,READONLY,pIOK)
> IV = 27858341335720
This means that all intermediate results are IVs, which in this case
are 64bit C long longs. Integer arithmetic is much faster than floating-
point arithmetic (and, probably, arithmetic that has to keep
converting from one to the other is slower than either), so that's why
cygwin is faster.
AFAICT, it is not possible to build a perl for Win32 with the
equivalent of use64bitint, the way the cygwin perl is built, so if you
need speed, either get a 64bit processor and OS and build a 64bit perl
or stick to cygwin.
Ben
--
#!/bin/sh
quine="echo 'eval \$quine' >> \$0; echo quined"
eval $quine
# [ben@morrow.me.uk]
------------------------------
Date: Sun, 8 Jun 2008 00:04:25 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Performance on Windows: Cygwin is much faster. Why?
Message-Id: <pogrh5-0ak1.ln1@osiris.mauzo.dyndns.org>
Quoth "Peter J. Holzer" <hjp-usenet2@hjp.at>:
>
> 5.10 seems to be a bit slower, but not much:
>
> 5.8.8 as included in Debian Etch:
> Found X=133481 in 325 seconds, Y=447352 in 573 seconds.
>
> 5.10.0 (compiled from source with default options):
> Found X=133481 in 343 seconds, Y=447352 in 597 seconds.
Both of these were built with 32bit IVs (why on earth do Debian do that?
The stock FreeBSD perl is built with 64bit IVs, even on a 32bit
system...)
> (both run on the same system:
> Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz)
>
<snip>
> Found X=133481 in 139 seconds, Y=447352 in 245 seconds.
>
> (Intel(R) Xeon(R) CPU X5355 @ 2.66GHz)
This perl was presumably built with 64bit IVs...
> So 64 bit at 43% higher clock rate seems to be 134% faster than 32 bit -
> quite impressive (but it's a different CPU, too - so that might be
> deceptive)
...so, again, you're comparing FP to integer arithmetic.
Ben
--
Razors pain you / Rivers are damp
Acids stain you / And drugs cause cramp. [Dorothy Parker]
Guns aren't lawful / Nooses give
Gas smells awful / You might as well live. ben@morrow.me.uk
------------------------------
Date: Sat, 07 Jun 2008 12:21:21 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: Perl CGI Issue
Message-Id: <Xns9AB654FE33848asu1cornelledu@127.0.0.1>
Ben Morrow <ben@morrow.me.uk> wrote in
news:hn0ph5-epf.ln1@osiris.mauzo.dyndns.org:
>
> Quoth xhoster@gmail.com:
>> Eric <venner@gmail.com> wrote:
>> >
>> > system("cmd & perl backend.pl $pdbid $sum"); # takes about 4
>> > minutes to run
>>
>> "cmd &" looks like something meant for Windows.
...
>
> My suspicion is the OP has misinterpreted the line
>
> You could also use
>
> system("cmd &")
>
> in perldoc -q background. I have no idea why it 'worked' at all,
> unless the OP *is* on windows, where the command shell treats '&' as a
> statement separator, much like ';' in sh.
It does. Good catch.
> To the OP: that statement means you should run
>
> system("perl backend.pl $pdbid $sum &");
system( 'start', 'perl', 'backend.pl', $pdbid, $sum );
might be the equivalent.
Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
------------------------------
Date: Sun, 08 Jun 2008 01:19:50 +0100
From: brian d foy <brian.d.foy@gmail.com>
Subject: Perl FAQ and Perl 5.10: Find the answers that can show off the new features
Message-Id: <080620080119501291%brian.d.foy@gmail.com>
Hi All,
I just updated a few perlfaq answers in perlfaq4 for the new Perl 5.10
features. For those of you who check the answers from the auto-poster,
I'd appreciate it if you think about how each answer might be improved
for Perl 5.10 (but still work for people using an older version). I
know some of you are doing this already and it's already brought
several answers up-to-date, but I think a lot of answers could be a lot
simpler now :)
Thanks,
------------------------------
Date: Sat, 7 Jun 2008 16:15:25 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: select and filehandle <3abd05ad-0d59-400f-9882-f70061e8f851@m45g2000hsb.googlegroups.com> <16757be3-6f65-4d5e-9042-87e3deeb6441@w7g2000hsa.googlegroups.com>
Message-Id: <slrng4l5vu.bkp.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-06 01:18, John W. Krahn <someone@example.com> wrote:
> Jim Cochrane wrote:
>> From: Jim Cochrane <allergic-to-spam@no-spam-allowed.org>
>> Subject: Re: select and filehandle
>> <3abd05ad-0d59-400f-9882-f70061e8f851@m45g2000hsb.googlegroups.com>
>> <16757be3-6f65-4d5e-9042-87e3deeb6441@w7g2000hsa.googlegroups.com>
>> User-Agent: slrn/0.9.8.1pl2 (Linux)
>
>
> Jim,
>
> Your "References:" header is missing and your references are showing up
> in the "Subject:" line.
Known problem with Google, slrn, and some newsservers:
* Google creates extremely long message-ids.
* some versions of slrn insert a line break before the first message-id
if the first message-id + "References: " is longer than 78 characters.
* Some newservers "repair" the broken header by removing the
"References:" line which appends the references to the preceding
header field (usually the Subject).
Current versions of slrn don't have that problem.
hp
------------------------------
Date: Sat, 7 Jun 2008 20:08:09 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Set breakpoint at a file/class in debugger
Message-Id: <slrng4ljk9.bkp.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-07 00:57, Hongyu <me@hongyu.org> wrote:
> Does anyone know how to set a breakpoint at a file or class that is
> not in the main program file when using debugger?
>
> For example, I have a Perl class named MyClass.pm, and my main program
> is named main.pl which use MyClass module. When I launch the debugger
> by typing "perl -d main.pl", I want to set a breakpoint at line 100 of
> MyClass.pm.
Don't know about line numbers, but you can set break points at any sub
by using the full name. So if line 100 is in MyClass::mysub, you can
first set a breakpoint at the start of mysub:
b MyClass::mysub
and when you hit that breakpoint, you can set another one at line 100.
hp
------------------------------
Date: Sat, 7 Jun 2008 19:19:23 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: XML::Parser Tree Style
Message-Id: <slrng4lgos.bkp.hjp-usenet2@hrunkner.hjp.at>
On 2008-06-03 13:59, Ben Bullock <benkasminbullock@gmail.com> wrote:
> On Tue, 03 Jun 2008 05:56:16 -0700, NiallBCarter wrote:
>
>> Well, many thanks but you did not answer a single question of mine.
>
> That's the Way of the World here, you have to get everything right first time.
That's wrong in both ways: Getting everything right the first time is
neither necessary nor sufficient to get your questions answered. But it
certainly increases the probability of the thread staying on-topic at
least for some time before branching wildly into random directions.
[...]
>> So as I originally stated, could anyone help me to try to use the Tree
>> Style to parse out the value contained in the firstname element?
>
> That's easy:
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> use XML::Parser;
> use Data::Dumper;
> my $xml=<<EOF;
><?xml version='1.0' encoding='UTF-8'?>
><list name="Nialls list">
> <person>
> <firstname>Niall</firstname>
> <lastname>Carter</lastname>
> <age>24</age>
> </person>
> <person>
> <firstname>Ruth</firstname>
> <lastname>Brewster</lastname>
> <age>22</age>
> </person>
> <person>
> <firstname>Cas</firstname>
> <lastname>Creer</lastname>
> <age>23</age>
> </person>
></list>
> EOF
> my $p = new XML::Parser( Style => 'Tree' );
> my $inputfile = "testxml.xml";
> my $tree = $p->parse($xml);
>
> my $stuff = Dumper( $tree );
> $stuff =~ s/\s//g;
> while ($stuff =~/firstname.*?(\w+)'\]/g) {
> print "$1\n";
> }
Outch!
First parsing one text format into a tree, then serializing the tree
into a different text format, and finally parsing that text format using
regexes is really evil. Besides, it doesn't work. Consider
<firstname>Mary-Ann</firstname>, for which your code prints "Ann\n".
Several good ways of doing it have already been proposed, but all of
them avoided using the tree structure provided by XML::Parser. So here's
one which uses it:
#!/usr/bin/perl
use XML::Parser;
use Data::Dumper;
use strict;
use warnings;
my $parser = new XML::Parser( Style => 'Tree' );
my $tree = $parser->parsefile( $ARGV[0] );
die unless $tree->[0] eq 'list';
my $list = $tree->[1];
for (my $i = 1; $i < $#{ $list }; $i++) {
if ($list->[$i] eq 'person') {
my $person = $list->[$i+1];
# assume that firstname is always the second component of
# person:
die unless $person->[3] eq 'firstname';
print $person->[4][2], "\n";
}
}
hp
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 1619
***************************************