[28879] in Perl-Users-Digest
Perl-Users Digest, Issue: 123 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Feb 9 23:41:26 2007
Date: Fri, 9 Feb 2007 20:40:52 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 9 Feb 2007 Volume: 11 Number: 123
Today's topics:
Sending data via ftp <mdudley@king-cart.com>
Re: Sending data via ftp xhoster@gmail.com
Re: Sending data via ftp <tony_curtis32@yahoo.com>
Re: Sending data via ftp anno4000@radom.zrz.tu-berlin.de
Re: Sending data via ftp anno4000@radom.zrz.tu-berlin.de
Re: separating attribution, quoted text, and sigs from <a24061@yahoo.com>
Simple REGEX <gabyr@yahoo.co.uk>
Re: Simple REGEX <abigail@abigail.be>
Re: Simple REGEX <wahab-mail@gmx.de>
Re: Simple REGEX <lambik@kieffer.nl>
Simple XML question ... <stephen.odonnell@gmail.com>
Re: Simple XML question ... <bik.mido@tiscalinet.it>
Re: Simple XML question ... <stephen.odonnell@gmail.com>
Re: Simple XML question ... <mirod@xmltwig.com>
Re: Simple XML question ... <bik.mido@tiscalinet.it>
Re: Simple XML question ... <stephen.odonnell@gmail.com>
Re: Simple XML question ... <zentara@highstream.net>
Someone can tell me how integrate VC++ with perl <rafael.avaria@gmail.com>
Re: Someone can tell me how integrate VC++ with perl <wahab-mail@gmx.de>
Re: Someone can tell me how integrate VC++ with perl <bik.mido@tiscalinet.it>
Re: Someone can tell me how integrate VC++ with perl <sisyphus1@nomail.afraid.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Thu, 08 Feb 2007 19:57:23 GMT
From: Marshall Dudley <mdudley@king-cart.com>
Subject: Sending data via ftp
Message-Id: <DgLyh.2439$G23.1253@newsreading01.news.tds.net>
Is there any way to send a variable via ftp with perl? I am trying to
use NET::FTP, but I cannot find any way to tell it to send the data to
the other end. All I can find is to send a file from the hard drive to
the remote, but the data is not in a file, but created by the script
that is attempting to send it and in a simply text variable
Thanks,
Marshall
------------------------------
Date: 08 Feb 2007 20:18:50 GMT
From: xhoster@gmail.com
Subject: Re: Sending data via ftp
Message-Id: <20070208151921.480$1p@newsreader.com>
Marshall Dudley <mdudley@king-cart.com> wrote:
> Is there any way to send a variable via ftp with perl? I am trying to
> use NET::FTP, but I cannot find any way to tell it to send the data to
> the other end. All I can find is to send a file from the hard drive to
> the remote, but the data is not in a file, but created by the script
> that is attempting to send it and in a simply text variable
Net::FTP (case matters) can send either files, or the contents read from a
file handle.
Either save the variable to a temp file and then send the file, or
make the variable look like a filehandle. For doing the latter, see
IO::Scalar.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: Thu, 08 Feb 2007 15:17:43 -0500
From: Tony Curtis <tony_curtis32@yahoo.com>
Subject: Re: Sending data via ftp
Message-Id: <eqg0h7$f3r$1@knot.queensu.ca>
Marshall Dudley wrote:
> Is there any way to send a variable via ftp with perl? I am trying to
> use NET::FTP, but I cannot find any way to tell it to send the data to
> the other end. All I can find is to send a file from the hard drive to
> the remote, but the data is not in a file, but created by the script
> that is attempting to send it and in a simply text variable
Have you ever pondered what the "F" in "FTP" stands for? :-)
Write the data into a temporary file and send that, e.g. through File::Temp.
hth
t
------------------------------
Date: 8 Feb 2007 21:11:40 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Sending data via ftp
Message-Id: <531i0cF1qk718U1@mid.dfncis.de>
Marshall Dudley <mdudley@king-cart.com> wrote in comp.lang.perl.misc:
> Is there any way to send a variable via ftp with perl? I am trying to
> use NET::FTP, but I cannot find any way to tell it to send the data to
> the other end. All I can find is to send a file from the hard drive to
> the remote, but the data is not in a file, but created by the script
> that is attempting to send it and in a simply text variable
Did you read this paragraph?
put ( LOCAL_FILE [, REMOTE_FILE ] )
Put a file on the remote server. "LOCAL_FILE" may be a name or a
filehandle. If "LOCAL_FILE" is a filehandle then "REMOTE_FILE"
must be specified. If "REMOTE_FILE" is not specified then the file
will be stored in the current directory with the same leafname as
"LOCAL_FILE".
The local file doesn't have to be on the hard drive, you can specify
a file handle. Then the data could come from anywhere, for instance
from print statements in your code.
I don't think that FTP is the right tool for what you want to do,
but just for fun, here is how it *could* be done:
You have one problem: The call to ->put only returns after the
transfer is finished, so you don't get a chance to print the data
in the same process that runs ->put. You'll need to fork another
process for that and use a pipe to communicate between the two.
use Net::FTP;
my $ftp = Net::FTP->new( 'host') or die 'new';
$ftp->login( 'user', 'password') or die 'login';
pipe( my ( $r, $w)) or die 'pipe';
if ( my $pid = fork ) {
defined( $pid) or die 'fork';
close $r;
print $w "$_\n" for qw( hihi haha hoho);
} else {
close $w;
$ftp->put( $r, '/remote/file') or die 'put';
}
Anno
------------------------------
Date: 8 Feb 2007 21:15:14 GMT
From: anno4000@radom.zrz.tu-berlin.de
Subject: Re: Sending data via ftp
Message-Id: <531i72F1qk718U2@mid.dfncis.de>
<xhoster@gmail.com> wrote in comp.lang.perl.misc:
> Marshall Dudley <mdudley@king-cart.com> wrote:
> > Is there any way to send a variable via ftp with perl? I am trying to
> > use NET::FTP, but I cannot find any way to tell it to send the data to
> > the other end. All I can find is to send a file from the hard drive to
> > the remote, but the data is not in a file, but created by the script
> > that is attempting to send it and in a simply text variable
>
> Net::FTP (case matters) can send either files, or the contents read from a
> file handle.
>
> Either save the variable to a temp file and then send the file, or
> make the variable look like a filehandle. For doing the latter, see
> IO::Scalar.
Ah... very good idea. Much more compact than the rather low-level
pipe/fork solution I posted.
Anno
------------------------------
Date: Tue, 6 Feb 2007 19:05:01 +0000
From: Adam Funk <a24061@yahoo.com>
Subject: Re: separating attribution, quoted text, and sigs from the body of a post
Message-Id: <t30n94-jqf.ln1@news.ducksburg.com>
On 2007-01-17, usenet@DavidFilmer.com wrote:
> You won't be able to do this 100% of the time because the behavior of
> replies is different (and can be customized) in different newsreaders.
> Usenet posts are plain text, and lack the context tagging of XML, etc.
> But you can probably get pretty close to what you want.
>
> You can probably exclude 90%+ of attribution lines by excluding
> /wrote:$/ (but it won't work for Dr.Ruud's posts, etc). Of course,
> that assumes English-language newsgroups. Some folks try to be cute
> with attribution lines like:
> When Art Merkel finally sobered up, he blundered:
> Nuthin you can do about attribution lines like that, unless you
> hard-code distinctive strings for prolific posters.
Here's something I've tinkered with, which assumes that either the
body is all original (no m/^>/ lines) or that all unquoted lines
before the first quoted one are attribution lines (I think this is
almost always the case for inline/bottom-posting).
Comments, suggestions?
Of course it doesn't handle top-posting!
##################################################
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Std;
use News::Article;
my ($filename, $in_art, $out_art, $out_filename);
while (@ARGV) {
$filename = shift(@ARGV);
$in_art = News::Article->new($filename);
print("*****\n$filename\n");
process_body($in_art->body());
}
sub process_body {
my @input = @_;
my @output = ();
my $op = 1;
my $line;
my $not_sig = 1;
# $op true IFF this is an original post (with no quoting)
foreach $line (@input) {
if ($line =~ /^>/) {
$op = 0;
last;
}
elsif ($line =~ /^-- /) {
last;
}
}
if ($op) {
print("original\n");
}
else {
print("quoting\n");
}
# copy the attribution lines
if (! $op) {
do {
$line = shift(@input);
print(" a $line\n"); # attribution
} while ($line !~ /^>/ );
}
while (@input && $not_sig) {
$line = shift(@input);
if ($line =~ /^-- /) {
$not_sig = 0;
print(" - "); # sig separator
}
elsif ($line !~ /^>/) {
print("n "); # new content
}
else {
print(" q "); # quoted
}
print($line, "\n");
}
while (@input) {
$line = shift(@input);
print(" s $line\n"); # sig
}
}
------------------------------
Date: Sat, 03 Feb 2007 12:40:08 +0100
From: Gabriel <gabyr@yahoo.co.uk>
Subject: Simple REGEX
Message-Id: <eq1sap$an$1@cormoran.emeteo.local>
Hello.
I have a file with this kind of data:
field1;field2;field3;field4\n
For example:
Andrew;Dimmu;Shagrath big;This is correct; /* CORRECT LINE */
Jeny; Hello Jen; Spaces;Black; /* INCORRECT LINE */
How can I delete those first spaces (Jeny;Hello Jen;Spaces;Black) ?
$line =~ /;[ ]+//g; ?
------------------------------
Date: 03 Feb 2007 11:47:51 GMT
From: Abigail <abigail@abigail.be>
Subject: Re: Simple REGEX
Message-Id: <slrnes8tiq.ekq.abigail@alexandra.abigail.be>
Gabriel (gabyr@yahoo.co.uk) wrote on MMMMCMIV September MCMXCIII in
<URL:news:eq1sap$an$1@cormoran.emeteo.local>:
)) Hello.
)) I have a file with this kind of data:
))
)) field1;field2;field3;field4\n
))
)) For example:
))
)) Andrew;Dimmu;Shagrath big;This is correct; /* CORRECT LINE */
)) Jeny; Hello Jen; Spaces;Black; /* INCORRECT LINE */
))
)) How can I delete those first spaces (Jeny;Hello Jen;Spaces;Black) ?
)) $line =~ /;[ ]+//g; ?
Almost, but that deletes the semi-colon as well, so you have to put
that one back in:
$line =~ s/; +/;/g;
Note that there's no need to put the space in a character class
(if you do, it's slower).
Abigail
--
$_ = "\nrekcaH lreP rehtona tsuJ"; my $chop; $chop = sub {print chop; $chop};
$chop -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> ()
-> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> ()
------------------------------
Date: Sat, 03 Feb 2007 12:58:33 +0100
From: Mirco Wahab <wahab-mail@gmx.de>
Subject: Re: Simple REGEX
Message-Id: <eq1tl2$ifv$1@mlucom4.urz.uni-halle.de>
Gabriel wrote:
> I have a file with this kind of data:
> field1;field2;field3;field4\n
> Andrew;Dimmu;Shagrath big;This is correct; /* CORRECT LINE */
> Jeny; Hello Jen; Spaces;Black; /* INCORRECT LINE */
>
> How can I delete those first spaces (Jeny;Hello Jen;Spaces;Black) ?
> $line =~ /;[ ]+//g; ?
perl -lpe 's/(?<=;)\s+|^\s+|\s+(?=;)//g' file.txt
This works in these cases:
|
Jeny; Hello Jen; Spaces;Black;
Jenny Space Front; Hello Jen ; Spaces;Black;
Andrew;Dimmu;Shagrath big;This is correct;
|
Regards
M.
------------------------------
Date: Sat, 3 Feb 2007 15:35:48 +0100
From: "Lambik" <lambik@kieffer.nl>
Subject: Re: Simple REGEX
Message-Id: <45c49d04$0$747$5fc3050@dreader2.news.tiscali.nl>
"Abigail" <abigail@abigail.be> wrote in message
news:slrnes8tiq.ekq.abigail@alexandra.abigail.be...
> Gabriel (gabyr@yahoo.co.uk) wrote on MMMMCMIV September MCMXCIII in
> <URL:news:eq1sap$an$1@cormoran.emeteo.local>:
> )) Hello.
> )) I have a file with this kind of data:
> ))
> )) field1;field2;field3;field4\n
> ))
> )) For example:
> ))
> )) Andrew;Dimmu;Shagrath big;This is correct; /* CORRECT LINE */
> )) Jeny; Hello Jen; Spaces;Black; /* INCORRECT LINE */
> ))
> )) How can I delete those first spaces (Jeny;Hello Jen;Spaces;Black) ?
> )) $line =~ /;[ ]+//g; ?
>
>
> Almost, but that deletes the semi-colon as well, so you have to put
> that one back in:
>
> $line =~ s/; +/;/g;
>
Just an amendment. This will not work with a space in the first field (read:
beginning of the line).
s/(;) +|^ +/$1/g;
------------------------------
Date: 5 Feb 2007 05:40:25 -0800
From: "Stephen O'D" <stephen.odonnell@gmail.com>
Subject: Simple XML question ...
Message-Id: <1170682824.989421.141500@l53g2000cwa.googlegroups.com>
I have a big file that looks similar to this:
<file>
<item>
<itemtags>foo</itemtags>
</item>
<item>
<itemtags>foo</itemtags>
</item>
... 1000's of repititions
<item>
<itemtags>foo</itemtags>
</item>
</file>
I want to ensure the file is valid and grab each item (without the
item wrapper - ie all child tags of each item).
I was hoping to do this using XML::Parser, but I just cannot work out
how to get the actual markup text contained in a tag. I know can use
it in Subs mode, and set a handler for item. How can I use that to
extract the parts I care about?
Thanks,
Stephen.
------------------------------
Date: Mon, 05 Feb 2007 15:43:37 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Simple XML question ...
Message-Id: <pvfes2d7e9nfcjrt0pcsl2sc3tiikdvh79@4ax.com>
On 5 Feb 2007 05:40:25 -0800, "Stephen O'D"
<stephen.odonnell@gmail.com> wrote:
>Subject: Simple XML question ...
If it's Simple and XML, how 'bout XML::Simple?
>I have a big file that looks similar to this:
>
><file>
> <item>
> <itemtags>foo</itemtags>
> </item>
[snip]
Seems simple enough...
>I want to ensure the file is valid and grab each item (without the
>item wrapper - ie all child tags of each item).
>
>I was hoping to do this using XML::Parser, but I just cannot work out
>how to get the actual markup text contained in a tag. I know can use
>it in Subs mode, and set a handler for item. How can I use that to
>extract the parts I care about?
Well, let's see if X::S can actually do that:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple;
die "D'Oh!\n" unless @ARGV;
print $_->{itemtags}, "\n"
for @{ (XMLin shift)->{item} };
__END__
Yes, it seems to do the job. Provided that I understood the job
correctly. Error checking is left as an exercise to the reader: here I
am assuming everything will go fine in any case.
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
------------------------------
Date: 5 Feb 2007 07:37:10 -0800
From: "Stephen O'D" <stephen.odonnell@gmail.com>
Subject: Re: Simple XML question ...
Message-Id: <1170689830.132389.133870@l53g2000cwa.googlegroups.com>
On Feb 5, 2:43 pm, Michele Dondi <bik.m...@tiscalinet.it> wrote:
> On 5 Feb 2007 05:40:25 -0800, "Stephen O'D"
>
> <stephen.odonn...@gmail.com> wrote:
> >Subject: Simple XML question ...
>
> If it's Simple and XML, how 'bout XML::Simple?
>
> >I have a big file that looks similar to this:
>
> ><file>
> > <item>
> > <itemtags>foo</itemtags>
> > </item>
>
> [snip]
>
> Seems simple enough...
>
> >I want to ensure the file is valid and grab each item (without the
> >item wrapper - ie all child tags of each item).
>
> >I was hoping to do this using XML::Parser, but I just cannot work out
> >how to get the actual markup text contained in a tag. I know can use
> >it in Subs mode, and set a handler for item. How can I use that to
> >extract the parts I care about?
>
> Well, let's see if X::S can actually do that:
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use XML::Simple;
>
> die "D'Oh!\n" unless @ARGV;
>
> print $_->{itemtags}, "\n"
> for @{ (XMLin shift)->{item} };
>
> __END__
>
> Yes, it seems to do the job. Provided that I understood the job
> correctly. Error checking is left as an exercise to the reader: here I
> am assuming everything will go fine in any case.
>
> Michele
Thats not exactly what I want. My file is more like:
<file>
<item>
<itemtags>
<tag1>foo</tag1>
<tag2>bar</tag2>
</itemtags>
</item>
<item>
<itemtags>
<tag1>foo</tag1>
<tag2>bar</tag2>
</itemtags>
</item>
... 1000's of repititions
</file>
So I need a series of xmlchunks like the following as my output (they
will be passed to another process for processing one at a time):
<itemtags>
<tag1>foo</tag1>
<tag2>bar</tag2>
</itemtags>
Also, the files I am dealing with are going to be large, and each
itemtags section is about 32K in size.
I am struggling to find some way to get me just the output. I have
something working with XML::Twig:
[sodonnel@millhouse]$ more twig.pl
use XML::Twig;
sub print_it {
my ($t, $elt) = @_;
$elt->set_asis;
print $elt->sprint($elt,1), "\n";
$t->purge;
}
my $t= XML::Twig->new( twig_handlers =>
{ 'item' => \&print_it }
);
$t->parsefile( 'data.xml');
[sodonnel@millhouse]$ perl twig.pl
<itemtags>foo</itemtags>
<itemtags>foo</itemtags>
<itemtags>foo</itemtags>
I have no real experience parsing big xml files in Perl (or
anything). My file has 10 items at a total size of ~400K and it takes
~ 5.2 CPU seconds to parse it and print each chunk. That seems slow
to me - can I expect to parse the file faster than that?
Stephen.
------------------------------
Date: Mon, 05 Feb 2007 17:25:37 +0100
From: mirod <mirod@xmltwig.com>
Subject: Re: Simple XML question ...
Message-Id: <45c7591f$0$20819$5fc30a8@news.tiscali.it>
Stephen O'D wrote:
> I am struggling to find some way to get me just the output. I have
> something working with XML::Twig:
>
> [sodonnel@millhouse]$ more twig.pl
> use XML::Twig;
>
> sub print_it {
> my ($t, $elt) = @_;
> $elt->set_asis;
> print $elt->sprint($elt,1), "\n";
> $t->purge;
> }
>
> my $t= XML::Twig->new( twig_handlers =>
> { 'item' => \&print_it }
> );
> $t->parsefile( 'data.xml');
>
> [sodonnel@millhouse]$ perl twig.pl
> <itemtags>foo</itemtags>
> <itemtags>foo</itemtags>
> <itemtags>foo</itemtags>
>
> I have no real experience parsing big xml files in Perl (or
> anything). My file has 10 items at a total size of ~400K and it takes
> ~ 5.2 CPU seconds to parse it and print each chunk. That seems slow
> to me - can I expect to parse the file faster than that?
Hi,
Your code looks about right. The $elt->set_asis I believe is useless
(and dangerous actually), you should probably get rid of it.
As far as speed goes, it depends on your system, you can have a look at
the various benchmarks in the Ways to Rome serie:
http://xmltwig.com/article/index_wtr.html (basically XML::LibXML is very
fast, most other modules are slower.
------------------------------
Date: Mon, 05 Feb 2007 17:29:07 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Simple XML question ...
Message-Id: <qhmes2dfkbfkrk0gflv94k1dcdog9emki9@4ax.com>
On 5 Feb 2007 07:37:10 -0800, "Stephen O'D"
<stephen.odonnell@gmail.com> wrote:
>Thats not exactly what I want. My file is more like:
>
><file>
> <item>
> <itemtags>
> <tag1>foo</tag1>
> <tag2>bar</tag2>
[snip]
>So I need a series of xmlchunks like the following as my output (they
>will be passed to another process for processing one at a time):
>
> <itemtags>
> <tag1>foo</tag1>
> <tag2>bar</tag2>
> </itemtags>
Well, then it would be even easier, but...
>Also, the files I am dealing with are going to be large, and each
>itemtags section is about 32K in size.
[snip]
>I have no real experience parsing big xml files in Perl (or
>anything). My file has 10 items at a total size of ~400K and it takes
>~ 5.2 CPU seconds to parse it and print each chunk. That seems slow
>to me - can I expect to parse the file faster than that?
...indeed X::S builds a Perl data structure out of your XML and this
doesn't really scream for extreme speed nor memory friendness. It is
actually simple and that's why I have used it for my XML parsing
needs: because I hardly know anything about XML at all, let alone big
files. So I hope that someone more knowldegeable than I am will give
you more specific help.
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
------------------------------
Date: 6 Feb 2007 03:53:34 -0800
From: "Stephen O'D" <stephen.odonnell@gmail.com>
Subject: Re: Simple XML question ...
Message-Id: <1170762814.855195.41430@v45g2000cwv.googlegroups.com>
> <stephen.odonn...@gmail.com> wrote:
> >Thats not exactly what I want. My file is more like:
>
> ><file>
> > <item>
> > <itemtags>
> > <tag1>foo</tag1>
> > <tag2>bar</tag2>
> [snip]
> >So I need a series of xmlchunks like the following as my output (they
> >will be passed to another process for processing one at a time):
>
> > <itemtags>
> > <tag1>foo</tag1>
> > <tag2>bar</tag2>
> > </itemtags>
>
> Well, then it would be even easier, but...
>
>
>
> >Also, the files I am dealing with are going to be large, and each
> >itemtags section is about 32K in size.
> [snip]
> >I have no real experience parsing big xml files in Perl (or
> >anything). My file has 10 items at a total size of ~400K and it takes
> >~ 5.2 CPU seconds to parse it and print each chunk. That seems slow
> >to me - can I expect to parse the file faster than that?
Ok, for those that are interested I have now two ways of doing this,
using XML::Twig or XML::Parser
XML::Twig code:
[sodonnel@millhouse]$ more twig.pl
use XML::Twig;
use Benchmark;
my $item;
sub print_it {
my ($t, $elt) = @_;
$elt->set_asis;
# putting this into $item and then clearing it is stupid
# but its to make it a fair test to what I am doing with
# XML::Parser.
$item = $elt->sprint($elt,1), "\n";
$item = '';
$t->purge;
}
my $t= XML::Twig->new( twig_handlers =>
{ 'cloudItem' => \&print_it }
);
my $bstart = new Benchmark;
$t->parsefile( 'cloud.xml');
my $bend = new Benchmark;
print timestr(timediff($bend,$bstart)), "\n";
XML::Parser code:
[sodonnel@millhouse]$ more xml_parser.pl
use XML::Parser;
use Benchmark;
my( $in_item, $item_text);
my $bstart = new Benchmark;
my $parser = XML::Parser->new(Handlers => { Start => \&tag_start,
End => \&tag_end,
Char => \&characters,
});
$parser->parsefile('cloud.xml');
my $bend = new Benchmark;
print timestr(timediff($bend,$bstart)), "\n";
exit(0);
sub tag_start {
my ($xp, $el) = @_;
# this will copy all but the first occurrance into item text
if ($in_item >= 1) { $item_text .= $xp->recognized_string }
if ($el eq 'cloudItem') { $in_item += 1 }
}
sub tag_end {
my ($xp, $el) = @_;
if ($el eq 'cloudItem') { $in_item -= 1 }
if ($in_item == 0) {
#print $item_text;
$item_text = '';
} else {
# copies everything but the closing cloudItem tag
$item_text .= $xp->recognized_string;
}
}
sub characters {
my ($xp, $txt) = @_;
if ($in_item) { $item_text .= $txt }
}
[sodonnel@millhouse]$ perl xml_parser.pl
1 wallclock secs ( 1.59 usr + 0.00 sys = 1.59 CPU)
[sodonnel@millhouse]$ perl twig.pl
5 wallclock secs ( 5.14 usr + 0.02 sys = 5.16 CPU)
So XML::Parser wins by quite a way, probably because it doesn't make a
memory structure of the tags. Goodness only knows if this is the best
way, but its good enough for now.
Cheers,
Stephen.
------------------------------
Date: Tue, 06 Feb 2007 16:12:08 GMT
From: zentara <zentara@highstream.net>
Subject: Re: Simple XML question ...
Message-Id: <lo9hs2t08jucj3r9rdbjjgptla04bksi39@4ax.com>
On 5 Feb 2007 07:37:10 -0800, "Stephen O'D" <stephen.odonnell@gmail.com>
wrote:
>I have no real experience parsing big xml files in Perl (or
>anything). My file has 10 items at a total size of ~400K and it takes
>~ 5.2 CPU seconds to parse it and print each chunk. That seems slow
>to me - can I expect to parse the file faster than that?
>
>Stephen.
This might be a good bet.
See:
http://www.xml.com/pub/a/2001/02/14/perlsax.html
Feed your xml to this, and watch the output.
#!/usr/bin/perl
use warnings;
use strict;
use XML::Parser::PerlSAX;
my $parser = new XML::Parser::PerlSAX( Handler => new SampleHandler );
$parser->parse( Source => { SystemId => shift } );
package SampleHandler;
sub new {
my $self = {};
return bless( $self );
}
sub start_document { print "start_document\n"; }
sub end_document { print "end_document\n"; }
sub start_element {
my ( $self, $element ) = @_;
my $name = $element->{ Name };
print "start_element: '$name'\n";
while ( my ( $k, $v ) = each( %{ $element->{ Attributes } } ) ) {
print " attribute: $k = $v\n";
}
}
sub end_element {
my ( $self, $element ) = @_;
my $name = $element->{ Name };
print "end_element: '$name'\n";
}
sub characters {
my ( $self, $text ) = @_;
my $data = $text->{ Data };
print "characters: '$data'\n";
}
__END__
zentara
--
I'm not really a human, but I play one on earth.
http://zentara.net/japh.html
------------------------------
Date: 6 Feb 2007 07:49:47 -0800
From: "lala4life" <rafael.avaria@gmail.com>
Subject: Someone can tell me how integrate VC++ with perl
Message-Id: <1170776987.601921.214830@j27g2000cwj.googlegroups.com>
I look for some doc in the network, but didn't find any usefull, i
have experience programing with perl but no with VC++, at least with
this topic.
If someone can tell where can get examples it really nice.
------------------------------
Date: Tue, 06 Feb 2007 17:09:17 +0100
From: Mirco Wahab <wahab-mail@gmx.de>
Subject: Re: Someone can tell me how integrate VC++ with perl
Message-Id: <eqa9hj$7qk$1@mlucom4.urz.uni-halle.de>
lala4life wrote:
> I look for some doc in the network, but didn't find any usefull, i
> have experience programing with perl but no with VC++, at least with
> this topic.
To do what?
> If someone can tell where can get examples it really nice.
For what?
"integrating Perl-Code-Modules into a VC++-Programm"
or
"integrating VC++-Code-Modules into a Perl-Programm"?
or something else?
Regards
Mirco
------------------------------
Date: Tue, 06 Feb 2007 20:37:53 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Someone can tell me how integrate VC++ with perl
Message-Id: <p4mhs2htlc38pth8ot2166ioaeoht9tnp8@4ax.com>
On 6 Feb 2007 07:49:47 -0800, "lala4life" <rafael.avaria@gmail.com>
wrote:
>I look for some doc in the network, but didn't find any usefull, i
>have experience programing with perl but no with VC++, at least with
>this topic.
>
>If someone can tell where can get examples it really nice.
I'm not really sure about what you're asking about but someone wrote a
tutorial about a somewhat related topic at Perl Monks:
http://perlmonks.org/index.pl?node_id=583586
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
------------------------------
Date: Wed, 7 Feb 2007 10:30:17 +1100
From: "Sisyphus" <sisyphus1@nomail.afraid.com>
Subject: Re: Someone can tell me how integrate VC++ with perl
Message-Id: <45c91008$0$5745$afc38c87@news.optusnet.com.au>
"lala4life" <rafael.avaria@gmail.com> wrote in message
news:1170776987.601921.214830@j27g2000cwj.googlegroups.com...
>I look for some doc in the network, but didn't find any usefull, i
> have experience programing with perl but no with VC++, at least with
> this topic.
>
> If someone can tell where can get examples it really nice.
>
See 'perldoc perlembed'.
Cheers,
Rob
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 123
**************************************