[28459] in Perl-Users-Digest
Perl-Users Digest, Issue: 9823 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Oct 9 11:05:53 2006
Date: Mon, 9 Oct 2006 08:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 9 Oct 2006 Volume: 10 Number: 9823
Today's topics:
Re: [META] Usenet and charsets <rvtol+news@isolution.nl>
Re: [META] Usenet and charsets <hjp-usenet2@hjp.at>
Re: [META] Usenet and charsets <rvtol+news@isolution.nl>
Re: [META] Usenet and charsets <hjp-usenet2@hjp.at>
Re: about FAQ4.5 and substr <cmic@caramail.com>
about matching in a list context sharma__r@hotmail.com
Re: about matching in a list context <eric-amick@comcast.net>
Re: about matching in a list context <mritty@gmail.com>
Re: Command substitution in perl <kiran.r.pillai@gmail.com>
Re: Data Extraction Hierarchial Report <someone@example.com>
Re: Data Extraction Hierarchial Report <tadmc@augustmail.com>
Re: Data Extraction Hierarchial Report <1usa@llenroc.ude.invalid>
Re: Data Extraction Hierarchial Report (reading news)
Re: Distributed multitasking, POE, communication schwarzenschafe@gmail.com
Re: How to parse a new computer language in Perl? <bol@adv.magwien.gv.at>
Re: LWP and Unicode <dale.gerdemann@googlemail.com>
new CPAN modules on Mon Oct 9 2006 (Randal Schwartz)
Re: Output of Concise <bol@adv.magwien.gv.at>
Re: Parse tree like data like XML by Perl? <zhushenli@gmail.com>
Parsing and preserving comments <woland99@gmail.com>
Re: Posting Guidelines for comp.lang.perl.misc ($Revisi <rvtol+news@isolution.nl>
Re: Posting Guidelines for comp.lang.perl.misc ($Revisi <tadmc@augustmail.com>
Re: Syntax for getting web page links <tadmc@augustmail.com>
Re: Usenet and charsets (was: Re: LWP and Unicode) <dale.gerdemann@googlemail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 9 Oct 2006 11:48:33 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: [META] Usenet and charsets
Message-Id: <egdd2c.jo.1@news.isolution.nl>
Dale schreef:
> Isn't UTF8 the most consertive choice nowadays? Look at Wikipedia or
> Wiktionary. Massive international websites all in UTF8. And look at
> the Russian Wikipedia. for example. It doesn't use a "custom charset"
> at all.
See Subject, this is about "Usenet and charsets", not about HTML.
Your newsclient doesn't remove the
/[[:blank:]]+[(]was: Re: .*[)]$/
part from the Subject header field,
so you need to do it by hand.
Your broken newsclient does remove the [anything] prefix
from the Subject header field, which is real bad.
My broken newsclient (OE6) does a lot of real bad things too, but used
together with OE-QuoteFix and Hamster it is almost OK.
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: Mon, 9 Oct 2006 15:01:19 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: [META] Usenet and charsets
Message-Id: <slrneiki0v.ac5.hjp-usenet2@yoyo.hjp.at>
On 2006-10-09 09:48, Dr.Ruud <rvtol+news@isolution.nl> wrote:
> Dale schreef:
>> Isn't UTF8 the most consertive choice nowadays? Look at Wikipedia or
>> Wiktionary. Massive international websites all in UTF8. And look at
>> the Russian Wikipedia. for example. It doesn't use a "custom charset"
>> at all.
>
> See Subject, this is about "Usenet and charsets", not about HTML.
Yup. Usenet is more conservative than the WWW. UTF-8 is only about 14
years old, so you can't expect all newsreaders to support it. Still, I
think that properly declared UTF-8 should be acceptable in international
newsgroups, and since nobody has complained about my postings yet, I
take it as evidence that my newsreader's inability to use ISO-8859-1
where sufficient is only a minor bug.
> Your newsclient doesn't remove the
> /[[:blank:]]+[(]was: Re: .*[)]$/
> part from the Subject header field,
> so you need to do it by hand.
>
> Your broken newsclient does remove the [anything] prefix
> from the Subject header field, which is real bad.
Weird. Dale seems to be using Mozilla 1.7.8 from Debian. I just
installed that (although a slightly newer version), and can't reproduce
this: [META] is preserved and (was: ...) is automatically removed.
hp
--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR | > ist?
| | | hjp@hjp.at | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
------------------------------
Date: Mon, 9 Oct 2006 15:21:31 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: [META] Usenet and charsets
Message-Id: <egdpvj.1es.1@news.isolution.nl>
Peter J. Holzer schreef:
> Dr.Ruud:
>> [to Dale]
>> Your newsclient doesn't remove the
>> /[[:blank:]]+[(]was: Re: .*[)]$/
>> part from the Subject header field,
>> so you need to do it by hand.
>>
>> Your broken newsclient does remove the [anything] prefix
>> from the Subject header field, which is real bad.
>
> Weird. Dale seems to be using Mozilla 1.7.8 from Debian. I just
> installed that (although a slightly newer version), and can't
> reproduce this: [META] is preserved and (was: ...) is automatically
> removed.
I assumed that Dale had used the googlegroups-interface for a
newsclient. The [META] was already removed with Bart's reply.
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: Mon, 9 Oct 2006 16:37:37 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: [META] Usenet and charsets
Message-Id: <slrneiknlh.bvl.hjp-usenet2@yoyo.hjp.at>
On 2006-10-09 13:21, Dr.Ruud <rvtol+news@isolution.nl> wrote:
> Peter J. Holzer schreef:
>> Dr.Ruud:
>>> Your broken newsclient does
[weird things to the subject]
>> Weird. Dale seems to be using Mozilla 1.7.8 from Debian. I just
>> installed that (although a slightly newer version), and can't
>> reproduce this: [META] is preserved and (was: ...) is automatically
>> removed.
>
> I assumed that Dale had used the googlegroups-interface for a
> newsclient.
You are right. I saw Mozilla 1.7.8 in what looked like a useragent header, and
didn't notice that it was really an 'X-HTTP-Useragent' header. Sorry for
the confusion.
hp
--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR | > ist?
| | | hjp@hjp.at | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
------------------------------
Date: 8 Oct 2006 22:29:12 -0700
From: "cmic" <cmic@caramail.com>
Subject: Re: about FAQ4.5 and substr
Message-Id: <1160371752.661984.314730@m7g2000cwm.googlegroups.com>
John W. Krahn a =E9crit :
> cmic wrote:
> > Hello
> > The FAQ 4.54, about converting from hex to dec uses the following
> > method :
> > $dec =3D unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8)));
> >
> > I don't understand the usefulleness of the substr thing, because
> >
> > $dec =3D unpack("N", pack("H8", "DEADBEEF"));
> >
> > works OK too.
>
> If you replace the literal string "DEADBEEF" with a variable that may have
> zero or more hexadecimal digits then '"0" x 8 . ' prepends eight zeros to
> ensure that the string is at least eight characters long and substr() ens=
ures
> that the resulting string is exactly eight characters long.
OK. I get it. Of course, in the "real life" I don't have "DEADBEEF" but
a variable of unknown length.
Thanks
--
cmic
>
>
>
> John
> --
> Perl isn't a toolbox, but a small machine shop where you can special-order
> certain sorts of tools at low cost and in short order. -- Larry Wall
------------------------------
Date: 9 Oct 2006 05:14:01 -0700
From: sharma__r@hotmail.com
Subject: about matching in a list context
Message-Id: <1160396041.497076.130570@k70g2000cwa.googlegroups.com>
Hello,
I don't understand this behavior of perl:
perl -lne 'print for /^(\d+\.\d+E-\d+)$/;' << EOF
***********
4.62972E-13
4.63098E-13
4.62983E-13
***********
EOF
returns corrrectly --->
4.62972E-13
4.63098E-13
4.62983E-13
Whereas if I do this :
perl -lne 'print for /^\d+\.\d+E-\d+$/;'
I get (from the exact same input as above) :
1
1
1
Now I want to know what's going on?
Thanks
Rakesh
------------------------------
Date: Mon, 09 Oct 2006 08:31:25 -0400
From: Eric Amick <eric-amick@comcast.net>
Subject: Re: about matching in a list context
Message-Id: <4sfki2td7fvc57paim14mt6qjuagn3b7bt@4ax.com>
On 9 Oct 2006 05:14:01 -0700, sharma__r@hotmail.com wrote:
>Hello,
>
>I don't understand this behavior of perl:
>
>
>perl -lne 'print for /^(\d+\.\d+E-\d+)$/;' << EOF
>***********
>4.62972E-13
>4.63098E-13
>4.62983E-13
>***********
>EOF
>
>returns corrrectly --->
>
>4.62972E-13
>4.63098E-13
>4.62983E-13
>
>
>Whereas if I do this :
>
>perl -lne 'print for /^\d+\.\d+E-\d+$/;'
>
>I get (from the exact same input as above) :
>1
>1
>1
>
>Now I want to know what's going on?
Here's the relevant passage from "Regexp Quote-Like Operators" in
perlop:
If the /g option is not used, m// in list context returns a list
consisting of the subexpressions matched by the parentheses in the
pattern, i.e., ($1, $2, $3...). (Note that here $1 etc. are also set,
and that this differs from Perl 4's behavior.) When there are no
parentheses in the pattern, the return value is the list (1) for
success. With or without parentheses, an empty list is returned upon
failure.
(Remember that the foreach modifier, like the foreach loop, imposes list
context.) If you want the second example to work the same as the first,
add the /g option.
--
Eric Amick
Columbia, MD
------------------------------
Date: 9 Oct 2006 05:37:40 -0700
From: "Paul Lalli" <mritty@gmail.com>
Subject: Re: about matching in a list context
Message-Id: <1160397460.551538.211930@m7g2000cwm.googlegroups.com>
sharma__r@hotmail.com wrote:
> I don't understand this behavior of perl:
>
>
> perl -lne 'print for /^(\d+\.\d+E-\d+)$/;' << EOF
> ***********
> 4.62972E-13
> 4.63098E-13
> 4.62983E-13
> ***********
> EOF
>
> returns corrrectly --->
>
> 4.62972E-13
> 4.63098E-13
> 4.62983E-13
>
>
> Whereas if I do this :
>
> perl -lne 'print for /^\d+\.\d+E-\d+$/;'
>
> I get (from the exact same input as above) :
> 1
> 1
> 1
>
> Now I want to know what's going on?
In a list context, a pattern match returns a list of all the captured
sub-matches contained within parentheses. If there are no parentheses
within the pattern, the pattern match returns the list containing 1.
In your first pattern, you've captured the entire pattern - whatever
matches (\d+\.\d+E-\d+) will be returned. So for each line of input
that matches the pattern, that captured sub-pattern is returned. In
the second pattern, nothing is captured, and so the pattern returns 1
for each line of text that matches the pattern.
This behavior is documented in `perldoc perlop`. Search for
"m/PATTERN/".
Paul Lalli
------------------------------
Date: 8 Oct 2006 23:42:13 -0700
From: "kp" <kiran.r.pillai@gmail.com>
Subject: Re: Command substitution in perl
Message-Id: <1160376133.084183.199520@c28g2000cwb.googlegroups.com>
Hi All,
Thanks for all your posts.
The problem here was that `$HOSTNAME` output had a newline character.
I did the following for fixing this:
$hostname = `$HOSTNAME`;
chomp ($hostname);
Then used the "$hostname" variable in the "if" loop.
Thanks,
Kiran
kp wrote:
> My perl module file includes this line:
> $HOSTNAME = "/usr/bin/hostname";
>
> I have sourced this perl module file in my perl script.
>
> In my perl script I have an if loop :
> if ( `$HOSTNAME` eq "rsd2" ) {
> print "NFS server is fsd2";
> }
>
>
> However, this command substitution does not seem to work.
>
> Any clues???
>
> -kp
------------------------------
Date: Mon, 09 Oct 2006 04:26:27 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: Data Extraction Hierarchial Report
Message-Id: <TbkWg.7480$H7.235@edtnps82>
banker123 wrote:
> The following code will extract the header record, the challeng I am
> having is appending this to the detail records.
>
> open(data,'C:\data.txt');
> @array=<data>;
>
> foreach $line(@array){
> if ($line =~ /B./){
> print "$line";
> }
> }
You want something like (UNTESTED):
#!/usr/bin/perl
use warnings;
use strict;
my $file = 'C:/data.txt';
open my $fh, '<', $file or die "Cannot open $file: $!";
my $account;
while ( <$fh> ) {
if ( /\A\d/ ) {
( $account = $_ ) =~ s/\s+\z//;
}
elsif ( s/\A\s+// ) {
print "$account $_";
}
}
__END__
John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
------------------------------
Date: Mon, 9 Oct 2006 07:06:42 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Data Extraction Hierarchial Report
Message-Id: <slrneikeqi.i70.tadmc@magna.augustmail.com>
banker123 <bradbrockman@yahoo.com> wrote:
> Should be
No it shouldn't.
It contains a whole boatload of bad practices.
> open(data,'C:\data.txt');
You should use UPPERCASE for bareword filehandles. Even better,
you should use a lexical file handle instead.
The 3-arg form of open() is much much safer.
You should always, yes *always*, check the return value from open().
You can use sensible slashes in paths that are not destined for
a Windows "shell".
> @array=<data>;
You should enable strictures in all of your Perl programs. That
requires that you declare each variable that you use.
> foreach $line(@array){
There is no need to read the entire file if you are going to
process it line-by-line anyway.
Just read it line-by-line, then it will continue to work even
on huge files.
> if ($line =~ /Account/){
Nothing wrong with that one, but some whitespace would make it
easier to read and understand.
> print "$line";
You should never quote a lone scalar variable. See:
perldoc -q vars
What's wrong with always quoting "$vars"?
So then, we end up with (untested):
use warnings;
use strict;
open( my $ACCOUNTS, '<', 'C:/data.txt' ) or
die "could not open the 'C:/data.txt' file $!";
while ( my $line = <$ACCOUNTS> ) {
if ( $line =~ /Account/ ) {
print $line;
}
}
close $ACCOUNTS;
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Mon, 09 Oct 2006 14:26:13 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: Data Extraction Hierarchial Report
Message-Id: <Xns98576A3A39CC2asu1cornelledu@127.0.0.1>
Tad McClellan <tadmc@augustmail.com> wrote in
news:slrneikeqi.i70.tadmc@magna.augustmail.com:
> banker123 <bradbrockman@yahoo.com> wrote:
>
>> Should be
>
>
> No it shouldn't.
>
> It contains a whole boatload of bad practices.
>
>
>> open(data,'C:\data.txt');
>
>
> You should use UPPERCASE for bareword filehandles.
One should probably note that the DATA filehandle has a special meaning
in Perl (allows one to read the data included in the source file
following __DATA__) so it should not be used for other purposes if only
to make it easier for others to understand what's going on.
> Even better, you should use a lexical file handle instead.
While $data/$DATA would be perfectly safe if used properly, it is easy
to accidentally type DATA and read from a very unexpected place. The
resulting problems may be hard to track down. So, I would recommend
avoiding variations of data as a filehandle.
>
> The 3-arg form of open() is much much safer.
>
> You should always, yes *always*, check the return value from open().
Of course, I agree with these and all your other comments. I just wanted
to point out the potential for confusion if the OP changed data to DATA.
Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
------------------------------
Date: Mon, 09 Oct 2006 14:35:04 GMT
From: "Mumia W. (reading news)" <paduille.4059.mumia.w@earthlink.net>
Subject: Re: Data Extraction Hierarchial Report
Message-Id: <s6tWg.9178$o71.5360@newsread3.news.pas.earthlink.net>
On 10/08/2006 10:30 PM, banker123 wrote:
> The following code will extract the header record, the challeng I am
> having is appending this to the detail records.
>
> open(data,'C:\data.txt');
> @array=<data>;
>
> foreach $line(@array){
> if ($line =~ /B./){
> print "$line";
> }
> }
Please bottom-post in this newsgroup.
That's good. Your data seems to have indented and unindented records.
How might you extract (and store) data from an unindented record?
--
Mumia W.
paduille.4059.mumia.w@earthlink.net
This is a temporary e-mail to help me catch some s-p*á/m.
------------------------------
Date: 9 Oct 2006 00:57:10 -0700
From: schwarzenschafe@gmail.com
Subject: Re: Distributed multitasking, POE, communication
Message-Id: <1160380630.826561.49250@b28g2000cwb.googlegroups.com>
Ben Morrow wrote:
> The way I would structure this would be to have each slave process (your
> 'POE management process') connect to a listening socket on the master
> process (your 'POE central server'), and then communicate with it by
> some protocol you define. It is perfectly possible to define the
> protocol such that either end can initiate transactions: you just have
> to be careful to make sure both ends don't try to speak at the same time
> so the transactions get mixed up. It's worth putting some though into
> the design of the protocol before you start programming: IMO HTTP/1.1 is
> a good model to follow, but your situation is somewhat complicated by
> the fact the either end can speak first.
I've been talking with Uri Guttman and he has me sold on Stem
(http://www.stemsystems.com), which will do this for me. I'm also
rethinking my strategy for managing the different servers, I have alot
of planning still. Thanks for the response!
SS
------------------------------
Date: Mon, 9 Oct 2006 12:38:02 +0200
From: "Ferry Bolhar" <bol@adv.magwien.gv.at>
Subject: Re: How to parse a new computer language in Perl?
Message-Id: <1160390283.200831@proxy.dienste.wien.at>
Anno:
>> Please recommand a simple computer language parser module in Perl.
>
> You want a parser generator. Search CPAN (http://search.cpan.org/)
> for "parser". You'll find at least Parse::RecDescent, probably more.
If someone has already experience with lex & yacc (the preferred U*X
tools for this kind of work), I'd suggest Parse::Lex and Parse::Yapp.
Mainly Parse::Yapp creates the same bottom-up parser as yacc does
(Parse::RecDescent creates top-down parsers) and allows to use
the same (almost) unchanged yacc grammar (rule) descriptions!
Code examples for both parsers can be found in "Advanced Perl
Programming, 2nd edition" (O' Reilly 2005, ISBN 0-596-00456-7).
Greetings, Ferry
--
Ing Ferry Bolhar
Magistrat der Stadt Wien - MA 14
A-1010 Wien
E-Mail: bol@adv.magwien.gv.at
------------------------------
Date: 9 Oct 2006 00:03:29 -0700
From: "Dale" <dale.gerdemann@googlemail.com>
Subject: Re: LWP and Unicode
Message-Id: <1160377409.025656.192110@b28g2000cwb.googlegroups.com>
Mumia W. (reading news) wrote:
> On 10/06/2006 03:55 AM, Dale wrote:
> > Mumia W. (reading news) wrote:
> Your data seems to be UTF8, but you advertise it as iso-8859-1. Don't
> you think that will confuse user agents such as LWP::UserAgent?
>
>
Yes, I know. It's not configured properly for serving UTF8. That's why
I at first put it at a different URL where UTF8 is handled correctly.
But I forgot that this site is only local.
------------------------------
Date: Mon, 9 Oct 2006 04:42:09 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Mon Oct 9 2006
Message-Id: <J6uqE9.zKB@zorch.sf-bay.org>
The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN). You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.
BSD-Sysctl-0.04
http://search.cpan.org/~dland/BSD-Sysctl-0.04/
Fetch sysctl values from BSD-like systems
----
CPAN-Inject-0.01
http://search.cpan.org/~adamk/CPAN-Inject-0.01/
Base class for injecting distributions into CPAN sources
----
CPAN-Inject-0.02
http://search.cpan.org/~adamk/CPAN-Inject-0.02/
Base class for injecting distributions into CPAN sources
----
CPAN-Inject-0.03
http://search.cpan.org/~adamk/CPAN-Inject-0.03/
Base class for injecting distributions into CPAN sources
----
CPAN-Inject-0.04
http://search.cpan.org/~adamk/CPAN-Inject-0.04/
Base class for injecting distributions into CPAN sources
----
CPAN-Reporter-0.28
http://search.cpan.org/~dagolden/CPAN-Reporter-0.28/
Provides Test::Reporter support for CPAN.pm
----
Catalyst-View-MicroMason-0.04_01
http://search.cpan.org/~jrockway/Catalyst-View-MicroMason-0.04_01/
MicroMason View Class
----
Clone-0.21
http://search.cpan.org/~rdf/Clone-0.21/
recursively copy Perl datatypes
----
Clone-0.22
http://search.cpan.org/~rdf/Clone-0.22/
recursively copy Perl datatypes
----
DBIx-Class-FormTools-0.000005
http://search.cpan.org/~djo/DBIx-Class-FormTools-0.000005/
Helper module for building forms with multiple related DBIx::Class objects.
----
Devel-LineTrace-0.1.6
http://search.cpan.org/~shlomif/Devel-LineTrace-0.1.6/
Apply traces to individual lines.
----
Devel-Symdump-2.0604
http://search.cpan.org/~andk/Devel-Symdump-2.0604/
dump symbol names or the symbol table
----
File-Next-OO-0.03
http://search.cpan.org/~borisz/File-Next-OO-0.03/
File-finding iterator Wrapper for File::Next::files function
----
Mail-Mailer-smtp_auth-0.01
http://search.cpan.org/~fayland/Mail-Mailer-smtp_auth-0.01/
a Net::SMTP_auth wrapper for Mail::Mailer
----
Net-Bluetooth-0.38
http://search.cpan.org/~iguthrie/Net-Bluetooth-0.38/
Perl Bluetooth Interface
----
POE-API-Peek-1.08
http://search.cpan.org/~sungo/POE-API-Peek-1.08/
Peek into the internals of a running POE environment
----
POE-API-Peek-1.0801
http://search.cpan.org/~sungo/POE-API-Peek-1.0801/
Peek into the internals of a running POE environment
----
PPM-Make-0.87
http://search.cpan.org/~rkobes/PPM-Make-0.87/
Make a ppm package from a CPAN distribution
----
Search-Tools-0.03
http://search.cpan.org/~karman/Search-Tools-0.03/
tools for building search applications
----
Term-Clui-1.37
http://search.cpan.org/~pjb/Term-Clui-1.37/
Perl module offering a Command-Line User Interface
----
Text-Bastardize-0.07
http://search.cpan.org/~ayrnieu/Text-Bastardize-0.07/
----
Text-Bastardize-0.08
http://search.cpan.org/~ayrnieu/Text-Bastardize-0.08/
----
WWW-Dict-0.0.1
http://search.cpan.org/~gugod/WWW-Dict-0.0.1/
Base class for WWW::Dict::* modules.
----
WWW-Dict-TWMOE-Phrase-0.04
http://search.cpan.org/~gugod/WWW-Dict-TWMOE-Phrase-0.04/
TWMOE Chinese Phrase Dictionary interface.
----
WWW-Dict-Zdic-0.0.3
http://search.cpan.org/~gugod/WWW-Dict-Zdic-0.0.3/
Zdic Chinese Dictionary interface
----
WWW-Dict-Zdic-0.0.4
http://search.cpan.org/~gugod/WWW-Dict-Zdic-0.0.4/
Zdic Chinese Dictionary interface
----
WWW-Dict-Zdic-v0.0.2
http://search.cpan.org/~gugod/WWW-Dict-Zdic-v0.0.2/
Zdic Chinese Dictionary interface
----
X11-Protocol-0.56
http://search.cpan.org/~smccam/X11-Protocol-0.56/
Perl module for the X Window System Protocol, version 11
----
XML-Genx-0.22
http://search.cpan.org/~hdm/XML-Genx-0.22/
A simple, correct XML writer
----
makepatch-2.03
http://search.cpan.org/~jv/makepatch-2.03/
create script to update a source tree
----
version-0.67_03
http://search.cpan.org/~jpeacock/version-0.67_03/
Perl extension for Version Objects
If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.
This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
http://www.stonehenge.com/merlyn/LinuxMag/col82.html
print "Just another Perl hacker," # the original
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
------------------------------
Date: Mon, 9 Oct 2006 10:00:12 +0200
From: "Ferry Bolhar" <bol@adv.magwien.gv.at>
Subject: Re: Output of Concise
Message-Id: <1160380813.46952@proxy.dienste.wien.at>
J. Gleixner:
> The documentation for B::Concise seems to explain the output pretty well.
No, it doesn't. It just mentions somewhat about "targets in brackets".
But it doesn't explain what these targets are and what the numbers
indicate.
Information is useless if one doesn't know what it means.
Greetings, Ferry
--
Ing Ferry Bolhar
Magistrat der Stadt Wien - MA 14
A-1010 Wien
E-Mail: bol@adv.magwien.gv.at
------------------------------
Date: 8 Oct 2006 21:30:56 -0700
From: "Davy" <zhushenli@gmail.com>
Subject: Re: Parse tree like data like XML by Perl?
Message-Id: <1160368256.901345.132120@m7g2000cwm.googlegroups.com>
Michael Goerz wrote:
> Davy wrote:
> > Hi all,
> >
> > I used to embed data in program use something like "case" or
> > "if..else". But my friend advice me to separate program and data. So I
> > want to use a tree-like data file like XML.
> Might be a good idea.
> >
> > The question is:
> > 1. Is there any better or easier standardized tree-like data structure
> > than XML?
> Probably not.
> > 2. If XML is better, I am new to it, what shall I learn to use it by
> > Perl?
> XML is pretty good. The best way to learn it is to do some reading and
> then jump in the water by coding a little bit. Some good resources to
> get you started:
>
> The absolute prime resource:
> http://perl-xml.sourceforge.net/faq/
>
> For a complete beginner maybe this is easiest:
> http://search.cpan.org/~grantm/XML-Simple-2.15/lib/XML/Simple.pm
>
> The full deal is something like this:
> http://search.cpan.org/~kmacleod/libxml-perl-0.08/
> Lot's of other modules on CPAN, too!
>
> Good Tutorial:
> http://www.xml.com/pub/a/2001/02/14/perlsax.html
> Check out other stuff at xml.com, too.
>
> Just do a little bit of reading, and then start coding, which is the
> best way to understand XML. There's somewhat of a learning curve,
> depending on how much you know alrady about perl and/or XML, but it
> should be worth it.
>
[snip]
Hi Michael,
Thanks a lot! I have write a lot of Perl code for hardware simulation
environment, but know a little about XML(I just know <>).
I will try to understand the prime you mention.
Best regards,
Davy
> Michael
------------------------------
Date: 9 Oct 2006 04:19:46 -0700
From: "Woland99" <woland99@gmail.com>
Subject: Parsing and preserving comments
Message-Id: <1160392786.352621.243280@i42g2000cwa.googlegroups.com>
Hi - I need to parse proprietary language (very Java-like syntax) in
order
to automate population of data. It would not be a problem if those
source
files were entirely generated by a scrript - converting spreadsheet
into
source code. Problem is that when they maintained them by hand they
introduced a lot of comments inside the source code. And they want to
keep making changes by hand if necessary. So I need to parse it but
preserve comments and their position before script would add any new
data or modify current attributes of objects. Does anybody have any
suggestions beyond piling up a lot of pattern matching and hoping to
cover all the corner cases?
JT
------------------------------
Date: Mon, 9 Oct 2006 12:05:45 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 1.6 $)
Message-Id: <egde80.i8.1@news.isolution.nl>
Ilya Zakharevich schreef:
> [PG-changes]
> One should not post rude replies even if you consider the message
> you reply to as violating these guidelines, as rude, as inbalanced,
> or as insane. If one can't post a polite informative up-to-a-point
> reply, one should not post at all...
Followups that are short and to-the-point and just informative to me,
are sometimes taken as rude by the OP. One is what one reads.
--
Affijn, Ruud
"Gewoon is een tijger."
------------------------------
Date: Mon, 9 Oct 2006 07:09:34 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 1.6 $)
Message-Id: <slrneikevu.i70.tadmc@magna.augustmail.com>
Ilya Zakharevich <nospam-abuse@ilyaz.org> wrote:
> [A complimentary Cc of this posting was sent to
> Tad McClellan
><tadmc@augustmail.com>], who wrote in article <slrneij37s.f2g.tadmc@magna.augustmail.com>:
>> > I was thinking about changing the tone of "Posting Guidelines".
>
>> If you post proposed changes, then they can be discussed.
> I think we should pay equal attention to the *other guys*; their pas
> are faux too. I won't be able to quickly invent somethign socially
> acceptable, but it may go along the lines
>
> One should not post rude replies even if you consider the message
> you reply to as violating these guidelines, as rude, as inbalanced,
> or as insane. If one can't post a polite informative up-to-a-point
> reply, one should not post at all...
I don't see that there is much diffence in that from what we
already have:
Do not use these guidelines as a "license to flame" or other
meanness. It is possible that a poster is unaware of things
discussed here. Give them the benefit of the doubt, and just
help them learn how to post, rather than assume that they do
know and are being the "bad kind" of Lazy.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Mon, 9 Oct 2006 08:29:39 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Syntax for getting web page links
Message-Id: <slrneikjm3.i70.tadmc@magna.augustmail.com>
dysgraphia <ldolan@bigpond.net.au> wrote:
> This is my first attempt at a perl script.
If you intend to learn Perl programming, then you should not
put code into your programs unless you understand why you need
to put that code in your program.
It is also a good idea to check the Perl FAQ for questions related
to what you are trying to accomplish. For instance, if you seen
this FAQ answser:
perldoc -q HTML
How do I fetch an HTML file?
Then you could replace 5 of your "use" statements with a single one.
> What I hope to do is have my script collect the links on the page
> listed under Events 2006
> So far I have only got this.
Thank you for including your code!
But you have included a bunch of stuff that you do not use.
If it is not used, then it should not be included.
The site you want to scrape does not require cookies, so don't
use cookies.
Your program does not use the DBI nor IO::Dir modules, so don't
include those modules.
> Any help to push me along
> would be appreciated!
Scraping a web page requires an intimate knowledge of the page's
structure and format.
The best and most robust way to process HTML data is with one of
the many HTML::* modules on the CPAN.
But for a dirty hack that prints URLs for the 2006 Events,
this should get you started:
---------------------------------------
#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple;
use URI::Escape;
my $html = get 'http://www.chessbase.com/events/index.asp';
$html =~ s/Events\s+/Events /g; # fix silly-formatted data
$html =~ s/.*Events 2006//s; # delete unwanted prefix
$html =~ s/Events 2005.*//s; # delete unwanted suffix
foreach my $line ( split /\n/, $html ) {
if ( $line =~ /eventname=([^"]+)/ ) {
my $eventname = uri_escape( $1 );
print "http://www.chessbase.com/eventlist.asp?eventname=$eventname\n"
}
}
---------------------------------------
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 9 Oct 2006 01:07:46 -0700
From: "Dale" <dale.gerdemann@googlemail.com>
Subject: Re: Usenet and charsets (was: Re: LWP and Unicode)
Message-Id: <1160381266.579581.32990@h48g2000cwc.googlegroups.com>
Bart Van der Donck wrote:
> (1) Default to ISO-8859-1 when possible, yes even with plain ASCII.
> (2) Use custom charset if the offered characters can unambiguously be
> represented in that charset and if ISO-8859-1 is too narrow; perhaps
> also considering browser settings/preferences.
> (3) Use UTF-8 if the above fails; I suppose mostly in charset
> combinations, 'tricky' replies or really exotic stuff.
and Alan J Flavell wrote:
> ... successful communication depends on a certain
> conservatism in what one sends - not relying on the generosity of the
> recipient to interpret it liberally.
Isn't UTF8 the most consertive choice nowadays? Look at Wikipedia or
Wiktionary. Massive international websites all in UTF8. And look at the
Russian Wikipedia. for example. It doesn't use a "custom charset" at
all.
The idea that UTF8 should be reserved for "really exotic stuff" seems
very weird. Look at any Wikipedia page dealing with mathematics, and
you're bound to find UTF8 used for quite normal things. Here, for
example, is the rule for the associativity of function composition:
f o (g o h) = (f o g) o h
Try to say that in ASCII or ISO-8859-1!
Dale
Dale
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 9823
***************************************