[30892] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2137 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Jan 18 06:09:40 2009

Date: Sun, 18 Jan 2009 03:09:07 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 18 Jan 2009     Volume: 11 Number: 2137

Today's topics:
    Re: Checking return values (was: Re: opening a file) <tim@burlyhost.com>
    Re: Checking return values (was: Re: opening a file) <whynot@pozharski.name>
    Re: Checking return values (was: Re: opening a file) <tadmc@seesig.invalid>
    Re: Checking return values (was: Re: opening a file) <hjp-usenet2@hjp.at>
    Re: Checking return values (was: Re: opening a file) <hjp-usenet2@hjp.at>
    Re: Circular lists <xhoster@gmail.com>
    Re: Circular lists <gamo@telecable.es>
    Re: fastest way to allocate memory ? <nospam-abuse@ilyaz.org>
        new CPAN modules on Sun Jan 18 2009 (Randal Schwartz)
        Parsing out text from in between HTML tags tgwaltz@googlemail.com
    Re: Parsing out text from in between HTML tags <thepoet_nospam@arcor.de>
    Re: The Seven Stages of a Perl Programmer (was: Re: Wha <tadmc@seesig.invalid>
    Re: The Seven Stages of a Perl Programmer (was: Re: Wha <whynot@pozharski.name>
    Re: The Seven Stages of a Perl Programmer (was: Re: Wha <hjp-usenet2@hjp.at>
    Re: What do you need to have to be considered a Master  (Tim McDaniel)
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 17 Jan 2009 17:14:06 -0800
From: Tim Greer <tim@burlyhost.com>
Subject: Re: Checking return values (was: Re: opening a file)
Message-Id: <ynvcl.78410$3_4.14465@newsfe10.iad>

Tad J McClellan wrote:

>> I would personally never intentionally fail
>> to check a return on a call
> 
> 
> Just to show that my "pendulum of inflexibility" can also swing
> the other way, I *never* check the return value from print().
> 
> print() has a return value to indicate success, as do many functions
> in Perl.

Okay, you got me.  Good point.  I didn't mean it as all should be.  I
can't say I've ever check the return value of print either.  I guess I
meant for any functions you should check. :-)
-- 
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!


------------------------------

Date: Sun, 18 Jan 2009 00:09:33 +0200
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: Checking return values (was: Re: opening a file)
Message-Id: <slrngn4lqs.j84.whynot@orphan.zombinet>

On 2009-01-17, Tad J McClellan <tadmc@seesig.invalid> wrote:
> Tim Greer <tim@burlyhost.com> wrote:
>
> [ recap: I wrote:
>
>     You should always, yes *always*, check the return value from open()
> ]
>
>
>> I would personally never intentionally fail
>> to check a return on a call
>
>
> Just to show that my "pendulum of inflexibility" can also swing
> the other way, I *never* check the return value from print().
>
> print() has a return value to indicate success, as do many functions in Perl.
>
> But once you have a successfully opened write filehandle, about the only
> thing that can go wrong with a print() is "filesystem full".
>
> (assuming the filehandle is connected to a real file rather than a 
>  socket or something.)
>
> If the filesystem is full, I won't need my little Perl program to tell
> me that something is wrong, because just about everything will fail to work.

Some time ago I'd concluded that checking return of B<close> of
RO filehandle is useles since that syscall would fail only in case when
a whole system crashed.  While reading that braindead thread I've came
to idea that I was somewhat wrong.  Am I?

*CUT*

-- 
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom


------------------------------

Date: Sat, 17 Jan 2009 21:10:53 -0600
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: Checking return values (was: Re: opening a file)
Message-Id: <slrngn57dt.t8o.tadmc@tadmc30.sbcglobal.net>

Eric Pozharski <whynot@pozharski.name> wrote:
> On 2009-01-17, Tad J McClellan <tadmc@seesig.invalid> wrote:
>> Tim Greer <tim@burlyhost.com> wrote:
>>
>> [ recap: I wrote:
>>
>>     You should always, yes *always*, check the return value from open()
>> ]
>>
>>
>>> I would personally never intentionally fail
>>> to check a return on a call
>>
>>
>> Just to show that my "pendulum of inflexibility" can also swing
>> the other way, I *never* check the return value from print().
>>
>> print() has a return value to indicate success, as do many functions in Perl.
>>
>> But once you have a successfully opened write filehandle, about the only
>> thing that can go wrong with a print() is "filesystem full".
>>
>> (assuming the filehandle is connected to a real file rather than a 
>>  socket or something.)
>>
>> If the filesystem is full, I won't need my little Perl program to tell
>> me that something is wrong, because just about everything will fail to work.
>
> Some time ago I'd concluded that checking return of B<close> of
> RO filehandle is useles since that syscall would fail only in case when
> a whole system crashed.  While reading that braindead thread I've came
> to idea that I was somewhat wrong.  Am I?


If it was a pipe open, then yes, you should have checked.

See the 3rd paragraph in:

   perldoc -f close


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Sun, 18 Jan 2009 10:17:31 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Checking return values (was: Re: opening a file)
Message-Id: <slrngn5stb.37g.hjp-usenet2@hrunkner.hjp.at>

On 2009-01-18 01:14, Tim Greer <tim@burlyhost.com> wrote:
> Tad J McClellan wrote:
>>> I would personally never intentionally fail
>>> to check a return on a call
>> Just to show that my "pendulum of inflexibility" can also swing
>> the other way, I *never* check the return value from print().
>> 
>> print() has a return value to indicate success, as do many functions
>> in Perl.
>
> Okay, you got me.  Good point.  I didn't mean it as all should be.  I
> can't say I've ever check the return value of print either.  I guess I
> meant for any functions you should check. :-)

Errors in file handle accumulate, so you can often defer checking until
close:

    open(my $fh, '>', $filename) or die ...;
    while (...) {
	my $a_record = compute_next_record();
	print $fh $a_record;
    }
    close($fh) or die "cannot close $filename: $!";

will die with the message "cannot close foo: No space left on device" if
the filesystem gets full during the loop.

That's fine for programs which run only a short time. But if your
program writes 100 million records and the disk gets full after 1
million, you don't want to waste time computing another 99 million
records which are never going to be written anyway. 

Similarily for filehandles which are never or infrequently closed, e.g.
logfiles of daemon processes. If you care about the data you write (or
about the rest of the system) you have to check the return value of
print.

You may not have to check each print, because - as mentioned above -
errors are "sticky". So if you have a loop like

    while (...) {
	print $fd $record_header;
	print $fd $first_part;
	some_sub_which_prints_more_parts();
	print $fd $record_trailer or die "...";
    }

it's probably sufficient to check the last print in the loop.
Or you can use $io->error:

    use IO::Handle;
    ...
    while (...) {
	print $fd $record_header;
	print $fd $first_part;
	some_sub_which_prints_more_parts();
	print $fd $record_trailer;
	die "..." if $fd->error;
    }



------------------------------

Date: Sun, 18 Jan 2009 10:29:41 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Checking return values (was: Re: opening a file)
Message-Id: <slrngn5tk5.37g.hjp-usenet2@hrunkner.hjp.at>

On 2009-01-17 22:09, Eric Pozharski <whynot@pozharski.name> wrote:
> On 2009-01-17, Tad J McClellan <tadmc@seesig.invalid> wrote:
>> Tim Greer <tim@burlyhost.com> wrote:
>>
>> [ recap: I wrote:
>>
>>     You should always, yes *always*, check the return value from open()
>> ]
>>
>>
>>> I would personally never intentionally fail
>>> to check a return on a call
>>
>> Just to show that my "pendulum of inflexibility" can also swing
>> the other way, I *never* check the return value from print().
>>
>> print() has a return value to indicate success, as do many functions in Perl.
>>
>> But once you have a successfully opened write filehandle, about the only
>> thing that can go wrong with a print() is "filesystem full".
>>
>> (assuming the filehandle is connected to a real file rather than a 
>>  socket or something.)
>>
>> If the filesystem is full, I won't need my little Perl program to tell
>> me that something is wrong, because just about everything will fail to work.

You may still do something sensible, like removing the partial file(s)
you have just written. Especially if they are several Gigs long.
And a message "no space left on device" is much more user-friendly than
causing "everything to fail" in mysterious ways.

Besides, on a properly administrated multi-user system, one program
should not be able to cause "everything" to fail. There are such things
like partitions, quotas, etc. which can be used to limit the damage.


> Some time ago I'd concluded that checking return of B<close> of
> RO filehandle is useles since that syscall would fail only in case when
> a whole system crashed.  While reading that braindead thread I've came
> to idea that I was somewhat wrong.  Am I?

Yes. Errors on a file handle are sticky, so close will also fail if any
I/O operation on that handle has failed. So checking the return value of
close is often a handy short-cut for checking every single print,
printf, etc. But see my other posting on this matter.

	hp


------------------------------

Date: Sat, 17 Jan 2009 16:04:39 -0800
From: Xho Jingleheimerschmidt <xhoster@gmail.com>
Subject: Re: Circular lists
Message-Id: <4972884c$0$25703$ed362ca5@nr5c.newsreader.com>

gamo wrote:
> On Thu, 15 Jan 2009, Xho Jingleheimerschmidt wrote:
> 
>> gamo wrote:
>>
>>> The thing could change radically if there is a method to canonicalize all
>>> the rotations of a list in a compact string. Did you say that is possible?
>> Of course.  From a previous post:
>>
>>     my $s = join '',@set;
>>     my $two = $s . $s;
>>     next if ($two =~ /gg/);
>>     my ($canon) = sort map {substr $two,$_,length $s} 0..length($s) -1;
>>
> 
> That's fine, but don't works for handling the circular permutations.

Of course it does.  That is the only thing it works for.  That is the 
whole point of the code.  When I originally posted the code, it was 
tested and verified to work, converging on the same answer as the other 
implementation did.


> It detects distintinct permutations

Distinct permutations need no canonicalization.  They do not have 
multiple different representations of the same underlying conceptual 
thing, so canonicalization is neither necessary nor possible.  (Unless 
you actually did append a digit to each letter to indicate the original 
order, as someone proposed as a pedagogical exercise.  Then stripping 
that digit off would be a form of canonicalization.)


> but not the circular ones, as far as
> I tested. Probably the problem comes for taking the first element of the 
> sorted list (only).

Taking just the first element of the sorted list (of strings, which are
themselves implicit lists of letters) is the whole point.  If, for some 
reason, you later need all the rotations, not just the canonical one, 
you can recompute then on the fly given the canonical one.  I can't 
comment on what your tests were showing you, unless you
post the test code.

 ...

> In the case of 20 letters it has 20*20 rotations per item. 

There only 20 rotations, not 20*20.  And you only save one of them, if that.

> About testing candidates on the fly I still don't imagine how could be 
> done without using hashes of the previous ones. 

Rather than using memory and hashes to explicitly compare to the past, 
you use logic and rules to implicitly compare to the past and future.
To pull it off you, need to combine both the distinct permutation 
generator and the circular canonicalization together, otherwise it won't 
work.

1) Has this linear permutation been generated in past, or will it be 
generated again in the future (of this particular execution of the 
code)?  No!

1a) How do you know that?  Because I successfully implemented a distinct 
permutation generator designed explicitly to have that behavior.

2) Will the underlying circular concept represented by this linear 
permutation be represented by some other linear permutation, past or 
future?  Yes!

2a) So how can I efficiently communicate with those past and future 
representations so that one and only one of us knows whether we are the 
"it" one or not?  Do it by rule, not by explicit communication.  Each of 
us tests whether our representation canonicalizes to itself.  If it 
does, I know I am "it".  If it doesn't,  I know that one of the other 
representations is (or will be) "it", so I bow out gracefully.

Cheers,

Xho


------------------------------

Date: Sun, 18 Jan 2009 10:15:38 +0100
From: gamo <gamo@telecable.es>
Subject: Re: Circular lists
Message-Id: <alpine.LNX.2.00.0901180958590.8148@jvz.es>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-833736700-1232270138=:8148
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Sat, 17 Jan 2009, Xho Jingleheimerschmidt wrote:

> > That's fine, but don't works for handling the circular permutations.
>=20
> Of course it does.  That is the only thing it works for.  That is the who=
le
> point of the code.  When I originally posted the code, it was tested and
> verified to work, converging on the same answer as the other implementati=
on
> did.
>=20
>=20
> > It detects distintinct permutations
>=20
> Distinct permutations need no canonicalization.  They do not have multipl=
e
> different representations of the same underlying conceptual thing, so
> canonicalization is neither necessary nor possible.  (Unless you actually=
 did
> append a digit to each letter to indicate the original order, as someone
> proposed as a pedagogical exercise.  Then stripping that digit off would =
be a
> form of canonicalization.)
>=20
>=20
> > but not the circular ones, as far as
> > I tested. Probably the problem comes for taking the first element of th=
e
> > sorted list (only).
>=20
> Taking just the first element of the sorted list (of strings, which are
> themselves implicit lists of letters) is the whole point.  If, for some
> reason, you later need all the rotations, not just the canonical one, you=
 can
> recompute then on the fly given the canonical one.  I can't comment on wh=
at
> your tests were showing you, unless you
> post the test code.
>=20
> ...

You are right, again.
#!/usr/local/bin/perl -w
use List::Util qw(shuffle);
@a =3D qw(a a a a a r r g g n);
# @a =3D qw(a a b b b c c c c d d d d d e e e e e e);
$n =3D @a;
for (1..10_000_000){
    @set =3D shuffle(@a);
    $s =3D join '',@set;
    $two =3D $s . $s;
    my ($canon) =3D sort map {substr $two,$_,$n} 0..$n-1;
    if (!defined $clist{$s} && !defined $hash{$canon}){
=09$clist{$s}++;
#=09print "$s -> length canon $canon ", length $canon, "\n";
=09$hash{$canon}++;
=09$exito++;
=09print "$0 $exito\n";
=09$kount=3D0;
    }else{
=09$kount++;
=09last if $kount>=3D100_000;
    }
}
print "El n=FAmero de permutaciones circulares es $exito\n";

__END__
756
 =20
>=20
> > In the case of 20 letters it has 20*20 rotations per item.=20
>=20
> There only 20 rotations, not 20*20.  And you only save one of them, if th=
at.
>=20
> > About testing candidates on the fly I still don't imagine how could be =
done
> > without using hashes of the previous ones.=20
>=20
> Rather than using memory and hashes to explicitly compare to the past, yo=
u use
> logic and rules to implicitly compare to the past and future.
> To pull it off you, need to combine both the distinct permutation generat=
or
> and the circular canonicalization together, otherwise it won't work.
>=20
> 1) Has this linear permutation been generated in past, or will it be gene=
rated
> again in the future (of this particular execution of the code)?  No!
>=20
> 1a) How do you know that?  Because I successfully implemented a distinct
> permutation generator designed explicitly to have that behavior.
>=20
> 2) Will the underlying circular concept represented by this linear permut=
ation
> be represented by some other linear permutation, past or future?  Yes!
>=20
> 2a) So how can I efficiently communicate with those past and future
> representations so that one and only one of us knows whether we are the "=
it"
> one or not?  Do it by rule, not by explicit communication.  Each of us te=
sts
> whether our representation canonicalizes to itself.  If it does, I know I=
 am
> "it".  If it doesn't,  I know that one of the other representations is (o=
r
> will be) "it", so I bow out gracefully.
>=20
Good idea.

> Cheers,
>=20
> Xho
>=20

Best regards,

--=20
http://www.telecable.es/personales/gamo/
"Was it a car or a cat I saw?"
perl -E 'say 111_111_111**2;'
--8323328-833736700-1232270138=:8148--


------------------------------

Date: Sun, 18 Jan 2009 08:34:08 +0000 (UTC)
From:  Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: fastest way to allocate memory ?
Message-Id: <gkupi0$m2m$1@agate.berkeley.edu>

[A complimentary Cc of this posting was sent to
georg.heiss@gmx.de
<georg.heiss@gmx.de>], who wrote in article <bbace9ea-54e6-4b7e-8b89-4e81e89d6a60@b38g2000prf.googlegroups.com>:
> Hi, i try to allocate 10GB of memory on my box and it takes about 27
> seconds.

You are allocating 30000MiB, not 10GB (and not 20GB, as somebody
mentioned) + malloc() overhead.

> my $gras = "A" x (1024 * 1024 * 10000);

RHS is constant, thus computed at compile time ==> 10000GiB.
LHS taks another 10000GiB.

> my $needle = "B";
> print "\nAllocated " . length($gras) . " byte buffer\n";
> $gras = $gras.$needle;

RHS takes another 10000GiB.

> Is there a faster way to do this?

 my $gras = "A";
 $gras x= 1024 * 1024 * 10000;
 $gras .= $needle;

would take 3x smaller memory footprint.

But if the *allocation* is a bottleneck, make sure your perl is
compiled with "my" malloc.  It takes major care to make allocation as
quick as possible...

  perl -V:".*malloc.*"

Hope this helps,
ILya


------------------------------

Date: Sun, 18 Jan 2009 05:42:25 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Sun Jan 18 2009
Message-Id: <KDnJup.wIH@zorch.sf-bay.org>

The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN).  You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.

AI-Genetic-Pro-0.31
http://search.cpan.org/~strzelec/AI-Genetic-Pro-0.31/
Efficient genetic algorithms for professional purpose. 
----
AI-Genetic-Pro-0.32
http://search.cpan.org/~strzelec/AI-Genetic-Pro-0.32/
Efficient genetic algorithms for professional purpose. 
----
Acme-Boolean-0.2
http://search.cpan.org/~gugod/Acme-Boolean-0.2/
There is more then one way to be true. 
----
Apache2-EmbedFLV-0.1
http://search.cpan.org/~damog/Apache2-EmbedFLV-0.1/
Embed FLV videos into a templated web interface using Flowplayer. 
----
Best-0.12
http://search.cpan.org/~gaal/Best-0.12/
Fallbackable module loader 
----
Binding-0.02
http://search.cpan.org/~gugod/Binding-0.02/
eval with variable binding of caller stacks. 
----
CPANPLUS-Dist-RPM-0.0.8
http://search.cpan.org/~rsrchboy/CPANPLUS-Dist-RPM-0.0.8/
a CPANPLUS backend to build RPM 
----
Cisco-Abbrev-0.02
http://search.cpan.org/~kbrint/Cisco-Abbrev-0.02/
Translate to/from Cisco Interface Abbreviations 
----
Config-IniFiles-2.46
http://search.cpan.org/~shlomif/Config-IniFiles-2.46/
A module for reading .ini-style configuration files. 
----
Config-IniHash-3.00.03
http://search.cpan.org/~jenda/Config-IniHash-3.00.03/
Perl extension for reading and writing INI files 
----
Finance-Quote-Sberbank-0.02
http://search.cpan.org/~kilork/Finance-Quote-Sberbank-0.02/
Obtain quotes from Sberbank (Savings Bank of the Russian Federation) 
----
GD-3DBarGrapher-0.9.6
http://search.cpan.org/~swarhurst/GD-3DBarGrapher-0.9.6/
Create 3D bar graphs using GD 
----
KiokuDB-0.22
http://search.cpan.org/~nuffin/KiokuDB-0.22/
Object Graph storage engine 
----
KiokuDB-Backend-BDB-0.12
http://search.cpan.org/~nuffin/KiokuDB-Backend-BDB-0.12/
BerkeleyDB backend for KiokuDB. 
----
KiokuDB-Backend-CouchDB-0.03
http://search.cpan.org/~nuffin/KiokuDB-Backend-CouchDB-0.03/
CouchDB backend for KiokuDB 
----
Language-Befunge-4.09
http://search.cpan.org/~jquelin/Language-Befunge-4.09/
a generic funge interpreter 
----
Markaya-0.40
http://search.cpan.org/~gugod/Markaya-0.40/
Markup As YAML 
----
Math-BaseCalc-1.013
http://search.cpan.org/~kwilliams/Math-BaseCalc-1.013/
Convert numbers between various bases 
----
MooseX-Traits-Attribute-CascadeClear-0.03
http://search.cpan.org/~rsrchboy/MooseX-Traits-Attribute-CascadeClear-0.03/
Attribute trait to cascade clearer actions 
----
Net-FSP-0.13
http://search.cpan.org/~leont/Net-FSP-0.13/
A client implementation of the File Service Protocol 
----
Net-FSP-0.14
http://search.cpan.org/~leont/Net-FSP-0.14/
A client implementation of the File Service Protocol 
----
Net-INET6Glue-0.3
http://search.cpan.org/~sullr/Net-INET6Glue-0.3/
Make common modules IPv6 ready by hotpatching 
----
Net-SMS-ASPSMS-0.1.3
http://search.cpan.org/~supcik/Net-SMS-ASPSMS-0.1.3/
Interface to ASMSMS services 
----
POE-Component-SmokeBox-Recent-1.01_01
http://search.cpan.org/~bingos/POE-Component-SmokeBox-Recent-1.01_01/
A POE component to retrieve recent CPAN uploads. 
----
Perl-Dist-1.11
http://search.cpan.org/~adamk/Perl-Dist-1.11/
Perl Distribution Creation Toolkit 
----
Perl-Dist-Strawberry-1.08
http://search.cpan.org/~adamk/Perl-Dist-Strawberry-1.08/
Strawberry Perl for win32 
----
Perl6ish-0.01
http://search.cpan.org/~gugod/Perl6ish-0.01/
Some Perl6 programming in Perl5 code. 
----
Perl6ish-0.02
http://search.cpan.org/~gugod/Perl6ish-0.02/
Some Perl6 programming in Perl5 code. 
----
Pod-ToDocBook-0.1
http://search.cpan.org/~zag/Pod-ToDocBook-0.1/
Pluggable converter POD data to DocBook. 
----
Provision-Unix-0.37
http://search.cpan.org/~msimerson/Provision-Unix-0.37/
provision accounts on unix systems 
----
RDF-RDFa-Parser-Trine-0.01
http://search.cpan.org/~kjetilk/RDF-RDFa-Parser-Trine-0.01/
Use a RDF::Trine::Model for the returned RDF graph 
----
Set-Scalar-1.23
http://search.cpan.org/~jhi/Set-Scalar-1.23/
basic set operations 
----
Syntax-Highlight-Perl6-0.034
http://search.cpan.org/~azawawi/Syntax-Highlight-Perl6-0.034/
Perl 6 Syntax Highlighter 
----
Sys-Sendfile-0.01
http://search.cpan.org/~leont/Sys-Sendfile-0.01/
Zero-copy data transfer 
----
Template-Plugin-Num2Word-0.30
http://search.cpan.org/~gugod/Template-Plugin-Num2Word-0.30/
Convert numbers to words in Template. 
----
Test-Continuous-0.61
http://search.cpan.org/~gugod/Test-Continuous-0.61/
Run your tests suite continusouly when developing. 
----
Test-Continuous-0.62
http://search.cpan.org/~gugod/Test-Continuous-0.62/
Run your tests suite continusouly when developing. 
----
Test-POE-Server-TCP-0.16
http://search.cpan.org/~bingos/Test-POE-Server-TCP-0.16/
A POE Component providing TCP server services for test cases 
----
Test-WWW-Mechanize-1.24
http://search.cpan.org/~petdance/Test-WWW-Mechanize-1.24/
Testing-specific WWW::Mechanize subclass 
----
Tkx-1.06
http://search.cpan.org/~gaas/Tkx-1.06/
Yet another Tk interface 
----
Video-FourCC-Info-1.0
http://search.cpan.org/~frequency/Video-FourCC-Info-1.0/
Find information about codecs specified as Four Character Code 
----
WWW-Amazon-Wishlist-1.602
http://search.cpan.org/~mthurn/WWW-Amazon-Wishlist-1.602/
grab all the details from your Amazon wishlist 
----
WWW-Bleep-0.9.1
http://search.cpan.org/~snevine/WWW-Bleep-0.9.1/
Perl interface to Bleep.com 
----
WWW-Bleep-0.91
http://search.cpan.org/~snevine/WWW-Bleep-0.91/
Perl interface to Bleep.com 
----
WWW-FreshMeat-API-0.01
http://search.cpan.org/~draegtun/WWW-FreshMeat-API-0.01/
inspect & update your freshmeat.net projects 
----
WWW-Pastebin-NoPasteCom-Create-0.001
http://search.cpan.org/~zoffix/WWW-Pastebin-NoPasteCom-Create-0.001/
create new pastes on http://nopaste.com/ pastebin site 
----
WWW-Pastebin-NoPasteCom-Create-0.0102
http://search.cpan.org/~zoffix/WWW-Pastebin-NoPasteCom-Create-0.0102/
create new pastes on http://nopaste.com/ pastebin site 
----
WWW-Pastebin-PastebinCom-Create-0.002
http://search.cpan.org/~zoffix/WWW-Pastebin-PastebinCom-Create-0.002/
paste to <http://pastebin.com> from Perl. 
----
Win32-API-0.58
http://search.cpan.org/~cosimo/Win32-API-0.58/
Perl Win32 API Import Facility 
----
XML-ExtOn-0.07
http://search.cpan.org/~zag/XML-ExtOn-0.07/
The handler for expansion of Perl SAX by objects. 
----
XML-FeedPP-0.37
http://search.cpan.org/~kawasaki/XML-FeedPP-0.37/
Parse/write/merge/edit RSS/RDF/Atom syndication feeds 
----
XML-TreePP-0.37
http://search.cpan.org/~kawasaki/XML-TreePP-0.37/
Pure Perl implementation for parsing/writing XML documents 
----
indirect-0.10
http://search.cpan.org/~vpit/indirect-0.10/
Lexically warn about using the indirect object syntax. 


If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.

This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
  http://www.stonehenge.com/merlyn/LinuxMag/col82.html

print "Just another Perl hacker," # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion


------------------------------

Date: Sat, 17 Jan 2009 19:59:43 -0800 (PST)
From: tgwaltz@googlemail.com
Subject: Parsing out text from in between HTML tags
Message-Id: <949a69c8-3651-4b35-9ca3-f631355f004f@z28g2000prd.googlegroups.com>

Hello -

I'm new to perl and am having a tough time trying to complete a
theoretically simple statement.  What I'm trying to do is write a very
simple search engine that searches an html file for a given
searchQuery.  The way it's set up now is that if the searchQuery is
something like "java," every single page is a hit because the word
"javascript" is in the code in the form of the "<script
language="javascript">" etc.  I want to specify that $searchQuery
should be surrounded like so:

">(anything)searchQuery(anything)<"

In other words, the searchQuery has to be in between two HTML tags.
Here's what I have at this point (the wrong way):

return unless ($fileName =~ /\Q$searchQuery\E/i);

Any help would be greatly appreciated!

Thanks,
TW


------------------------------

Date: Sun, 18 Jan 2009 08:06:41 +0100
From: Christian Winter <thepoet_nospam@arcor.de>
Subject: Re: Parsing out text from in between HTML tags
Message-Id: <4972d4dc$0$31877$9b4e6d93@newsspool3.arcor-online.net>

tgwaltz@googlemail.com schrieb:
> I'm new to perl and am having a tough time trying to complete a
> theoretically simple statement.  What I'm trying to do is write a very
> simple search engine that searches an html file for a given
> searchQuery.  The way it's set up now is that if the searchQuery is
> something like "java," every single page is a hit because the word
> "javascript" is in the code in the form of the "<script
> language="javascript">" etc.  I want to specify that $searchQuery
> should be surrounded like so:
> 
> ">(anything)searchQuery(anything)<"
> 
> In other words, the searchQuery has to be in between two HTML tags.
> Here's what I have at this point (the wrong way):
> 
> return unless ($fileName =~ /\Q$searchQuery\E/i);
> 
> Any help would be greatly appreciated!

Most like you'll only get a partly working solution if you approach
this problem with a regular expression. There are all kinds of
things that can go wrong. I'd leave the parsing of the HTML to
a module that knows what it's doing, like HTML::TreeBuilder.

use HTML::TreeBuilder;

my $t = HTML::TreeBuilder->new_from_file( "input.html" );
# or my $t = HTML::TreeBuilder->new_from_content( $html );
if( index( $t->as_text, $searchQuery ) >= 0 ) {
   # ... found ...
}

Using this module, you could also search for your query in
different attributes, e.g. link titles:

my $foundlinks = $t->look_down(
   '_tag',  'a',
   'title', qr/$searchQuery/
);
if( $foundlinks ) {
   # ... had a hit ...
}


-Chris


------------------------------

Date: Sat, 17 Jan 2009 21:12:17 -0600
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: The Seven Stages of a Perl Programmer (was: Re: What do you need to have to be considered a Master at Perl?)
Message-Id: <slrngn57gh.t8o.tadmc@tadmc30.sbcglobal.net>

Eric Pozharski <whynot@pozharski.name> wrote:
> On 2009-01-17, Tad J McClellan <tadmc@seesig.invalid> wrote:

>>    The Seven Stages of a Perl Programmer


>>     * Understands why regexes can't match nested data.
>
> If I got that right, it's not the case any more.


The seven stages is circa 1997.


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Sun, 18 Jan 2009 00:04:17 +0200
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: The Seven Stages of a Perl Programmer (was: Re: What do you need to have to be considered a Master at Perl?)
Message-Id: <slrngn4lh1.j84.whynot@orphan.zombinet>

On 2009-01-17, Tad J McClellan <tadmc@seesig.invalid> wrote:
> sln@netherlands.com <sln@netherlands.com> wrote:
>
>> Subject: What do you need to have to be considered a Master at Perl?
*SKIP*
> I can't find Tom Christiansen's original, so I'll go with Nat's 
> reconstruction of:
>
>    The Seven Stages of a Perl Programmer
>
>
> (From http://prometheus.frii.com/~gnat/yapc/2000-stages/)
>
*SKIP*
>     * Understands why regexes can't match nested data.

If I got that right, it's not the case any more.

>     * Rewrites minor utilities in Perl.

Hmm...

*CUT*

-- 
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom


------------------------------

Date: Sun, 18 Jan 2009 10:38:41 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: The Seven Stages of a Perl Programmer (was: Re: What do you need to have to be considered a Master at Perl?)
Message-Id: <slrngn5u51.37g.hjp-usenet2@hrunkner.hjp.at>

On 2009-01-17 22:04, Eric Pozharski <whynot@pozharski.name> wrote:
> On 2009-01-17, Tad J McClellan <tadmc@seesig.invalid> wrote:
>> sln@netherlands.com <sln@netherlands.com> wrote:
>>> Subject: What do you need to have to be considered a Master at Perl?
> *SKIP*
>> I can't find Tom Christiansen's original, so I'll go with Nat's 
>> reconstruction of:
>>
>>    The Seven Stages of a Perl Programmer
>>
>> (From http://prometheus.frii.com/~gnat/yapc/2000-stages/)
>>
> *SKIP*
>>     * Understands why regexes can't match nested data.
>
> If I got that right, it's not the case any more.

Right. I expected a point "has figured out how to match nested data with
regexps" in one of the later stages, but it wasn't there ;-).

	hp


------------------------------

Date: Sat, 17 Jan 2009 21:03:49 -0600
From: tmcd@panix.com (Tim McDaniel)
Subject: Re: What do you need to have to be considered a Master at Perl?
Message-Id: <lpia46-qfv.ln1@tmcd.austin.tx.us>

In article <Zo2dnQ4xtqrsxu_UnZ2dnUVZ_tTinZ2d@giganews.com>,
~greg <g_m@remove-comcast.net> wrote:
>The perlre section on (??{code}) warns that
>
>   Recursing deeper than 50 times without consuming
>   any input string will result in a fatal error. The maximum
>   depth is compiled into perl, so changing it requires a
>   custom build.
 ...
>If the warning had said "recursing deeper than 50 times...will result
>in a fatal error" then I would have felt I understood it.

If I read that, I would assume that the coder had alocated a
fixed-size stack of 50 elements.  (I'd say that the design was
inferior to one without a fixed small limit, of course.)

>What I do not understand is how "consuming ... input string" could
>change the situation.

I strongly suspect it's a heuristic.  It's not that the code could not
recurse deeper in that case or others.  It's that the designer chose
to forbid this particular case.  "Without consuming any input string"
means that for 50 calls, the recursion made absolutely no progress.
They figured that, if it's made no progress in 50 recursions, it's
never going to make progress.  It's a rule-of-thumb intended to stop
an infinite recursion that's going nowhere, before it sucks down all
of available memory, and sucks down oodles of time filling it.

As an analogy, it's like the pseudocode

    ping -c 1 -w 1 $MY_ISP
    if exit code was not 0,
        die with an error saying that we can't reach Internet host $MY_ISP

Maybe the machine's network connection was flaky or just coming up.
Maybe if I'd instead had it ping 2 times (-c 2) or take 2 seconds to
check (-w 2), it would have gotten a response.  But I chose a
rule-of-thumb: if we can't get one ping packet back in one second,
the network is deemed to be unusable.

-- 
Tim McDaniel; Reply-To: tmcd@panix.com


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2137
***************************************


home help back first fref pref prev next nref lref last post