[22676] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4897 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Apr 26 14:05:48 2003

Date: Sat, 26 Apr 2003 11:05:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 26 Apr 2003     Volume: 10 Number: 4897

Today's topics:
    Re: "make tardist" does not include subdirectories <pilsl_usenet@goldfisch.at>
    Re: "make tardist" does not include subdirectories <pilsl_usenet@goldfisch.at>
    Re: analyising haskell source for <<loop>> dependencies (Jason Smith)
    Re: check if it's an English word (Walter Roberson)
    Re: Curses and perl5.8 <tassilo.parseval@rwth-aachen.de>
    Re: Curses and perl5.8 <REMOVEsdnCAPS@comcast.net>
    Re: Dumping hash after sort <tassilo.parseval@rwth-aachen.de>
    Re: Errors running Randal and Damian's Parse::RecDescen (Randal L. Schwartz)
    Re: Getting FULL path+filename from a filehandle <krahnj@acm.org>
    Re: How to find out installed packages on Unix <bigj@kamelfreund.de>
    Re: How to send and receive on IP PORT? <sammie@greatergreen.com>
    Re: Is there an array for ($1, $2, $3, ...) <krahnj@acm.org>
    Re: Is there an array for ($1, $2, $3, ...) <bigj@kamelfreund.de>
    Re: Is there an array for ($1, $2, $3, ...) <ptlen@ceti.pl>
    Re: Is there an array for ($1, $2, $3, ...) <uri@stemsystems.com>
    Re: Is there an array for ($1, $2, $3, ...) <ptlen@ceti.pl>
    Re: Is there an array for ($1, $2, $3, ...) <uri@stemsystems.com>
    Re: Just curous about this- are REGEXes rigorously dete <private@claudio.ch>
        Parsing HTML pages... best way out of the dozens of opt <nerdy1@snet.net>
    Re: Parsing HTML pages... best way out of the dozens of <bongie@gmx.net>
    Re: Regex greediness question <nerdy1@snet.net>
    Re: Tough question for the guru's; Grep Once, Awk Twice <tassilo.parseval@rwth-aachen.de>
    Re: Tough question for the guru's; Grep Once, Awk Twice <bart.lateur@pandora.be>
    Re: XS or SWIG <peter_wilson@mail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 26 Apr 2003 14:21:06 +0200
From: peter pilsl <pilsl_usenet@goldfisch.at>
Subject: Re: "make tardist" does not include subdirectories
Message-Id: <3eaa79f3$2@e-post.inode.at>

Janek Schleicher wrote:

> 
> Have you tried to write the files of the submodules into the MANIFEST
> file?
> 

so easy if one knows ... :)

thnx,
peter

-- 
peter pilsl
pilsl_usenet@goldfisch.at
http://www.goldfisch.at



------------------------------

Date: Sat, 26 Apr 2003 18:18:08 +0200
From: peter pilsl <pilsl_usenet@goldfisch.at>
Subject: Re: "make tardist" does not include subdirectories
Message-Id: <3eaab183$1@e-post.inode.at>

Anno Siegel wrote:

> 
> Have you re-run "perl Makefile.PL" after adding the additional modules?
> That should fix it.
> 

yep - I tried this several times. It didnt fix the issue. I had to change 
the MANIFEST-file like Janek suggested.

thnx,
peter

-- 
peter pilsl
pilsl_usenet@goldfisch.at
http://www.goldfisch.at



------------------------------

Date: 26 Apr 2003 01:07:37 -0700
From: chastel_pelerin@hotmail.com (Jason Smith)
Subject: Re: analyising haskell source for <<loop>> dependencies.
Message-Id: <623cd325.0304260007.41cc0724@posting.google.com>

Hmm just reading the "Tough question" thread, let me rephrase, just to
make sure,

All I want is a couple of pointers on how to pull out the variable
names from either side of the '=' operator and compare the two lists.

No writing the solution for me required.

Thanks
Jason.

> Hi All, I have little/no Perl experience and was wondering if anyone
> could quickly hash this requirement out for me...
> 
> basically given say a haskell source, snippet below.
> 
> fvars   = TPX.listDiffBy cmpVar nl' pdl
> pdl'    = fvars `union` pdl
> 
> It can identify where this may occur
> 
> pdl'    = fvars `union` pdl'
> 
> i.e. create a <<loop>> during runtime?
> 
> so basically I evision a script that will check either side of a '='
> operator and determine if we are using a variable in an assignment to
> that variable.
> 
> Thanks
> J.


------------------------------

Date: 26 Apr 2003 04:48:58 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: check if it's an English word
Message-Id: <b8d33q$rhn$1@canopus.cc.umanitoba.ca>

In article <776e0325.0304251939.53141b73@posting.google.com>,
Sara <genericax@hotmail.com> wrote:
|roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote in message news:<b8bl3i$7f3$1@canopus.cc.umanitoba.ca>...
|> In article <3EA95376.363F6B8C@tamu.edu>,
|> Bing Du Test  <bing-du@tamu.edu> wrote:
|> :Is there any Perl module that can verify if part of a string is an
|> :English word?

|> There is no reliable way to do this in any computer language.
|> New English words are invented pretty much every day, and there is
|> no central registry of valid English words.

|Perhaps you should take this up with the makers of "Scrabble"? Your
|argument makes their product useless, yet many enjoy it each day.

The rules of Scrabble are NOT written to permit any "English word":
the rules of Scrabble are written to only permit words that are in the
Official Scrabble(R) Player's Dictionary.

If Bing had wanted a module to verify whether part of a string is in
the Official Scrabble(R) Player's Dictionary, then Bing could have said
so.  It's not a particularily big list of words, as word-list goes.

:Perhaps to 99.99% he can be assured its an English word? Let's try to
:maintain some notion of realism in our programming pursuits?

I guess he could try writing a module that attempted to reverse map
out prefixes and suffixes to get a reasonable stem, and then submit the
stem for lookup to www.oed.com (which is not free.) The Second Edition
had 291,500 entries (that's words without all the variants due to suffexing
unless the suffex noticably changes the meaning.)

The Official Scrabble(R) Player's Dictionary has "over 100,000"
2-8 letter words. "You cannot use proper names, foreign words
that are not in common use, and words that contain hyphens are not allowed."
Note that the word 'dictionary' has more than 8 letters and so would not
be listed...

If one is willing to accept OED as being somehow authoratative as
to whether a word is English or not, and if one had a perfect
word-stemming algorithm, then 99.99% accuracy over 291,500 main entries
would allow only 29 misses. Using the Official Scrabble(R) Player's
Dictionary as the source would, though, miss out on about 180,000
words... and that's without even considering the extra words in
the third edition that OED is about half way through producing.

By way of comparison: the standard Unix /usr/lib/dict/words
has only less than 24000 entries, and thus would miss more than 90%
of English words as determined by OED.


Sooo... I'd say that if, as you say, you want some notion of
realism in this program, that you should be prepared to accept 
an completeness rate of no better than 1 in 3 "English words". 


That having been said, 99.99% accuracy on randomly chosen pages
of English text might be obtainable, as English word distribution
is not at all Uniformly Random, and the frequency of the 90% least
often words in English might perhaps be less than 0.01% total between
them.
-- 
Are we *there* yet??


------------------------------

Date: 26 Apr 2003 08:06:01 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: Curses and perl5.8
Message-Id: <b8del9$d0b$1@nets3.rz.RWTH-Aachen.DE>

Also sprach Eric J. Roode:

> "David Formosa (aka ? the Platypus)" <dformosa@dformosa.zeta.org.au>
> wrote in news:slrnbahp8p.phe.dformosa@dformosa.zeta.org.au:
> 
>> I'm trying to install Curses.pm under perl5.8 however when I try it
>> fails to compile.  I can't see anything on deja-google about it is
>> this a know problem or something I should submit a bug report on. 
>  
> Let me guess:  Perl_sv_isa?
> 
> This is a known problem, although I don't know why there isn't more
> Net noise about it.
> 
> You need to change the above symbol to "sv_isa".  The compiler
> message tells you where the error is -- some .c file, line 250 or 275
> (I forget exactly).  Just change that one line and re-run make. 
> It'll work.

Are you sure that this is the cause? Compilation worked fine for me when
I tried it yesterday on Perl5.8.0, despite the Perl_sv_isa thing. Also
looking at the perl-headers, sv_isa is eventually #defined as
Perl_sv_isa so it should be the same thing.

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: Sat, 26 Apr 2003 10:39:46 -0500
From: "Eric J. Roode" <REMOVEsdnCAPS@comcast.net>
Subject: Re: Curses and perl5.8
Message-Id: <Xns936976A3C61A0sdn.comcast@216.166.71.239>

-----BEGIN xxx SIGNED MESSAGE-----
Hash: SHA1

"Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de> wrote in
news:b8del9$d0b$1@nets3.rz.RWTH-Aachen.DE:

> Are you sure that this is the cause? Compilation worked fine for me
when
> I tried it yesterday on Perl5.8.0, despite the Perl_sv_isa thing.
Also
> looking at the perl-headers, sv_isa is eventually #defined as
> Perl_sv_isa so it should be the same thing.

I suspect that there is some sort of conditional compilation going
on, so that under some circumstances (I don't know what), one of the
#defines doesn't happen.

- -- 
Eric
print scalar reverse sort qw p ekca lre reh 
ts uJ p, $/.r, map $_.$", qw e p h tona e;
-----BEGIN xxx SIGNATURE-----
Version: GnuPG v1.2.1 (MingW32) - WinPT 0.5.13

iD8DBQE+qqg/Y96i4h5M0egRAggAAJsFSc+XJWgfSTUTe4Zoy1xxOsdMQwCfZzWb
KggF5GhzhD/q45CBwE42Ymc=
=IMmV
-----END PGP SIGNATURE-----


------------------------------

Date: 26 Apr 2003 08:02:09 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: Dumping hash after sort
Message-Id: <b8dee1$csl$1@nets3.rz.RWTH-Aachen.DE>

Please don't top-post. Before you do anything, halt and read
<http://www.xs4all.nl/~wijnands/nnq/nquote.html>, Q7 in particular.

Also sprach TruthXayer:

> Thanks Tassilo, but somehow I can't get it to work even
> for a simple sort by value. I could work around by sorting
> the hash in normal way but just trying to get the elegant 
> solution to work. Any help is appreciated...
> 
> my %baz = (
> 			"A" => 34,
> 			"B" => 75,
> 			"C" => 21,
> 		);
> 
> $Data::Dumper::Sortkeys = \&my_filter2;
> 
> sub my_filter2 {
>              my ($hash) = shift;
>              return [    sort { $$hash{b} <=> $$hash{$a} }
                                         ^             ^^
>               (keys %$hash) ];
>              
>          }
> 
> print Dumper ( \%baz );

If I run the above with warnings enabled, I get

Use of uninitialized value in numeric comparison (<=>) at dump.pl line 12.
Use of uninitialized value in numeric comparison (<=>) at dump.pl line 12.
Use of uninitialized value in numeric comparison (<=>) at dump.pl line 12.

And indeed, there's again this 'b' instead of '$b' type in a
hash-subscript. Didn't we just recently had this in another thread?

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: Sat, 26 Apr 2003 04:43:12 GMT
From: merlyn@stonehenge.com (Randal L. Schwartz)
To: loopy1@ureach.com (J. H.)
Subject: Re: Errors running Randal and Damian's Parse::RecDescent examples
Message-Id: <4cd147891cfe9c9d87811f6aac531347@TeraNews>

>>>>> "J" == J H <loopy1@ureach.com> writes:

J> But, I'm having difficulty getting them to work, either. When I run
J> Randal Schwartz's example from
J> http://www.stonehenge.com/merlyn/UnixReview/col40.html,
J> using the following code (against my computer's win.ini file):

J> #! /usr/bin/perl -w

J> use Parse::RecDescent;

J> my $grammar = q{

J>     file:
J>     sections /\z/
J>     { my %return;
J>       my $sections = $item{sections};
J>       for my $section (@$sections) {
J>         my ($section_marker, $definitions) = @$section;
J>         for my $definition (@$definitions) {
J>           my ($key, $value) = @$definition;
J>           for ($return{$section_marker}{$key}) {
J>             if (not defined $_) {
J>               $_ = $value;
J>             } elsif (not ref $_) {
J>               $_ = [$_, $value];
J>             } else {
J>               push @$_, $value;
J>             }
J>           }
J>         }
J>       }
J>       \%return;
J>     }

J> };

J> my $parser = Parse::RecDescent->new($grammar);

J> I get the following error:

J> Warning: Undefined (sub)rule "sections" used in a production.

Uh, that's not the whole grammar.  You have to read the whole article!

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


------------------------------

Date: Sat, 26 Apr 2003 09:13:36 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: Getting FULL path+filename from a filehandle
Message-Id: <3EAA4DA2.96A2B490@acm.org>

"Michael P. Broida" wrote:
> 
> "John W. Krahn" wrote:
> >
> > "Michael P. Broida" wrote:
> > >
> > > I've been digging through the Camel book and haven't been
> > > able to find out how to get the FULL path+filename from a
> > > filehandle.
> >
> > Assuming that you are running Linux:
> >
> > my $fullpath = readlink "/proc/$$/fd/" . fileno filehandle;
> 
>         Any ideas how to do that on Win2K?  I'm not lucky enough
>         to be using any kind of Unix here.  <grin>

Sorry, I haven't used any form of Windows for many years now.

>         What is the "fileno" in your example??

perldoc -f fileno

>         Oh, I think I see
>         it (now that I see what the $$ part was all about).  I don't
>         think there's any true parallel to that in Windows, though.

From past experience writing programs for DOS and Windows, no there
isn't.

>         It seems (from the Camel book) that "readlink" requires a
>         symbolic link:
>                 EXPR should evalute to a filename, the last
>                 component of which is a symbolic link.
>         It also only wants one argument, not two as in your example.

The example has one argument.  The two strings are joined together via
the concatenation operator (it's a period, not a comma.)


John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Sat, 26 Apr 2003 14:55:58 +0200
From: "Janek Schleicher" <bigj@kamelfreund.de>
Subject: Re: How to find out installed packages on Unix
Message-Id: <pan.2003.04.26.12.55.55.926398@kamelfreund.de>

Wang, Vincent wrote at Fri, 25 Apr 2003 14:28:19 -0700:

> Do you know how to find out the Perl packages that already installed on
> my UNIX or Linux, like “ppm query” does on win32?

You might also have a look to the CPAN module
ExtUtils::Installed

e.g.
perl -MExtUtils::Installed -e 'print join "\n", ExtUtils::Installed->new->modules'

Greetings,
Janek


------------------------------

Date: Sat, 26 Apr 2003 06:57:52 GMT
From: "Brad Walton" <sammie@greatergreen.com>
Subject: Re: How to send and receive on IP PORT?
Message-Id: <Q5qqa.626681$L1.178343@sccrnsc02>

Thank you Mina, I have been searching all day, looking at Socket() and
IO::Socket::INET->new(), trying to find a fairly simple solution. Let me go
into a little more detail, and see if you have any ideas for an easier
solution, or may an already existing script.

I have a program sitting on a server that already has the data formatted and
ready to send.
On the other machine, I simply need to sit there and listen for information
coming from the server. Once that info is received, it grabs it, and puts it
in an array or some sort of results param.

It should be real basic, and dumb... meaning neither side should care if the
other side is there. I guess UDP is the way to go for this. Does that bring
any other ideas to mind?

Thanks again, I appreciate the time, and I will look through the links,
Sammie (Brad)


"Mina Naguib" <spam@thecouch.homeip.net> wrote in message
news:IUlqa.14305$_w.273717@wagner.videotron.net...
> -----BEGIN xxx SIGNED MESSAGE-----
> Hash: SHA1
>
> Brad Walton wrote:
> > I am looking for information on how to send information and receive
(listen)
> > for information on a port. For example, I want to have a perl program
> > running on one PC, while another sits on a remote machine and listens
for
> > incoming data on a specified port. What would this process be called?
And
> > are their any examples or tutorials of how this is accomplished?
> >
> > Thanks,
> > Sammie
> >
> >
>
> Hi Sammie, or Brad...
>
> This is simple IP traffic (Internet Protocol).  As long as both machines
> are on the same IP network (for example, the internet) then you can
> easily make them talk to each other.
>
> Start your quest by lots of reading. Here are some resources to get you
> started:
> http://www.perldoc.com/perl5.8.0/pod/perlipc.html
> http://search.cpan.org/author/JHI/perl-5.8.0/ext/IO/lib/IO/Socket/INET.pm
> http://search.cpan.org/author/JHI/perl-5.8.0/ext/IO/lib/IO/Socket.pm
> http://search.cpan.org/author/JHI/perl-5.8.0/ext/Socket/Socket.pm
> http://search.cpan.org/author/JWIED/Net-Daemon-0.37/lib/Net/Daemon.pm
>
> And my own:
> http://search.cpan.org/author/MNAGUIB/EasyTCP-0.19/EasyTCP.pm
>
> The above is just a small list of MANY available modules to help you do
> what you mentioned.
>
> Best of luck.
> -----BEGIN xxx SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQE+qeqoeS99pGMif6wRAtD0AKCHd3drtOK7hZup5YUEXCKjl0a2hQCfVXx2
> WpIVeIBC0QoOLIGjSOWb4ns=
> =0dSQ
> -----END PGP SIGNATURE-----
>




------------------------------

Date: Sat, 26 Apr 2003 06:53:18 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: Is there an array for ($1, $2, $3, ...)
Message-Id: <3EAA2CAD.35806D6@acm.org>

Sara wrote:
> 
> Yes much more stylish. These constructs come to mind when I'm
> "mapping" like
> 
>     my %h = map /..()..()../, @a;
> 
> but you're correct, it looks like it would have broader application.
> That's something I like about Perl - just like watching Spinal Tap, no
> matter how many times you think you've seen it all something else pops
> up you never noticed!
> 
> By the way, in like 500,000 lines of Perlcode now I've never used m//,
> only //. I never did understand that "m"!

The "m" is optional if the delimiters are //.  The "m" is required
however if you use any delimiters other than //, e.g. m<>, m##, m||,
m!!, m{}, etc.


John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Sat, 26 Apr 2003 15:13:18 +0200
From: "Janek Schleicher" <bigj@kamelfreund.de>
Subject: Re: Is there an array for ($1, $2, $3, ...)
Message-Id: <pan.2003.04.26.13.13.15.154053@kamelfreund.de>

Sara wrote at Fri, 25 Apr 2003 17:00:24 -0700:

> but you're correct, it looks like it would have broader application.
> That's something I like about Perl - just like watching Spinal Tap, no
> matter how many times you think you've seen it all something else pops
> up you never noticed!
> 
> By the way, in like 500,000 lines of Perlcode now I've never used m//,
> only //. I never did understand that "m"!

As John said, 
it's mainly useful if you need/want another delimiters.

However, I have already used even the explicit m/.../-style.
One main reason for it,
is that editors can get confused in some situations
(regexps over more than one line,
 with #comments inside,
 ending on $ -- so it might look like the $/ variable,
 regexps as a little part of a longer statement,
 ...
)
where they need an explicit help to detect the /.../ part as a regexp for
the syntax highlighting.

Sometimes it also helps to increase the readability (quite without syntax
highlighting), especially if you work with many different /.../ parts,
e.g. with
qw/..../
s/.../.../;
tr/.../.../;
qq/.../;
q/.../;
qr/.../;
then I find it often useful also to write the regexp as m/.../
(Of course you could argue that using the same delimiters for many
 different purposes is a bad programming habit
 - in fact I prefer q{...} and qq{...} to distinguish if possible -
 but you'll need it at least when changing a script that wasn't written by
 yourself
)

But most important is TMTWTDI.


Greetings,
Janek


------------------------------

Date: Sat, 26 Apr 2003 15:09:37 +0000 (UTC)
From: Filip G. <ptlen@ceti.pl>
Subject: Re: Is there an array for ($1, $2, $3, ...)
Message-Id: <slrnbal4pi.th.ptlen@localhost.localdomain>

Sara pisze:
> Of course we can use @ARGV for ($ARGV[0], $ARGV[1], ...)
> and @_ for ($_[0], $_[1],...)
> 
> Is there an array defined for ($1, $2, $3,...)  after a regex defines
> them? I did a perldoc -q on regex and some other keywords but I didn't
> discover anything related.

No, there is no array, but you can use symbolic references:

/your-regexp-goes-here/
for$i(1..9)
{
	print "${$i}\n";
}




------------------------------

Date: Sat, 26 Apr 2003 17:00:14 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Is there an array for ($1, $2, $3, ...)
Message-Id: <x7he8lfb35.fsf@mail.sysarch.com>

>>>>> "FG" == Filip G <ptlen@ceti.pl> writes:

  FG> Sara pisze:
  >> Of course we can use @ARGV for ($ARGV[0], $ARGV[1], ...)
  >> and @_ for ($_[0], $_[1],...)
  >> 
  >> Is there an array defined for ($1, $2, $3,...)  after a regex defines
  >> them? I did a perldoc -q on regex and some other keywords but I didn't
  >> discover anything related.

  FG> No, there is no array, but you can use symbolic references:

  FG> /your-regexp-goes-here/
  FG> for$i(1..9)
  FG> {
  FG> 	print "${$i}\n";
  FG> }

GACK!!

did you even read any of the other fine answers posted in this thread?

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org


------------------------------

Date: Sat, 26 Apr 2003 17:35:39 +0000 (UTC)
From: Filip G. <ptlen@ceti.pl>
Subject: Re: Is there an array for ($1, $2, $3, ...)
Message-Id: <slrnbaldbc.th.ptlen@localhost.localdomain>

Uri Guttman pisze:
>>>>>> "FG" == Filip G <ptlen@ceti.pl> writes:
[...]
>  FG> No, there is no array, but you can use symbolic references:
>  FG> 	print "${$i}\n";
>
> GACK!!
> 
> did you even read any of the other fine answers posted in this thread?

Sure. It was just my $0.03, just a curiosity. It might come in handy
from time to time, but of course using @x = $y=~/.../ is usually much
more reasonable.

Filip G.



------------------------------

Date: Sat, 26 Apr 2003 17:53:37 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Is there an array for ($1, $2, $3, ...)
Message-Id: <x765p1f8m6.fsf@mail.sysarch.com>

>>>>> "FG" == Filip G <ptlen@ceti.pl> writes:

  FG> Uri Guttman pisze:
  >>>>>>> "FG" == Filip G <ptlen@ceti.pl> writes:
  FG> [...]
  FG> No, there is no array, but you can use symbolic references:
  FG> print "${$i}\n";
  >> 
  >> GACK!!
  >> 
  >> did you even read any of the other fine answers posted in this thread?

  FG> Sure. It was just my $0.03, just a curiosity. It might come in handy
  FG> from time to time, but of course using @x = $y=~/.../ is usually much
  FG> more reasonable.

i would like you to show any reason where it might be a good solution
and an simple array assignment (or non-symref soultion) wouldn't
suffice?

you can see my general stand on symrefs by searching google. i can't
supply a fresh rant each time i see it used here. but the rule is:

	don't use symrefs unless you are munging the symbol
	table. symrefs are not for general data structures.

simple to learn and obey. and if you used strict all the time, you can't
break it without disabling it.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org


------------------------------

Date: 26 Apr 2003 12:53:54 +0200
From: Claudio Nieder <private@claudio.ch>
Subject: Re: Just curous about this- are REGEXes rigorously deterministic
Message-Id: <3eaa6542$1@news.swissonline.ch>

>:mind as I compose regexes that there may be sets of inputs & regexes
>:that have 2 or more valid solutions? I doubt that the trivial ones I
> 
> There is more than one valid representation for any regex in perl.

That several regex can have the same result is clear. The question is,
if a given regex has always the same result, when presented with a
certain input. Could it be that two implementation of the perl regular
expression specification, though both fully conform to the specification
of perlre(1) and have no bugs, produce a different result for the same
regex and the same input to match or substitute?

			claudio
-- 
Claudio Nieder, Kanalweg 1, CH-8610 Uster, Tel +41 79 357 6743
yahoo messenger: claudionieder aim: claudionieder icq:42315212
mailto:private@claudio.ch                http://www.claudio.ch



------------------------------

Date: Sat, 26 Apr 2003 12:11:16 GMT
From: "Tman" <nerdy1@snet.net>
Subject: Parsing HTML pages... best way out of the dozens of options?
Message-Id: <EHuqa.1843$Yb2.400780353@newssvr10.news.prodigy.com>

I have an HTML page with a number of tables on it, and I need to locate a
certain table (by looking through all the tables until I find one that
matches a certain regex), and then iterate through the rows and columns of
that table.

What is the best way to do this?.... :)

So I see that HTML::TableExtract will parse out a table into rows and
columns.  Very nice.

But it doesn't have any way to find a table in a web page with content that
matches a certain regex.

So my first attempt was to find the table in the HTML by trying to match
<TABLE.... (my content) ... /TABLE> with a regex.  But as another thread
here discusses ("Regex greediness question"), writing such a regex is not
that easy.  To summarize, I would need to write a certain regex, and then
iterate through all the tables in the HTML page until I happened upon one
with my content.  And that would only work if there were no nested tables.

Posters there had suggestions to use HTML::Parser and its ilk for parsing
the HTML instead of toying with regexes.  But I have a basic question with
using that parser and its friends.  It seems that it parses the HTML and
delivers a stream of events to your handler functions, one for each tag
start and end, etc.  Would'nt it be much easier to grok the HTML using a
parser which instead of delivering tag events to your code, exposed a data
structure containing the already parsed HTML in convenient object types?

I know it's possible to do what I am trying to do dozens of different ways,
and I am trying to find the easiest, most straightforward approach.  I have
a lot of HTML "screen scraping" to do, and I would like to have the code
which does the scraping be as light and straightforward as possible.... just
encoding the logic and knowledge to grok the HTML pages, all the other stuff
in a library and out of the code.

Several years ago I did a _lot_ of HTML screen scraping and there was an
excellent Java-based tool called "webl" for this purpose.  It parsed the
HTML, then allowed the user to extract a set of "piece" objects (which
mapped to HTML elements) based upon certain queries.  Then, you could
perform set algebra operations on these sets of pieces.  For example, for my
original problem, my code would call into webl to:
- Fetch and Parse the HTML
- a =  the set of all "table" pieces in the HTML.
- b = the set of all pieces in the HTML whose text matches the certain
string that I am looking for within this table.
- c = the set of all items in a which enclose b.  E.g. all the tables with
that certain string in them.
- Assert that c is a set of size 1.  The single item in set c is the table
that I am looking for.

And so on; this hardly expresses the full power of what webl could do.  It
allowed one to grok through HTML at a very high level, and not worry too
much about stuff not direcly related to the structure of the page.  As you
can see, the code to grok the HTML is also quite independent of structural
changes made to the source page... as long as there is exactly one table
with a certain string in it we are all set.

Is there anything like this out on CPAN for Perl?  If not... I may be about
to take matters into my own hands and build one based upon the raw
HTML::Parser and friends.

If you want to see more of what webl can do, check this link:
http://www.research.compaq.com/SRC/WebL/htmldocu/WebL.html

(The relevant stuff is in Chapter 4)

Thanks,
Tman.

PS: Why don't I just use webl?  Well, it has not been maintained for years
due to certain legal wranglings I think, and also it defines its own
scripting language, and of course there is not very good support for that
scripting language re: debuggers, etc.  We don't need another scripting
language.  But the webl concepts of pieces, sets, and the algebra on them
are great, and I would love to see the equivalent exposed to Perl in the
form of an object-based set of modules.






------------------------------

Date: Sat, 26 Apr 2003 15:34:00 +0200
From: "Harald H.-J. Bongartz" <bongie@gmx.net>
Subject: Re: Parsing HTML pages... best way out of the dozens of options?
Message-Id: <1057118.2cj2UPpubv@nyoga.dubu.de>

Tman wrote:
> Would'nt it be much easier to grok the HTML using
> a parser which instead of delivering tag events to your code, exposed
> a data structure containing the already parsed HTML in convenient
> object types?

You may want to take a look at HTML::TreeBuilder. 


Ciao,
        Harald
-- 
Harald H.-J. Bongartz <bongie@gmx.net>
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Seit Computer nicht mehr mit Lochkarten und einem rein textuellen
Terminal bedient werden, ist Schein leider oft sehr viel wichtiger
geworden als Sein.                      -- Markus Kohm in TEX-D-L


------------------------------

Date: Sat, 26 Apr 2003 12:13:53 GMT
From: "Tman" <nerdy1@snet.net>
Subject: Re: Regex greediness question
Message-Id: <5Kuqa.1844$ia2.400675965@newssvr10.news.prodigy.com>



>
> Yes, it occured to me, too, that we might have misunderstood the OP
> in different ways :)
>

OP here.  Thanks to all those who have replied and set me straight on the
regex quantifier stuff.  In the replies here, I have gotten several
suggestions re: HTML parsing in the terms of my original problem, which I
guess I didn't make too clear in the orig post.  If you are interested, I
have posted another thread in this group more to the point of my original
problem in terms of HTML parsing..."Parsing HTML pages... best way out of
the dozens of options?"




------------------------------

Date: 26 Apr 2003 08:11:58 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: Tough question for the guru's; Grep Once, Awk Twice (or more)
Message-Id: <b8df0e$d8o$1@nets3.rz.RWTH-Aachen.DE>

Also sprach Agrapha:

> "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de> wrote in message news:<b86v2g$dbh$1@nets3.rz.RWTH-Aachen.DE>...

>> Most problems require to look at each line of a file only once. 
>> In such a case you should not slurp a whole file into an
>> array. Instead, iterate over the file line-wise. Perl makes that quite
>> easy so it's a common idiom:
>> 
>>     open F, "file" or die $!;
>>     while (<F>) {
>>         # each line now in $_ including terminating newline
>>         ...
>>     }
> 
> I love the elegance of looking at a file only once. This script needs
> to look at three different files. Is it possible to slurp 3 files into
> F?

A filehandle always refers to one file (socket, pipe or whatever). If
you need to run through different files, consider to nest the above
while-loop into a for-loop:

    for my $file (@files) {
        open F, $file or die $!;
        while (<F>) { ... }
        close F;
    }

That you way, you never have more than one open handle.

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: Sat, 26 Apr 2003 09:45:01 GMT
From: Bart Lateur <bart.lateur@pandora.be>
Subject: Re: Tough question for the guru's; Grep Once, Awk Twice (or more)
Message-Id: <i6lkavcq0tird5trk1e2undhghdggpskun@4ax.com>

Agrapha wrote:

>>     open F, "file" or die $!;
>>     while (<F>) {
>>         # each line now in $_ including terminating newline
>>         ...
>>     }
>
>I love the elegance of looking at a file only once. This script needs
>to look at three different files. Is it possible to slurp 3 files into
>F?

If you actually don't care which file your data comes from, you can put
your files into @ARGV (the parameter array) and let the magic of <> do
its work:

     @ARGV = ("file1", "file2", "file3");
     while (<>) {
         # each line now in $_ including terminating newline
         ...
     }

-- 
	Bart.


------------------------------

Date: Sat, 26 Apr 2003 15:21:31 +0000 (UTC)
From: "Peter Wilson" <peter_wilson@mail.com>
Subject: Re: XS or SWIG
Message-Id: <b8e85r$dns$1@titan.btinternet.com>

"Eric Wilhelm" <ericw@nospam.ku.edu> wrote in message
news:pan.2003.04.25.17.23.41.640785.5659@nospam.ku.edu...
> On Fri, 25 Apr 2003 13:07:43 -0500, Peter Wilson wrote:
>
> >Does anyone know of a
> > book / web site / set of examples of how to write XS or SWIG or have any
> > advice on which is best to use. I have a header file (.h) and the
> > library (.dll) and no source files.
<snipped>
> I have used SWIG to write a perl wrapper for just such a toolkit.  It has
> support for "shadow classes" which (I think) would give you read-write
> interface to your variable.
>
> You would probably be okay without writing extra C code, provided that
> you aren't trying to get write-access to arrays (at which point you have
> to get into perlguts and figure out how to translate from one to the
> other).
>
> I too am not a C programmer.  I looked at both systems and SWIG seemed to
> be the one with the shorter learning curve and smaller interface code.  I
> have found a couple of issues with it (not declaring variables, etc), but
> have been able to work-around them using short intermediate scripts and
> includes between generating and compiling.
>
> The manual is HUGE and very informative (www.swig.org).
>
> --Eric

Many thanks Eric, I guess swig it is then.

Peter




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 4897
***************************************


home help back first fref pref prev next nref lref last post