[22677] in Perl-Users-Digest
Perl-Users Digest, Issue: 4898 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Apr 26 21:05:41 2003
Date: Sat, 26 Apr 2003 18:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sat, 26 Apr 2003 Volume: 10 Number: 4898
Today's topics:
Re: analyising haskell source for <<loop>> dependencies <jkeen@concentric.net>
Authentication with Unix username and password (SlimClity)
Re: Authentication with Unix username and password <tony_curtis32@yahoo.com>
Function to get directory separator char? (D. Alvarado)
Re: Function to get directory separator char? <tony_curtis32@yahoo.com>
help with use lib <mpapec@yahoo.com>
Re: help with use lib <tassilo.parseval@rwth-aachen.de>
Re: How to send and receive on IP PORT? (Walter Roberson)
Re: Is there an array for ($1, $2, $3, ...) <julian@avbrief.com>
Isolating Sentences (Not Lines) With Regex <sskinner@cloud9.net>
Re: Isolating Sentences (Not Lines) With Regex <mpapec@yahoo.com>
Re: Isolating Sentences (Not Lines) With Regex <nobody@dev.null>
Re: Just curious about this- are REGEXes rigorously det (Sara)
Re: Just curous about this- are REGEXes rigorously dete (Walter Roberson)
Re: Just curous about this- are REGEXes rigorously dete <nobody@dev.null>
Re: Just curous about this- are REGEXes rigorously dete <tassilo.parseval@rwth-aachen.de>
Re: Just curous about this- are REGEXes rigorously dete (Sara)
Re: Parsing HTML pages... best way out of the dozens of <greg@hassan.com>
regex for word whitespace word <johngros@bigpond.net.au>
socket buffer flush problem (Christoff Pale)
Re: Tough question for the guru's; Grep Once, Awk Twice <dover@nortelnetworks.com>
Re: What do these files have in common? <goldbb2@earthlink.net>
Re: XS or SWIG <julian@avbrief.com>
Re: {newbie} sorting of files <dha@panix.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 26 Apr 2003 18:44:13 GMT
From: "James E Keenan" <jkeen@concentric.net>
Subject: Re: analyising haskell source for <<loop>> dependencies.
Message-Id: <b8ek1t$h56@dispatch.concentric.net>
"Jason Smith" <chastel_pelerin@hotmail.com> wrote in message
news:623cd325.0304251647.468d04c7@posting.google.com...
> Hi All, I have little/no Perl experience and was wondering if anyone
> could quickly hash this requirement out for me...
>
> basically given say a haskell source, snippet below.
>
> fvars = TPX.listDiffBy cmpVar nl' pdl
> pdl' = fvars `union` pdl
>
> It can identify where this may occur
>
> pdl' = fvars `union` pdl'
>
> i.e. create a <<loop>> during runtime?
>
> so basically I evision a script that will check either side of a '='
> operator and determine if we are using a variable in an assignment to
> that variable.
>
Since I don't know haskell, I'm guessing at what is proper syntax. But
here's a start:
foreach (<DATA>) {
print "'$1' was seen on each side of the assignment operator.\n"
if (/^(.*?)'?\s+=\s+.*\1.*/);
}
__DATA__
fvars = TPX.listDiffBy cmpVar nl' pdl
pdl' = fvars `union` pdl
------------------------------
Date: 26 Apr 2003 15:18:45 -0700
From: slimclity@hotmail.com (SlimClity)
Subject: Authentication with Unix username and password
Message-Id: <3c209c5f.0304261418.3771219c@posting.google.com>
Is it possible to use the BSD username and password as authentication
method?
I've tried to use the encrypt command and verify this with
/etc/master.passwd but the encrypted string changes while using the
same password.
------------------------------
Date: Sat, 26 Apr 2003 17:24:16 -0500
From: Tony Curtis <tony_curtis32@yahoo.com>
Subject: Re: Authentication with Unix username and password
Message-Id: <87r87o7v8v.fsf@limey.hpcc.uh.edu>
>> On 26 Apr 2003 15:18:45 -0700,
>> slimclity@hotmail.com (SlimClity) said:
> Is it possible to use the BSD username and password as
> authentication method?
to authenticate for what?
hth
t
------------------------------
Date: 26 Apr 2003 12:24:19 -0700
From: laredotornado@zipmail.com (D. Alvarado)
Subject: Function to get directory separator char?
Message-Id: <9fe1f2ad.0304261124.2568df7@posting.google.com>
Hi,
Does anyone have a quick one-liner for getting the directory
separator character? Thanks - Dave
------------------------------
Date: Sat, 26 Apr 2003 14:25:44 -0500
From: Tony Curtis <tony_curtis32@yahoo.com>
Subject: Re: Function to get directory separator char?
Message-Id: <873ck583if.fsf@limey.hpcc.uh.edu>
>> On 26 Apr 2003 12:24:19 -0700,
>> laredotornado@zipmail.com (D. Alvarado) said:
> Hi, Does anyone have a quick one-liner for getting the
> directory separator character?
perldoc Config
path_sep
hth
t
------------------------------
Date: Sat, 26 Apr 2003 22:59:08 +0200
From: Matija Papec <mpapec@yahoo.com>
Subject: help with use lib
Message-Id: <ferlavko5q1sa567pf6cemk560kmjv7f9o@4ax.com>
I did perl Makefile.PL with PREFIX and LIB, make && make install, everything
went ok but when invoking a script there is an error:
Can't locate object method "new" via package "CGI::Session" at test.pl line
4
and I don't know where to look anymore..
===============
use lib qw'/home/mpapec/web/lib';
my $session = CGI::Session->new("driver:File", undef, {Directory=>'./'});
===============
[mpapec@localhost lib]$ pwd
/home/mpapec/web/lib
[mpapec@localhost lib]$ du -h
4.0k ./i386-linux/auto/CGI/Session
5.0k ./i386-linux/auto/CGI
6.0k ./i386-linux/auto
8.0k ./i386-linux
161k ./man/man3
162k ./man
27k ./auto/CGI/Session
28k ./auto/CGI
29k ./auto
10k ./CGI/Session/Serialize
9.0k ./CGI/Session/ID
104k ./CGI/Session
138k ./CGI
338k .
--
Matija
------------------------------
Date: 26 Apr 2003 21:36:40 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: help with use lib
Message-Id: <b8eu58$qtd$1@nets3.rz.RWTH-Aachen.DE>
Also sprach Matija Papec:
> I did perl Makefile.PL with PREFIX and LIB, make && make install, everything
> went ok but when invoking a script there is an error:
>
> Can't locate object method "new" via package "CGI::Session" at test.pl line
> 4
>
> and I don't know where to look anymore..
>
>
>===============
> use lib qw'/home/mpapec/web/lib';
> my $session = CGI::Session->new("driver:File", undef, {Directory=>'./'});
You have to use() the module as well:
use lib '...';
use CGI::Session;
my $session = CGI::Session...;
The error-message you got actually states this quite clearly
... (perhaps you forgot to load "CGI::Session"?) at ...
Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
------------------------------
Date: 26 Apr 2003 18:13:12 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: How to send and receive on IP PORT?
Message-Id: <b8ei7o$ipn$1@canopus.cc.umanitoba.ca>
In article <b8ca7c$gnj$1@canopus.cc.umanitoba.ca>,
Walter Roberson <roberson@ibd.nrc-cnrc.gc.ca> wrote:
|In article <qKhqa.379953$Zo.88046@sccrnsc03>,
|Brad Walton <sammie@greatergreen.com> wrote:
|:I am looking for information on how to send information and receive (listen)
|:for information on a port. For example, I want to have a perl program
|:running on one PC, while another sits on a remote machine and listens for
|:incoming data on a specified port.
|I'm not overly familiar with PCs, but you can probably treat the
|IR port as a serial port with a slightly different device name.
Ah, sorry, it appears that I misread the original posting. I
read "IR PORT" (Infrared Port) instead of "IP PORT" (Internet Protocol).
Infrared ports are serial ports, with there being common libraries
available that allow you to use them as limited IP devices. But
that turns out not to be what you were asking about.
--
When your posts are all alone / and a user's on the phone/
there's one place to check -- / Upstream!
When you're in a hurry / and propagation is a worry/
there's a place you can post -- / Upstream!
------------------------------
Date: Sat, 26 Apr 2003 21:52:40 +0100
From: "Julian Scarfe" <julian@avbrief.com>
Subject: Re: Is there an array for ($1, $2, $3, ...)
Message-Id: <lkCqa.1857$ZS4.56577@newsfep4-glfd.server.ntli.net>
"Malcolm Dew-Jones" <yf110@vtn1.victoria.tc.ca> wrote in message
news:3ea9d104@news.victoria.tc.ca...
> *but* be careful of the gotcha of assigning a list in conjunction with
> looping using /g because that is just a little too clever
>
> while ( my($one,$two) = m/(some).*?(thing)/g ) # note /g
> { # doesn't work
> }
>
> you need to use
>
> while ( m/(some).*?(thing)/g ) # note /g
> { my($one,$two) = ($1,$2);
> }
I find omitting the paretheses a nice gotcha too, particularly for a single
match:
$a = "1 and 2 and 3" =~ /and\s(\d)\sand/;
# $a is 1, the number of matches
when I really meant:
($a) = "1 and 2 and 3" =~ /and\s(\d)\sand/;
# $a is 2, the matched digit.
Julian Scarfe
------------------------------
Date: Sat, 26 Apr 2003 21:04:25 GMT
From: Scott Edward Skinner <sskinner@cloud9.net>
Subject: Isolating Sentences (Not Lines) With Regex
Message-Id: <260420031704294767%sskinner@cloud9.net>
Although I'm doing this project in Java, I need help from regex gurus,
and that means Perl users...
Given a text file of literature, I'm trying to isolate individual
sentences. What is a sentence? Well, I'm not sure, but I know it when I
see it.
A sentence can begin with quotation marks, as in:
"What is a sentence?" I pondered.
But a sentence may not contain quotes, as in:
What is a sentence?--I pondered.
A sentence may simply contain quotes, as in:
What is a "sentence"?--I pondered.
The period is no help, either. Consider:
What is a sentence?!
Periods can also appear within a sentence, as in:
Mr. Sentence, I ponder what you are...
Capital letters can also appear with a sentence, as in:
I wonder, Mr. Sentence, if you know yourself.
Given the following paragraph, then, what regex will isolate the
individual sentences?
"What is a sentence?" I pondered. What is a sentence?--I pondered. What
is a "sentence"?--I pondered. What is a sentence?! Mr. Sentence, I
ponder what you are... I wonder, Mr. Sentence, if you know yourself.
Yep, I'm stumped. Any ideas?
-S
------------------------------
Date: Sat, 26 Apr 2003 23:14:42 +0200
From: Matija Papec <mpapec@yahoo.com>
Subject: Re: Isolating Sentences (Not Lines) With Regex
Message-Id: <uatlav80gaodlstj3ujco47jlrvkhqimm7@4ax.com>
X-Ftn-To: Scott Edward Skinner
Scott Edward Skinner <sskinner@cloud9.net> wrote:
>Given the following paragraph, then, what regex will isolate the
>individual sentences?
>
>"What is a sentence?" I pondered. What is a sentence?--I pondered. What
>is a "sentence"?--I pondered. What is a sentence?! Mr. Sentence, I
>ponder what you are... I wonder, Mr. Sentence, if you know yourself.
>
>Yep, I'm stumped. Any ideas?
For beginning you could say that sentence begins with \w and ends with
[.!?]+ but I guess there are cases which don't obey this rule. Hm, I just
did such sentence.
--
Matija
------------------------------
Date: Sat, 26 Apr 2003 21:43:41 GMT
From: Andras Malatinszky <nobody@dev.null>
Subject: Re: Isolating Sentences (Not Lines) With Regex
Message-Id: <3EAAFD58.7030601@dev.null>
Scott Edward Skinner wrote:
> Although I'm doing this project in Java, I need help from regex gurus,
> and that means Perl users...
>
> Given a text file of literature, I'm trying to isolate individual
> sentences. What is a sentence? Well, I'm not sure, but I know it when I
> see it.
>
There was a pretty good discussion about this in 2001. Take a look at
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=3AF4066B.4CD64D06%40mortgagestats.com&rnum=3&prev=/groups%3Fq%3Dsentence%2Bgroup:comp.lang.perl.misc%2Bgroup:comp.lang.perl.misc%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26group%3Dcomp.lang.perl.misc%26selm%3D3AF4066B.4CD64D06%2540mortgagestats.com%26rnum%3D3
with particular attention to Abigail's suggestion about TeX's
sentence-recognition algorithm and Damian Conway's suggestion about
trying Text::Autoformat.
------------------------------
Date: 26 Apr 2003 17:48:49 -0700
From: genericax@hotmail.com (Sara)
Subject: Re: Just curious about this- are REGEXes rigorously deterministic?
Message-Id: <776e0325.0304261648.dc1e6e7@posting.google.com>
"Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de> wrote in message news:<b8euch$r0v$1@nets3.rz.RWTH-Aachen.DE>...
> Also sprach Andras Malatinszky:
>
> > Sara wrote:
> >
> >> OK, probably just the "scientist" in me, but with such an enormously
> >> large set of possibilities, I wonder if regex results
> >> deterministically map into a 1:1 into set of correct solutions (1
> >> input, 1 regex = 1 result?).
> >>
> >> I know *I* sure couldn't prove it.
> >
> >
> > I should hope not. Try running this a couple of times:
> >
> > my $number=666;
> > $number=~s/6/int(rand(10))/ge;
> > print $number;
>
> The regex part is still deterministic here. Remember that in s/// only
> the left part is a regex. And the pattern is just '6' here. The
> replacement part is a string (optionally taken to be evaluated).
>
> Perl regexes get indeterministic once you use some of the more advanced
> features. "(??{ code })" or "(?(condition)yes-pattern|no-pattern)" come
> to mind. Naturally, in such a case no one would expect them to be
> deterministic either.
>
> Tassilo
Tassilo, I guess you mean that given an input string, and a regex,
there are other EXTERNAL states that affect the outcome, such as in
your example, (condition)? Yes I can see that. OK. Good point.
Let's say only the input string and the engine itself can affect the
outcome. I realize that's a subset of the domain of regexes, but let's
look at that since you make a good point that the domain of
*everything outside the regex* is like saying "is Perl
deterministic?". Perhaps an intersting question as well but not what
I'm after.
Are we all using the "THEORY of regexes* or *the LAW of regexes*?
Now in my little mind, I think of it THIS WAY. The regex ENGINE is a
state-machine, like a Turing machine, but ever-so-much more complex.
Each time the crank turns the state changes. Can it be proven that (1)
each turn of the crank in state A inalterably produces state B, and
(2) each implementation begins in the same state? I guess if those
proposals could be proven then they are a LAW. Otherwise not.
You gentlemen already came up with some very interesting ways that (1)
and (2) are not satisfied, and I'm inclined to think that we're using
the *theory of regexes* in our daily work. Surely a model that will
satisfy 99.999% and more of our challenges we throw at it, but much
like the monkey typing, how long does he have to type before he writes
War & Peace? How many inputs do we have to throw at a simple regex to
break it?
I suppose in a sense this is all moot since as any seasoned programmer
knows, even arithmetic operations are at best approximations. How many
of us have been bit by the machine thinking 1 + 2 = 2.99999999999? I
know I have.
Anyhow- guess I just felt like digging a little but past the practical
aspects of regexes and into the theory side. As always, some
interesting views pop up immediately here.
Did anyone else ever notice how much better the discussions are on
CLPM than other groups :)
Cheers,
Gx
------------------------------
Date: 26 Apr 2003 18:24:52 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: Just curous about this- are REGEXes rigorously deterministic
Message-Id: <b8eitk$j2d$1@canopus.cc.umanitoba.ca>
In article <3eaa6542$1@news.swissonline.ch>,
Claudio Nieder <private@claudio.ch> wrote:
:That several regex can have the same result is clear. The question is,
:if a given regex has always the same result, when presented with a
:certain input. Could it be that two implementation of the perl regular
:expression specification, though both fully conform to the specification
:of perlre(1) and have no bugs, produce a different result for the same
:regex and the same input to match or substitute?
In the case of perlre (which are NOT the same as classical
Regular Expressions), the answer is Yes, two different implimentations
could produce different results that both conform to specifications.
The reason this can happen for perlre is that perlre can include
function calls, and Perl does not specify the order of function
call processing if one has multiple function calls in the same
statement. As I recall, Perl also does not completely specify the
order of operations of argument evaluation.
Classical Regular Expressions (which only have alternation,
concatenation, and indefinite repetition) *are* deterministic.
perlre support matching a lot of patterns that cannot be matched
with classical Regular Expressions. Some of those extensions
preserve determinism and some of them don't.
--
Will you ask your master if he wants to join my court at Camelot?!
------------------------------
Date: Sat, 26 Apr 2003 21:29:10 GMT
From: Andras Malatinszky <nobody@dev.null>
Subject: Re: Just curous about this- are REGEXes rigorously deterministic
Message-Id: <3EAAF9F1.2030601@dev.null>
Sara wrote:
> OK, probably just the "scientist" in me, but with such an enormously
> large set of possibilities, I wonder if regex results
> deterministically map into a 1:1 into set of correct solutions (1
> input, 1 regex = 1 result?).
>
> I know *I* sure couldn't prove it.
I should hope not. Try running this a couple of times:
my $number=666;
$number=~s/6/int(rand(10))/ge;
print $number;
------------------------------
Date: 26 Apr 2003 21:40:33 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: Just curous about this- are REGEXes rigorously deterministic
Message-Id: <b8euch$r0v$1@nets3.rz.RWTH-Aachen.DE>
Also sprach Andras Malatinszky:
> Sara wrote:
>
>> OK, probably just the "scientist" in me, but with such an enormously
>> large set of possibilities, I wonder if regex results
>> deterministically map into a 1:1 into set of correct solutions (1
>> input, 1 regex = 1 result?).
>>
>> I know *I* sure couldn't prove it.
>
>
> I should hope not. Try running this a couple of times:
>
> my $number=666;
> $number=~s/6/int(rand(10))/ge;
> print $number;
The regex part is still deterministic here. Remember that in s/// only
the left part is a regex. And the pattern is just '6' here. The
replacement part is a string (optionally taken to be evaluated).
Perl regexes get indeterministic once you use some of the more advanced
features. "(??{ code })" or "(?(condition)yes-pattern|no-pattern)" come
to mind. Naturally, in such a case no one would expect them to be
deterministic either.
Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
------------------------------
Date: 26 Apr 2003 17:25:34 -0700
From: genericax@hotmail.com (Sara)
Subject: Re: Just curous about this- are REGEXes rigorously deterministic
Message-Id: <776e0325.0304261625.6ab4ddfd@posting.google.com>
Andras Malatinszky <nobody@dev.null> wrote in message news:<3EAAF9F1.2030601@dev.null>...
> Sara wrote:
>
> > OK, probably just the "scientist" in me, but with such an enormously
> > large set of possibilities, I wonder if regex results
> > deterministically map into a 1:1 into set of correct solutions (1
> > input, 1 regex = 1 result?).
> >
> > I know *I* sure couldn't prove it.
>
>
> I should hope not. Try running this a couple of times:
>
> my $number=666;
> $number=~s/6/int(rand(10))/ge;
> print $number;
666? Andras you are *bad*!
OK I hadn't thought of that. Actually, however, given the same
psuedo-random number generator in the same state, is n't that in fact
still deterministic? I mean in this case, the PRN-generator and state
are sort of *inputs* aren't they?
Very interesting proposal however!
-Gx
------------------------------
Date: Sat, 26 Apr 2003 18:50:06 GMT
From: Greg <greg@hassan.com>
Subject: Re: Parsing HTML pages... best way out of the dozens of options?
Message-Id: <3EAAD4EA.6010701@hassan.com>
Tman wrote:
> I have an HTML page with a number of tables on it, and I need to locate a
> certain table (by looking through all the tables until I find one that
> matches a certain regex), and then iterate through the rows and columns of
> that table.
>
Seems like a simple funciton to split the page on table and
/table will find your table. Then you can split on </td> and
</tr> and return the data in an array if you like. Not sure
why you need a library. Instead of writing such a long
post, you could have created said function with much
fewer lines.
-Greg
http://www.supercgis.com/
------------------------------
Date: Sun, 27 Apr 2003 00:46:11 GMT
From: "John Gros" <johngros@bigpond.net.au>
Subject: regex for word whitespace word
Message-Id: <nLFqa.1860$lD4.12990@news-server.bigpond.net.au>
I have been trying to get a regex to pick up both single words and two word
descriptions of an a href tag but it refuses to pickup two word
descriptions. I have been going over perlre looking for ways to do this and
have tried many variations. I often believe I have a regex that logically
should work yet while $1 is set $2 is not. $1 being the link and $2 being
the description, it does get set for a single word description but not set
for two word descriptions.
Regexs I have tried.
/([A-Z]{2}.HTM)(>.*?<)/;
/([A-Z]{2}.HTM)>(.*?)</;
/([A-Z]{2}.HTM)>(\w*\s\w*)</;
/([A-Z]{2}.HTM)>(\w*\s?\w{0,20})</;
/([A-Z]{2}.HTM)>(\w*.?\w{0,20})</;
/([A-Z]{2}.HTM)>(\w*.+\w+)</;
/([A-Z]{2}.HTM)>(\w*.+)</;
/([A-Z]{2}.HTM)>(\w*.?\w+)</;
I have run out of ideas.
------------------------------
Date: 26 Apr 2003 15:34:08 -0700
From: christoff_pale@yahoo.com (Christoff Pale)
Subject: socket buffer flush problem
Message-Id: <73718d8a.0304261434.567a35d@posting.google.com>
Hi,
i am having a problem with getting response from the server and
vice versa , the code is below.
Output should be:
1) for server:
tao $ ./server_eg1.pl
SERVER started on port 7890
2) for client
tao $ ./client_eq1.pl
Smile from the server
but the client output I get is nothing. Can someone please help
me figure this out. Thanks.
p.s. note that I have set $|=1 as well.
server:
=======\
use strict;
use Socket;
$|=1;
# initialize host and port
my $host = shift || 'localhost';
my $port = shift || 7890;
my $proto = getprotobyname('tcp');
# get the port address
my $iaddr = inet_aton($host);
my $paddr = sockaddr_in($port, $iaddr);
# create the socket, connect to the port
socket(SOCKET, PF_INET, SOCK_STREAM, $proto)
or die "socket: $!";
connect(SOCKET, $paddr) or die "connect: $!";
my $line;
while ($line = <SOCKET>) {
print $line;
}
close SOCKET or die "close: $!";
client
======
use strict;
use Socket;
$|=1;
# use port 7890 as default
my $port = shift || 7890;
my $proto = getprotobyname('tcp');
# create a socket, make it reusable
socket(SERVER, PF_INET, SOCK_STREAM, $proto) or die "socket: $!";
setsockopt(SERVER, SOL_SOCKET, SO_REUSEADDR, 1) or die "setsock: $!";
# grab a port on this machine
my $paddr = sockaddr_in($port, INADDR_ANY);
# bind to a port, then listen
bind(SERVER, $paddr) or die "bind: $!";
listen(SERVER, SOMAXCONN) or die "listen: $!";
print "SERVER started on port $port\n";
# accepting a connection
my $client_addr;
while ($client_addr = accept(CLIENT, SERVER)) {
# find out who connected
my ($client_port, $client_ip) = sockaddr_in($client_addr);
my $client_ipnum = inet_ntoa($client_ip);
my $client_host = gethostbyaddr($client_ip, AF_INET);
print "got a connection from: $client_host","[$client_ipnum]\n";
# print who has connected
# send them a message, close connection
#print CLIENT "Smile from the server";
print CLIENT "Smile from the server";
close CLIENT;
}
------------------------------
Date: Sat, 26 Apr 2003 13:40:31 -0500
From: "Bob Dover" <dover@nortelnetworks.com>
Subject: Re: Tough question for the guru's; Grep Once, Awk Twice (or more)
Message-Id: <b8ejpo$ogq$1@zcars0v6.ca.nortel.com>
"Agrapha" wrote...
> Agrapha wrote...
> > "Tassilo v. Parseval" wrote...
> > > Could we please put an end to all this noise? I think it has been
> > Agreed. A few less flame throwers will be much appreciated. I will do
>
> for those who have forgotten what my original breach of etiquette was...
So much for putting an end to the noise. 8^(
-BD
------------------------------
Date: Sat, 26 Apr 2003 16:37:43 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: What do these files have in common?
Message-Id: <3EAAEE17.79783F04@earthlink.net>
Malcolm Dew-Jones wrote:
[snip]
> obviously untested, make all the "3 or more" word phrases for both files,
> put in a hash, and compare.
>
> my @words1 = split into words $file1;
>
> my %phrases1 = ();
>
> for (my $i=0; $i<@words1-3; $i++)
> {
> for (my $j=$i+3; $j<@words1; $j++)
> { my $phrase = join ' ', @words1[$i..$j];
> $phrases1{ $phrase }= $phrase;
> }
> }
>
> # now do the same for the 2nd file, then compare the keys of the
> # two hashes
>
> my @words2 = split into words $file2;
> my %phrases2 = ();
> for (my $i=0; $i<@words2-3; $i++)
> { for (my $j=$i+3; $j<@words2; $j++)
> { my $phrase = join ' ', @words2[$i..$j];
> $phrases2{ $phrase }= $phrase;
> }
> }
>
> # compare
> foreach my $phrase1 (keys %phrases1)
> {
> if ( $phrases2{$phrase1} )
> {
> print "$phrase1 is in both files\n";
> }
> }
The problem with your code is that it doesn't scale -- you're going to
build one hash for every file, and the OP implies that he's got at least
three files. Just because the OP *says* that resources aren't a problem,
is no reason to waste memory like that.
Also, each file needs to be examined line-by-line, since phrases cannot
extend across lines.
And, you've got an off-by-one error -- you've got $i+3, when it should
be $i+2. (Your code can't find phrases shorter than 4 words).
I believe that the OP's problem is doable using only one hash.
my $last_pass;
for my $file ( @files ) {
open( my($fh), "<", $file )
or die "Couldn't open $file: $!";
my %this_pass;
while( defined( my $line = <$fh> ) ) {
next unless $line matches some criteria;
$line = lc $line;
my @words = $line =~ /\w+/g;
for my $i ( 0 .. $#words - 2 ) {
for my $j ( $i + 2 .. $#words ) {
my $phrase = "@words[$i..$j]";
# if this is the first pass (!$last_pass),
# or if the phrase was found on the last pass,
# then we keep it.
$this_pass{$phrase} = 1
if !$last_pass or $last_pass->{$phrase};
} }
}
$last_pass = \%this_pass;
last if not %this_pass;
}
# Remove those phrases which are sub-phrases of
# phrases which are common. That is, if we found
# "The quick brown fox", then we don't want to see
# either "The quick brown" or "quick brown fox".
for( keys %$last_pass ) {
next unless tr/ // > 3 and exists $last_pass->{$_};
my @words = split;
# Remove sub-phrases of this phrase.
for my $i ( 0 .. $#words - 2 ) {
for my $j ( $i + 2 .. $#words ) {
delete $last_pass->{"@words[$i..$j]"};
} }
# Since that removed this phrase itself,
# we afterwards have to add it back in.
$last_pass->{$_} = 1;
}
# Now, print them out.
while( my $phrase = each %$phrases ) {
print ucfirst( $phrase ), "\n";
}
__END__
[untested]
Ok, there're actually two hashes being used here -- but that's *only*
two, never more than that, no matter how many files are examined.
I *could* do it with one hash, but there would need to be an extra pass
through the hash.
--
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}
------------------------------
Date: Sat, 26 Apr 2003 21:40:59 +0100
From: "Julian Scarfe" <julian@avbrief.com>
Subject: Re: XS or SWIG
Message-Id: <o9Cqa.1852$ZS4.54562@newsfep4-glfd.server.ntli.net>
"Peter Wilson" <peter_wilson@mail.com> wrote in message
news:b8bthf$oog$1@sparta.btinternet.com...
> Imp trying to call a dll with lots of different functions in it (some of
> them return data structures (records of data I want to read)). Up until
now
> I have been avoiding all my issues using Win32::API but it would appear
that
> I am now forced into needing to use more a complex route to get at the
> returned data structures (or at least I think I do). From what I read I
> think that SWIG or XS are needed. Does anyone know of a book / web site /
> set of examples of how to write XS or SWIG or have any advice on which is
> best to use. I have a header file (.h) and the library (.dll) and no
source
> files.
As Eric suggests, SWIG is what you need.
Good documentation at
http://www.swig.org/
and I particularly recommend
http://www.swig.org/papers/Perl98/swigperl.htm
Julian Scarfe
------------------------------
Date: Sat, 26 Apr 2003 22:07:41 +0000 (UTC)
From: "David H. Adler" <dha@panix.com>
Subject: Re: {newbie} sorting of files
Message-Id: <slrnbam0pd.d8p.dha@panix2.panix.com>
In article <slrnbaiutq.sj.galenmenzel@localhost.localdomain>, Galen
Menzel wrote:
>
> That's why he should have used a pseudohash. That would have solved
> all his problems!
>
> Oh, wait...
No, no, that might be true. Granted, he'd have an even greater number
of *new* problems... :-)
dha
--
David H. Adler - <dha@panix.com> - http://www.panix.com/~dha/
Six course banquet of nothing, with a scoop of sod-all for a palate
cleanser - Rupert Giles
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 4898
***************************************