[19526] in Perl-Users-Digest
Perl-Users Digest, Issue: 1721 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 10 00:06:06 2001
Date: Sun, 9 Sep 2001 21:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <1000094707-v10-i1721@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Sun, 9 Sep 2001 Volume: 10 Number: 1721
Today's topics:
Re: @INC on VMS <Patrick_member@newsguy.com>
Re: Arrays, References and indexes. (Martien Verbruggen)
Re: Confused (again) over complex data structures. <goldbb2@earthlink.net>
Re: Difference between .pl, .cgi, and .pm File Extensio (Damian James)
Happy Rollover <revjack@revjack.net>
Re: inter process communication (Chris Fedde)
Perl docs <e@nospam:[arix.com]>
Re: Perl docs <mjcarman@home.com>
Re: Perl docs <bwalton@rochester.rr.com>
Re: reading lines from a file (TuNNe|ing)
Re: reading lines from a file (TuNNe|ing)
Re: Regular Expression puzzle... <goldbb2@earthlink.net>
Slight regexp problem (lar3ry gensch)
Re: Slight regexp problem <davidhilseenews@yahoo.com>
Re: Slight regexp problem (Tad McClellan)
Re: Slight regexp problem (lar3ry gensch)
Re: Slight regexp problem (lar3ry gensch)
Re: Slight regexp problem <davidhilseenews@yahoo.com>
Re: Slight regexp problem <bart.lateur@skynet.be>
Re: Slight regexp problem (lar3ry gensch)
Re: Slight regexp problem <davidhilseenews@yahoo.com>
Stand-alone Perl programs in Win32 (Tucker McLean)
Re: Where can I find a Perl SSH Telnet Client? <whataman@home.com>
Re: Where can I find a Perl SSH Telnet Client? <tintin@snowy.calculus>
Which Version............please? <dabhar@dabhar.org>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 9 Sep 2001 14:56:53 -0700
From: Patrick Flaherty <Patrick_member@newsguy.com>
Subject: Re: @INC on VMS
Message-Id: <9ngoj501l23@drn.newsguy.com>
In article <6Hem7.316511$v5.31719849@news1.rdc1.ct.home.com>, Dan says...
>
>Patrick Flaherty <Patrick_member@newsguy.com> wrote:
>
>>How do I define @INC on VMS? (Presumably with a logical name). Looked through
>> the documentation and it's not yet apparent to me.
>
>Just like you do everywhere else. :)
Well I of course tried that but it didn't work (to the best of my
understanding):
RM763A>defi "@INC" U8:[UTIL.PERL5.LIB]
%DCL-I-SUPERSEDE, previous value of @INC has been superseded
RM763A>perl devel_pf_perl:dtrees.pl
Can't locate File/Find.pm in @INC at devel_pf_perl:dtrees.pl line 7.
BEGIN failed--compilation aborted at devel_pf_perl:dtrees.pl line 7.
%RMS-F-SYN, file specification syntax error
RM763A>
>
>The -I flag works fine (quote it so it doesn't get downcased), as does the
>use lib pragma.
>
> Dan
------------------------------
Date: Mon, 10 Sep 2001 03:32:15 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: Arrays, References and indexes.
Message-Id: <slrn9pod1u.6cq.mgjv@verbruggen.comdyn.com.au>
[Please, in the future, if you Cc someone on a post, say so in the
message body. Also, put your reply _after the suitably trimmed text
you reply to. It's the generally accepted quoting convention on this
newsgroup, and Usenet in general.]
On Sun, 09 Sep 2001 20:06:43 GMT,
Sudhir Krishnan <sudhir@newmail.net> wrote:
> Actually I simplified the problem a bit, when I posted.
> The way the data is defined, the data structure is 4D.
Ok.
> The way to read the data files. p1 and p2
> first 2 lines: result header
> next 3 lines: cgi header
> next 3 lines: system wide stanza (another header)
>
><main body>
>
> 3rd and 4th last lines: physical system trailer
> 2nd last and last lines: file trailer.
> ####################################################3
>
> The body consists of 'stanzas'
>
> Each stanza starts with "*FC ????????"
>
> So all text, starting with *FC until the next *FC is a stanza.
>
> I process each file convert stanza to line format for simplicity.
>
> i.e into something line:
>
> *FC=????????&data1=value&data2=value ....
>
The following is the output format? You are not keeping any
information about which file this stuff came from?
> Stanzas in the file are grouped together as follows:
>
> System Stanza
> Subsystem Stanza
> Subsystem Stanza
> .
> .
> .
> System Stanza
> subsystem stanza
> subsystem stanza
> .
> .
>
> So each stanza with its subsystem stanzas is what I called
> a 'section'
> The problem is to come up with one ouput file, that'll contain
> all the system stanzas from all files.
You don't specify whether there's any necessary order between the
stanzas in the different files, or even whether the ordering needs to
be maintained.
> Also all subsystem stanzas should be present under them.
> Corresponding system stanzas can have duplicate sub-system stanza
> entries, duplicates should be removed
This implies that maybe you should use hashes, or a combination of
hashes and arrays if order needs to be preserved, or a hash tie that
preserves order.
If all you're interested in is the 'system stanzas' and each of their
sub stanzas, why do you keep more information in your structured
array?
> Do you think this problem is worthy of perl?
Yep. I'm certain that this can be done perfectly fine in Perl. It's
just a bit too complicated and underspecified for me to work on. I
don't really have the time to try to work all this out from the file
data. My first ideas, however would be to write a subroutine that
parses a file, returning the header stanza, and a reference to an
array with all the stanzas in the file (I'd probably write a little
subroutine to parse a single stanza from a file handle as well). I'd
collect those in some array, or so, and then postprocess that
structure to flatten it.
You imply that each file contains the same set of system stanzas.
Maybe a structure could be:
Use the system stanzas as a hash key, and the value of this hash key
could be a reference to an array with all the sub stanzas. Or, it
could be a reference to a hash, with all the stanzas. In the latter
case, all the duplication will automatically be taken care of,
although you will lose the order in which the stuff appears. If the
order is important, you can either use a hash that preserves order
(there are some Tie:: modules available from CPAN), or you also keep
an array which you can use for the order.
[Jeopardectomy performed]
Martien
--
Martien Verbruggen |
Interactive Media Division | 42.6% of statistics is made up on the
Commercial Dynamics Pty. Ltd. | spot.
NSW, Australia |
------------------------------
Date: Mon, 10 Sep 2001 00:05:09 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Confused (again) over complex data structures.
Message-Id: <3B9C3BF5.F10A6B8F@earthlink.net>
Stan Brown wrote:
>
> Iahe an rference to an array of references to arrays, as returned by
> DBI::fetchrow_arrayref. I am able to acces individual scalars from
> this structure using something liek:
>
> $records_array_ref->[($pos - 1)]->[$i]
>
> Now, I'm trying to pass each of the individual arrays to a DBI insert
> staement. The statement is defined like this:
>
> INSERT INTO stan VALUES ( ?, ?, ?, ?, ?, ?, ? )
>
> And I;m currently doing something like:
>
> foreach (@$records_array_ref)
> {
> $stho->execute (@$records_array_ref->[$i++]);
> }
Foreach iterates over the given array, and stores it either into the
varable mentioned between "foreach" and "(", or into $_, if there isn't
one.
What I think you want is:
foreach (@$records_array_ref) {
$stho->execute( @$_ );
}
> But what is getting put in the DB are the references, not the data.
There is a bug in perl which makes @foo->[x] act like $foo[x]. So when
you do @$r_a_r->[i], it acts like ${$r_a_r}[i], which in turn is
equivilant to $r_a_r->[i]. You've already said that @$r_a_r is a list
of arrayreferences, so $r_a_r->[i] is an arrayref. That is why what is
getting put into the DB are references.
Note that if that particular bug were fixed, then @foo->[x] would result
in something like "3"->[x], which means $3[x] (or some other number
instead of 3, depending on the length of @foo).
>
> What am I doing wrong?
You should be using the iterator value which foreach provides, and be
doing:
foreach (@$records_array_ref) {
$stho->execute( @$_ );
}
Or better yet:
while( my $arr = $sth_a->fetchrow_arrayref ) {
$sth_b->execute( @$arr );
}
--
"I think not," said Descartes, and promptly disappeared.
------------------------------
Date: 9 Sep 2001 22:24:12 GMT
From: damian@qimr.edu.au (Damian James)
Subject: Re: Difference between .pl, .cgi, and .pm File Extensions.
Message-Id: <slrn9pnqsr.o41.damian@puma.qimr.edu.au>
Trewth Seeker chose 6 Sep 2001 18:44:55 -0700 to say this:
>"J?gen Exner" <jurgenex@hotmail.com> wrote in message news:<3b8fcb68$1@news.microsoft.com>...
>> "Trewth Seeker" <trewth_seeker@yahoo.com> wrote in message
>> news:d690a633.0108302015.60293f45@posting.google.com...
>> > .pl and .cgi files also must be "installed onto a server" -- where *else*
>> > do you expect them to be?
>>
>> Hmm, then I guess my computer is broken.
>> If I type "foobar.cgi" at the command line my computer will run the program
>> "foobar.cgi". No server of whatever kind involved here.
>
>Your PC is a server of a kind.
Dude, you are confusing the notion of a machine with web-server software
running (which would be necessary to provide a CGI environment) with a
machine that merely has perl installed -- which need not be serving
anything at all. This suggests that you are making an implicit assumption
that Perl is only for CGI. Many people here would find that not merely
wrong, but also vaguely offensive. Myself, I wouldn't even go so far as to
assume that Juergen is using a peecee...
I suggest you refrain from making blanket statements involving such
generalisations until you have these details cleared up at least.
:-)
Cheers,
Damian
--
@:=grep!(m!$/|#!..$|),split//,<DATA>;@;=0..$#:;while($:=@;){$;=rand
$:--,@;[$;,$:]=@;[$:,$;]while$:;push@|,shift@;if$;[0]==@|;select$,,
$,,$,,1/80;print qq x\bxx((@;+@|)*$|++),@:[@|,@;],!@;&&$/} __END__
Just another Perl Hacker, ### rev 3.3 -- stupidectomy performed :-)
------------------------------
Date: 10 Sep 2001 03:07:28 GMT
From: revjack <revjack@revjack.net>
Subject: Happy Rollover
Message-Id: <9nhapg$njh$1@news1.Radix.Net>
Keywords: Hexapodia as the key insight
~$ perl -wle 'print length time'
10
Happy rollover. Is the world supposed to end now?
--
___________________
revjack@revjack.net
------------------------------
Date: Sun, 09 Sep 2001 22:06:48 GMT
From: cfedde@fedde.littleton.co.us (Chris Fedde)
Subject: Re: inter process communication
Message-Id: <YJRm7.258$Owe.255086080@news.frii.net>
In article <3B9B7C45.E2C7E1B9@cs.man.ac.uk>,
Andrew Paul Gorton <gortona@cs.man.ac.uk> wrote:
>Hi,
>
>I have a client/server where the server receives requests from the
>client and forks a number of processes which will continue to run until
>the client sends a stop.
>
>The client has the ability to send reconfiguration data to the server
>which passes this to the relevant processes. The server also needs the
>ability to get information from the processes when asked by the client.
>
>Therefore the server needs bi-directional communication with the
>processes it has forked. At the moment the server writes parameters to
>a file, which the processes reads. The processes then write to a file,
>which the server reads to send back data to the client. However, with a
>large number of processes there will be a concurrence issue.
>
>Is there a neater way to achieve this communication
>
>Any help appreciated
The standard perl manual page called perlipc has several examples
of doing this kind of thing. And depending on how sophisticated
you need your server to be you might consider looking at the IO::Select,
Event, or POE modules.
Good Luck.
--
This space intentionally left blank
------------------------------
Date: Sun, 9 Sep 2001 17:28:05 -0700
From: "ekkis" <e@nospam:[arix.com]>
Subject: Perl docs
Message-Id: <eZTm7.454$884.141247@news.pacbell.net>
does anyone know where I could download the Perl HTML documentation? I
don't want to keep using perldoc.com because it's sometimes unavailable so
I'd like to install this on my web server... but I can't find it!
- e
------------------------------
Date: Mon, 10 Sep 2001 01:14:44 GMT
From: Michael Carman <mjcarman@home.com>
Subject: Re: Perl docs
Message-Id: <3B9C138A.9080704@home.com>
ekkis wrote:
> does anyone know where I could download the Perl HTML documentation? I
> don't want to keep using perldoc.com because it's sometimes unavailable so
> I'd like to install this on my web server... but I can't find it!
You should already have the pods; you can use pod2html to convert them
into HTML.
-mjc
------------------------------
Date: Mon, 10 Sep 2001 01:21:55 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re: Perl docs
Message-Id: <3B9C15AC.1F9CCF24@rochester.rr.com>
ekkis wrote:
>
> does anyone know where I could download the Perl HTML documentation? I
> don't want to keep using perldoc.com because it's sometimes unavailable so
> I'd like to install this on my web server... but I can't find it!
>
> - e
Well, you could check out pod2html, which is a program that should
already be on your hard drive. That will generate HTML which matches
your POD, module for module and version for version. Or, if you have a
Windoze machine available, you could install ActiveState Perl and grab
the HTML from that.
--
Bob Walton
------------------------------
Date: Mon, 10 Sep 2001 02:09:34 GMT
From: troll@gimptroll.com (TuNNe|ing)
Subject: Re: reading lines from a file
Message-Id: <3b9c1ece.35992990@news.coserv.net>
On 6 Sep 2001 14:02:55 -0700, neil.elder@teradyne.com (Neil Elder)
wrote:
[snip]
>$fail = open(SFILE, ">D:/inetpub/wwwroot/directlink/resumetrack/dir.pass");
[/snip]
Your opening the file for output. Try:
$fail = open(SFILE,
">D:/inetpub/wwwroot/directlink/resumetrack/dir.pass");
TuNNe|ing
------------------------------
Date: Mon, 10 Sep 2001 02:10:42 GMT
From: troll@gimptroll.com (TuNNe|ing)
Subject: Re: reading lines from a file
Message-Id: <3b9c210c.36567268@news.coserv.net>
On Mon, 10 Sep 2001 02:09:34 GMT, troll@gimptroll.com (TuNNe|ing)
wrote:
bah
$fail = open(SFILE,
"<D:/inetpub/wwwroot/directlink/resumetrack/dir.pass");
------------------------------
Date: Sun, 09 Sep 2001 23:50:40 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Regular Expression puzzle...
Message-Id: <3B9C3890.DA5B86AC@earthlink.net>
Fred wrote:
>
> for me, anyway...I think I'm almost there, but...
>
> I have data in the form of
>
> XXX XXXXX XXXXXX
> X X XXXXX XXX
> X X X X XXXXXX XXXX
> X X & X XXXXX
>
> where X can be either a number or letter, all caps.
>
> I need to collapse the single letters/single spaces.
>
> ie,
>
> X X X X YYYYY YYYY
>
> to
>
> XXXX YYYYY YYYY
>
> and
>
> X X & X XXXXX
>
> to
>
> XX&X XXXXX
$test =~ s/((?<=\s\S)|(?<=^\S)) (?=\S\s|\S$))//g;
Remove any space which is both preceded by just a single letter and
followed by just a single letter.
--
"I think not," said Descartes, and promptly disappeared.
------------------------------
Date: Sun, 09 Sep 2001 22:19:36 GMT
From: lar3ry@mediaone.net (lar3ry gensch)
Subject: Slight regexp problem
Message-Id: <YVRm7.7683$e55.1050801@typhoon.ne.mediaone.net>
I have written a text reformatter. This basically takes text and
ensure it follows some simple grammatical rules (eg., two spaces after
periods, exclams, question marks; one space after ellipsis and colons,
etc. It also does automatic indentation and limits line lengths to 62
characters, etc. The program does this by collecting all "lines" in a
file, separating them into paragraph strings. It then runs
substitutions on the strings to emit the paragraphs properly.
The script is easy to write and maintain in perl, but I have come up
with a slight problem.
Some of the text that I'm reformatting has some illegal constructs (in
a grammatical sense).
Like:
Now, I just have my doctorate to worry about........
The problem with the above line is that there are too many periods on
the ellipsis. I have had a perl line to fix things like that:
s/\s*\.\.+([ !?"])?/...$1/g;
(Remember, these substitutions happen on a per-paragraph level, with
all newlines having previously been removed.)
The problem is that on the input line above (about the doctorate), the
ellipsis is the final entry in the paragraph. This perl code
generates a run-time error, since the grouping operator is followed by
a question mark (zero or one matches), and the $1 is referring to a
capture buffer that doesn't exist in this case:
Use of uninitialized value in concatenation (.) at
/home/lar3ry/scripts/fmt.pl line 115, <> line 2.
(However, the actual replacement works as desired.)
The problem is that I don't want the error message emitted at all.
I *KNOW* that there will be some cases where the replacement will not
be found, and I want it to be null.
When I use my perl script in an editor buffer, the editor picks up all
standard output and standard error, and perl's message is being dumped
into my buffer. And usually, bad ellipsis like the above usually
occurs at the end of paragraphs.
I'd hate to have to have to run the program with standard error
redirected to /dev/null, because if there IS a problem with my
program, I'd like to be able to know about it as soon as is possible.
But having to search for these error messages is tiresome.
The function of the code I need is to do the following, specifically:
Look for something the looks like an ellipsis (two or more
periods). Remove any white space immediately prior to the
ellipsis, and make sure it is replaced by exactly three periods.
Make sure it is followed by an end of paragraph, an exclam, a
question mark, a quotation mark, or a space.
(I know, technically "What...?" is invalid grammar, but I prefer to
allow this particular violation.)
If anybody has a (hopefully short) replacement for my line of perl
code, I'd be grateful!
--
lar3ry gensch lar3ry@mediaone.net
"As God is my witness, I thought turkeys could fly" - Arthur Carlson
------------------------------
Date: Sun, 09 Sep 2001 22:47:23 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: Slight regexp problem
Message-Id: <%jSm7.85293$hT4.22066878@news1.rdc1.md.home.com>
<snip>
> Some of the text that I'm reformatting has some illegal constructs (in
> a grammatical sense).
>
> Like:
>
> Now, I just have my doctorate to worry about........
>
> The problem with the above line is that there are too many periods on
> the ellipsis. I have had a perl line to fix things like that:
>
> s/\s*\.\.+([ !?"])?/...$1/g;
>
<snip>
> The problem is that on the input line above (about the doctorate), the
> ellipsis is the final entry in the paragraph. This perl code
> generates a run-time error, since the grouping operator is followed by
> a question mark (zero or one matches), and the $1 is referring to a
> capture buffer that doesn't exist in this case:
>
> Use of uninitialized value in concatenation (.) at
> /home/lar3ry/scripts/fmt.pl line 115, <> line 2.
>
> (However, the actual replacement works as desired.)
>
> The problem is that I don't want the error message emitted at all.
<snip>
Why are you bothering to match the character afterwards? It sounds like you
want to just eliminate all the excess periods in any elipsis. If that is
the case, then why won't something like this be suitable?
$ perl -we '$_="There is.....um ..excuse me .....?"; s|\s*\.{2,}|...|g;
print;'
There is...um...excuse me...?
If there is something that is missed by this regex, please explain.
Also, be sure to look at the text-parsing modules on CPAN.
David Hilsee
------------------------------
Date: Sun, 09 Sep 2001 22:56:25 GMT
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Slight regexp problem
Message-Id: <slrn9pnq2j.r3n.tadmc@tadmc26.august.net>
lar3ry gensch <lar3ry@mediaone.net> wrote:
>Some of the text that I'm reformatting has some illegal constructs (in
>a grammatical sense).
>
>Like:
>
> Now, I just have my doctorate to worry about........
>
>The problem with the above line is that there are too many periods on
>the ellipsis. I have had a perl line to fix things like that:
>
> s/\s*\.\.+([ !?"])?/...$1/g;
>
>The problem is that on the input line above (about the doctorate), the
>ellipsis is the final entry in the paragraph. This perl code
>generates a run-time error,
^^^^^^^^^^^^^^
No it doesn't.
>since the grouping operator is followed by
>a question mark (zero or one matches), and the $1 is referring to a
>capture buffer that doesn't exist in this case:
>
> Use of uninitialized value in concatenation (.) at
> /home/lar3ry/scripts/fmt.pl line 115, <> line 2.
That is a "warning", not an "error", message.
>If anybody has a (hopefully short) replacement for my line of perl
>code, I'd be grateful!
If you use lookahead, then you won't need to use $1 at all:
s/\s*\.\.+(?=[ !?"]|$)/.../g;
I don't think I'd go with the "2 or more dots is ellipsis" rule though.
If they could leave 1 dot out of an ellipsis, then they could add
1 extra dot to a period. I wouldn't want to change (what was supposed to be)
a period into an ellipsis. I'd go with "3 or more":
s/\s*\.{3,}(?=[ !?"]|$)/.../g;
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Sun, 09 Sep 2001 22:59:07 GMT
From: lar3ry@mediaone.net (lar3ry gensch)
Subject: Re: Slight regexp problem
Message-Id: <%uSm7.7687$e55.1051573@typhoon.ne.mediaone.net>
In article <%jSm7.85293$hT4.22066878@news1.rdc1.md.home.com>,
"David Hilsee" <davidhilseenews@yahoo.com> writes:
> <snip>
>
> Why are you bothering to match the character afterwards? It sounds like you
> want to just eliminate all the excess periods in any elipsis. If that is
> the case, then why won't something like this be suitable?
>
> $ perl -we '$_="There is.....um ..excuse me .....?"; s|\s*\.{2,}|...|g;
> print;'
> There is...um...excuse me...?
>
> If there is something that is missed by this regex, please explain.
>
> Also, be sure to look at the text-parsing modules on CPAN.
The result doesn't follow the ruleset that I outlined. There should
be one space after each ellipsis, except if it is followed by a
sentence-ending punctuation character (question mark, exclamation
point, quotation mark).
--
lar3ry gensch lar3ry@mediaone.net
"As God is my witness, I thought turkeys could fly" - Arthur Carlson
------------------------------
Date: Sun, 09 Sep 2001 23:14:58 GMT
From: lar3ry@mediaone.net (lar3ry gensch)
Subject: Re: Slight regexp problem
Message-Id: <SJSm7.7689$e55.1052030@typhoon.ne.mediaone.net>
In article <slrn9pnq2j.r3n.tadmc@tadmc26.august.net>,
tadmc@augustmail.com (Tad McClellan) writes:
> lar3ry gensch <lar3ry@mediaone.net> wrote:
>
>> Use of uninitialized value in concatenation (.) at
>> /home/lar3ry/scripts/fmt.pl line 115, <> line 2.
>
> That is a "warning", not an "error", message.
Correct. Poor choice of words on my part.
> If you use lookahead, then you won't need to use $1 at all:
>
> s/\s*\.\.+(?=[ !?"]|$)/.../g;
Actually, to get what I wanted, I used your pattern, and added a second
to ensure that a single space follows an ellipsis where appropriate:
s/\s*\.\.+(?=[ !?"]|$)/.../g;
s/\.\.\.([^ !?])/... $1/g;
> I don't think I'd go with the "2 or more dots is ellipsis" rule though.
> If they could leave 1 dot out of an ellipsis, then they could add
> 1 extra dot to a period. I wouldn't want to change (what was supposed to be)
> a period into an ellipsis. I'd go with "3 or more":
In most of the places where I've found two periods, I've found that an
ellipsis was intended. I tend to proofread after running the
formatter (the formatter being the first step in my proofreading
cycle), and I can pick up the places where ellipsis was inserted but
not intended.
The formatter was originally written about half a year ago, and it has
evolved since then. It now handles most of the stuff I throw at it
very nicely. It has to be pseudo-intelligent in order to get spacing
correct. For example, an abbreviated word like "Mr." should have a
single space following it, so I had to enter a list of common
initials into the formatter (St. Ave. Co. Dr. Jr. Ln. Lt. Mr.
Mrs. etc.) so that it handled the spacing just right.
Anyway, thanks for the quick answer!
--
lar3ry gensch lar3ry@mediaone.net
"As God is my witness, I thought turkeys could fly" - Arthur Carlson
------------------------------
Date: Sun, 09 Sep 2001 23:26:52 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: Slight regexp problem
Message-Id: <0VSm7.85713$hT4.22101186@news1.rdc1.md.home.com>
"lar3ry gensch" <lar3ry@mediaone.net> wrote in message
news:%uSm7.7687$e55.1051573@typhoon.ne.mediaone.net...
> In article <%jSm7.85293$hT4.22066878@news1.rdc1.md.home.com>,
> "David Hilsee" <davidhilseenews@yahoo.com> writes:
> > <snip>
> >
> > Why are you bothering to match the character afterwards? It sounds like
you
> > want to just eliminate all the excess periods in any elipsis. If that
is
> > the case, then why won't something like this be suitable?
> >
> > $ perl -we '$_="There is.....um ..excuse me .....?";
s|\s*\.{2,}|...|g;
> > print;'
> > There is...um...excuse me...?
> >
> > If there is something that is missed by this regex, please explain.
> >
> > Also, be sure to look at the text-parsing modules on CPAN.
>
> The result doesn't follow the ruleset that I outlined. There should
> be one space after each ellipsis, except if it is followed by a
> sentence-ending punctuation character (question mark, exclamation
> point, quotation mark).
Your explanation of the problem was
Look for something the looks like an ellipsis (two or more
periods). Remove any white space immediately prior to the
ellipsis, and make sure it is replaced by exactly three periods.
Make sure it is followed by an end of paragraph, an exclam, a
question mark, a quotation mark, or a space.
The last line was what caused the communication problem, as it was fuzzy.
Your code was suggesting that whatever was after the ellipsis didn't matter
(optional match - ?), so I used that as an explanation instead. If it's not
optional, but you don't want to match it, then a lookahead is something to
look at. But I believe Tad already supplied that solution.
David Hilsee
------------------------------
Date: Mon, 10 Sep 2001 00:18:31 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: Slight regexp problem
Message-Id: <8l1opt8f8sv1qnka6t3ndelcchamsc8di2@4ax.com>
lar3ry gensch wrote:
>There should
>be one space after each ellipsis, except if it is followed by a
>sentence-ending punctuation character (question mark, exclamation
>point, quotation mark).
So a normal sentence, ordinarily terminated by a dot, can't end in an
ellipsis?
--
Bart.
------------------------------
Date: Mon, 10 Sep 2001 01:57:54 GMT
From: lar3ry@mediaone.net (lar3ry gensch)
Subject: Re: Slight regexp problem
Message-Id: <C6Vm7.7699$e55.1060439@typhoon.ne.mediaone.net>
In article <8l1opt8f8sv1qnka6t3ndelcchamsc8di2@4ax.com>,
Bart Lateur <bart.lateur@skynet.be> writes:
> lar3ry gensch wrote:
>>There should
>>be one space after each ellipsis, except if it is followed by a
>>sentence-ending punctuation character (question mark, exclamation
>>point, quotation mark).
> So a normal sentence, ordinarily terminated by a dot, can't end in an
> ellipsis?
That's where any automated solution fails. In order to get things
perfect, one would have to have a perl script that can actually
understand English. You cannot even "look ahead" to the next word to
see if it starts with a capital letter, as the next word might be a
proper noun. Writing a perl script that understands conversational
English was beyond the scope of my formatter.
Another place where spacing after a period can go wrong is when the
last word of a sentence is an abbreviation. I tend to avoid that
construct when I'm writing (as in "Thanks again, Mister.") But if
somebody abbreviated the last word, the formatter will only add a
single space.
This is why my formatter is only the first step (albeit a very useful
one) in the process of editing.
Perl is ideally suitable for finding and fixing the most common
grammatical problems. To take another example outside what I've
discussed before, the rules of grammar require that no space appear
between two words that are joined with an "em dash." In US-ASCII text,
an em-dash is represented by two (and only two) consecutive hyphens.
The perl formatter I wrote can handle the cases where there is white
space before or after two dashes.
I found a web site called "Guide to Grammar and Writing" which
itemized the various "rules" for sentence layout:
http://ccc.commnet.edu/grammar/
It was a simple matter to codify those rules in perl to make my
formatter, which I am still tweaking as I find odd bits that get
through the formatter.
Anyway, I'd like to once again thank the people (David and Tad) that
gave me enough information in order to get my script to work
properly. I'm very surprised to have gotten an answer so quickly on
USENET... I'm not used to such quick turnarounds!
--
lar3ry gensch lar3ry@mediaone.net
"As God is my witness, I thought turkeys could fly" - Arthur Carlson
------------------------------
Date: Mon, 10 Sep 2001 02:15:37 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: Slight regexp problem
Message-Id: <dnVm7.86533$hT4.22252190@news1.rdc1.md.home.com>
"lar3ry gensch" <lar3ry@mediaone.net> wrote in message
news:SJSm7.7689$e55.1052030@typhoon.ne.mediaone.net...
> In article <slrn9pnq2j.r3n.tadmc@tadmc26.august.net>,
> tadmc@augustmail.com (Tad McClellan) writes:
> > lar3ry gensch <lar3ry@mediaone.net> wrote:
> >
> >> Use of uninitialized value in concatenation (.) at
> >> /home/lar3ry/scripts/fmt.pl line 115, <> line 2.
> >
> > That is a "warning", not an "error", message.
>
> Correct. Poor choice of words on my part.
>
> > If you use lookahead, then you won't need to use $1 at all:
> >
> > s/\s*\.\.+(?=[ !?"]|$)/.../g;
>
> Actually, to get what I wanted, I used your pattern, and added a second
> to ensure that a single space follows an ellipsis where appropriate:
>
> s/\s*\.\.+(?=[ !?"]|$)/.../g;
> s/\.\.\.([^ !?])/... $1/g;
>
I don't quite understand this solution, and I'm curious if that's what you
want. The first regular expression targets ellipses that are followed by a
space, an exclamation mark, a double quote, or the end of the string. What
about other ellipses?
For example, I had a string that I used in my code:
$ perl -wle '$_="There is.....um ..excuse me ..... ?"; s/\s*\.\.+(?=[
!?"]|$)
/.../g; print;s/\.\.\.([^ !?])/... $1/g;print;'
There is.....um ..excuse me... ?
There is... ..um ..excuse me... ?
Note the two print statements to show the two passes. I would think that
this is more in line with what you want:
$ perl -wle '$_="There is.....um ..excuse me ..... ?";
s|\s*\.{2,}\s*|...|g;
print;s/\.\.\.([^!?"])/... $1/g;print;'
There is...um...excuse me...?
There is... um... excuse me...?
That is, replacing all of the incorrect or correct ellipses with three
periods and simultaneously removing all of the whitespace surrounding them,
then going back and putting spaces in where they are required according to
your punctuation scheme.
David Hilsee
------------------------------
Date: 9 Sep 2001 20:36:43 -0700
From: tucker@noodleroni.com (Tucker McLean)
Subject: Stand-alone Perl programs in Win32
Message-Id: <7a18cc04.0109091936.7079d49e@posting.google.com>
Hi,
I was wondering what I could do to make my Perl program run by itself
without Perl having to be around (such as if I distributed it). If
possible, maybe without having to install Cygwin and using the
compiler. I know there is several ways to do this with Tcl/Tk. Any
help is appreciated.
Thanks,
Tucker
_____________________
Esse quam vederi, Rather to be than to seem.
------------------------------
Date: Mon, 10 Sep 2001 01:42:46 GMT
From: "What A Man !" <whataman@home.com>
Subject: Re: Where can I find a Perl SSH Telnet Client?
Message-Id: <3B9C1AA2.9A9A1AE2@home.com>
peter pilsl wrote:
>
> What A Man ! wrote:
>
> > Does anyone know where I can find an SSH Client that's written in Perl
> > that will allow me to input telnet commands thru SSH Telnet?
> >
>
> You cant be such a man, when you cant look at www.cpan.org, where my search
> for ssh returned two modules. Got your powers in the wrong place ?
>
Perhaps, but being stupid is not analogous to being a man.
> If this ssh-modules at cpan dont do what you need (I didnt take a look at
> them) you can still use any command-line-based ssh-client and let perl
> communicate with it.
Thanks, I really doubted that Perl had any SSH modules so I didn't
check, but admit that I should have. I did check www.openssh.org and it
said there was a unix client, but the only URL was in Finland, and it
would never respond. I've now looked at the SSH modules that the other
poster recommended, and still don't quite understand how to go about
putting it together in a script. I was really hoping there was a "perl
SSH telnet client script" already written by someone, so I wouldn't have
to re-invent the wheel. Seems like I saw a perl telnet client in one of
Randall Schwartz articles, if I can find it.
Regards,
Dennis
------------------------------
Date: Mon, 10 Sep 2001 11:56:09 +1000
From: "Tintin" <tintin@snowy.calculus>
Subject: Re: Where can I find a Perl SSH Telnet Client?
Message-Id: <06Vm7.12$044.236667@news.interact.net.au>
"What A Man !" <whataman@home.com> wrote in message
news:3B9C1AA2.9A9A1AE2@home.com...
> peter pilsl wrote:
> Thanks, I really doubted that Perl had any SSH modules so I didn't
> check, but admit that I should have. I did check www.openssh.org and it
> said there was a unix client, but the only URL was in Finland, and it
> would never respond. I've now looked at the SSH modules that the other
> poster recommended, and still don't quite understand how to go about
> putting it together in a script. I was really hoping there was a "perl
> SSH telnet client script" already written by someone, so I wouldn't have
> to re-invent the wheel. Seems like I saw a perl telnet client in one of
> Randall Schwartz articles, if I can find it.
I'm really confused. Why would you want a SSH "telnet" client?
------------------------------
Date: Mon, 10 Sep 2001 03:03:57 GMT
From: "Dabhar" <dabhar@dabhar.org>
Subject: Which Version............please?
Message-Id: <x4Wm7.156400$B37.3497955@news1.rdc1.bc.home.com>
Thanks for reading my post. I'm trying to figure out which distribution of
Perl to start with. I'm working on a windows machine, and there seems to be
a choice between a few of them, like ActiveState. I would really appreciate
some advice on which to choose.
Thank you so much.
Dabhar
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 1721
***************************************