[9189] in Perl-Users-Digest
Perl-Users Digest, Issue: 2806 Volume: 8
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jun 4 12:17:29 1998
Date: Thu, 4 Jun 98 09:00:23 -0700
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Thu, 4 Jun 1998 Volume: 8 Number: 2806
Today's topics:
Re: -d ? 3 : 7 is ambiguous, how comes? <jdporter@min.net>
Re: anybody installed perl in VMS? <Thomas.Kratz@lrp.de.nospam>
Re: Member Registration/Login CGI Script Help <rootbeer@teleport.com>
Re: order of execution of print statements? (Mike Stok)
Re: order of execution of print statements? <Dave.Cross@gb.swissbank.com>
Re: pattern match (Mike Stok)
Re: pattern match <jdporter@min.net>
Re: Please Help: How do you make a vote/polling script? <jefpin@bergen.org>
Problem with namespaces colliding? <cdm2@formalsys.ca>
Re: Problem with namespaces colliding? <rootbeer@teleport.com>
Re: searching <jenkinsrd@cf.ac.uk>
Re: searching (John Klassa)
Re: Sendmail script <rootbeer@teleport.com>
splitting via LEADING whitespace (Scott DiNitto)
Re: splitting via LEADING whitespace <Tony.Curtis+usenet@vcpc.univie.ac.at>
Re: splitting via LEADING whitespace (Honza Pazdziora)
Re: splitting via LEADING whitespace (Earl Hood)
Re: splitting via LEADING whitespace <quednauf@nortel.co.uk>
Re: Taking data out of strings <aqumsieh@matrox.com>
Re: Use of HTML, POD, etc in Usenet (was: Re: map in vo <jdporter@min.net>
Re: Use of HTML, POD, etc in Usenet (was: Re: map in vo (Earl Hood)
Digest Administrivia (Last modified: 8 Mar 97) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Thu, 04 Jun 1998 14:02:25 GMT
From: John Porter <jdporter@min.net>
Subject: Re: -d ? 3 : 7 is ambiguous, how comes?
Message-Id: <3576AA94.5544@min.net>
Frangois Pinard wrote:
>
> Shouldn't it read it as written first, before hypothetizing various errors?
> In the expression "-d ? 3 : 7", as written, I still do not see a possible
> ambiguous interpretation. Is there one?
You forget (if you ever knew, that is) that perl, like most computer
language parsers, scans from left to right. Imagine:
$a = -d ? 3 : 7; $b = -d ? 5 : 9;
This time, perl gives a different warning:
Number found where operator expected at - line 2,
near "? 3 : 7; $b = -d ? 5"
(Missing operator before 5?)
Perl thinks you wrote a regex! followed immediately by a 5.
How can it tell that's not what you meant?
> I'm already used to write "@x = (1) x @x", indeed, and were just trying
> to simplify this writing. I also quite agree that "@x = (1)" will only
> set the first element to 1.
But it's true for any value.
@x = $n;
is always the same as
@x = ($n);
> So, you say that a scalar in list context is interpreted as a list
> containing only that scalar.
No, he didn't say that. Only clueless newbies say that.
It isn't true. In fact, "there is no such thing as a scalar in a
list context." This has been belabored many MANY times in this
newsgroup. In a nutshell:
Just because you know how a lexical expression will be
evaluated in a list context, doesn't mean you know how
it will be evaluated in a scalar context. It might
not have any meaningful evaluation in a scalar context.
But you were inappropriately generalizing. Variables, including
array variables, do their own special thing when assigned to or
from. Just as
$n = @x;
is special, so is
@x = $n;
It can't be generalized.
John Porter
------------------------------
Date: Thu, 04 Jun 1998 16:04:23 +0200
From: Thomas Kratz <Thomas.Kratz@lrp.de.nospam>
To: GEMINI <dennis@info4.csie.nctu.edu.tw>
Subject: Re: anybody installed perl in VMS?
Message-Id: <3576A967.6367@lrp.de.nospam>
GEMINI wrote:
>
> hi all,
> I'd like to install perl under VMS. however,
> I am not familiar with VMS, so I didn't make it.
> I followed the readme.vms in perl source,
> got MMK(for make), and run MMK/descrip=[.vms]DESCRIP.MMS
> but... the following messages appear. so what should I do with it?
> or anybody has binary code that I can install directly? (the machine
> is DEC alpha)
> thanks in advance.
>
> %DCL-W-IVVERB, unrecognized command verb - check validity and spelling
> \\
> %DCL-W-UNDSYM, undefined symbol - check validity and spelling
> \\
> %DCL-W-IVVERB, unrecognized command verb - check validity and spelling
> \\
> %DCL-W-NULFIL, missing or invalid file specification - respecify
> %DCL-W-IVVERB, unrecognized command verb - check validity and spelling
> \\
> %DCL-W-IVVERB, unrecognized command verb - check validity and spelling
> \\
> %DCL-W-UNDFIL, file has not been opened by DCL - check logical name
> %DCL-W-UNDSYM, undefined symbol - check validity and spelling
> \LINE\
> %DCL-E-INVIFNEST, invalid IF-THEN-ELSE nesting structure or data
> %inconsistency
> %MMK-F-ERRUPD, error status %X1003894A occurred when updating target
> %CONFIG.H
Did you detar the sources on the VMS machine?
If you didn't, your files are all messed up.
There is a VMSTAR command available from DIGITAL.
After detaring on VMS there should be no major problems. The only one i
got was that DECC canceled the compilation process on a few non
important warnings. But that could be changed with setting the
/NOWARNING flag in DESCRIP.MMS
hope this helps
Thomas
--
Thomas Kratz Landesbank Rheinland-Pfalz, Germany
Thomas.Kratz@lrp.de (remove 'nospam' from address for reply)
------------------------------
Date: Thu, 04 Jun 1998 15:14:35 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Member Registration/Login CGI Script Help
Message-Id: <Pine.GSO.3.96.980604081235.13600b-100000@user2.teleport.com>
On Wed, 3 Jun 1998, Steven Falk wrote:
> I am looking for a CGI script that for registering new users and logging in
> registered users to a secure site.
If you're looking for software (as opposed to wanting to write it) you
should probably use your favorite search engine to find software archives.
There are plenty of those around. Hope this helps!
--
Tom Phoenix Perl Training and Hacking Esperanto
Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
------------------------------
Date: 4 Jun 1998 15:15:54 GMT
From: mike@stok.co.uk (Mike Stok)
Subject: Re: order of execution of print statements?
Message-Id: <6l6dna$11q@news-central.tiac.net>
You're being caught by stdio buffering, one way to get around this is to
put
$| = 1;
at the top of the script which makes the currently selected file handle be
flushed after each print.
This is covered in the perlvar man pages.
If stdout is a terminal then it'll usually get flushed each time a \n is
printed, so adding \n to the end of the "delayed" prints might help too.
warn and die usually output stuff to the standard error stream.
Hope this helps,
Mike
In article <3576A131.191ED2C4@buffalo.edu>,
Andrew S Gianni <agianni@acsu.buffalo.edu> wrote:
>This is something that I've had trouble with before, and not only in
>PERL (maybe in C to?) in the following code:
>
>print "opening blah_blah_blah...\n";
>open (sesame, "blah_blah_blah") or die "couldn't open blah_blah_blah\n";
>while (<sesame>){
>...
>
>If it can't open the file, it'll print out the die statement before the
>"opening blah_blah_blah..." statement. Is there anyway to force order of
>output? This also happens to me with combinations of just print
>statements like:
>
>print "processing...";
>while(<stuff>){
>...
>}
>print "done\n";
--
mike@stok.co.uk | The "`Stok' disclaimers" apply.
http://www.stok.co.uk/~mike/ | PGP fingerprint FE 56 4D 7D 42 1A 4A 9C
http://www.tiac.net/users/stok/ | 65 F3 3F 1D 27 22 B7 41
stok@colltech.com | Collective Technologies (work)
------------------------------
Date: Thu, 4 Jun 1998 14:45:35 GMT
From: David Cross-cmt <Dave.Cross@gb.swissbank.com>
Subject: Re: order of execution of print statements?
Message-Id: <eq0d8cp9qeo.fsf@gb.swissbank.com>
Andrew S Gianni <agianni@acsu.buffalo.edu> writes:
> print "opening blah_blah_blah...\n";
> open (sesame, "blah_blah_blah") or die "couldn't open blah_blah_blah\n";
> while (<sesame>){
> ...
>
> If it can't open the file, it'll print out the die statement before the
> "opening blah_blah_blah..." statement. Is there anyway to force order of
> output?
Sounds like an output buffering issue to me. Try setting $| to a
non-zero value and trying again.
> --
[seven line sig!]
Far too long. Polite posters stick to four lines.
hth,
Dave...
--
If I wasn't so busy writing status reports,
my status report might just become a progress report.
Dave.Cross@gb.swissbank.com
------------------------------
Date: 4 Jun 1998 15:11:50 GMT
From: mike@stok.co.uk (Mike Stok)
Subject: Re: pattern match
Message-Id: <6l6dfm$ses@news-central.tiac.net>
In article <35766717.82A66AA3@something.com>,
Iqbal gandham <iqbal@etc.prestel.co.uk> wrote:
>Hi
>
>I have a text file full of lines like below
>
>
>
>NW1*, NW8*, SE1, SE11, SE17, SE5, SW1, SW10, SW11, SW1A,
>SW1E, SW1H, SW1P, SW1V, SW1W, SW1X, SW1Y, SW3, SW4, SW5, SW6,
>SW7, SW8, W1*, W1A*, W1E*, W1H*, W1M*, W1N*, W1P*, W1R*,
>W1V*, W1X*, W1Y*, W2*, W8*, W9*;D5482.htm
>
>
>What I have is a web page where users enter in a string. I want to see
>which line contains a match, and then goto that file listed at the end.
>
>Firstly how do I either match on teh first three or four chracters,
>cause as you can see some like SW8 only have 3 charcaters in it, so if
>someone enters SW83 2ER, I need that to match just as if someone entered
>SW1V 2ER. I cant match on just the first two chracters because some SW*
>point to other files.
>
>Also whats the quickest way od doing such a match, I presume its not to
>grep, and tyen split on the semi colon, it is quicker to read the file
>into a array or something.
>
>I think what I need to do is to first match the first 3 characters, and
>then match the fourth.
>
>One last thing, if I have a wildcard in the file eg NW8*, how do I get
>it to match if someone enters in NW81 3WE, because it will see NW8* in
>the file, and not match.
Is it possible to preprocess the file so that you have perl style regular
expressions and a filename on each line of the file?
NW1*, NW8*, SE1, SE11, SE17, SE5, SW1, SW10, ..., W8*, W9*;D5482.htm
Could become
(?:NW1.*|NW8.*|SE1|SE11|SE17|SE5|SW1|SW10|W8.*|W9.*);D5482.htm
and then as you read the file you could say something like this (no error
checking, untested code...)
$dest = undef;
while (<FILE>) {
($pattern,$file) = split /;/
if ($input =~ /^$pattern/) {
$dest = $file;
last;
}
}
if (defined $dest) {
...
}
If you put the common ones near the beginning of the file then you can
avoid too much "unproductive" work.
If you put some thought into the preprocessing of the file then you could
end up with much more efficient regexes than the one I used as an
illustration, the example I used should be easy to construct from your
file though.
If you were using something where the perl script was cached (e.g. Apache
+ mod_perl) the in would be worth reading the data file at startup time
and generating a routine which would return the appropriate destination
page from user input. This would be an extension of the technique near
the end of http://www.perl.com/CPAN-local//doc/FMTEYEWTK/regexps.html
The amount of effort you might want put into optimising depends on you,
the traffic to your site and the machine it's on, my suggestion might do
the job but at an unacceptable cost.
Another approach might be to find all the valid first chunks of a London
post code and make a DB file mapping them to the appropriate destination
file name. That would allow you to detect "illegal" post codes and would
just be a single DB lookup. The time saved is paid for in space and the
work done generating the DB file.
Hope this helps,
Mike
--
mike@stok.co.uk | The "`Stok' disclaimers" apply.
http://www.stok.co.uk/~mike/ | PGP fingerprint FE 56 4D 7D 42 1A 4A 9C
http://www.tiac.net/users/stok/ | 65 F3 3F 1D 27 22 B7 41
stok@colltech.com | Collective Technologies (work)
------------------------------
Date: Thu, 04 Jun 1998 14:43:14 GMT
From: John Porter <jdporter@min.net>
Subject: Re: pattern match
Message-Id: <3576B428.3A02@min.net>
Iqbal gandham wrote:
>
>
> NW1*, NW8*, SE1, SE11, SE17, SE5, SW1, SW10, SW11, SW1A,
> SW1E, SW1H, SW1P, SW1V, SW1W, SW1X, SW1Y, SW3, SW4, SW5, SW6,
> SW7, SW8, W1*, W1A*, W1E*, W1H*, W1M*, W1N*, W1P*, W1R*,
> W1V*, W1X*, W1Y*, W2*, W8*, W9*;D5482.htm
>
> Firstly how do I either match on teh first three or four chracters,
> cause as you can see some like SW8 only have 3 charcaters in it, so if
> someone enters SW83 2ER, I need that to match just as if someone entered
> SW1V 2ER.
Umm, are you sure that "SW83 2ER" should be matched by "SW8"?
If so, then what's the difference between "SW8" and "SW8*"?
Thanks,
John Porter
(my email is down temporarily)
------------------------------
Date: Thu, 4 Jun 1998 09:59:07 -0400
From: Any more mini-dilemmas I should know about? <jefpin@bergen.org>
To: "Matt L." <matt@att.net>
Subject: Re: Please Help: How do you make a vote/polling script???
Message-Id: <Pine.SGI.3.95.980604095837.6411A-100000@vangogh.bergen.org>
>Thanks a lot for the tip! I'll check it out. I'm just getting started
>with PERL, you know, reading the books and writing scripts.
What books are you using? I suggest you look at "Learning Perl" and
"Programming Perl" by O'Reilly... written by the guys that know.
--
Ambiguity is the key to... it.
- Jeff Pinyan
-- Jeff Pinyan | users.bergen.org/%7Ejefpin | techmaster@bergen.org --
NYPM | ICQ# 10222129 | 10222129@pager.mirabilis.com | qw[jeff] on EFnet
&jp('"($``','','$)EDF8```','$*52J4```','$+E1G4```','#J``@','#2__`');sub
jp{for$w(@_){$_=unpack('B48',unpack('u',$w));$c=~tr/10/# /;print;}}
------------------------------
Date: Thu, 04 Jun 1998 11:34:49 -0300
From: Craig Morris <cdm2@formalsys.ca>
Subject: Problem with namespaces colliding?
Message-Id: <3576B089.329E8021@formalsys.ca>
I've been experimenting with packages and I have come across a problem
when I use a package and a library in the same script. It appears as
though the interpreter is looking for the variables from a given library
in the package namespace. The variables mentioned below ($CUI_NOPROMPT,
$CUI_PROMPT, etc...
originate from a library cui.pl while Navigator::ReadHTML is a
completely separate
package. I've searched the FAQ'S but I didn't find anything that
addressed this
problem specifically. I thought each package used a separate namespace
as
not to conflict with other packages. Any help would be greatly
appreciated.
Craig
I'm running Perl5.003 for Solaris 2.5.1 for the PC.
The script:
--------------------------------------------------
!/usr/local/bin/perl5 -w # -*-Perl-*-
use strict;
use diagnostics;
use Navigator::ReadHTML;
require "cui.pl";
my($success, $error, @test) =
ReadHTML("/home/users/cdm2/test/htmlcust/n2k_rpts/viewer.htm");
print @test;
The package:
-------------------------------------------------
package Navigator::ReadHTML;
require Exporter;
use strict;
use vars qw(@ISA
@EXPORT
@EXPORT_OK
);
@ISA = qw(Exporter);
@EXPORT = qw(ReadHTML);
@EXPORT_OK = qw(ReadHTML);
sub ReadHTML
{
my($file,
$error) = @_;
my($fh) = \*INPUT_HTML;
my(@contents);
if (open ($fh, $file))
{
@contents = <$fh>;
close $fh;
return (1, $error, @contents);
}
else
{
$error++;
return(0, $error);
}
}
1;
The output:
------------------------------------------------------------
Identifier "main::error" used only once: possible typo at pack-test.perl
line 40.
(W) Typographical errors often show up as unique identifiers. If you
had a good reason for having a unique identifier, then just mention
it
again somehow to suppress the message.
Identifier "main::success" used only once: possible typo at
pack-test.perl line 40.
Identifier "Navigator::ReadHTML::CUI_PROMPT" used only once: possible
typo at /opt/cp_bin/cui.pl line 201.
Identifier "Navigator::ReadHTML::CUI_NOPROMPT" used only once: possible
typo at /opt/cp_bin/cui.pl line 202.
<html>
<head>
<title>Impact Analysis</title>
<meta http-equiv="Content-Type" content="text/html">
</head>
<frameset rows="20%,*">
<frame src="./index.htm" name="index">
<frame src="./splash.htm" name="main">
</frameset>
</html>
------------------------------
Date: Thu, 04 Jun 1998 15:27:21 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Problem with namespaces colliding?
Message-Id: <Pine.GSO.3.96.980604082427.13600d-100000@user2.teleport.com>
On Thu, 4 Jun 1998, Craig Morris wrote:
> I've been experimenting with packages and I have come across a problem
> when I use a package and a library in the same script.
Actually, you can't use Perl without using at least one package. :-)
> Identifier "main::error" used only once: possible typo at pack-test.perl
> line 40.
> (W) Typographical errors often show up as unique identifiers. If you
> had a good reason for having a unique identifier, then just mention
> it
> again somehow to suppress the message.
Although that message doesn't say so, 'use vars' is a good way to declare
variables so that they won't be "used only once".
> Identifier "main::success" used only once: possible typo at
> pack-test.perl line 40.
> Identifier "Navigator::ReadHTML::CUI_PROMPT" used only once: possible
> typo at /opt/cp_bin/cui.pl line 201.
> Identifier "Navigator::ReadHTML::CUI_NOPROMPT" used only once: possible
> typo at /opt/cp_bin/cui.pl line 202.
Those are the only diagnostic errors you mentioned; I don't see any
package- or library-related problems here. Maybe you simply need to use
'use vars'? Hope this helps!
--
Tom Phoenix Perl Training and Hacking Esperanto
Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
------------------------------
Date: Thu, 04 Jun 1998 15:09:16 -0700
From: Dean Jenkins <jenkinsrd@cf.ac.uk>
Subject: Re: searching
Message-Id: <35771B0C.552E@cardiff.ac.uk>
Justin Archie wrote:
> Another thing i want to do is if the script detects a URL or
> email address that it automatically sets up a link with that http://,
> gopher://, plus the url in it.
#find and encode URLs
$line =~ s/(http:\/\/[^ \t\n]*)/<a href=\"$1\">$1<\/a>/ig;
$line =~ s/mailto:([^ \t\n]*)/<a href=\"mailto:$1\">$1<\/a>/ig;
$line =~ s/(ftp:\/\/[^ \t\n]*)/<a href=\"$1\">$1<\/a>/ig;
$line =~ s/(news:[^ \t\n]*)/<a href=\"$1\">$1<\/a>/ig;
etc.
--
Dean Jenkins - Llandough Hospital, Cardiff, Wales.
12-lead ECG library
http://homepages.enterprise.net/djenkins/ecghome.html
------------------------------
Date: 4 Jun 1998 14:57:58 GMT
From: klassa@aursgh.aur.alcatel.com (John Klassa)
Subject: Re: searching
Message-Id: <6l6clm$pgd$1@aurwww.aur.alcatel.com>
On Thu, 04 Jun 1998 15:09:16 -0700, Dean Jenkins <jenkinsrd@cf.ac.uk> wrote:
->Justin Archie wrote:
->
->> Another thing i want to do is if the script detects a URL or
->> email address that it automatically sets up a link with that http://,
->> gopher://, plus the url in it.
->
-> #find and encode URLs
-> $line =~ s/(http:\/\/[^ \t\n]*)/<a href=\"$1\">$1<\/a>/ig;
-> $line =~ s/mailto:([^ \t\n]*)/<a href=\"mailto:$1\">$1<\/a>/ig;
-> $line =~ s/(ftp:\/\/[^ \t\n]*)/<a href=\"$1\">$1<\/a>/ig;
-> $line =~ s/(news:[^ \t\n]*)/<a href=\"$1\">$1<\/a>/ig;
I think you're going to get more than you bargained for with
[^ \t\n]*...
--
John Klassa / Alcatel Telecom / Raleigh, NC, USA <><
------------------------------
Date: Thu, 04 Jun 1998 15:21:37 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Sendmail script
Message-Id: <Pine.GSO.3.96.980604081912.13600c-100000@user2.teleport.com>
On Wed, 3 Jun 1998, Justin Archie wrote:
> Does anyone have a script that sendmails internally... Calling its own
> ports and timeout status...
There are some modules on CPAN which may help you to do what you want.
Good luck!
--
Tom Phoenix Perl Training and Hacking Esperanto
Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
------------------------------
Date: Thu, 04 Jun 1998 14:21:59 GMT
From: sdinitto@kronos.com (Scott DiNitto)
Subject: splitting via LEADING whitespace
Message-Id: <3576ad7c.1207533150@news>
I want to be able to split an ascii table that looks lhis:
NAME DATE TIME MESSAGE LEFT
John Doe 6-3-98 10:11 This is a message
Ronald Regan 6-3-98 12:03 My diaper's leaking again
Eric Cartman 6-4-98 06:09 YOU WILL RESPECT MY
AUTHORIT-AY!
(trust me it lookes all evenin regular ascii)
Anyways, what I am trying to do is assign each field to an array. For
example. Let's assume $tableScalar holds the third line of the ascii
file. So far, I am doing:
@tableArray = split (/ +/, $tableScalar);
this gives me the result:
@tableArray[0] "John"
@tableArray[1] "Doe"
@tableArray[2] "6-3-98"
@tablearray[3] "10:11"
@tableArray[4] "This"
@tableArray[5] "is"
@tableArray[6] "a"
@tableArray[7] "message"
Now, as far as I know I did a split using leading whitespace as the
delimiter... right?
My problem is, however, I want the array to look like this:
@tableArray[0] "John Doe"
@tableArray[1] "6-3-98"
@tableArray[2] "10:11"
@tableArray[3] "This is a message"
You see? I want to split where it looks like there is a tab. However,
if I try splitting via /\t/, it doesnt work (obviously my ascii file
doesnt really contain tab charachters). I always thought splitting via
leading whitespace meant the split occurs where ever there is
whitespace that is more than one character long. This is true, it does
split via that HOWEVER if there is any text seperated by one character
of whitespace, it splits there too and I don't want that!!
Does anyone know the best way to perform a split and have the fields
assigned the way I want it?
Thanks
SD
------------------------------
Date: 04 Jun 1998 16:47:11 +0200
From: Tony Curtis <Tony.Curtis+usenet@vcpc.univie.ac.at>
To: sdinitto@kronos.com
Subject: Re: splitting via LEADING whitespace
Message-Id: <7x1zt5b4wg.fsf@beavis.vcpc.univie.ac.at>
Re: splitting via LEADING whitespace, Scott
<sdinitto@kronos.com> said:
Scott> @tableArray = split (/ +/, $tableScalar);
/ +/ is "one or more SPACEs"
Scott> @tableArray[0] "John Doe" @tableArray[1] "6-3-98"
Scott> @tableArray[2] "10:11" @tableArray[3] "This is a
Scott> message"
Scott> You see? I want to split where it looks like there is
Scott> a tab. However, if I try splitting via /\t/, it
Why not actually have TAB in the file?
Then split on /\t+/
Scott> via leading whitespace meant the split occurs where
Scott> ever there is whitespace that is more than one
Scott> character long. This is true, it does split via that
In that case you need to use a regexp which signifies >= 2
SPACE chars. See "perldoc perlre" for the {n,} syntax.
Scott> Does anyone know the best way to perform a split and
Scott> have the fields assigned the way I want it?
If the file is rigid on columns then you could use substr()
instead.
hth
tony
--
Tony Curtis, Systems Manager, VCPC, | Tel +43 1 310 93 96 - 12; Fax - 13
Liechtensteinstrasse 22, A-1090 Wien, AT | http://www.vcpc.univie.ac.at/
"You see? You see? Your stupid minds! Stupid! Stupid!" ~ Eros, Plan9 fOS.
------------------------------
Date: Thu, 4 Jun 1998 14:43:35 GMT
From: adelton@fi.muni.cz (Honza Pazdziora)
Subject: Re: splitting via LEADING whitespace
Message-Id: <adelton.896971415@nemesis>
sdinitto@kronos.com (Scott DiNitto) writes:
> I want to be able to split an ascii table that looks lhis:
>
> NAME DATE TIME MESSAGE LEFT
>
> John Doe 6-3-98 10:11 This is a message
> Ronald Regan 6-3-98 12:03 My diaper's leaking again
> Eric Cartman 6-4-98 06:09 YOU WILL RESPECT MY
> AUTHORIT-AY!
[...]
> Does anyone know the best way to perform a split and have the fields
> assigned the way I want it?
@array = split /\s\s+/, $line;
or
@array = split /\s{2,}/, $line;
provided there always are at least two whitespaces between those
columns, thus no
Malicky krtecek leze 6-3-98 10:11 This is a message
and there is never more than one whitespace inside of the fields, thus
no
Malicky krtecek 6-3-98 10:11 This is a message
It the fields always start at the same positions, it is a good
candidate for
@array = unpack 'A20A10A3A45', $line;
or what ever. Check the perlfunc man page for pack to see the
specifications and why I chose 'A' and not 'a'.
Hope this helps,
--
------------------------------------------------------------------------
Honza Pazdziora | adelton@fi.muni.cz | http://www.fi.muni.cz/~adelton/
I can take or leave it if I please
------------------------------------------------------------------------
------------------------------
Date: 4 Jun 1998 14:58:47 GMT
From: ehood@geneva.acs.uci.edu (Earl Hood)
Subject: Re: splitting via LEADING whitespace
Message-Id: <6l6cn8$gh2@news.service.uci.edu>
[mail & posted]
In article <3576ad7c.1207533150@news>,
Scott DiNitto <sdinitto@kronos.com> wrote:
>NAME DATE TIME MESSAGE LEFT
>
>John Doe 6-3-98 10:11 This is a message
>Ronald Regan 6-3-98 12:03 My diaper's leaking again
>Eric Cartman 6-4-98 06:09 YOU WILL RESPECT MY
>AUTHORIT-AY!
>@tableArray = split (/ +/, $tableScalar);
>@tableArray[0] "John"
>@tableArray[1] "Doe"
>@tableArray[2] "6-3-98"
>@tablearray[3] "10:11"
>@tableArray[4] "This"
>@tableArray[5] "is"
>@tableArray[6] "a"
>@tableArray[7] "message"
>
>Now, as far as I know I did a split using leading whitespace as the
>delimiter... right?
No. You got the term "leading whitespace" all wrong. What you
told perl to do was split on any sequence of one, or more, spaces.
>My problem is, however, I want the array to look like this:
>
>@tableArray[0] "John Doe"
>@tableArray[1] "6-3-98"
>@tableArray[2] "10:11"
>@tableArray[3] "This is a message"
>
>You see? I want to split where it looks like there is a tab. However,
>if I try splitting via /\t/, it doesnt work (obviously my ascii file
>doesnt really contain tab charachters). I always thought splitting via
>leading whitespace meant the split occurs where ever there is
The "leading whitespace" mentioned in the perl documentation refers
to whitespace at the beginning of the string. You can get perl to
ignore the leading whitespace in a split operation (if you want to
split on white space) as follows:
split(' ', $somestring)
>whitespace that is more than one character long. This is true, it does
>split via that HOWEVER if there is any text seperated by one character
>of whitespace, it splits there too and I don't want that!!
If you know your columns are always separated by 2, or more, whitespaces
and none of the data will contain 2, or more, whitespaces, just use
the following:
@tableArray = split (/ {2,}/, $tableScalar);
--ewh
--
Earl Hood | University of California: Irvine
ehood@medusa.acs.uci.edu | Electronic Loiterer
http://www.oac.uci.edu/indiv/ehood/ | Dabbler of SGML/WWW/Perl/MIME
------------------------------
Date: Thu, 04 Jun 1998 15:58:11 +0100
From: "F.Quednau" <quednauf@nortel.co.uk>
Subject: Re: splitting via LEADING whitespace
Message-Id: <3576B602.3C8812D2@nortel.co.uk>
Scott DiNitto wrote:
> I want to be able to split an ascii table that looks lhis:
>
> NAME DATE TIME MESSAGE LEFT
>
> John Doe 6-3-98 10:11 This is a message
> Ronald Regan 6-3-98 12:03 My diaper's leaking again
> Eric Cartman 6-4-98 06:09 YOU WILL RESPECT MY
> AUTHORIT-AY!
>
> (trust me it lookes all evenin regular ascii)
>
> Anyways, what I am trying to do is assign each field to an array. For
> example. Let's assume $tableScalar holds the third line of the ascii
> file. So far, I am doing:
>
> @tableArray = split (/ +/, $tableScalar);
>
what about
@tableArray = split (/ {2,}/, $tableScalar);
That should split where there are at least 2 whitespaces.
But I suppose you are in trouble again when the name is so long that
there is only one whitespace between the name and your date. Oh well...
--
____________________________________________________________
Frank Quednau
http://www.surrey.ac.uk/~me51fq
________________________________________________
------------------------------
Date: Thu, 04 Jun 1998 09:12:00 -0400
From: Ala Qumsieh <aqumsieh@matrox.com>
Subject: Re: Taking data out of strings
Message-Id: <35769D20.C2ADB3AA@matrox.com>
Marek Jedlinski wrote:
> [posted and emailed]
>
> Ray Rarey <rayr@accessus.net> wrote:
>
> >For our school, we have a website that lists all the people that made
> >the honor roll (about 1,600, so I want to make the pages with perl) and
> >each line of data in the files is written like this:
> >LAST NAME, FIRST NAME GRAD. YEAR GPA
> >I want to get perl to read the files, take out each line one at a time,
> >and print the person's names in a table cell, their grad year in a
> >cell, and their gpa in a cell.
>
> You didn't say _precisely_ how your data is formatted in the original file.
I think he did:
LAST NAME, FIRST NAME GRAD. YEAR GPA
> How are the fields in each record delimited - with spaces, tabs, otherwise?
> At any rate, you probably need `split'. Assuming the simplest possible
> case, data items separated by single spaces:
>
> # read a line from the file into $line, and then do:
> @recs = split( ' ', $line );
>
But, it appears that he has a "," followed by a " " separating the first and
last names. So, the last name will contain a comma.
> You'll then have:
> LAST NAME in $recs[0]
> FIRST NAME in $recs[1]
> GRAD YEAR in $recs[2]
> GPA in $recs[3]
>
> which you can then print out in any order, etc.
>
> Of course if there is, for instance, variable number of spaces between the
> items, you'll need to collapse them into single spaces first, etc.
>
> .marek
Why not just split on "one or more spaces" from the beginning?
@recs = split( /\s+/, $line );
--
Ala Qumsieh | No .. not just another
ASIC Design Engineer | Perl Hacker!!!!!
Matrox Graphics Inc. |
Montreal, Quebec | (Not yet!)
------------------------------
Date: Thu, 04 Jun 1998 14:21:26 GMT
From: John Porter <jdporter@min.net>
Subject: Re: Use of HTML, POD, etc in Usenet (was: Re: map in void context regarded as evil - suggestion)
Message-Id: <3576AF0B.F68@min.net>
Zenin wrote:
>
> Chris Nandor <pudge@pobox.com> wrote:
> : Huh? I have never seen a POD reader built-in to a newsreader. Hence, POD
> : in a post will look the same to everyone, necessarily.
>
> If you're using a format other then text/plain, by RFC 1036 it must
> be declared as such. Therefor, the reader would try to run whatever
> it had mapped to handle type text/x-pod, if it had such a tool
> available.
Problem is, many (most?) people's newsreaders can't render anything
other than plain text. Declaring the content type to be anything else
might just hide it altogether. Maybe that's good, for e.g.
application/x-postscript... At least pod (and even html) is somewhat
human-readable. May as well give the user a chance to see it, even if
actually reading it is beyond her patience.
John Porter
------------------------------
Date: 4 Jun 1998 14:48:49 GMT
From: ehood@geneva.acs.uci.edu (Earl Hood)
Subject: Re: Use of HTML, POD, etc in Usenet (was: Re: map in void context regarded as evil - suggestion)
Message-Id: <6l6c4h$gag@news.service.uci.edu>
In article <3576AF0B.F68@min.net>, John Porter <jdporter@min.net> wrote:
>Zenin wrote:
>> If you're using a format other then text/plain, by RFC 1036 it must
>> be declared as such. Therefor, the reader would try to run whatever
>> it had mapped to handle type text/x-pod, if it had such a tool
>> available.
>
>Problem is, many (most?) people's newsreaders can't render anything
>other than plain text. Declaring the content type to be anything else
>might just hide it altogether.
No it won't. If a newsreader does not understand MIME, it will
display the post as a regular message. If it does understand MIME,
the RFCs state that any text/* type not understood, should be
treated as text/plain. If it does not, and does something bad like
not even display the data, then throw out your newsreader.
--ewh
--
Earl Hood | University of California: Irvine
ehood@medusa.acs.uci.edu | Electronic Loiterer
http://www.oac.uci.edu/indiv/ehood/ | Dabbler of SGML/WWW/Perl/MIME
------------------------------
Date: 8 Mar 97 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 8 Mar 97)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V8 Issue 2806
**************************************