[18769] in Perl-Users-Digest
Perl-Users Digest, Issue: 937 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat May 19 09:10:25 2001
Date: Sat, 19 May 2001 06:10:10 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <990277810-v10-i937@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Sat, 19 May 2001 Volume: 10 Number: 937
Today's topics:
Re: Searching for Postal Code <boqichi0@earthlink.net>
Re: Searching for Postal Code <a.v.a@home.nl>
Re: Simple Search problem <a.v.a@home.nl>
Re: Simple Search problem (Gwyn Judd)
Re: Simple Search problem <reevehotNOSPAM@hotmail.com>
Re: Simple Search problem (Gwyn Judd)
Re: Simple Search problem <boqichi0@earthlink.net>
Re: Simple Search problem <flavell@mail.cern.ch>
Re: splitting strings <boqichi0@earthlink.net>
Re: Stubborn regex won't work <boqichi0@earthlink.net>
Re: Stubborn regex won't work <keesh@users.pleaseremovethisbit.sourceforge.net>
Re: Stubborn regex won't work (Kai Henningsen)
Re: word doc to txt <boqichi0@earthlink.net>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sat, 19 May 2001 10:54:51 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: Searching for Postal Code
Message-Id: <3B067CCF.C58EA1E7@earthlink.net>
what does your version of postal code look like, for one?
i dont know canadian postal codes off-hand...looks like you want
"letter", "number 1-9", "letter", "space", "number 1-9", "letter",
"number 1-9" ... but you are trying to substitute it w/out providing
something to substitute sith...were you just trying to match? maybe you
wanted to use :
m/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i
.... anyway, if you still have probs just try to explain more what you
want to match...
Jean Cooper wrote:
> Hi,
>
> I'm trying to search through some text and find the Postal Code
> (Canadian). The form of the PC is M4M 2B2. So I wrote a little Perl
> script, which isn't working too well. If anyone can let me know where
> I've gone wrong, I'd appreciate it.
>
> open (PC, "PCFile.txt") or die "Can't find file\n";
>
> if ($ARGV[0] =~ s/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i {
>
> print "Found something that resembles a Postal Code\n"
>
> }
>
> Jean
------------------------------
Date: Sat, 19 May 2001 11:43:17 GMT
From: AvA <a.v.a@home.nl>
Subject: Re: Searching for Postal Code
Message-Id: <3B063470.B5A71524@home.nl>
Jean Cooper wrote:
> Hi,
>
> I'm trying to search through some text and find the Postal Code
> (Canadian). The form of the PC is M4M 2B2. So I wrote a little Perl
> script, which isn't working too well. If anyone can let me know where
> I've gone wrong, I'd appreciate it.
>
> open (PC, "PCFile.txt") or die "Can't find file\n";
>
> if ($ARGV[0] =~ s/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i {
>
loose the "s" and it will work fine.
the "s" is for substitution, what you need is the "m", but that can be
ommited.
so:
if ($ARGV[0] =~ m/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i
or:
if ($ARGV[0] =~ /[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i
------------------------------
Date: Sat, 19 May 2001 10:32:37 GMT
From: AvA <a.v.a@home.nl>
Subject: Re: Simple Search problem
Message-Id: <3B0623E0.8179798B@home.nl>
James R wrote:
> I'm applying Matt Wright's Simple Search perl script to an HTML-based CD Rom
> which I'm constructing (where a virtual web server is running via software
> called "Microweb").
>
> See http://worldwidemart.com/scripts/search.shtml and
> http://www.indigostar.com/microweb.htm for more information. The whole
> script is very small and found at that first site.
>
> The search works fine on defined files - eg. @files = ('xyz.htm','abc.htm').
> However, not when I try to use wild-cards (which the help files indicates
> should work).
>
> Here is some of the suspect code...
>
> $basedir = '/'; # I've brought everything back to the root for testing
> simplicity
> @files = ('*.htm');
try it with double quotes : "*htm"
------------------------------
Date: Sat, 19 May 2001 11:16:27 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Simple Search problem
Message-Id: <slrn9gcpdf.1nc.tjla@thislove.dyndns.org>
"Mein Lufkissenfahrzeug ist voller Aale"
said AvA (a.v.a@home.nl) in
<3B0623E0.8179798B@home.nl>:
>James R wrote:
>> $basedir = '/'; # I've brought everything back to the root for testing
>> simplicity
>> @files = ('*.htm');
>
>try it with double quotes : "*htm"
And just what do you think that will do?
--
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
If a guru falls in the forest with no one to hear him, was he really a
guru at all?
-- Strange de Jim, "The Metasexuals"
------------------------------
Date: Sat, 19 May 2001 21:20:29 +1000
From: "James R" <reevehotNOSPAM@hotmail.com>
Subject: Re: Simple Search problem
Message-Id: <bFsN6.836$Ld4.37769@ozemail.com.au>
Thanks, but no luck. That did not fix the problem.
"AvA" <a.v.a@home.nl> wrote in message news:3B0623E0.8179798B@home.nl...
> James R wrote:
>
> > I'm applying Matt Wright's Simple Search perl script to an HTML-based CD
Rom
> > which I'm constructing (where a virtual web server is running via
software
> > called "Microweb").
> >
> > See http://worldwidemart.com/scripts/search.shtml and
> > http://www.indigostar.com/microweb.htm for more information. The whole
> > script is very small and found at that first site.
> >
> > The search works fine on defined files - eg. @files =
('xyz.htm','abc.htm').
> > However, not when I try to use wild-cards (which the help files
indicates
> > should work).
> >
> > Here is some of the suspect code...
> >
> > $basedir = '/'; # I've brought everything back to the root for testing
> > simplicity
> > @files = ('*.htm');
>
> try it with double quotes : "*htm"
>
>
------------------------------
Date: Sat, 19 May 2001 11:19:58 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Simple Search problem
Message-Id: <slrn9gcpk3.1nc.tjla@thislove.dyndns.org>
"Mein Lufkissenfahrzeug ist voller Aale"
said James R (reevehotNOSPAM@hotmail.com) in
<evnN6.712$Ld4.32283@ozemail.com.au>:
>I'm applying Matt Wright's Simple Search perl script to an HTML-based CD Rom
>which I'm constructing (where a virtual web server is running via software
>called "Microweb").
Heh. There's your problem right there. Matt Wright is a horrible Perl
programmer. Better to start again with something else.
>The search works fine on defined files - eg. @files = ('xyz.htm','abc.htm').
>However, not when I try to use wild-cards (which the help files indicates
>should work).
>
>Here is some of the suspect code...
>
>$basedir = '/'; # I've brought everything back to the root for testing
>simplicity
>@files = ('*.htm');
I think you might mean:
@files = <*.htm>;
That is the globbing operator and will return a list of files ending in
".htm".
> foreach $file (@files) {
> $ls = "ls $file"; # the original script had single inverted commas
>here (') but it was causing an error
I suspect the original script may have had backticks ``. They run the
program inside the quotes and return the output:
$ls = `ls $file`;
You're on your own from there. I need to shower now, having touched such
horrible nasty code. *shudder*
--
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
Hear about...
the fellow who maintains a special register of particularly
accommodating girls? He refers to it as his little blew book.
------------------------------
Date: Sat, 19 May 2001 11:27:36 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: Simple Search problem
Message-Id: <3B06847C.70D580A9@earthlink.net>
i would not use any matt wright script- not on a CD Rom for people to use....
Aaarg!! thats asking for trouble... oops, sorry, from an unbiased, objective
standpoint i will look at the code presented.. why are you "ls" ing?? Seems
unlikely for a search, grep, maybe or readdir .... i guess you just didn't
copy that part, so, then you ls for all the info on the file?? you say "I get
no responses even when I search for a word I know to be in the html
files." but i dont see where you are searching *in* the files? Looks more like
a paltry attempt at searching through names of files......
James R wrote:
> I'm applying Matt Wright's Simple Search perl script to an HTML-based CD Rom
> which I'm constructing (where a virtual web server is running via software
> called "Microweb").
>
> See http://worldwidemart.com/scripts/search.shtml and
> http://www.indigostar.com/microweb.htm for more information. The whole
> script is very small and found at that first site.
>
> The search works fine on defined files - eg. @files = ('xyz.htm','abc.htm').
> However, not when I try to use wild-cards (which the help files indicates
> should work).
>
> Here is some of the suspect code...
>
> $basedir = '/'; # I've brought everything back to the root for testing
> simplicity
> @files = ('*.htm');
> ....
> foreach $file (@files) {
> $ls = "ls $file"; # the original script had single inverted commas
> here (') but it was causing an error
> @ls = split(/\s+/,$ls);
> foreach $temp_file (@ls) {
> if (-d $file) {
> $filename = "$file$temp_file";
> if (-T $filename) {
> push(@FILES,$filename);
> }
> }
> elsif (-T $temp_file) {
> push(@FILES,$temp_file);
> }
>
> I get no responses even when I search for a word I know to be in the html
> files.
>
> Any help would be greatly appreciated as I'm not much of a Perl guru...
>
> James
------------------------------
Date: Sat, 19 May 2001 13:51:52 +0200
From: "Alan J. Flavell" <flavell@mail.cern.ch>
Subject: Re: Simple Search problem
Message-Id: <Pine.LNX.4.30.0105191351070.21301-100000@lxplus003.cern.ch>
On Sat, 19 May 2001, James R wrote:
> Thanks, but no luck. That did not fix the problem.
OK, that's it. I'm putting NOSPAM@hotmail into the killfile.
------------------------------
Date: Sat, 19 May 2001 12:36:17 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: splitting strings
Message-Id: <3B069497.78648F98@earthlink.net>
$string = "abcdefg";
@array = split //, $string;
Christian Seeberger wrote:
> Hi all !
>
> I want to split a string, so that each letter of it is an element in an
> array. My idea is something like:
>
> $string = "abcdefg";
> @array = split /magic reg exp/, $string;
>
> @array shoud look like (a,b,c,d,e,f,g) after the split. What is the
> 'magic reg exp' I need in the code above ?? Is it possible to do it this
> way at all ?? Up to now I use a construct with substr() and suchlike,
> but I just have the feeling, thet there is a better, more elegant way of
> oing this.
>
> TIA
>
> Chris
------------------------------
Date: Sat, 19 May 2001 11:35:50 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: Stubborn regex won't work
Message-Id: <3B06866A.D0E6871A@earthlink.net>
why are you using \G and /g in this case i think just
if ($t =~ /(\w+?)\.$/) {$a = $1}
would make more sense...notice the ? to stop it from being greedy...but
you still need to mark what happens before that you dont want included-
say something like:
if ($t =~ /\](\w+?)\.$/) {$a = $1}
or whatever consistantly precedes what you want to match- better yet if
you can replace the \w+? with something more specific- like (ERI000001) if
it always is that or (\w{3}\d{7}) or something...with the \.$ after that
HTH a little
William Cardwell wrote:
> In the following, I'm trying to pull the "ERI000001" in every case, but
> I can't get it.
> Can anyone help?
>
> $x='DISK02:[USERS.EUSWILC]ERI000001.INO;1';
> $y='DISK02:[USERS.EUSWILC]ERI000001.;1';
> $z='DISK02:[USERS.EUSWILC]ERI000001.';
> $w='DISK02:[USERS.EUSWILC]ERI000001.INO';
>
> # Last word string terminated by a period
> for $t ($x,$y,$z,$w) {
> if ($t =~ /(\w+)\.\G/g) {$a = $1} # this for example doesn't work
> print "$a\n";
> }
>
> Thanks so much.
>
> Will Cardwell
------------------------------
Date: Sat, 19 May 2001 13:14:39 +0100
From: "Ciaran McCreesh" <keesh@users.pleaseremovethisbit.sourceforge.net>
Subject: Re: Stubborn regex won't work
Message-Id: <9e5nuv$m8i$1@news6.svr.pol.co.uk>
In article <slrn9gb2r7.87b.abigail@tsathoggua.rlyeh.net>, "Abigail"
<abigail@foad.org> wrote:
> ][ Surely foreach ?
>
> Why? Is there a difference?
Readability. I have to admit, I forgot that for could be used like that...
--
Ciaran McCreesh
mail: keesh@users.sourceforge.net
web: http://www.opensourcepan.com/
------------------------------
Date: 19 May 2001 13:08:00 +0200
From: kaih=81842hTHw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: Stubborn regex won't work
Message-Id: <81842hTHw-B@khms.westfalen.de>
abigail@foad.org (Abigail) wrote on 18.05.01 in <slrn9gb2r7.87b.abigail@tsathoggua.rlyeh.net>:
> Ciaran McCreesh (keesh@users.pleaseremovethisbit.sourceforge.net) wrote
> on MMDCCCXVII September MCMXCIII in <URL:news:9e3me7$52l$1@news6.svr.pol.co.
> uk>: ][ In article <3B055618.1AB65E3C@am1.ericsson.se>, "William Cardwell"
> ][ <EUSWMCL@am1.ericsson.se> wrote:
> ][ > # Last word string terminated by a period
> ][ > for $t ($x,$y,$z,$w) {
> ][
> ][ Surely foreach ?
>
> Why? Is there a difference?
Of course. There's a pretty fundamental difference.
One is four whole letters longer.
> Abigail
You, of all people, should have known that!
Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (rra@stanford.edu)
------------------------------
Date: Sat, 19 May 2001 12:38:22 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: word doc to txt
Message-Id: <3B069515.65F3696@earthlink.net>
CPAN
MS Word ain't ascii text... Win32::OLE or some Win32 module probably,
search around there is something
sven wrote:
> Hi,
>
> I want to extract all ascii strings in a microsoft word document. I am
> not interested in layout or anything, just in the text.
>
> I did something like:
>
> while ($line = <FileHandle>) {
> $line =~ s/[^A-Za-z0-9]+/ /g;
> ...
> }
>
> but this tends to produce strings, that are not visible in the document
> at all.
>
> Any suggestions ?
>
> thanks in advance, Sven
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 937
**************************************