[18769] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 937 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat May 19 09:10:25 2001

Date: Sat, 19 May 2001 06:10:10 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <990277810-v10-i937@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Sat, 19 May 2001     Volume: 10 Number: 937

Today's topics:
    Re: Searching for Postal Code <boqichi0@earthlink.net>
    Re: Searching for Postal Code <a.v.a@home.nl>
    Re: Simple Search problem <a.v.a@home.nl>
    Re: Simple Search problem (Gwyn Judd)
    Re: Simple Search problem <reevehotNOSPAM@hotmail.com>
    Re: Simple Search problem (Gwyn Judd)
    Re: Simple Search problem <boqichi0@earthlink.net>
    Re: Simple Search problem <flavell@mail.cern.ch>
    Re: splitting strings <boqichi0@earthlink.net>
    Re: Stubborn regex won't work <boqichi0@earthlink.net>
    Re: Stubborn regex won't work <keesh@users.pleaseremovethisbit.sourceforge.net>
    Re: Stubborn regex won't work (Kai Henningsen)
    Re: word doc to txt <boqichi0@earthlink.net>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 19 May 2001 10:54:51 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: Searching for Postal Code
Message-Id: <3B067CCF.C58EA1E7@earthlink.net>

what does your version of postal code look like, for one?
i dont know canadian postal codes off-hand...looks like you want
"letter", "number 1-9", "letter", "space", "number 1-9", "letter",
"number 1-9" ... but you are trying to substitute it w/out providing
something to substitute sith...were you just trying to match? maybe you
wanted to use :
m/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i
 .... anyway, if you still have probs just try to explain more what you
want to match...


Jean Cooper wrote:

> Hi,
>
> I'm trying to search through some text and find the Postal Code
> (Canadian).  The form of the PC is M4M 2B2.  So I wrote a little Perl
> script, which isn't working too well.  If anyone can let me know where
> I've gone wrong, I'd appreciate it.
>
> open (PC, "PCFile.txt") or die "Can't find file\n";
>
> if ($ARGV[0] =~ s/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i   {
>
>         print "Found something that resembles a Postal Code\n"
>
> }
>
> Jean



------------------------------

Date: Sat, 19 May 2001 11:43:17 GMT
From: AvA <a.v.a@home.nl>
Subject: Re: Searching for Postal Code
Message-Id: <3B063470.B5A71524@home.nl>

Jean Cooper wrote:

> Hi,
>
> I'm trying to search through some text and find the Postal Code
> (Canadian).  The form of the PC is M4M 2B2.  So I wrote a little Perl
> script, which isn't working too well.  If anyone can let me know where
> I've gone wrong, I'd appreciate it.
>
> open (PC, "PCFile.txt") or die "Can't find file\n";
>
> if ($ARGV[0] =~ s/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i   {
>

loose the "s" and it will work fine.
the "s" is for substitution, what you need is the "m", but that can be
ommited.
so:
if ($ARGV[0] =~ m/[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i
or:
if ($ARGV[0] =~ /[a-z][0-9][a-z]\s[0-9][a-z][0-9]/i



------------------------------

Date: Sat, 19 May 2001 10:32:37 GMT
From: AvA <a.v.a@home.nl>
Subject: Re: Simple Search problem
Message-Id: <3B0623E0.8179798B@home.nl>

James R wrote:

> I'm applying Matt Wright's Simple Search perl script to an HTML-based CD Rom
> which I'm constructing (where a virtual web server is running via software
> called "Microweb").
>
> See http://worldwidemart.com/scripts/search.shtml and
> http://www.indigostar.com/microweb.htm for more information. The whole
> script is very small and found at that first site.
>
> The search works fine on defined files - eg. @files = ('xyz.htm','abc.htm').
> However, not when I try to use wild-cards (which the help files indicates
> should work).
>
> Here is some of the suspect code...
>
> $basedir = '/';  # I've brought everything back to the root for testing
> simplicity
> @files = ('*.htm');

try it with double quotes : "*htm"




------------------------------

Date: Sat, 19 May 2001 11:16:27 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Simple Search problem
Message-Id: <slrn9gcpdf.1nc.tjla@thislove.dyndns.org>

"Mein Lufkissenfahrzeug ist voller Aale"
said AvA (a.v.a@home.nl) in 
<3B0623E0.8179798B@home.nl>:
>James R wrote:
>> $basedir = '/';  # I've brought everything back to the root for testing
>> simplicity
>> @files = ('*.htm');
>
>try it with double quotes : "*htm"

And just what do you think that will do?

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
If a guru falls in the forest with no one to hear him, was he really a
guru at all?
		-- Strange de Jim, "The Metasexuals"


------------------------------

Date: Sat, 19 May 2001 21:20:29 +1000
From: "James R" <reevehotNOSPAM@hotmail.com>
Subject: Re: Simple Search problem
Message-Id: <bFsN6.836$Ld4.37769@ozemail.com.au>

Thanks, but no luck. That did not fix the problem.

"AvA" <a.v.a@home.nl> wrote in message news:3B0623E0.8179798B@home.nl...
> James R wrote:
>
> > I'm applying Matt Wright's Simple Search perl script to an HTML-based CD
Rom
> > which I'm constructing (where a virtual web server is running via
software
> > called "Microweb").
> >
> > See http://worldwidemart.com/scripts/search.shtml and
> > http://www.indigostar.com/microweb.htm for more information. The whole
> > script is very small and found at that first site.
> >
> > The search works fine on defined files - eg. @files =
('xyz.htm','abc.htm').
> > However, not when I try to use wild-cards (which the help files
indicates
> > should work).
> >
> > Here is some of the suspect code...
> >
> > $basedir = '/';  # I've brought everything back to the root for testing
> > simplicity
> > @files = ('*.htm');
>
> try it with double quotes : "*htm"
>
>




------------------------------

Date: Sat, 19 May 2001 11:19:58 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Simple Search problem
Message-Id: <slrn9gcpk3.1nc.tjla@thislove.dyndns.org>

"Mein Lufkissenfahrzeug ist voller Aale"
said James R (reevehotNOSPAM@hotmail.com) in 
<evnN6.712$Ld4.32283@ozemail.com.au>:
>I'm applying Matt Wright's Simple Search perl script to an HTML-based CD Rom
>which I'm constructing (where a virtual web server is running via software
>called "Microweb").

Heh. There's your problem right there. Matt Wright is a horrible Perl
programmer. Better to start again with something else.

>The search works fine on defined files - eg. @files = ('xyz.htm','abc.htm').
>However, not when I try to use wild-cards (which the help files indicates
>should work).
>
>Here is some of the suspect code...
>
>$basedir = '/';  # I've brought everything back to the root for testing
>simplicity
>@files = ('*.htm');

I think you might mean:

@files = <*.htm>;

That is the globbing operator and will return a list of files ending in
".htm".

>   foreach $file (@files) {
>      $ls = "ls $file";   # the original script had single inverted commas
>here (') but it was causing an error

I suspect the original script may have had backticks ``. They run the
program inside the quotes and return the output:

$ls = `ls $file`;

You're on your own from there. I need to shower now, having touched such
horrible nasty code. *shudder*

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
Hear about...
	the fellow who maintains a special register of particularly
	accommodating girls?  He refers to it as his little blew book.


------------------------------

Date: Sat, 19 May 2001 11:27:36 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: Simple Search problem
Message-Id: <3B06847C.70D580A9@earthlink.net>

i would not use any matt wright script- not on a CD Rom for people to use....
Aaarg!! thats asking for trouble...  oops, sorry, from an unbiased, objective
standpoint i will look at the code presented.. why are you "ls" ing?? Seems
unlikely for a search,  grep, maybe or readdir .... i guess you just didn't
copy that part, so, then you ls for all the info on the file??  you say "I get
no responses even when I search for a word I know to be in the html
files." but i dont see where you are searching *in* the files? Looks more like
a paltry attempt at searching through names of files......


James R wrote:

> I'm applying Matt Wright's Simple Search perl script to an HTML-based CD Rom
> which I'm constructing (where a virtual web server is running via software
> called "Microweb").
>
> See http://worldwidemart.com/scripts/search.shtml and
> http://www.indigostar.com/microweb.htm for more information. The whole
> script is very small and found at that first site.
>
> The search works fine on defined files - eg. @files = ('xyz.htm','abc.htm').
> However, not when I try to use wild-cards (which the help files indicates
> should work).
>
> Here is some of the suspect code...
>
> $basedir = '/';  # I've brought everything back to the root for testing
> simplicity
> @files = ('*.htm');
> ....
>    foreach $file (@files) {
>       $ls = "ls $file";   # the original script had single inverted commas
> here (') but it was causing an error
>       @ls = split(/\s+/,$ls);
>       foreach $temp_file (@ls) {
>          if (-d $file) {
>             $filename = "$file$temp_file";
>             if (-T $filename) {
>                push(@FILES,$filename);
>             }
>          }
>          elsif (-T $temp_file) {
>             push(@FILES,$temp_file);
>          }
>
> I get no responses even when I search for a word I know to be in the html
> files.
>
> Any help would be greatly appreciated as I'm not much of a Perl guru...
>
> James



------------------------------

Date: Sat, 19 May 2001 13:51:52 +0200
From: "Alan J. Flavell" <flavell@mail.cern.ch>
Subject: Re: Simple Search problem
Message-Id: <Pine.LNX.4.30.0105191351070.21301-100000@lxplus003.cern.ch>

On Sat, 19 May 2001, James R wrote:

> Thanks, but no luck. That did not fix the problem.

OK, that's it.  I'm putting NOSPAM@hotmail into the killfile.




------------------------------

Date: Sat, 19 May 2001 12:36:17 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: splitting strings
Message-Id: <3B069497.78648F98@earthlink.net>

$string = "abcdefg";
@array = split //, $string;

Christian Seeberger wrote:

> Hi all !
>
> I want to split a string, so that each letter of it is an element in an
> array. My idea is something like:
>
> $string = "abcdefg";
> @array = split /magic reg exp/, $string;
>
> @array shoud look like (a,b,c,d,e,f,g) after the split. What is the
> 'magic reg exp' I need in the code above ?? Is it possible to do it this
> way at all ?? Up to now I use a construct with substr() and suchlike,
> but I just have the feeling, thet there is a better, more elegant way of
> oing this.
>
> TIA
>
> Chris



------------------------------

Date: Sat, 19 May 2001 11:35:50 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: Stubborn regex won't work
Message-Id: <3B06866A.D0E6871A@earthlink.net>

why are you using \G and /g in this case i think just
 if ($t =~ /(\w+?)\.$/) {$a = $1}
would make more sense...notice the ? to stop it from being greedy...but
you still need to mark what happens before that you dont want included-
say something like:

 if ($t =~ /\](\w+?)\.$/) {$a = $1}
or whatever consistantly precedes what you want to match- better yet if
you can replace the \w+? with something more specific- like (ERI000001) if
it always is that or (\w{3}\d{7}) or something...with the \.$ after that

HTH a little



William Cardwell wrote:

> In the following, I'm trying to pull the "ERI000001" in every case, but
> I can't get it.
> Can anyone help?
>
> $x='DISK02:[USERS.EUSWILC]ERI000001.INO;1';
> $y='DISK02:[USERS.EUSWILC]ERI000001.;1';
> $z='DISK02:[USERS.EUSWILC]ERI000001.';
> $w='DISK02:[USERS.EUSWILC]ERI000001.INO';
>
> # Last word string terminated by a period
> for $t ($x,$y,$z,$w) {
>   if ($t =~ /(\w+)\.\G/g) {$a = $1} # this for example doesn't work
>   print "$a\n";
> }
>
> Thanks so much.
>
> Will Cardwell



------------------------------

Date: Sat, 19 May 2001 13:14:39 +0100
From: "Ciaran McCreesh" <keesh@users.pleaseremovethisbit.sourceforge.net>
Subject: Re: Stubborn regex won't work
Message-Id: <9e5nuv$m8i$1@news6.svr.pol.co.uk>

In article <slrn9gb2r7.87b.abigail@tsathoggua.rlyeh.net>, "Abigail"
<abigail@foad.org> wrote:
> ][  Surely foreach ?
> 
> Why? Is there a difference?

Readability. I have to admit, I forgot that for could be used like that...

-- 
Ciaran McCreesh
mail:    keesh@users.sourceforge.net
web:     http://www.opensourcepan.com/


------------------------------

Date: 19 May 2001 13:08:00 +0200
From: kaih=81842hTHw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: Stubborn regex won't work
Message-Id: <81842hTHw-B@khms.westfalen.de>

abigail@foad.org (Abigail)  wrote on 18.05.01 in <slrn9gb2r7.87b.abigail@tsathoggua.rlyeh.net>:

> Ciaran McCreesh (keesh@users.pleaseremovethisbit.sourceforge.net) wrote
> on MMDCCCXVII September MCMXCIII in <URL:news:9e3me7$52l$1@news6.svr.pol.co.
> uk>: ][  In article <3B055618.1AB65E3C@am1.ericsson.se>, "William Cardwell"
> ][ <EUSWMCL@am1.ericsson.se> wrote:
> ][ > # Last word string terminated by a period
> ][ > for $t ($x,$y,$z,$w) {
> ][
> ][  Surely foreach ?
>
> Why? Is there a difference?

Of course. There's a pretty fundamental difference.

One is four whole letters longer.

> Abigail

You, of all people, should have known that!

Kai
-- 
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
  - Russ Allbery (rra@stanford.edu)


------------------------------

Date: Sat, 19 May 2001 12:38:22 GMT
From: Franco Luissi <boqichi0@earthlink.net>
Subject: Re: word doc to txt
Message-Id: <3B069515.65F3696@earthlink.net>

CPAN

MS Word ain't ascii text... Win32::OLE or some Win32 module probably,
search around there is something




sven wrote:

> Hi,
>
> I want to extract all ascii strings in a microsoft word document. I am
> not interested in layout or anything, just in the text.
>
> I did something like:
>
> while ($line = <FileHandle>) {
>   $line =~ s/[^A-Za-z0-9]+/ /g;
>   ...
> }
>
> but this tends to produce strings, that are not visible in the document
> at all.
>
> Any suggestions ?
>
> thanks in advance, Sven



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 937
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[18769] in Perl-Users-Digest

Perl-Users Digest, Issue: 937 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Sat May 19 09:10:25 2001

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat May 19 09:10:25 2001