[19543] in Perl-Users-Digest
Perl-Users Digest, Issue: 1738 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Sep 12 18:10:35 2001
Date: Wed, 12 Sep 2001 15:10:12 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <1000332611-v10-i1738@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Wed, 12 Sep 2001 Volume: 10 Number: 1738
Today's topics:
regex Q: how to exclude strings in a search? <unspecified@location.com>
Re: regex Q: how to exclude strings in a search? <Laocoon@eudoramail.com>
Re: regex Q: how to exclude strings in a search? <dtweed@acm.org>
Re: regex Q: how to exclude strings in a search? <unspecified@location.com>
regex Q: tough find and replace question (all of it, th <unspecified@location.com>
Search and Replace and Reformat <dhunter@storm.ca>
Re: Search and Replace and Reformat (Tad McClellan)
Send HTML page! <marovini@tiscalinet.it>
Re: Send HTML page! <spam@funnybytes.com>
Re: Send HTML page! <clay@panix.com>
Strange print behavior (P.C.)
Re: Strange print behavior (Logan Shaw)
Re: Strange print behavior (Tad McClellan)
substitution problem (Larry S)
Re: substitution problem <davidhilseenews@yahoo.com>
Re: substitution problem <dwilga-MUNGE@mtholyoke.edu>
Re: substitution problem <dtweed@acm.org>
system() command not wrking in cgi / works fine in .pl <anpandey@cisco.com>
Re: system() command not wrking in cgi / works fine in (Tad McClellan)
Using seek and print to insert data into a file. <robin_corcoran@3b2.com>
Re: What is package? (Tad McClellan)
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Wed, 12 Sep 2001 12:49:46 -0700
From: Bravan Dahn <unspecified@location.com>
Subject: regex Q: how to exclude strings in a search?
Message-Id: <3B9FBC5A.EFD65CCE@location.com>
Hello. I'm not much at perl, but my text editor contains a regex find
and replace utility. Unfortunately, there isn't any documentation online
simple enough for me to figure this out. I need to find any string of
text such as the following:
<font face="Verdana, Arial, Helvetica, sans-serif" size="-2">
<font face="Verdana, Arial, Helvetica, sans-serif" size="-1">
<font face="Verdana, Arial, Helvetica, sans-serif">
<font size="-2">
but NOT
<font face="verdana,arial,helvetica" color="#FF0000" size="-1">
------------------------------
Date: Wed, 12 Sep 2001 22:21:20 +0200
From: Laocoon <Laocoon@eudoramail.com>
Subject: Re: regex Q: how to exclude strings in a search?
Message-Id: <Xns911AE4CEEE70ALaocooneudoramail@62.153.159.134>
*snip*
> <font face="Verdana, Arial, Helvetica, sans-serif" size="-2">
*snip*
> but NOT
> <font face="verdana,arial,helvetica" color="#FF0000" size="-1">
m/<font[^#]+?>/
------------------------------
Date: Wed, 12 Sep 2001 21:10:43 GMT
From: Dave Tweed <dtweed@acm.org>
Subject: Re: regex Q: how to exclude strings in a search?
Message-Id: <3B9FCE32.69CEA60E@acm.org>
Do you want to give us a clue as to what it is about the fifth string
that rules it out? Is it:
- lack of capitalization on font names?
- lack of spaces after commas?
- lack of sans-serif font?
- presence of color parameter?
- something else?
If you want to match literally the first four strings, just combine them
into one big RE with alternation:
/string1|string2|string3|string4/
-- Dave Tweed
------------------------------
Date: Wed, 12 Sep 2001 14:58:21 -0700
From: Bravan Dahn <unspecified@location.com>
Subject: Re: regex Q: how to exclude strings in a search?
Message-Id: <3B9FDA7D.8ED01B63@location.com>
sorry about the confusion- I posted accidentally before I finished the
message. There are too many variations of the <font> tag for me to include
them all- that's why I want one which will find every tag except the ones
which have a backslash in them, which specify 7pt font, or which specify red
font. Please look at my post
regex Q: how to exclude strings in a search? (all of it this time)
thanks!
Dave Tweed wrote:
> Do you want to give us a clue as to what it is about the fifth string
> that rules it out? Is it:
> - lack of capitalization on font names?
> - lack of spaces after commas?
> - lack of sans-serif font?
> - presence of color parameter?
> - something else?
>
> If you want to match literally the first four strings, just combine them
> into one big RE with alternation:
>
> /string1|string2|string3|string4/
>
> -- Dave Tweed
------------------------------
Date: Wed, 12 Sep 2001 13:05:44 -0700
From: Bravan Dahn <unspecified@location.com>
Subject: regex Q: tough find and replace question (all of it, this time)
Message-Id: <3B9FC018.E8E51F72@location.com>
Ok, so, now that I have posted half of my question...(oops)
I am trying to edit a website, all at once. I am using "Advanced Find
and Replace", which has a regex-enabled advanced search. The files I'm
working on are text files, UTF-8 encoding. I'm replacing all of the font
tags except:
ones which are part of a the java in these jsp files
ones with font tags which specify 7pt font
ones with font tags which specify red font
The java strings contain backslashes- is excluding them actually as
simple as [^\] ? Also, how do I use regex to ignore strings which
contain whole words? I know [^q] is going to ignore the letter q, but it
won't work with [^"hat"] or [^hat] or [^(hat)], will it?
Any help with this is highly appreciated. Thank you in advance.
-Will
will
at
tkai
dot
com
------------------------------
Date: Wed, 12 Sep 2001 14:56:15 -0400
From: "dh" <dhunter@storm.ca>
Subject: Search and Replace and Reformat
Message-Id: <9nob62$3ov$1@bcarh8ab.ca.nortel.com>
Hi - I have a 6 meg flat file that I'd like to reformat. Here's a small
example of its current state:
ID_number: absn0002
ID BASE TAR RE CAT ORIG_DATE O TITLE
apc03 apc26 m88k f3 gen 1996-10-29 * CFW
apc03 apc26 m68k f3 gen 1996-10-29 * CFW
ID_number: snnn0005
ID BASE TAR RE CAT ORIG_DATE O TITLE
apc03 apc26 m88k f3 gen 1996-10-29 * CFW
apc03 apc26 m68k f3 gen 1996-10-29 * CFW
This is what i would like it to look like:
absn0002|apc03|apc26|m88k|f3|gen|1996-10-29|*
absn0002|apc03|apc26|m68k|f3|gen|1996-10-29|*
snnn0005|apc03|apc26|m88k|f3|gen|1996-10-29|*
snnn0005|apc03|apc26|m68k|f3|gen|1996-10-29|*
In other words, I'd like to put the ID_number at the beginning of each line
to where it is applicable, take out the column headings, and insert the pipe
delimiter where there is currently only a space.
The ID_number's DO NOT repeat. Every 'block' of data is: ID_number, column
headings, data
Any thoughts on where to start ?
------------------------------
Date: Wed, 12 Sep 2001 21:41:09 GMT
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Search and Replace and Reformat
Message-Id: <slrn9pvinu.ain.tadmc@tadmc26.august.net>
dh <dhunter@storm.ca> wrote:
>Hi - I have a 6 meg flat file that I'd like to reformat. Here's a small
>example of its current state:
[snip data]
>This is what i would like it to look like:
>
>absn0002|apc03|apc26|m88k|f3|gen|1996-10-29|*
>absn0002|apc03|apc26|m68k|f3|gen|1996-10-29|*
>snnn0005|apc03|apc26|m88k|f3|gen|1996-10-29|*
>snnn0005|apc03|apc26|m68k|f3|gen|1996-10-29|*
------------------------
#!/usr/bin/perl -w
use strict;
my $id; # the ID_number
my $rec=''; # bar-separated records
while ( <DATA> ) {
next if /^ID\s/; # skip header lines
if ( /^ID_number: (.*)/ ) {
$id = $1; # remember the ID number
print $rec; # output previous record
$rec = ''; # empty out the record buffer
}
else {
$rec .= join '|', $id, (split)[0..6]; # "list slice"
$rec .= "\n";
}
}
print $rec; # don't forget the one still in the buffer
__DATA__
ID_number: absn0002
ID BASE TAR RE CAT ORIG_DATE O TITLE
apc03 apc26 m88k f3 gen 1996-10-29 * CFW
apc03 apc26 m68k f3 gen 1996-10-29 * CFW
ID_number: snnn0005
ID BASE TAR RE CAT ORIG_DATE O TITLE
apc03 apc26 m88k f3 gen 1996-10-29 * CFW
apc03 apc26 m68k f3 gen 1996-10-29 * CFW
------------------------
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Wed, 12 Sep 2001 22:48:34 +0200
From: "Massimo Rovini" <marovini@tiscalinet.it>
Subject: Send HTML page!
Message-Id: <9nohps$rv8$1@lacerta.tiscalinet.it>
Hallo!
I'd like to create a PERL application that's able to get and send WEB pages.
The aim of this program is to donwload a WEB page without using the browser,
do some processing, and then compile a query. Depending on the received data
I'd like to get some link.
Can anyboby help me?
Thanks a lot,
Massimo
------------------------------
Date: Wed, 12 Sep 2001 22:59:19 +0200
From: "Admin UsePad" <spam@funnybytes.com>
Subject: Re: Send HTML page!
Message-Id: <9noi3v$p57$1@news.hccnet.nl>
#!/usr/bin/perl
print "Content-type: text/html\n\n";
# get the HTML page
use LWP::Simple;
$url = "http://www.domain.com";
$geturl = get ($url);
# do stuff with the html
# print the html
print $geturl;
"Massimo Rovini" <marovini@tiscalinet.it> schreef in bericht
news:9nohps$rv8$1@lacerta.tiscalinet.it...
> Hallo!
> I'd like to create a PERL application that's able to get and send WEB
pages.
> The aim of this program is to donwload a WEB page without using the
browser,
> do some processing, and then compile a query. Depending on the received
data
> I'd like to get some link.
> Can anyboby help me?
>
> Thanks a lot,
>
> Massimo
>
>
------------------------------
Date: 12 Sep 2001 21:22:30 GMT
From: Clay Irving <clay@panix.com>
Subject: Re: Send HTML page!
Message-Id: <slrn9pvki8.ftq.clay@panix1.panix.com>
In article <9nohps$rv8$1@lacerta.tiscalinet.it>, Massimo Rovini wrote:
> I'd like to create a PERL application that's able to get and send WEB pages.
> The aim of this program is to donwload a WEB page without using the browser,
> do some processing, and then compile a query. Depending on the received data
> I'd like to get some link.
> Can anyboby help me?
This command:
perldoc -q HTML
includes this in the output:
How do I fetch an HTML file?
This command:
perldoc -q Mail
includes this in the output:
How do I send mail?
--
Clay Irving <clay@panix.com>
A school should not be a preparation for life. A school should be life.
- Elbert Green Hubbard
------------------------------
Date: 12 Sep 2001 12:43:19 -0700
From: pietro28@yahoo.com (P.C.)
Subject: Strange print behavior
Message-Id: <a1f0067e.0109121143.28d8cb98@posting.google.com>
Currently, I've developed a program that creates a report based on
current system usage statistics, database size, and processes. It
takes a few minutes to complete before the report is generated, so I'd
like to keep track of progress while the user is waiting (like a hash
function in ftp). I thought doing a simple print "..."; might work...
but alas! It doesn't print a SINGLE THING until the end of the program
(after which it prints everything)... UNLESS there is a carriage
return at the end of the string, which defeats the purpose of keeping
it uniform and on the same line. My questions are: why is this
happening? Is there any way around it, so that I don't need a carriage
return to see a few characters printed on the screen to show progress?
Thanks!
------------------------------
Date: 12 Sep 2001 16:03:03 -0500
From: logan@cs.utexas.edu (Logan Shaw)
Subject: Re: Strange print behavior
Message-Id: <9noii7$jhb$1@charity.cs.utexas.edu>
In article <a1f0067e.0109121143.28d8cb98@posting.google.com>,
P.C. <pietro28@yahoo.com> wrote:
>Currently, I've developed a program that creates a report based on
>current system usage statistics, database size, and processes. It
>takes a few minutes to complete before the report is generated, so I'd
>like to keep track of progress while the user is waiting (like a hash
>function in ftp). I thought doing a simple print "..."; might work...
>but alas! It doesn't print a SINGLE THING until the end of the program
>(after which it prints everything)... UNLESS there is a carriage
>return at the end of the string, which defeats the purpose of keeping
>it uniform and on the same line. My questions are: why is this
>happening?
It's called buffering by line, and it's done for reasons of
efficiency.
Try doing "$| = 1;" before you do your first print.
- Logan
--
"Our grandkids love that we get Roadrunner and digital cable."
(Advertisement for Time Warner cable TV and internet access, July 2001)
------------------------------
Date: Wed, 12 Sep 2001 21:41:08 GMT
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Strange print behavior
Message-Id: <slrn9pvh3e.ain.tadmc@tadmc26.august.net>
P.C. <pietro28@yahoo.com> wrote:
>It doesn't print a SINGLE THING until the end of the program
>(after which it prints everything)... UNLESS there is a carriage
>return at the end of the string,
Terminals are often "line buffered".
>My questions are: why is this
>happening?
Buffering.
>Is there any way around it, so that I don't need a carriage
>return to see a few characters printed on the screen to show progress?
See the $| variable in:
perldoc perlvar
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 12 Sep 2001 08:55:58 -0700
From: dime0000@yahoo.com (Larry S)
Subject: substitution problem
Message-Id: <8fd7acb0.0109120755.347c2428@posting.google.com>
ok, i found something on
http://www.perl.com/pub/a/2000/11/begperl3.html but it isn't
working... i'm trying to do this
$text =~ /(<IMG SRC="\/product\/100\/.*?" width="100" height="100"
border=0 align="left" hspace=10>)/;
print $1;
now, $1 should now be <IMG SRC="\/product\/100\/.*?" width="100"
height="100" border=0 align="left" hspace=10> (where .*? is a
wildcard), but it isnt working! any help?
------------------------------
Date: Wed, 12 Sep 2001 18:10:53 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: substitution problem
Message-Id: <NyNn7.5902$gU.1783589@news1.rdc1.md.home.com>
"Larry S" <dime0000@yahoo.com> wrote in message
news:8fd7acb0.0109120755.347c2428@posting.google.com...
> ok, i found something on
> http://www.perl.com/pub/a/2000/11/begperl3.html but it isn't
> working... i'm trying to do this
>
> $text =~ /(<IMG SRC="\/product\/100\/.*?" width="100" height="100"
> border=0 align="left" hspace=10>)/;
> print $1;
>
>
> now, $1 should now be <IMG SRC="\/product\/100\/.*?" width="100"
> height="100" border=0 align="left" hspace=10> (where .*? is a
> wildcard), but it isnt working! any help?
It's hard to determine what's going on if you don't know what $text is, eh?
David Hilsee
------------------------------
Date: Wed, 12 Sep 2001 18:33:15 GMT
From: Dan Wilga <dwilga-MUNGE@mtholyoke.edu>
Subject: Re: substitution problem
Message-Id: <dwilga-MUNGE-32B82B.14331212092001@nap.mtholyoke.edu>
In article <8fd7acb0.0109120755.347c2428@posting.google.com>,
dime0000@yahoo.com (Larry S) wrote:
> ok, i found something on
> http://www.perl.com/pub/a/2000/11/begperl3.html but it isn't
> working... i'm trying to do this
>
> $text =~ /(<IMG SRC="\/product\/100\/.*?" width="100" height="100"
> border=0 align="left" hspace=10>)/;
> print $1;
Two suggestions:
First, it's generally a bad idea to make assumptions about the content of the
string you're searching. This is especially important in the case of HTML
code, since the HTML language treats any number of whitespace characters the
same way it treats one space. Also, a whitespace character can be lots of
things, not just ' '.
Second, to make your code more readable, when searching with a pattern that
contains /, it's best to explicitly use the "m" operator, with some other
delimiter.
Taking these two things into account, you might try this:
$text =~
m{(<IMG\s+SRC="/product/100/.*?"\s+width="100"\s+height="100"\s+border=0\s+alig
n="left"\s+hspace=10>)};
print $1;
Of course, that still makes some pretty broad assumptions about the order of
the width=, height=, etc., and the use of "quotes", but it should work if the
HTML string is typed in exactly right, including the right case.
--
Dan Wilga dwilga-MUNGE@mtholyoke.edu
** Remove the -MUNGE in my address to reply **
------------------------------
Date: Wed, 12 Sep 2001 20:58:42 GMT
From: Dave Tweed <dtweed@acm.org>
Subject: Re: substitution problem
Message-Id: <3B9FCB63.BBA7B559@acm.org>
Larry S wrote:
> ok, i found something on
> http://www.perl.com/pub/a/2000/11/begperl3.html but it isn't
> working... i'm trying to do this
>
> $text =~ /(<IMG SRC="\/product\/100\/.*?" width="100" height="100"
> border=0 align="left" hspace=10>)/;
> print $1;
>
> now, $1 should now be <IMG SRC="\/product\/100\/.*?" width="100"
> height="100" border=0 align="left" hspace=10> (where .*? is a
> wildcard), but it isnt working! any help?
I strongly recommend reading "perldoc perlre".
Are you absolutely certain that the capitalization and whitespace
(including newlines) of your RE matches that in your $text? Obviously,
we can't tell since you didn't show us $text, or at least the part(s)
you expect to match.
Note also that the quotes on most of the values are not actually
required, and you seem to be using them or not rather randomly.
BTW, $1 won't actually contain the backslashes if the match does
succeed.
Next time, post some code that actually works under -w and "use
strict" that demonstrates your problem.
-- Dave Tweed
------------------------------
Date: Wed, 12 Sep 2001 22:08:56 +0530
From: "Anupam" <anpandey@cisco.com>
Subject: system() command not wrking in cgi / works fine in .pl
Message-Id: <1000313268.640953@sj-nntpcache-3>
Hi all ,
I'm having a informix db and web server apache in a single solaris 2.7 sun
m/c.
I have a set up where i can get the data from informix db by executing
command like ...
echo " select * from table1 " dbaccess database_name
if i give this script in a file called A.pl having entries like
#!/usr/local/bin/perl -w
system ("./sql_user_connection ");
where sql_user_connection is nothing but a kshell script having entry
echo " select * from table1 " dbaccess database_name > result .
If i execute this A.pl like
perl A.pl --------------> this works fine .
If i use the same thing in a cgi script -- i.e
i call this page from my web browser :-
redirection file i.e result is always blank .
Please help me ..
thanks a million
~Anupam
excerpt from Sample cgi file :-
----------------------------------------------
#!/usr/local/bin/perl -w
require "cgi-bin.pl";
&ReadParse(*input);
print &PrintVariables(*input);
$Ci=$input{'ConnI'};
print $Ci;
system ( "sql_user_connection >/tmp/zz"); ---- > creates an empty file
$var=`sql_user_connection >/tmp/yx`; ---------> creates an empty file
system("ls -l > /tmp/aa") ; -------------> has the proper output .
print $var;
print <<labl1;
<html>
<body bgcolor="#9999FF">
<head>
<title>WebDb Browser</title>
</head>
------------------------------
Date: Wed, 12 Sep 2001 21:41:07 GMT
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: system() command not wrking in cgi / works fine in .pl
Message-Id: <slrn9pvghh.ain.tadmc@tadmc26.august.net>
Anupam <anpandey@cisco.com> wrote:
>system ("./sql_user_connection ");
^^
^^ current directory
What is your current directory when run as a CGI?
I'll betcha it isn't the same directory you were in when
you ran it from the command line.
chdir() to the correct directory, or use absolute pathnames.
>perl A.pl --------------> this works fine .
>
>If i use the same thing in a cgi script -- i.e
>i call this page from my web browser :-
>
>redirection file i.e result is always blank .
And what is in the server log? Error messages are helpful
for fixing errors.
Or, have the error messages sent to the browser instead:
perldoc -q CGI
"How can I get better error messages from a CGI program?"
>Please help me ..
Below is a whole lot of help that you didn't even know you needed :-)
>#!/usr/local/bin/perl -w
use strict;
>require "cgi-bin.pl";
use CGI; # cgi-bin.pl is really really old
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Wed, 12 Sep 2001 17:43:38 -0400
From: "Robin Corcoran" <robin_corcoran@3b2.com>
Subject: Using seek and print to insert data into a file.
Message-Id: <IHQn7.160$Ph3.371617@news.uswest.net>
Hello all,
I am trying to insert some processing instructions into an xml file at
specified locations. I am using "seek" to jump to the desired location in
the file, and "print" to output my processing instructions. Everything works
like I want it to, except print is overwriting the data that is already
there. I just want to insert new data, not overwrite existing data.
Following is the code I am using, the @Insert_ is being populated by an
external program (3b2). Any thoughts/suggestions would be much appreciated.
Robin
#Open the input file and store contents in a variable.
open (INPUT,"C:/PiTest.xml");
@contents=<INPUT>;
$contents = join('',@contents);
close (INPUT);
#Get rid of hard returns.
$contents=~s/\n+//g;
#Transfer contents of variable to new file
open (OUTPUT, ">C:/NewPiTest.xml");
print OUTPUT $contents;
#Iterate through file inserting pi's for line numbers (using seek to get to
correct position).
while ($Num_To_Insert > 0)
{
$position = $Insert_[$Num_To_Insert];
seek (OUTPUT,$position,0);
print OUTPUT "<?linestrt line=\"$Num_To_Insert\"?>";
$Num_To_Insert--;
}
close (OUTPUT);
------------------------------
Date: Wed, 12 Sep 2001 15:20:02 GMT
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: What is package?
Message-Id: <slrn9pusii.a02.tadmc@tadmc26.august.net>
Tad McClellan <tadmc@augustmail.com> wrote:
>Dav Lam <crud@hongkong.com> wrote:
>
>>what is package?
>
>
>Packages are documented right at the top of:
>
> perldoc perlmod
Oh yeah. See also "Coping with Scoping":
http://perl.plover.com/FAQs/Namespaces.html
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 1738
***************************************