[23387] in Perl-Users-Digest
Perl-Users Digest, Issue: 5606 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Oct 2 18:05:44 2003
Date: Thu, 2 Oct 2003 15:05:08 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Thu, 2 Oct 2003 Volume: 10 Number: 5606
Today's topics:
Re: data varification code logic <usenet@dwall.fastmail.fm>
Re: data varification code logic (Tad McClellan)
Re: data varification code logic <usenet@dwall.fastmail.fm>
Re: DBI <mbudash@sonic.net>
Re: DBI <tore@aursand.no>
Re: DBI <mbudash@sonic.net>
Free Windows NNTP Server for Perl to post? (Great Deals)
Help with Encoding Extended Characters <notspam@spamfree.dud>
Re: How to use proxy in Net::HTTP, not in LWP::UserAgen gisle@activestate.com
Re: Inherit file descriptors from parent process? (jeff)
Re: Inherit file descriptors from parent process? (Malcolm Dew-Jones)
Re: IsSorted @list ? 1 : 0 <krahnj@acm.org>
Re: IsSorted @list ? 1 : 0 <tore@aursand.no>
Re: Perl script crashing at lockfile ? (Malcolm Dew-Jones)
Re: regex behavior <davido@pacifier.com>
regex for URL in a log file <darthlover@yahoo.com>
regex for URL in a log file <darthlover@yahoo.com>
Re: regex for URL in a log file <darthlover@yahoo.com>
Re: regex for URL in a log file <darthlover@yahoo.com>
Re: regex for URL in a log file <xx087@freenet.carleton.ca>
Re: regex for URL in a log file <florian265@uboot.com>
Re: screen output lags behind, or script appears to 'sw <skweek@no.spam>
Re: select() on socket (Vorxion)
Re: Splitting subroutines out of a file <FirstName.LastNameWithUnderscoreForSpace@Pandora.Be.RemoveThis>
test news chelito@eromaker.es
Thanks <florian265@uboot.com>
Re: Unexpected alteration of array's content <mpapec@yahoo.com>
Re: <bwalton@rochester.rr.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Thu, 02 Oct 2003 20:20:57 -0000
From: "David K. Wall" <usenet@dwall.fastmail.fm>
Subject: Re: data varification code logic
Message-Id: <Xns9408A64FAB228dkwwashere@216.168.3.30>
King <king21122@yahoo.com> wrote:
> I have a data file and need to build another file with the data
> sorted and douplicated data removed based on the first column
> value.
Here's a hint:
perldoc -q duplicate
Unique values should always suggest "use a hash".
--
David Wall
------------------------------
Date: Thu, 2 Oct 2003 15:37:59 -0500
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: data varification code logic
Message-Id: <slrnbnp356.ar.tadmc@magna.augustmail.com>
King <king21122@yahoo.com> wrote:
> Subject: Re: data varification code logic
I do not see any data validation going on...
> I have a data file and need to build another file with the data sorted
> and douplicated data removed based on the first column value.
This should get you most of the way there:
----------------------------------------
#!/usr/bin/perl
use strict;
use warnings;
# Schwartzian Transform from FAQ: How do I sort an array by (anything)?
my %seen;
my @sorted = map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { my $key = (split)[0]; # do the split() only once
$seen{$key}++ ? () # empty list (skip duplicates)
: [ $_, $key ]; # or reference to array
} <DATA>;
print @sorted; # sorted & uniqified lines
print "----\n";
print "$_ seen $seen{$_} times\n" for sort keys %seen;
__DATA__
one two
one two
One Two
one three
Two five
two three
Two three
----------------------------------------
> while (<DATA>) {
> my @L = split;
> push @col_1, $L[0]; #could those 2 lines be combined in one?
> }
Yes, by using a "list slice", see perldata.pod:
push @col_1, (split)[0];
But you can combine the whole while loop too:
my @col_1 = map { (split)[0] } <DATA>; # untested
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Thu, 02 Oct 2003 21:02:27 -0000
From: "David K. Wall" <usenet@dwall.fastmail.fm>
Subject: Re: data varification code logic
Message-Id: <Xns9408AD59254CEdkwwashere@216.168.3.30>
Tad McClellan <tadmc@augustmail.com> wrote:
> King <king21122@yahoo.com> wrote:
>
>> I have a data file and need to build another file with the data
>> sorted and douplicated data removed based on the first column
>> value.
>
> This should get you most of the way there:
>
> ----------------------------------------
> #!/usr/bin/perl
> use strict;
> use warnings;
>
> # Schwartzian Transform from FAQ: How do I sort an array by
> (anything)?
>
> my %seen;
> my @sorted = map { $_->[0] }
> sort { $a->[1] cmp $b->[1] }
> map { my $key = (split)[0]; # do the split()
> only once
> $seen{$key}++ ? () # empty list (skip
> duplicates)
> : [ $_, $key ]; # or reference
> to array
> } <DATA>;
>
> print @sorted; # sorted & uniqified lines
Pretty. I was trying to hint at something simpler. If the OP was
having trouble getting unique values, perhaps a Schwartzian Transform
might be a bit premature? <shrug>
my %hash;
while(<DATA>) {
my ($first, $rest) = split /\s+/,$_, 2;
next if exists $hash{$first};
$hash{$first} = $rest;
}
print "$_ $hash{$_}" for sort keys %hash;
But of course the ST is better in several ways... and a good tool to
know about ASAP. (not counting the difference in the split(), which
may be important.)
--
David Wall
------------------------------
Date: Thu, 02 Oct 2003 18:30:04 GMT
From: Michael Budash <mbudash@sonic.net>
Subject: Re: DBI
Message-Id: <mbudash-5D612B.11300402102003@typhoon.sonic.net>
In article <pan.2003.10.02.12.40.42.160229@aursand.no>,
Tore Aursand <tore@aursand.no> wrote:
> Problems occur when the data you insert into the SQL query contain single
> quotes (or other non-escaped special characters).
>
> The easiest way to deal with problems like this is to _always_ bind
> variables into your SQL query. I recommend this way of doing it even when
> you are 110% sure that the data you're about to insert into the SQL is
> "clean";
very good advice. also it prevents having to think about which method
you'll use - just _always_ use placeholders.
>
> my $stExists = $dhh->prepare('SELECT COUNT(*)
> FROM tickets
> WHERE email = ?');
> $stExists->execute( $email );
> my ( $exists ) = $stExists->fetchrow();
> $stExists->finish();
>
> if ( $exists ) {
> my $stUpdate = $dbh->prepare('UPDATE tickets
> SET time = ?,
> ticket = ?
> WHERE email = ?');
> $stUpdate->execute( $thetime, $ticket, $email );
> $stUpdate->finish();
> }
> else {
> my $stInsert = $dbh->prepare('INSERT INTO tickets
> VALUES (email, time, ticket)
> (?, ?, ?)');
> $stInsert->execute( $email, $thetime, $ticket );
> $stInsert->finish();
> }
>
> Lookup 'perldoc DBI' for more information about bind'ing values into SQL
> queries.
also, the 'replace' query can be very useful:
http://www.mysql.com/doc/en/REPLACE.html
--
Michael Budash
------------------------------
Date: Thu, 02 Oct 2003 22:50:53 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: DBI
Message-Id: <pan.2003.10.02.20.50.44.27486@aursand.no>
On Thu, 02 Oct 2003 18:30:04 +0000, Michael Budash wrote:
>> The easiest way to deal with problems like this is to _always_ bind
>> variables into your SQL query. I recommend this way of doing it even when
>> you are 110% sure that the data you're about to insert into the SQL is
>> "clean";
> very good advice. also it prevents having to think about which method
> you'll use - just _always_ use placeholders.
That's right. I _always_ bind nowadays. Don't need that quote() method
or anything. Just bind away! :-)
> also, the 'replace' query can be very useful:
Isn't that _very_ MySQL specific?
--
Tore Aursand <tore@aursand.no>
------------------------------
Date: Thu, 02 Oct 2003 21:11:17 GMT
From: Michael Budash <mbudash@sonic.net>
Subject: Re: DBI
Message-Id: <mbudash-E7C7D0.14111602102003@typhoon.sonic.net>
In article <pan.2003.10.02.20.50.44.27486@aursand.no>,
Tore Aursand <tore@aursand.no> wrote:
> On Thu, 02 Oct 2003 18:30:04 +0000, Michael Budash wrote:
> >> The easiest way to deal with problems like this is to _always_ bind
> >> variables into your SQL query. I recommend this way of doing it even when
> >> you are 110% sure that the data you're about to insert into the SQL is
> >> "clean";
>
> > very good advice. also it prevents having to think about which method
> > you'll use - just _always_ use placeholders.
>
> That's right. I _always_ bind nowadays. Don't need that quote() method
> or anything. Just bind away! :-)
>
> > also, the 'replace' query can be very useful:
>
> Isn't that _very_ MySQL specific?
not sure, quite possibly, but the o.p. did say:
>> I have a mySQL database with a table called "tickets" in it.
--
Michael Budash
------------------------------
Date: 2 Oct 2003 14:03:15 -0700
From: deals@slip-12-64-108-121.mis.prserv.net (Great Deals)
Subject: Free Windows NNTP Server for Perl to post?
Message-Id: <cafe07c7.0310021303.6f35103d@posting.google.com>
Isn't NNTP like mail replay? As long as I have my own NNTP server, I
can get into the server-to-server communication.
There are many free NNTP for read, but very few open to post.
As I understand the NNTP in Perl needs a server to post. I want to
know how I can get this kind of server so that I can use perl to post
to my own server then the server transfers the file to the central
database of newsgroup.
------------------------------
Date: Thu, 02 Oct 2003 20:39:40 GMT
From: Sean O'Dwyer <notspam@spamfree.dud>
Subject: Help with Encoding Extended Characters
Message-Id: <notspam-926B66.16452602102003@news-server.nyc.rr.com>
Hi,
I have data coming into a script from a Web form. Everything's fine and
running normally, but when the Perl script sends the data via e-mail to
my (Hispanic) client the extended characters are munged...
>D'a
>DIRECCIîN
>CîDIGO POSTAL
>DIRECCIîN PAêS: Espa-a
...when they should read...
>Día
>DIRECCIÓN
>CÓDIGO POSTAL
>DIRECCIÓN PAÍS: España
These characters are embedded into my script as follows...
>print MAIL "TELÉFONO: $variable\n";
>print MAIL "DIRECCIÓN: $variable\n";
>print MAIL "CÓDIGO POSTAL: $variable\n";
>print MAIL "DIRECCIÓN PAÍS: $variable\n";
>etc.
The problem may be related to the fact I'm Mac-based. I upload via FTP
binary and use UNIX linebreaks, etc., but obviously I'm not encoding the
extended chars correctly.
Any help appreciated.
Kind regards,
Sean
------------------------------
Date: 02 Oct 2003 12:27:48 -0700
From: gisle@activestate.com
Subject: Re: How to use proxy in Net::HTTP, not in LWP::UserAgent?
Message-Id: <m3lls3mocr.fsf@eik.i-did-not-set--mail-host-address--so-shoot-me>
deals@slip-12-64-108-121.mis.prserv.net (Great Deals) writes:
> I need some features that are in Net::HTTP but not in LWP::UserAgent.
Which feature is that?
> I used proxy in LWP::UserAgent, and I think LWP::UserAgent is using
> Net::HTTP, but I can not find how I can use proxy in Net::HTTP
> directly...
Read the HTTP spec. Or monitor how some application (like LWP) does
it when it talks via a proxy. Basically all you have to do is provide
a full URL in the request line to the proxy server.
--
Gisle Aas
------------------------------
Date: Thu, 02 Oct 2003 20:32:06 +0100
From: Jeff.Turnbull@belvoirlettings.com (jeff)
Subject: Re: Inherit file descriptors from parent process?
Message-Id: <Jeff.Turnbull-0210032032060001@192.168.0.5>
In article <cabf0e7b.0309260908.28ecec97@posting.google.com>,
james@unifiedmind.com (James Thornton) wrote:
> How do you inherit the stdin and stdout file descriptors from a parent
> process? Specifically, the following function "run_filters" forks and
> calls several perl scripts via execvp.
>
> One of the perl scripts needs to execute a virus scanner on the file
> pointed to by the parent file descriptor, modifying the file as
> needed. The virus scanner will be invoked via a system call where
> $file is the file name associated with the file parent file
> descriptor:
>
> system $virus_binary $file;
>
> // C program that forks and executes several
> // Perl scripts via execvp
> int run_filters(command* first, int fdin)
> {
> command* c;
>
> for(c = first; c; c = c->next) {
> pid_t pid;
> int status;
> int fdout;
>
> fdout = mktmpfile();
> if(fdout == -1)
> return -QQ_WRITE_ERROR;
> pid = fork();
> if(pid == -1)
> return -QQ_OOM;
> if(pid == 0) {
> if(close(0) == -1 ||
> dup2(fdin, 0) != 0 ||
> close(1) == -1 ||
> dup2(fdout, 1) != 1)
> exit(QQ_WRITE_ERROR);
> execvp(c->argv[0], c->argv);
> exit(QQ_INTERNAL);
> }
> if(waitpid(pid, &status, WUNTRACED) == -1)
> return -QQ_INTERNAL;
> if(!WIFEXITED(status))
> return -QQ_INTERNAL;
> if(WEXITSTATUS(status))
> return -WEXITSTATUS(status);
> close(fdin);
> if(lseek(fdout, 0, SEEK_SET) != 0)
> return -QQ_WRITE_ERROR;
> fdin = fdout;
> }
> return fdin;
> }
------------------------------
Date: 2 Oct 2003 14:48:22 -0800
From: yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones)
Subject: Re: Inherit file descriptors from parent process?
Message-Id: <3f7c9d26@news.victoria.tc.ca>
James Thornton (james@unifiedmind.com) wrote:
: How do you inherit the stdin and stdout file descriptors from a parent
: process? Specifically, the following function "run_filters" forks and
: calls several perl scripts via execvp.
Normally you inherit them buy doing nothing. The command that is spawned
will read from the same stdin and write to the same stdout. (If several
programs are all ready the same stdin at the same time then they each get
a turn at reading.)
For example, this runs perl, then spawns out to the cat command which
echoes its stdin to stdout, and then when it return to perl then perl
prints to stdout.
C:> perl -e "print 'this is perl'; system('cat'); print 'back to perl'"
The output I get (includes some things I typed)
this is perl
hello
hello
back to perl
(My example uses system(), but all system does is fork and exec, so its
really no different.)
------------------------------
Date: Thu, 02 Oct 2003 18:10:37 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: IsSorted @list ? 1 : 0
Message-Id: <3F7C69E5.89BDC9F@acm.org>
King wrote:
>
> I need to find out if a data file is sorted by it's first column with is
> suppose to be a unique value.
> I am thinking
> while (<FH>) {
> push @key, split [0];
> }
> IsSorted @key ? do this : do that;
my $is_sorted = 1;
my $prev = '';
while ( <FH> ) {
my $curr = (split)[0];
$is_sorted = $curr gt $prev or last;
$prev = $curr;
}
unless $is_sorted do { something() };
John
--
use Perl;
program
fulfillment
------------------------------
Date: Thu, 02 Oct 2003 22:46:30 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: IsSorted @list ? 1 : 0
Message-Id: <pan.2003.10.02.16.40.51.362708@aursand.no>
On Fri, 03 Oct 2003 01:22:21 +1000, King wrote:
> I need to find out if a data file is sorted by it's first column
Well. Define "sorted", please. There are several ways of sorting
something, depending on what type of data we're dealing with.
--
Tore Aursand <tore@aursand.no>
------------------------------
Date: 2 Oct 2003 11:35:49 -0800
From: yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones)
Subject: Re: Perl script crashing at lockfile ?
Message-Id: <3f7c7005@news.victoria.tc.ca>
Peter Richards (jehoshua@my-deja.com) wrote:
: Hi,
: I've recently moved a website and one Perl script will not work. The
: previous site had a Unix box with Perl 5.006 , and the new site has a
: Linux box with Perl 5.006001
: After inserting "print" statements all over the place, finally this
: piece of code is where the script is stopping:
: ----------------------------------------------------------------------
: system ("lockfile -2 -r 5 $base_dir/.lock" ) == 0 or diehtml("Lock
: error: ", $? >> 8, "\n" ); # TODO stop stderr of system
: --------------------------------------------------------------------
: There is no msg appearing (what happened to the "diehtml" ?), the
: script just stops.
What do you mean by "just stops"?
Is the script still running but not doing anything? I would use the ps
command to check. I typically type the command
$ ps -ef | grep my-user-name
to see what programs I am running, though how you do that on your cgi
server will depend.
I would guess the lockfile program is waiting to acquire a lock on the
flag file ($base_dir/.lock). I suspect that if you use ps you will see
the lockfile program "running", but not taking any time (because it is
just waiting), and your perl script will also be "running" but not taking
any time (because it too is waiting - for the lockfile program).
It could be that the web server will kill the processes if they stay stuck
too long, in which case you will not get an error message. It could be
that the lockfile program gets stuck just once (for reason I cannot
guess at) and then every time you run the perl-script/lockfile-program
then it will hang because of the earlier stuck process.
You could try an alarm around the system() call
# untested, from memory, some pseudo code
alarm(3); # three seconds
eval { system(your-command); }
alarm(0);
if ($@)
{ diehtml("system() timed out after three seconds");
}
------------------------------
Date: Thu, 2 Oct 2003 12:29:01 -0700
From: "David Oswald" <davido@pacifier.com>
Subject: Re: regex behavior
Message-Id: <vnov3vhrqn2pea@corp.supernews.com>
> > So, 'a bb ccc dddd' =~ /(\w)+/g; returns for each substring of
> > consecutive word characters the last one, resulting in 'a', 'b', 'c' and
'd'.
>
> That tests out as you said, so it's MY thinking that's off. :)
> Hopefully, you can clue me in. :)
>
> I expected it to result in "a,bb,ccc,dddd". Now I realize that
> it's the positioning of the + that causes it to get a single
> character from each group. If the + is inside the (), it
> prints what I expected.
>
> But... What is causing the original /(\w)+/ to get the LAST
> character from each group instead of the FIRST character from
> each group?
Because, walking through your string of "a bb ccc dddd" look at what your
regexp is doing:
Pass one, step one. Find and capture "a". Return "a".
Pass 2, step 1: Find and capture first 'b'.
Pass2, step 2: Find 2nd 'b', and replace the first 'b' with the second one.
Return 2nd 'b'.
Pass3, step 1: Find first 'c' and capture it.
Pass3, step 2: Find second 'c' and put it where first 'c' had been captured.
Pass3, step 3: Find third 'c' and put it where the 2nd 'c' had been
captured. Return 3rd 'c'.
Pass4..... you should get the idea by now.
Think of the capturing parens as your pocket, and it only has room for one
thing. The regexp puts the first thing it matches into the pocket. When it
finds (due to the quantifier) that it matches the 2nd thing, take the first
one out and put the 2nd one in. And so on.
------------------------------
Date: Thu, 2 Oct 2003 14:16:45 -0400
From: "Jaga" <darthlover@yahoo.com>
Subject: regex for URL in a log file
Message-Id: <blhq2d$4v4$1@newsmaster.cc.columbia.edu>
hail all,
I am trying to write a regular expression to match a url in a text file.
the test file looks like below under the *********
I would like to match all the urls a print them out...
I think this is easy for most but a pain in the neck for me
thanks!
************
°;V8q|Ã`<F- ÃL/&¤ ?Q ` h þ
6/$h :2003091520030922:
tfred@http://quintillium.com/mslegal/tssi986
URL ssóq|Ã`<F- ÃL/²¥ ?Q ` h þ
6/$h :2003091520030922:
tfred@http://ninet/Lists/Announcements/DispForm.h
------------------------------
Date: Thu, 2 Oct 2003 15:28:17 -0400
From: "Jaga" <darthlover@yahoo.com>
Subject: regex for URL in a log file
Message-Id: <blhu8p$891$1@newsmaster.cc.columbia.edu>
> hail all,
> I am trying to write a regular expression to match a url in a text
file.
> the test file looks like below under the *********
> I would like to match all the urls a print them out...
> I think this is easy for most but a pain in the neck for me
>
> thanks!
>
>
> ************
> °;V8q|Ã`<F- ÃL/&¤ ?Q ` h þ
> 6/$h :2003091520030922:
> tfred@http://quintillium.com/mslegal/tssi986
> URL ssóq|Ã`<F- ÃL/²¥ ?Q ` h þ
> 6/$h :2003091520030922:
> tfred@http://ninet/Lists/Announcements/DispForm.h
>
>
------------------------------
Date: Thu, 2 Oct 2003 15:53:24 -0400
From: "Jaga" <darthlover@yahoo.com>
Subject: Re: regex for URL in a log file
Message-Id: <bli00h$9l0$1@newsmaster.cc.columbia.edu>
Hail again,
here is some code I 'lifted' from different places to do pretty much
what I want... unforutnately, it doesn't work and I am working on trying to
fix it...
##########################
open IFILE,"<log.txt" or die "Can't Open file:: $!";
@lines=<IFILE>;
$text = join "\n", @lines;
@hrefs=($text=~ m{ \"(?:(-)|http\:\/\/(.*?))\"\s+ }x);
print "list of href values\n";
$count = 1;
foreach $href (@hrefs) {
print "$href\n";
$count++;
}
print $count;
close IFILE;
##########################
thanks,
Jaga
"Jaga" <darthlover@yahoo.com> wrote in message
news:blhq2d$4v4$1@newsmaster.cc.columbia.edu...
> hail all,
> I am trying to write a regular expression to match a url in a text
file.
> the test file looks like below under the *********
> I would like to match all the urls a print them out...
> I think this is easy for most but a pain in the neck for me
>
> thanks!
>
>
> ************
> °;V8q|Ã`<F- ÃL/&¤ ?Q ` h þ
> 6/$h :2003091520030922:
> tfred@http://quintillium.com/mslegal/tssi986
> URL ssóq|Ã`<F- ÃL/²¥ ?Q ` h þ
> 6/$h :2003091520030922:
> tfred@http://ninet/Lists/Announcements/DispForm.h
>
>
------------------------------
Date: Thu, 2 Oct 2003 16:16:24 -0400
From: "Jaga" <darthlover@yahoo.com>
Subject: Re: regex for URL in a log file
Message-Id: <bli12p$af9$1@newsmaster.cc.columbia.edu>
I change the regex to look like this:
@hrefs=($text=~ m{http\:\/\/(.*?)\s+ }x);
unfortunately, it only returns:
quintillium.com/mslegal/tssi986
and doesn't return the other url
how can I do it recursivly through out the whole $text string?
or how can I do this more efficiently...
"Jaga" <darthlover@yahoo.com> wrote in message
news:bli00h$9l0$1@newsmaster.cc.columbia.edu...
> Hail again,
> here is some code I 'lifted' from different places to do pretty much
> what I want... unforutnately, it doesn't work and I am working on trying
to
> fix it...
> ##########################
> open IFILE,"<log.txt" or die "Can't Open file:: $!";
>
> @lines=<IFILE>;
>
> $text = join "\n", @lines;
>
> @hrefs=($text=~ m{ \"(?:(-)|http\:\/\/(.*?))\"\s+ }x);
>
> print "list of href values\n";
> $count = 1;
> foreach $href (@hrefs) {
> print "$href\n";
> $count++;
> }
> print $count;
>
> close IFILE;
> ##########################
> thanks,
> Jaga
>
> "Jaga" <darthlover@yahoo.com> wrote in message
> news:blhq2d$4v4$1@newsmaster.cc.columbia.edu...
> > hail all,
> > I am trying to write a regular expression to match a url in a text
> file.
> > the test file looks like below under the *********
> > I would like to match all the urls a print them out...
> > I think this is easy for most but a pain in the neck for me
> >
> > thanks!
> >
> >
> > ************
> > °;V8q|Ã`<F- ÃL/&¤ ?Q ` h þ
> > 6/$h :2003091520030922:
> > tfred@http://quintillium.com/mslegal/tssi986
> > URL ssóq|Ã`<F- ÃL/²¥ ?Q ` h þ
> > 6/$h :2003091520030922:
> > tfred@http://ninet/Lists/Announcements/DispForm.h
> >
> >
>
>
------------------------------
Date: 2 Oct 2003 20:50:31 GMT
From: Glenn Jackman <xx087@freenet.carleton.ca>
Subject: Re: regex for URL in a log file
Message-Id: <slrnbnp3tm.jb8.xx087@smeagol.ncf.ca>
Jaga <darthlover@yahoo.com> wrote:
> I am trying to write a regular expression to match a url in a text file.
Don't reinvent the wheel:
use Regexp::Common qw(URI);
my @urls;
while (<>) {
push @urls, /$RE{URI}{HTTP}/g;
}
--
Glenn Jackman
NCF Sysadmin
glennj@ncf.ca
------------------------------
Date: 02 Oct 2003 23:59:27 +0200
From: Florian von Savigny <florian265@uboot.com>
Subject: Re: regex for URL in a log file
Message-Id: <m3brsze1xc.fsf@uboot.com>
One way to do it:
$text = "blabla soiu apoj match poi aigjpo match poua ier";
while ($text =~ /[^a-z](match)[^a-z]/g) {
print $1, "\n";
}
this outputs:
match
match
The crucial thing is the /g (global) modifier, which causes the
matching to go on after the first match, until there's no more.
> @hrefs=($text=~ m{http\:\/\/(.*?)\s+ }x);
> unfortunately, it only returns:
> quintillium.com/mslegal/tssi986
This seems obvious, since you've excluded the "http://" from the
parentheses. I've never formulated such a thing the way you have done
here, but you might try to exchange your x modifier for g (x is
misled: it means "extended regular expressions", which means that you
can use comments and whitespace inside your regex to make it more
readable); it might work similar to my while () loop. However, as this
seems to return the contents of the first pair of parentheses (all $1,
so to speak), I wouldn't want to guess what it returns if you use more
than one pair.
Some more hints:
- if you use delimiters other than //, as you have done, you need not
escape the "/" in the regex; and you never need to escape ":"
- it is often a good idea to define matches by what they must NOT be:
e.g., formulate the body of the URL as "[^\s]+" (assuming it is
indeed delimited by some whitespace character). This has the side
effect of being helpful with tools such as grep, which don't support
minimal matching quantifiers (*?).
- if you do not want to exclude protocols other than HTTP, you might
want to say sth like "(http|ftp|news|mailto)" instead of just
"http" (but see above). You'd have to adjust the slashes, of course.
--
Florian v. Savigny
If you are going to reply in private, please be patient, as I only
check for mail something like once a week. - Si vous allez répondre
personellement, patientez s.v.p., car je ne lis les courriels
qu'environ une fois par semaine.
------------------------------
Date: Thu, 02 Oct 2003 21:35:23 +0200
From: AlV <skweek@no.spam>
Subject: Re: screen output lags behind, or script appears to 'switch statements'
Message-Id: <blhuls$1gk$1@news-reader5.wanadoo.fr>
Florian von Savigny wrote:
> Hi,
Hi,
[snip]
> Does anyone have a clue what kind of problem this could be, and where
> it typically arises?
You should suppress Perl buffering mechanism before printing a string
without newline character. For example, you might do this by setting $|
to something true (1 is a reasonable value for that purpose ;o)
Restoring $| to false might be a good idea thereafter.
Actually, $| has an "use English;" name: $OUTPUT_AUTOFLUSH
------------------------------
Date: 2 Oct 2003 15:50:03 -0400
From: vorxion@fairlite.com (Vorxion)
Subject: Re: select() on socket
Message-Id: <3f7c816b$1_1@news.iglou.com>
In article <u9k77nfu9d.fsf@wcl-l.bham.ac.uk>, Brian McCauley wrote:
>vorxion@fairlite.com (Vorxion) writes:
>
>> I'm having a problem with select() on a socket. I'm setting the bits, and
>> it's really strange, because even when there's -nothing- to be read,
>> vec($rout,fileno(DATA) comes back as 0 instead of 1. The bits come back
>> 0 1 0 in order of $rout $wout $eout.
>>
>> Okay, so exactly how do you tell when the remote end is sitting there with
>> nothing to say? I've used select(2) in C, and there appears to be no ready
>> equivalent of FD_ISSET() in perl.
>
>Eh?
>
>> If someone can tell me how to do this properly, or what I'm doing wrong (I'm
>> checking for $rout to be set back to 0 (via vec) after the select().
>
>> select($rout=$rin,$wout=$win,$eout=$ein,${timeout_select});
>
>You are not checking if the return value of select().
>
>It is possible that it's failing.
Yes, I discovered that in the wee hours of last night. Fixed. It helped
to go through the core modules and look at how it was used there.
Now I need to find a way to determine if the socket has been closed from
remote. Now that I have the return value of select being used, maybe I can
try using EBITS?
--
Vorxion - Member of The Vortexa Elite
------------------------------
Date: Thu, 02 Oct 2003 20:44:59 GMT
From: Jos De Laender <FirstName.LastNameWithUnderscoreForSpace@Pandora.Be.RemoveThis>
Subject: Re: Splitting subroutines out of a file
Message-Id: <f70fb.54446$j17.2294849@phobos.telenet-ops.be>
James E Keenan wrote:
>
> "Bob Walton" <noemail@rochester.rr.com> wrote in message
> news:3F7701F3.3070905@rochester.rr.com...
>> Jos De Laender wrote:
>>
>>
>> My approach would be to manually examine the source in an editor and
>> insert a special flag between subs (something like a line consisting of:
>>
>> #~~sub~~
>>
>> -- anything that won't otherwise appear in the code. Then process the
>> file with that as the input record separator. You will have to decide
>> how to name the files, since there might be duplicate sub names between
>> files and perhaps between packages in one file (and perhaps anonymous
>> subs too?) -- use a sequence number?
>>
>> But all in all, I don't grok what doing all that would buy you over
>> simply manipulating the code into the files you want using an editor --
>> it looks like that's what you're going to end up doing anyway.
>>
>> --
> Jos: I think Bob is right on the money here.
Thanks to you all.
I used Bob's approach for identifying and copying out the subs.
You wont't believe, but the cgi application worked with some 20 files
(called by different buttons in the webpage) in which 8 subroutines where
more or less the same. In file1 it was patched in file2 it wasn't etc ...
That's why I wanted to reorganise drastically and not simply manipulating
code into the files ....
Once more thanks.
Jos
--
Jos De Laender
------------------------------
Date: Thu, 2 Oct 2003 17:38:40 +0000 (UTC)
From: chelito@eromaker.es
Subject: test news
Message-Id: <blhnr0$3rq$1@localhost.localdomain>
darnier test news
perdón por las molestias
--
Chelito
Fotos Eroticas
http://www.personal.able.es/ensoriano
------------------------------
Date: 02 Oct 2003 23:13:46 +0200
From: Florian von Savigny <florian265@uboot.com>
Subject: Thanks
Message-Id: <m3fzibe41h.fsf@uboot.com>
Thanks a lot, I had really not figured out that it was 'no newline at
the end' what those print statements had in common. I did think
something real weird was going on, possibly something beyond the realm
of Perl (such as the shell commands I am calling).
I have taken to setting and resetting $| before and after each such
print statement, which may have been more work than Brian's proposal
after all. But in any case, it has worked like a snap.
Many thanks!
--
Florian v. Savigny
If you are going to reply in private, please be patient, as I only
check for mail something like once a week. - Si vous allez répondre
personellement, patientez s.v.p., car je ne lis les courriels
qu'environ une fois par semaine.
------------------------------
Date: Thu, 02 Oct 2003 20:34:10 +0200
From: Matija Papec <mpapec@yahoo.com>
Subject: Re: Unexpected alteration of array's content
Message-Id: <hdronv01mtn2q3gke24vakffnf8j5i88h9@4ax.com>
X-Ftn-To: Brian McCauley
Brian McCauley <nobull@mail.com> wrote:
>> >effect, I've just returned to Perl programming.
>>
>> It is a documented feature, to avoid it try,
>> for my $line (map $_, @lines){
>
>Personally I'm supprised that works.
? :)
>But even though it does (on at
>least one version of Perl I've tried) I still prefer:
>
> for my $line (@{[@lines]}){
Do you know if there is some significant speed difference?
I've started using double maps since single map start to corrupt hash
slices(very frustrating). IMO, there is absolutely none or close to none
usefulness of aliasing $_ for grep and map.
--
Matija
------------------------------
Date: Sat, 19 Jul 2003 01:59:56 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re:
Message-Id: <3F18A600.3040306@rochester.rr.com>
Ron wrote:
> Tried this code get a server 500 error.
>
> Anyone know what's wrong with it?
>
> if $DayName eq "Select a Day" or $RouteName eq "Select A Route") {
(---^
> dienice("Please use the back button on your browser to fill out the Day
> & Route fields.");
> }
...
> Ron
...
--
Bob Walton
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 5606
***************************************