[18189] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 357 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Feb 26 09:05:40 2001

Date: Mon, 26 Feb 2001 06:05:10 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <983196310-v10-i357@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Mon, 26 Feb 2001     Volume: 10 Number: 357

Today's topics:
    Re: #!/usr/local/bin/perl -w || /usr/bin/perl -w (Michael Wang)
    Re: [OT] Re: Problem with Program - can you help? (Betastar)
        [Perl] How to find the Perl FAQ <rootbeer&pfaq*finding*@redcat.com>
    Re: Any way to modify a script while executing it? (Gwyn Judd)
    Re: delete the own directory (Gwyn Judd)
    Re: Difficult Split Question <ianb@ot.com.au>
    Re: Difficult Split Question <ianb@ot.com.au>
    Re: Driving me mad!. How would you do this?? <enterprise@ozemail.com.au>
    Re: FAQ 4.7:   How do I perform an operation on a serie (Gwyn Judd)
    Re: Optimizing hash slice construct (Anno Siegel)
    Re: Problem with Program - can you help? (Gwyn Judd)
    Re: Problem with Program - can you help? (Gwyn Judd)
    Re: regexp clarification (Martien Verbruggen)
    Re: Regexp to match Web urls? (Csaba Raduly)
    Re: timeout connect() with select? <perlCHEESE@CHEESEatlaswebmail.com>
    Re: timeout connect() with select? (Anno Siegel)
        tricky perl regex problem <enterprise@ozemail.com.au>
    Re: tricky perl regex problem (Bernard El-Hagin)
    Re: tricky perl regex problem (Rafael Garcia-Suarez)
    Re: tricky perl regex problem <enterprise@ozemail.com.au>
    Re: XML parsing for large documents <matt@sergeant.org>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 26 Feb 2001 13:21:15 GMT
From: mwang@mindspring.com (Michael Wang)
Subject: Re: #!/usr/local/bin/perl -w || /usr/bin/perl -w
Message-Id: <fLsm6.1039$Jb2.58298@news.uswest.net>

Rafael Garcia-Suarez <rgarciasuarez@free.fr> wrote:
>#!/usr/bin/perl
>use Config;
>if ($Config{version} =~ /^5\.0/ && -x '/usr/local/bin/perl') {
>  exec('/usr/local/bin/perl', $0, @ARGV) or die $!;
>}
>... the script here ...

Thanks. This solves the problem that

if /usr/local/bin/perl is there
  use it
else
  use /usr/bin/perl  # and we know that it is there
fi

if the question is changed to

if /usr/local/bin/perl is there
  use it
elif /usr/bin/perl is there
  use it
fi

And we only know at least one of /usr/local/bin/perl and /usr/bin/perl
is there but not know which one is there.

I think this has be to

#!/bin/ksh

and some smart construct understandable for both ksh and perl.


------------------------------

Date: Mon, 26 Feb 2001 13:13:32 GMT
From: insq@yahoo.com (Betastar)
Subject: Re: [OT] Re: Problem with Program - can you help?
Message-Id: <3a9a55cf.81373263@news-server.nc.rr.com>

On 26 Feb 2001 00:15:01 -0500, Joe Schaefer
<joe+usenet@sunstarsys.com> wrote:
earch.
>
>Let's hope you will publish your results and not go looking for yet 
>another genome patent ;-)

I'm a grad student on a federal grant - I have to publish ;)

I just hope I get some good results.  We're trying to control a plant
parasite without putting the farmer in danger from chemical sprays. =)



------------------------------

Date: Mon, 26 Feb 2001 11:21:04 GMT
From: Tom Phoenix <rootbeer&pfaq*finding*@redcat.com>
Subject: [Perl] How to find the Perl FAQ
Message-Id: <pfaqmessage983186641.1604@news.teleport.com>

Archive-name: perl-faq/finding-perl-faq
Posting-Frequency: weekly
Last-modified: 29 Apr 2000

[ That "Last-modified:" date above refers to this document, not to the
Perl FAQ itself! The last _major_ update of the Perl FAQ was in Summer
of 1998; of course, ongoing updates are made as needed. ]

For most people, this URL should be all you need in order to find Perl's
Frequently Asked Questions (and answers).

    http://www.cpan.org/doc/FAQs/

Please look over (but never overlook!) the FAQ and related docs before
posting anything to the comp.lang.perl.* family of newsgroups.

For an alternative way to get answers, check out the Perlfaq website.

    http://www.perlfaq.com/

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 

Beginning with Perl version 5.004, the Perl distribution itself includes
the Perl FAQ. If everything is pro-Perl-y installed on your system, the
FAQ will be stored alongside the rest of Perl's documentation, and one
of these commands (or your local equivalents) should let you read the FAQ.

    perldoc perlfaq
    man perlfaq

If a recent version of Perl is not properly installed on your system,
you should ask your system administrator or local expert to help. If you
find that a recent Perl distribution is lacking the FAQ or other important
documentation, be sure to complain to that distribution's author.

If you have a web connection, the first and foremost source for all things
Perl, including the FAQ, is the Comprehensive Perl Archive Network (CPAN).
CPAN also includes the Perl source code, pre-compiled binaries for many
platforms, and a large collection of freely usable modules, among its
560_986_526 bytes (give or take a little) of super-cool (give or take
a little) Perl resources.

    http://www.cpan.org/
    http://www.perl.com/CPAN/
    http://www.cpan.org/doc/FAQs/FAQ/html/
    http://www.perl.com/CPAN/doc/FAQs/FAQ/html/

You may wish or need to access CPAN via anonymous FTP. (Within CPAN,
you will find the FAQ in the /doc/FAQs/FAQ directory. If none of these
selected FTP sites is especially good for you, a full list of CPAN sites
is in the SITES file within CPAN.)

    California     ftp://ftp.cdrom.com/pub/perl/CPAN/
    Texas          ftp://ftp.metronet.com/pub/perl/
    South Africa   ftp://ftp.is.co.za/programming/perl/CPAN/
    Japan          ftp://ftp.dti.ad.jp/pub/lang/CPAN/
    Australia      ftp://cpan.topend.com.au/pub/CPAN/
    Netherlands    ftp://ftp.cs.ruu.nl/pub/PERL/CPAN/
    Switzerland    ftp://sunsite.cnlab-switch.ch/mirror/CPAN/
    Chile          ftp://ftp.ing.puc.cl/pub/unix/perl/CPAN/

If you have no connection to the Internet at all (so sad!) you may wish
to purchase one of the commercial Perl distributions on CD-Rom or other
media. Your local bookstore should be able to help you to find one.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 

Comments and suggestions on the contents of this document
are always welcome. Please send them to the author at
<pfaq&finding*comments*@redcat.com>. Of course, comments on
the docs and FAQs mentioned here should go to their respective
maintainers.

Have fun with Perl!

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/


------------------------------

Date: Mon, 26 Feb 2001 13:32:19 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Any way to modify a script while executing it?
Message-Id: <slrn99kmn0.dkv.tjla@thislove.dyndns.org>

I was shocked! How could Dick <donkan7@yahoo.com>
say such a terrible thing:

>Each time I run this it of course says the default is 33. What I'd like to
>be able to do is to have a change in the default value "stick". For example

Language War!!! I just want to point out that in AppleScript (shudder)
if you declare a variable as a "property" variable instead of a normal
variable type then you get persistence by default and whatever value it
has at the termination of the script sticks...just like cr*p sticks to a
wall. This is about the only really nice feature of that horrible blight
of a language I can remember. Okay off-topic episode is over, back to
your normal everyday programming everyone, nothing to see here.

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
I believe OS/2...to be the most important OS...of all time.
-- Bill Gates, 1987


------------------------------

Date: Mon, 26 Feb 2001 13:38:00 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: delete the own directory
Message-Id: <slrn99kn1m.dkv.tjla@thislove.dyndns.org>

I was shocked! How could Christian Gersch <c.gersch@team.isneurope.com>
say such a terrible thing:
>Hi there!
>
>The following situation:
>The Perlscript "script.pl" has been executed and deleted itsself by "unlink
><script.pl>;".
>
>Okay, now the directory where the script _was_ is empty.
>
>How can I realize it to delete the empty directory, when the script has
>already deleted itsself.

I just have to know. Why are you doing this? There must be a better way.
Whatever problem you are trying to solve.

>The name of the directory is not know for the script - but there is a
>variable, isn't it?

No. Not in general. It may be possible to guess depending on the value
of $0.

>...and what's the command to delete a directory...I think "unlink" is only
>for files...?

rmdir. You may need to change your working directory out of this
directory before it will succeed.

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
 "She has brought something evil from the past...a soldier of darkness."
	-- Markab Ambassador, "The Long Dark"


------------------------------

Date: Mon, 26 Feb 2001 22:37:54 +1100
From: Ian Boreham <ianb@ot.com.au>
Subject: Re: Difficult Split Question
Message-Id: <3A9A4011.6E29505@ot.com.au>

Joe Schaefer wrote:

> Ian Boreham <ianb@ot.com.au> writes:

 ...

> >    (@Array) = split /,(?![\w,]*\))/;
>
> Since this is a split-derived solution, unlike name3
> (but like name1) it ignores trailing sequences of commas.
> I guess that's where the original disclaimer regarding
> assumptions fits in.

Just to clarify this point, this and similar split-derived solutions ignore
trailing commas, but if a negative argument is passed to split:

  @Array = split /,(?![\w,]*\))/, $_, -1;

then the trailing empty elements are retained. The user can decide which behaviour
is preferable.

Regards,


Ian




------------------------------

Date: Tue, 27 Feb 2001 00:09:53 +1100
From: Ian Boreham <ianb@ot.com.au>
Subject: Re: Difficult Split Question
Message-Id: <3A9A55A1.8734990B@ot.com.au>

Joe Schaefer wrote:

> IMHO split() isn't feasible for splitting on quotes("), since I think
> it would require variable-width lookbehinds to see if you're inside a
> quoted region.  However I think it can be done in OP's case since a
> (negative) lookahead can accomplish this for parentheses.

It's certainly not possible to do the task (splitting on unquoted commas) as
nicely as the FAQ version, but depending on your assumptions, I believe it can
be done. I certainly wouldn't recommend using this solution, though, because
for long strings it would get very slow (and why bother when there's a FAQ to
use as a basis?), but I was just curious to try it out. However, in my
experimentation, I found some strange behaviour in the FAQ solution, which it
might pay to be aware of if you're using it.

I will include a solution below and a comparison with the FAQ results for each
case.

The assumptions I have made are:

 . Fields can include quoted regions that may include commas.

 . Fields can include escaped (\) characters.

 . All quoted sections are terminated (no trailing unclosed quotes) - there is
no perfect solution solution to this problem; the data is bad, so there just
needs to be defined behaviour for the user to expect (this solution does not
handle it very gracefully). The string should be validated beforehand, or the
solution should reject the string during processing.

---

What I found strange about the FAQ answer as it stands (although it could be
modified to suit, since it is less restrictive than the split solution, and
some of this is merely my aesthetic opinion) is:

 . Quotes surrounding a whole field are stripped, but internal quotes are left
alone, and escaped characters retain the escape slash; this just seems
inconsistent. (I think leaving the escaped characters alone is fine, but I
think the quotes should be left alone too.)

 . A section starting with quotes but with the quotes terminated internally is
split into the quoted part and the unquoted remainder, even though there is no
comma separating them. Adding a second quoted part just makes it start to give
surprising results.

 . An unclosed quote (which looks like it can only happen if it is the final
quote character) is ignored. I'd prefer it to effectively quote the remainder
of the string. (My split-based solution is no better; probably worse,
depending on how you define worse when there is no right solution other than
"die".)


The principle on which this version works is as follows:

 . To avoid lookbehind, especially of the variable-length variety, all
quote-checking is based on lookahead (same as my related parentheses version).
This problem is harder than parentheses, since the opening and closing
delimiters are the same.

 . The saving grace is that because they are the same, there can be no nesting.
To determine if we are outside a quoted section, we look ahead to the end of
the string (very inefficient) and check that there are an even number of
unescaped quote chars ahead.

 . Commas not in a quoted section are used to split.

---

The overall behaviour:

 . Quoted sections of fields are left alone, as are escape slashes, i.e. no
processing is done other than the split itself.

 . Multiple quoted sections are allowed between unquoted commas; their only
effect is to hide internal commas.

 . Escaped characters are left alone.

 . Unterminated (final) quoted sections screw up all previous sections. I think
it would be essential before using a solution with this behaviour (or any
other solution, in fact) to validate the string first, using a regex similar
to that used in the lookahead part of the code below, rather than to blindly
accept the output produced.


Here is some code (at last):

#! /usr/bin/perl -w

my @strings = (
        'SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core
Dumped"',
        'SAR001,"","Cimetrix, Inc","Bob Smith", \\",
Weird\\"string,here\\",6',
        'SAR001,"","Cimetrix, Inc","Bob Smith", \\", Weird\\"string,here,6',
        'SAR001,"","Cimetrix, Inc","Bob \\"Dog, Hound\\" Smith",,"Blah",,',
        'String "with embedded quotes" in it, other stuff',
        '"Starting but not ending" with a quote, other stuff',
        '"Starting quote" and "Ending quote", other stuff',
        'String, here, that will screw it up: "hello, how are you',
        );

foreach my $text (@strings)
{
    print "Splitting text: '$text':\n";

    # FAQ version -----------------------------------------------------------
    @new = ();
    push(@new, $+) while $text =~ m{
 "([^\"\\]*(?:\\.[^\"\\]*)*)",?  # groups the phrase inside the quotes
       | ([^,]+),?
       | ,
       }gx;
    push(@new, undef) if substr($text,-1,1) eq ',';
    # End FAQ version -------------------------------------------------------

    # Inserted by me to suppress warnings on printing:
    map {$_ = "" unless defined $_} @new;
    print "     FAQ RESULTS: [", join("] [", @new), "]\n";

    # Split version ---------------------------------------------------------
    my @array = split
/,(?=(?:\\.|[^"\\])*(?:"(?:\\.|[^"\\])*"(?:\\.|[^"\\])*)*$)/, $text, -1;
    print "   SPLIT RESULTS: [", join("] [", @array), "]\n";
    # End Split version -----------------------------------------------------

    print "\n";
}


__END__


Here is the output, if you don't care to run it:

Splitting text: 'SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error,
Core Dumped"':
     FAQ RESULTS: [SAR001] [] [Cimetrix, Inc] [Bob Smith] [CAM] [N] [8] [1]
[0] [7] [Error, Core Dumped]
   SPLIT RESULTS: [SAR001] [""] ["Cimetrix, Inc"] ["Bob Smith"] ["CAM"] [N]
[8] [1] [0] [7] ["Error, Core Dumped"]

Splitting text: 'SAR001,"","Cimetrix, Inc","Bob Smith", \",
Weird\"string,here\",6':
     FAQ RESULTS: [SAR001] [] [Cimetrix, Inc] [Bob Smith] [ \"] [
Weird\"string] [here\"] [6]
   SPLIT RESULTS: [SAR001] [""] ["Cimetrix, Inc"] ["Bob Smith"] [ \"] [
Weird\"string] [here\"] [6]

Splitting text: 'SAR001,"","Cimetrix, Inc","Bob Smith", \",
Weird\"string,here,6':
     FAQ RESULTS: [SAR001] [] [Cimetrix, Inc] [Bob Smith] [ \"] [
Weird\"string] [here] [6]
   SPLIT RESULTS: [SAR001] [""] ["Cimetrix, Inc"] ["Bob Smith"] [ \"] [
Weird\"string] [here] [6]

Splitting text: 'SAR001,"","Cimetrix, Inc","Bob \"Dog, Hound\"
Smith",,"Blah",,':
     FAQ RESULTS: [SAR001] [] [Cimetrix, Inc] [Bob \"Dog, Hound\" Smith] []
[Blah] [] []
   SPLIT RESULTS: [SAR001] [""] ["Cimetrix, Inc"] ["Bob \"Dog, Hound\" Smith"]
[] ["Blah"] [] []

Splitting text: 'String "with embedded quotes" in it, other stuff':
     FAQ RESULTS: [String "with embedded quotes" in it] [ other stuff]
   SPLIT RESULTS: [String "with embedded quotes" in it] [ other stuff]

Splitting text: '"Starting but not ending" with a quote, other stuff':
     FAQ RESULTS: [Starting but not ending] [ with a quote] [ other stuff]
   SPLIT RESULTS: ["Starting but not ending" with a quote] [ other stuff]

Splitting text: '"Starting quote" and "Ending quote", other stuff':
     FAQ RESULTS: [Starting quote] [ and "Ending quote"] [ other stuff]
   SPLIT RESULTS: ["Starting quote" and "Ending quote"] [ other stuff]

Splitting text: 'String, here, that will screw it up: "hello, how are you':
     FAQ RESULTS: [String] [ here] [ that will screw it up: "hello] [ how are
you]
   SPLIT RESULTS: [String, here, that will screw it up: "hello] [ how are you]



Well, I thought it was an interesting exercise, although I wouldn't use it. I
probably wouldn't use the FAQ as it stands, though, either. I'd define my
requirements for a particular use, and then rewrite it with tests to make sure
it did what I wanted.

Regards,


Ian




------------------------------

Date: Mon, 26 Feb 2001 22:26:48 +1100
From: Marc Fearby <enterprise@ozemail.com.au>
Subject: Re: Driving me mad!. How would you do this??
Message-Id: <3A9A3D78.8B5FC469@ozemail.com.au>

This is gonna be dodgy but I guess you could use the LWP module to 
actually attempt to retrieve the remote web page. If the buffer is
empty or fails then you know there is a problem with the page. You 
could also do some quick checking to make sure the first few lines 
don't contain the number "404" or something, then send the file to 
/dev/null when you're done.

Chile wrote:
> 
> Hi,
> 
> I would really appreciate it if someone could recommend the best method to
> do this and any examples would be appreciated.
> 
> Basically i have a MySql db with urls to files that are being stored on
> remote servers. These files are important to our buisness and are stored on
> clients servers and we need to constantly check that these files are up.
> 
> Now i know its easy enough to ping the clients url and check if its
> reachable but the way the clients servers are set up the files are stored on
> a diffrent server so the site can but clients tend to remove/delete our
> files so i have been given the task of writing a script that will loop
> through the db with url's to these FILES and test for there reachability.
> 
> I have the script that loops through the db and prints out the urls etc..
> but i just can't figure out how to write a script that i can pass these urls
> to for it to check there reachability then reports back...
> 
> I saw in a previous post someone asking the same thing and using this was
> suggested.
> 
> lynx -head -source http://www.example.com/ | grep "200 OK"
> 
> now this works perfect and returns "200 OK" if its found or "Error" if its
> not but i don't know how to incorprate this into a script so i can pass a
> url to a file and it tells me good or bad.
> 
> I hope i have descipbed this well enough for someone to help.
> 
> thanks for ANY help,
> Chud


------------------------------

Date: Mon, 26 Feb 2001 13:43:24 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: FAQ 4.7:   How do I perform an operation on a series of integers?
Message-Id: <slrn99knbp.dkv.tjla@thislove.dyndns.org>

I was shocked! How could PerlFAQ Server <faq@denver.pm.org>
say such a terrible thing:
>This message is one of several periodic postings to comp.lang.perl.misc
>intended to make it easier for perl programmers to find answers to
>common questions. The core of this message represents an excerpt
>from the documentation provided with every Standard Distribution of
>Perl.
>
>+
>  How do I perform an operation on a series of integers?

I think that this should mention the List::Util module. I know it's not
in core but it's so handy frankly I feel it bloody well should be :)
Also it has the added advantage of running as compiled C code on a
number of platforms.

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
Confucius say too much.
-Chinese proverb


------------------------------

Date: 26 Feb 2001 12:41:53 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Optimizing hash slice construct
Message-Id: <97diuh$rbp$1@mamenchi.zrz.TU-Berlin.DE>

According to Raymund Hofmann <RAY_electronic_design@t-online.de>:
> i use the following working hash slice construct:
> 
> map {$_->[1][2][0][1]} @$cellports{keys %$cellports}
> 
> Could i do this hash slice more elegantly without the map ?
> 
> something like:
> 
> @$cellports{keys %$cellports}->[1][2][0][1]
> 
> only gives the first element of the hash slice, not all like the first
> construct.
> 
> $cellports is a ref to a hash with values that contain references to a more
> complicated data structure (refs to hashes and array's).
> 
> Pleas note that i do not necessarily want all key's of a hash like in the
> example here '@$cellports{keys %$cellports}', but maybe
> '@$cellports{('some','keys')}'

Your problem has little to do with hash slices.  What you have is
a list of (nested) array references.  That it is obtained via a
hash slice is secondary.

I don't see a way to apply an index expression ( ->[[1][2][0][1]) to
each of the list elements without some looping construct.

Anno


------------------------------

Date: Mon, 26 Feb 2001 13:59:30 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Problem with Program - can you help?
Message-Id: <slrn99ko9v.dkv.tjla@thislove.dyndns.org>

I was shocked! How could Betastar <insq@yahoo.com>
say such a terrible thing:

>I have a file that looks something like this:
>
>>First one
>1a
>b2
>3x
>y4
>5z
>
>>Second one
>1G
>C2
>3GGC

<snip>

>I want to make five separate files that have each >line followed by
>the text in a separate file.   (Actually, I have a file with 6,626

Other people have pointed out why your program didn't work. I want to
suggest a slightly different way. If you look at the document 'perldoc
perlvar' you will see it describes a variable called '$/'. This is
called the "Input record separator". What it does is it tells Perl what
string to look for that separates each record of input. By default the
<> operator gets you one line at a time, but if you do this:

$/ = ''; # set to the empty string

It will get you one paragraph at a time, that is each chunk of text
separated by a blank line, which is what your code seemed to want to do.
ergo (untested, but it should work):

#!/usr/bin/perl -w
use strict;

$/ = '';

while (<>)
{
    $inc++;
    
    open OUT, ">$inc.file" or die "Couldn't open $inc.file: $!";

    print OUT;
}

Much simpler, I think.

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
Fast. Powerful. User-friendly. Now choose any two.
-Eric Daniels


------------------------------

Date: Mon, 26 Feb 2001 14:03:22 GMT
From: tjla@guvfybir.qlaqaf.bet (Gwyn Judd)
Subject: Re: Problem with Program - can you help?
Message-Id: <slrn99koh7.dkv.tjla@thislove.dyndns.org>

I was shocked! How could Gwyn Judd <tjla@guvfybir.qlaqaf.bet>
say such a terrible thing:

>#!/usr/bin/perl -w
>use strict;
>
>$/ = '';
>
>while (<>)
>{
>    $inc++;

Of course this *wont* work because some idiot (namely yours truly)
forgot to declare $inc. Add the line 'my $inc' somewhere up there above
the while() and it will work.

-- 
Gwyn Judd (print `echo 'tjla@guvfybir.qlaqaf.bet' | rot13`)
A meeting is an event where minutes are taken and hours wasted.

		-- Unknown


------------------------------

Date: Mon, 26 Feb 2001 23:40:31 +1100
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: regexp clarification
Message-Id: <slrn99kjlv.gro.mgjv@martien.heliotrope.home>

On Mon, 26 Feb 2001 02:38:25 -0800,
	Inkswamp <inkswamp@nas.com> wrote:
> 
> BTW, I should have said in my previous post that I don't have access to
> a machine or web server where I can test this out which is why I asked
> here.

If you have a computer at home, it is very unlikely that you have an
OS/architecture combination for which Perl isn't available, or on which
perl won't compile. 

If you have a Windows box, try www.activestate.com for a port of Perl
for Win32. If you have a Mac try www.macperl.com. If you have anything
else, start looking here:

http://www.perl.com/pub/language/info/software.html

Of course, if you don't have a computer, then you can't install Perl.
But then the question remains: What did you use to post to Usenet? And
what are you going to use to connect to the place where you can use an
installed Perl?

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | I'm just very selective about what I
Commercial Dynamics Pty. Ltd.   | accept as reality - Calvin
NSW, Australia                  | 


------------------------------

Date: Mon, 26 Feb 2001 10:42:09 +0000 (UTC)
From: real.email@signature.this.is.invalid (Csaba Raduly)
Subject: Re: Regexp to match Web urls?
Message-Id: <Xns905465853quuxi@194.203.134.135>

And so it came to pass that abigail@foad.org (Abigail) on 21 Feb 2001
wrote <slrn998hqg.hvb.abigail@tsathoggua.rlyeh.net>: 

>Eli the Bearded (elijah@workspot.net) wrote on MMDCCXXXI September
>MCMXCIII in <URL:news:eli$0102211629@qz.little-neck.ny.us>:
>"" In comp.lang.perl.misc, Clay Shirky <clays@panix.com> wrote:
>"" > I need the canonical regexp to match urls beginning with
>http:// (I "" > don't need to worry about ftp:, telnet: or mailto:,
>in other words) "" > and though I don't want to roll my own, Google
>searches of the form "" 
>"" Maybe not cannonical, but
>"" 
[snip regex]
>
>
>Sorry, that allows too much.
>
>Here's a better one (it cheats on ldap:// though) (remove the
>newlines): 
>
[100+ line regex deleted]
>
>Abigail

Pity I can't sigquote it ( I'd get flamed all over :-)

-- 
Csaba Raduly, Software Developer ,             Sophos Anti-Virus
email:csaba.raduly@sophos.com             http://www.sophos.com/
US Support +1 888 SOPHOS 9            UK Support +44 1235 559933
In case of water landing, YOU may be used as a flotation device!


------------------------------

Date: Mon, 26 Feb 2001 23:14:10 +1100
From: "h." <perlCHEESE@CHEESEatlaswebmail.com>
Subject: Re: timeout connect() with select?
Message-Id: <3a9a4deb@nexus.comcen.com.au>

Anno Siegel wrote in message <97dc0n$i65$1@mamenchi.zrz.TU-Berlin.DE>...
>Here you're using $eout in the position where the file numbers
>to be checked for errors are expected.  Also, you're setting it
>to the integer 1, but it must be a bit string of file numbers.
>You should acquaint yourself with select by reading perldoc select
>and man select.

Ok. I read the man pages, they are written for C, of course.
In C, the 3rd argument to select has something to do with
 "exceptions", not  "errors". - eg. signals as well as errors.

The if the connect() call fails for a socket handle, then the
socket immediately becomes readable and writable.
Seems weird to me,  but that's what the documentation says.
That's why I've been putting the file descriptor (whatever
that is) in the third argument for the select call.

I could not find the select perl doc, could someone please let me know where
I can find it. Also, if someone could please let me know what perldoc
explains
what a file number is and what a file descriptor is, I'd appreciate it. All
the documentation I've read for select() mentions file numbers, file
descriptors, et. al.
I fear I'll never be able to understand select() until I can get a solid
handle on what
a file descriptor is, and what it looks like and how it functions on the bit
level.

p.s. Please note that my main goal is to understand, not just to find a way.

>Anno
Thanks for your help, Anno.




------------------------------

Date: 26 Feb 2001 12:57:41 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: timeout connect() with select?
Message-Id: <97djs5$rbp$2@mamenchi.zrz.TU-Berlin.DE>

According to h. <perlCHEESE@CHEESEatlaswebmail.com>:
> Anno Siegel wrote in message <97dc0n$i65$1@mamenchi.zrz.TU-Berlin.DE>...
> >Here you're using $eout in the position where the file numbers
> >to be checked for errors are expected.  Also, you're setting it
> >to the integer 1, but it must be a bit string of file numbers.
> >You should acquaint yourself with select by reading perldoc select
> >and man select.

Uh... that should have been "perldoc -f select".
 
> Ok. I read the man pages, they are written for C, of course.
> In C, the 3rd argument to select has something to do with
>  "exceptions", not  "errors". - eg. signals as well as errors.
> 
> The if the connect() call fails for a socket handle, then the
> socket immediately becomes readable and writable.
> Seems weird to me,  but that's what the documentation says.
> That's why I've been putting the file descriptor (whatever
> that is) in the third argument for the select call.
> 
> I could not find the select perl doc, could someone please let me know where
> I can find it. Also, if someone could please let me know what perldoc
> explains
> what a file number is and what a file descriptor is, I'd appreciate it. All

Perldoc -f select (see above).  perldoc -f fileno for file numbers.
These may point you further to corresponding Unix man pages.
File numbers are small integers used to identify an open file of a
process.  In Perl you have rarely to do with them.  select() is an
exception, but if you use IO::Select this part is hidden in the
module.

> the documentation I've read for select() mentions file numbers, file
> descriptors, et. al.
> I fear I'll never be able to understand select() until I can get a solid
> handle on what
> a file descriptor is, and what it looks like and how it functions on the bit
> level.

The "bit level" (in the form of vec() expressions) doesn't have to do
with the functioning of files.  It is just a convenient way to put
many file numbers in a single string so they can all be referenced at
once.

Anno


------------------------------

Date: Mon, 26 Feb 2001 22:18:24 +1100
From: Marc Fearby <enterprise@ozemail.com.au>
Subject: tricky perl regex problem
Message-Id: <3A9A3B80.1BD37885@ozemail.com.au>

I am having problems trying to write a regular expression that will
find all instances of <?sp> where ? is a number and replace the 
entire <?sp> with spaces -ie, so that even the < sp> bit is replaced 
with spaces.

I can get it to find the number between the < and > but don't know
how to tell perl how to trash the characters used to find the match
also. I'm trying to do this to show some of my Visual Basic friends
the power of Perl. I know it can do it :-)

Thanks heaps

mfearby@yahoo.com


------------------------------

Date: Mon, 26 Feb 2001 11:22:15 +0000 (UTC)
From: bernard.el-hagin@lido-tech.net (Bernard El-Hagin)
Subject: Re: tricky perl regex problem
Message-Id: <slrn99kf2e.n6i.bernard.el-hagin@gdndev32.lido-tech>

On Mon, 26 Feb 2001 22:18:24 +1100, Marc Fearby
<enterprise@ozemail.com.au> wrote:
>I am having problems trying to write a regular expression that will
>find all instances of <?sp> where ? is a number and replace the 
>entire <?sp> with spaces -ie, so that even the < sp> bit is replaced 
>with spaces.

Assuming your searched string is in $_:

s/<\dsp>/     /g;

Cheers,
Bernard
--
#requires 5.6.0
perl -le'* = =[[`JAPH`]=>[q[Just another Perl hacker,]]];print @ { @ = [$ ?] }'


------------------------------

Date: Mon, 26 Feb 2001 11:31:21 GMT
From: rgarciasuarez@free.fr (Rafael Garcia-Suarez)
Subject: Re: tricky perl regex problem
Message-Id: <slrn99kfjf.mgh.rgarciasuarez@rafael.kazibao.net>

Bernard El-Hagin wrote in comp.lang.perl.misc:
> On Mon, 26 Feb 2001 22:18:24 +1100, Marc Fearby
> <enterprise@ozemail.com.au> wrote:
> >I am having problems trying to write a regular expression that will
> >find all instances of <?sp> where ? is a number and replace the 
> >entire <?sp> with spaces -ie, so that even the < sp> bit is replaced 
> >with spaces.
> 
> Assuming your searched string is in $_:
> 
> s/<\dsp>/     /g;

Or, with multi-digit numbers :
  s/<(\d+)sp>/' 'x(4+length $1)/eg;

or, if you want to replace <NNNsp> by NNN spaces : (sounds more logical
to me, if 'sp' is intended to mean 'space')
  s/<(\d+)sp>/' 'x$1/eg;

-- 
Rafael Garcia-Suarez / http://rgarciasuarez.free.fr/


------------------------------

Date: Mon, 26 Feb 2001 22:47:26 +1100
From: Marc Fearby <enterprise@ozemail.com.au>
Subject: Re: tricky perl regex problem
Message-Id: <3A9A424E.74367070@ozemail.com.au>

Merci beaucoup!  That works beautifully!!!!

Rafael Garcia-Suarez wrote:
> 
> Bernard El-Hagin wrote in comp.lang.perl.misc:
> > On Mon, 26 Feb 2001 22:18:24 +1100, Marc Fearby
> > <enterprise@ozemail.com.au> wrote:
> > >I am having problems trying to write a regular expression that will
> > >find all instances of <?sp> where ? is a number and replace the
> > >entire <?sp> with spaces -ie, so that even the < sp> bit is replaced
> > >with spaces.
> >
> > Assuming your searched string is in $_:
> >
> > s/<\dsp>/     /g;
> 
> Or, with multi-digit numbers :
>   s/<(\d+)sp>/' 'x(4+length $1)/eg;
> 
> or, if you want to replace <NNNsp> by NNN spaces : (sounds more logical
> to me, if 'sp' is intended to mean 'space')
>   s/<(\d+)sp>/' 'x$1/eg;
> 
> --
> Rafael Garcia-Suarez / http://rgarciasuarez.free.fr/


------------------------------

Date: Mon, 26 Feb 2001 11:23:15 +0000
From: Matt Sergeant <matt@sergeant.org>
Subject: Re: XML parsing for large documents
Message-Id: <3A9A3CA3.1A6E7F6E@sergeant.org>

Ajaman2001 wrote:
> 
> I am parsing a large XML document ( 50 - 100 MB) using XML::Parser module and
> the entire operation takes more than an hour. I am using the "Subs" style for
> XML parser. Is there a faster way to process such large XML documents in Perl?
> Any pointers would be greatly helpful. Thanks in advance

Why not just use the raw parser (by setting your own handlers)? There's no
way it should take that long, unless you're building a tree from the
document and going into swap memory.

-- 
<Matt/>

    /||    ** Founder and CTO  **  **   http://axkit.com/     **
   //||    **  AxKit.com Ltd   **  ** XML Application Serving **
  // ||    ** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** mod_perl news and resources: http://take23.org  **
     \\//
     //\\
    //  \\


------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 357
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[18189] in Perl-Users-Digest

Perl-Users Digest, Issue: 357 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Feb 26 09:05:40 2001

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Feb 26 09:05:40 2001