[18437] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 605 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Apr 2 21:26:34 2001

Date: Mon, 2 Apr 2001 18:26:00 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <986261160-v10-i605@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Mon, 2 Apr 2001     Volume: 10 Number: 605

Today's topics:
    Re: Regex question <goldbb2@earthlink.net>
    Re: Regex question <uri@sysarch.com>
    Re: Regular exasperation <nospam@nospam.com>
    Re: Regular exasperation (Jay Tilton)
    Re: Regular exasperation <nospam@nospam.com>
    Re: Regular exasperation <elijah@workspot.com>
        Regular expression for conversions that exclude HTML-co <Tim.Lauterborn@gmx.de>
    Re: require problem, help?? <mickm@ix.netcom.com>
        Script optimization question <andrew@mvt.ie>
    Re: Script optimization question <mjcarman@home.com>
    Re: Script optimization question (Anno Siegel)
    Re: Script optimization question <jonni@ifm.liu.se>
    Re: Script optimization question <andrew@mvt.ie>
    Re: Script optimization question (Abigail)
    Re: Script or Application (Damian James)
    Re: simultaneously open file handles -- limit? <wayne.keenan@ntlworld.com>
    Re: SMTP Connections <moiraine@qwest.net>
    Re: sort array contents from file question (Jay Tilton)
        Strings in XML files and Unicode <thomastk@prodigy.net>
        Summary: closures and //o (was Re: regex-qr// for searc <rick.delaney@home.com>
    Re: Taint problem? <Daniel.Maloney@bms.com>
    Re: Taint problem? <Daniel.Maloney@bms.com>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 02 Apr 2001 20:15:25 GMT
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Regex question
Message-Id: <3AC8DEAE.F475A956@earthlink.net>

Uri Guttman wrote:
> 
> >>>>> "AW" == Ave Wrigley <Ave.Wrigley@itn.co.uk> writes:
> 
>   >> BEGIN{ @words = qw( foo bar baz ) ; $" = "|" ; }
>   >>
>   >> print $1 if /((?:\s*(?:\b(?:@words)\b)\s*)+)/ ;
> 
>   AW> Surely this only matches if the substring only contains @words? I.e. for:
> 
>   AW> "some text here baz foo blah foo bar more text here"
> 
>   AW> it will match:
> 
>   AW> " baz foo "
> 
> you have been much less than clear on your goal. fixing it to match any
> string with has those words starting or ending it is a simple mod:
> 
> print $1 if /((?:\s*(?:\b(?:@words)\b)\s*)+.+?(?:\s*(?:\b(?:@words)\b)\s*))/ ;

No, this isn't quite right, as what he *wants* is "the smallest substring that contains all of a given set of words, in any order."

Yours will (I think) find the *first* substring which starts and ends with a word from the list.

He needs to match *all* the words in the list, and get the *shortest*

Really, the thing to do is start a new thread, asking for just this.

Perhaps this [untested] code will work:
my $string  = "some text here foo baz foo blah foo bar more text here"
my @string = (split /\b(foo|bar|baz)\b/ $string);
my %words; @words{qw(foo bar baz)} = ();
my $shortest = "";
for my $start (0..$#string) {
	next unless $start & 1; #don't start with nonmatches
	my @slice = @string[$start..$#string];
	next unless @words{@string} == 3;
	for my $end (4..$#slice) { #0,2,4 are the first 3 matches.
		next if $end & 1; #don't end with a nonmatch
		last if @words{@slice[0..$end]} == 3;
	}
	my $slice = "@slice[$0..$end]";
	$shortest = $slice if length $slice < length $shortest;
}

-- 
Sometimes the journey *is* its own reward--but not when you're trying to get to the bathroom in time.


------------------------------

Date: Mon, 02 Apr 2001 20:29:40 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: Regex question
Message-Id: <x7puevujub.fsf@home.sysarch.com>

>>>>> "BG" == Benjamin Goldberg <goldbb2@earthlink.net> writes:

  BG> Uri Guttman wrote:

  >> print $1 if /((?:\s*(?:\b(?:@words)\b)\s*)+.+?(?:\s*(?:\b(?:@words)\b)\s*))/ ;

  BG> No, this isn't quite right, as what he *wants* is "the smallest
  BG> substring that contains all of a given set of words, in any
  BG> order."

  BG> Yours will (I think) find the *first* substring which starts and
  BG> ends with a word from the list.

i gave up on this as his spec was never well thought out. i bet it is an
X Y problem where his real goal is easy  to solve but he has us barking
up this tree trying to solve it his way.

  BG> He needs to match *all* the words in the list, and get the *shortest*

a POSIX regex will find all matches and return the shortest one. perl's
won't do that for efficiency reasons.

uri

-- 
Uri Guttman  ---------  uri@sysarch.com  ----------  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  -----------  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  ----------  http://www.northernlight.com


------------------------------

Date: 2 Apr 2001 08:24:15 -0600
From: "William D. Ezell" <nospam@nospam.com>
To: Uri Guttman <uri@sysarch.com>
Subject: Re: Regular exasperation
Message-Id: <3AC87250.F787FF2E@nospam.com>


>   WDE> Please reply via e-mail (see below) if possible.
> 
> unless you use a normal email address, i won't reply via email. you
> post here, you read here.
> 
> uri

Thanks for the reply and solution.  Sorry if the e-mail address was
problematic but I've held this ISP account now for a couple of years
with virtually no spam, due almost entirely to not posting a valid
'Reply-To' address on USENET.  Yes, it is a royal pain for everyone but
an unfortunate fact of life thanks to Spammers and the Direct Marketing
Association's lobby efforts.


******  To reply, remove the "~" from  "wd~ezell@snowhill.com"  ******


------------------------------

Date: Fri, 30 Mar 2001 23:19:05 GMT
From: tiltonj@erols.com (Jay Tilton)
Subject: Re: Regular exasperation
Message-Id: <3ac513c7.143961479@news.erols.com>

On 30 Mar 2001 15:04:55 -0600, "William D. Ezell" <nospam@nospam.com>
wrote:

>I can't figure out a simple
>regular expression to pick off [a-z] following a whitespace character
>and convert it to uppercase.
>
>I last tried
>  $string =~ tr/(?<= )[a-z]/[A-Z]/;
>
>which I thought would convert any [a-z] preceded by a space to uppercase
>but obviously I'm way off.

Despite appearances, tr has nothing to do with regular expressions. 
Abandon that approach.

This meets the stated criteria:
  $string =~ s/(\s)([a-z])/$1\u$2/g;
But it's not what you want.  It won't catch the first word.

A better solution than looking for characters following whitespace is
to look for word boundaries.  The difference between the two is
subtle, and it's worth investigating on your own.

This one does what you really want:
  $string =~ s/\b([a-z])/\u$1/g;

For more info:
  perldoc perlre


------------------------------

Date: 2 Apr 2001 08:33:09 -0600
From: "William D. Ezell" <nospam@nospam.com>
To: Jay Tilton <tiltonj@erols.com>
Subject: Re: Regular exasperation
Message-Id: <3AC87460.BEF22FFF@nospam.com>

Thanks for the direction!

******  To reply, remove the "~" from  "wd~ezell@snowhill.com"  ******


------------------------------

Date: 30 Mar 2001 23:42:20 GMT
From: Eli the Bearded <elijah@workspot.com>
Subject: Re: Regular exasperation
Message-Id: <eli$0103301837@qz.little-neck.ny.us>

In comp.lang.perl.misc, William D. Ezell <nospam@nospam.com> wrote:
> I'm in need of a little enlightenment regarding regular expressions. 
> Specifically, I'm writing a Perl subroutine to perform "proper"
> capitalization for strings with unusual, special-case acronyms.

I'll assume you don't need to care about cases where a capitalized
version can have a different meaning than a lower-case one.

> Passing 
>   $string = 'john a smith, dmd llc';
>  
> through a series of s///g sequences such as
>   $string =~ s/dmd/DMD/g;
> 
> I can arrive at 'john a smith, DMD LLC' but I can't figure out a simple
> regular expression to pick off [a-z] following a whitespace character
> and convert it to uppercase.

You mean capitalize the name that remains there?

	$string =~ s/(^|\s)(\w)/$1\u$2/g;

I have \w instead of [a-z] so that locale can take effect. Upper-casing
numbers is not likely to bother anyone.

> I last tried
>   $string =~ tr/(?<= )[a-z]/[A-Z]/;

tr/// does not use regexps.

> Please reply via e-mail (see below) if possible.

Munged in the headers and the body? No way.

Elijah
------
uses (^|\s) and (\s|$) constructs often


------------------------------

Date: Tue, 3 Apr 2001 00:11:27 +0200
From: "Tim Lauterborn" <Tim.Lauterborn@gmx.de>
Subject: Regular expression for conversions that exclude HTML-comments
Message-Id: <9aatem$1e8$1@nets3.rz.RWTH-Aachen.DE>

Hi,

I have got the following problem:

Example: $input="ä<!--ä is a nice letter-->"
should become "&auml;<!--ä is a nice letter-->"

What is the regular expression which achieves the result and excludes
HTML-comments?

Maybe someone can help?

Greetings,
Tim







------------------------------

Date: Fri, 30 Mar 2001 15:43:28 -0800
From: Mickey Mestel <mickm@ix.netcom.com>
To: nobull@mail.com
Subject: Re: require problem, help??
Message-Id: <3AC51A1F.AF4DC628@ix.netcom.com>

> What you are apparently missing is the self discipline to read the
> manaul entry for require() all the way through before posting to
> usenet.

    yes, you are correct, i did fail to read it all the way through, as
it is there blatanly at the end, and had i done so, i would have saved
your time as well as mine.  but otoh, i would not have provided you with
the apparently necessary satisfaction of berating someone new for their
lack of measuring up to the standards of self discipline that you
yourself posses.  so in the end, i think i did both of us a favor!

> > also, i don't have access to the newsgroups from work, or very
> > often, so i have to get there through this channel.
> 
> Not only is this matter addressed in the manual but it was also
> addresses in comp.lang.perl.* several timne within the last few weeks.

    not getting to the newsgroups mean that i don't read them, and so
don't follow threads and such.  it is unfortunate for the moment, but
out of my control.

    thanks,

    mickm
-- 

-----------------------------------------------------------------------
This is a signature file.  This is only a signature file.  Had this 
been an actual piece of useful information, you would have been 
instructed on what to do with it.
-----------------------------------------------------------------------


------------------------------

Date: Mon, 2 Apr 2001 15:42:47 +0100
From: "Andrew" <andrew@mvt.ie>
Subject: Script optimization question
Message-Id: <9aa35c$gsd$1@kermit.esat.net>

The problem that I'm trying to solve is this;
I have a file, each line of which contains an x coordinate, a y coordinate
and a unique reference.  I want to be able to specify a distance threshold
and have all items in the list which are closer to each other than this
threshold merged together.  The solution I have come up with is as follows;

I read the file and create an array for each variable.
I extract the first element from each array and put these into new temporary
arrays.
I then go through the original arrays and if any element is less than the
specified distance from the entry in the temporary array it is removed from
the original array and added to the end of the temporary array.
Next I go back through the original array and check for the distance from
all of the elements that were added to the new array in the last loop,
moving them if they're close enough.
Once I get to a loop where I didn't add any more elements I call a
subroutine that processes the new array, combining the elements into one.
Then I reset the temporary array, extract the first element from the
original array and repeat the above process until the original array is
empty.

This works well, except that it's a bit slow.  A sample file might have
around 2000 lines and sometimes the combined items can be composed of more
than 30 source items.  This takes almost a minute to process, and I want to
know if I can speed things up at all.  I'm posting the relevant section of
the script below, and any help in improving it would be appreciated.

thanks,
Andrew

do
{
    push @temp_x, shift @x;
    push @temp_y, shift @y;
    push @temp_ref, shift @ref;

        do
        {
            $found = 0;
            $last = 0;
            for($i = 0;$i <= $#ref;$i++)
            {
                $k = $#temp_ref;
                for($j = $last;$j <= $k;$j++)
                {
                    if(abs($x[$i]-$temp_x[$j]) <= $dist &&
abs($y[$i]-$temp_y[$j]) <= $dist)
                    {
                        push @temp_x, splice @x,$i,1;
                        push @temp_y,splice @y,$i,1;
                        push @temp_ref, splice @ref,$i,1;
                        $found = 1;
                    }
                }
            }
        $last = $k;
    }
    while($found == 1);
    &process_glob;
    @temp_x = @temp_y = @temp_ref = ();
}
while(@ref);




------------------------------

Date: Mon, 02 Apr 2001 16:50:37 -0500
From: Michael Carman <mjcarman@home.com>
Subject: Re: Script optimization question
Message-Id: <3AC8F42D.77DE37A1@home.com>

Andrew wrote:
> 
> The problem that I'm trying to solve is this;
> I have a file, each line of which contains an x coordinate,
> a y coordinate and a unique reference.  I want to be able to specify
> a distance threshold and have all items in the list which are closer
> to each other than this threshold merged together.

Sounds like quantization. There are many ways of doing this. I'm a bit
fuzzy on your purpose, though. To me, quantization is determining the
subset which best represents the original data. But you have to define
the size of the subset up front. You seem to want to control the amount
of quantization and let the size of the subset vary.

> The solution I have come up with is as follows;
> 
> I read the file and create an array for each variable.
> I extract the first element from each array and put these into
> new temporary arrays. I then go through the original arrays and 
> if any element is less than the specified distance from the entry
> in the temporary array it is removed from the original array and 
> added to the end of the temporary array.
> Next I go back through the original array and check for the distance from
> all of the elements that were added to the new array in the last loop,
> moving them if they're close enough.
> Once I get to a loop where I didn't add any more elements I call a
> subroutine that processes the new array, combining the elements into one.
> Then I reset the temporary array, extract the first element from the
> original array and repeat the above process until the original array is
> empty.

Well, that's an odd algorithm. What is it that you're actually trying to
do? Is this the algorithm you have to use, or is this the algorithm you
came up to do something?

[If it's the former, you can stop reading now.]

My suggestion would be to try using vector quantization instead. It will
give you a subset (of the desired size) which has a much better
representation of the original data.

The steps are (roughly) as follows:
1) Define an arbitrary starting set of regions based on the dimensions
   of your data set and determine their centroids. (A grid is simplest.)
2) Determine which points lie in which region based on their 
   proximity to each region's centroid.
3) Calculate the new centroid of the region based on the weighted
   average of the data points it contains.
4) Repeat steps 2 and 3 until your data converges.

This is the K-means algorithm. How fast your data converges will depend
on your starting point in step 1. If you know certain things about your
data, you may be able to speed up the process by making a better "first
cut."

I hope I didn't leave anything major out -- it's been quite a while
since I've actively done this sort of thing. If you want to try this and
have difficulties, email me and I can send you some more information.

-mjc


------------------------------

Date: 2 Apr 2001 15:11:51 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: Script optimization question
Message-Id: <9aa4rn$r0g$1@mamenchi.zrz.TU-Berlin.DE>

According to Andrew <andrew@mvt.ie>:
> The problem that I'm trying to solve is this;
> I have a file, each line of which contains an x coordinate, a y coordinate
> and a unique reference.  I want to be able to specify a distance threshold
> and have all items in the list which are closer to each other than this
> threshold merged together.  The solution I have come up with is as follows;

Pity you don't say what "merged together" means, but probably you are
replacing two sufficiently close points with one new point.  At what
co-ordinates?  And will it take part in the game again?

I am asking because it seems to me that your problem doesn't have a
unique solution.  Depending on which elements you merge first you may
end up with different patterns of which elements make up which of the
final entries.

So your optimization question is hard to deal with, because we don't
know what counts as a solution.

I would think of the problem in terms of shifting discs (or, as your
code seems to imply, squares) of a given size around the plane, trying
to cover all the points with a minimal number of disks.  This is a
coverage problem, and they love to turn out NP-hard, so maybe all you
can expect is an approximate solution.

The problem is probably covered (ha ha) in computational geometry.
Abigail?

Anno

[algorithm and code snipped]


------------------------------

Date: Mon, 2 Apr 2001 18:30:46 +0200
From: "Jonas Nilsson" <jonni@ifm.liu.se>
Subject: Re: Script optimization question
Message-Id: <9aa9ba$bc8$1@newsy.ifm.liu.se>

> > I am asking because it seems to me that your problem doesn't have a
> > unique solution.  Depending on which elements you merge first you may
> > end up with different patterns of which elements make up which of the
> > final entries.
>
> There should be one unique solution.  Hopefully the clarification above
> should indicate why.
>

There isn't a unique soloution using this code, as Anno says. Example:
Threshold distance<=2
points=(0,0) (0,2) (0,3)

Replacing form beginning gives:
points=(0,0) (0,2) (0,3) => points=(0,1)          (0,3)
points=(0,1)          (0,3) => points=(0,2)
Results in all points within distance, result: centers=(0,2)

Working from end gives:
points=(0,0) (0,2) (0,3) => points=(0,1) (0,2.5)
Results in two different clusters, result: centers=(0,1) (0,2.5)

So you must specify the algorithm more before an optimization can be done...
/jN



--
_______________________________
Jonas Nilsson




------------------------------

Date: Mon, 2 Apr 2001 17:59:51 +0100
From: "Andrew" <andrew@mvt.ie>
Subject: Re: Script optimization question
Message-Id: <9aab6f$kbv$1@kermit.esat.net>

Alright - I'll try explaining it again...
Say you have the following list and a threshold distance of 2;
1,0
1,1
12,5
25,1
0,1
13,4
1,3

Take the first element and place it in a new list;
1,1                    1,0
12,5
25,1
0,1
13,4
1,3
Compare each element in list 1 to each element (currently only 1 element) in
list 2, if they're close enough move them to list 2;
12,5                    1,0
25,1                    1,1 <- These are added because they're close enough
to element 1
13,4                    0,1 <
1,3
Repeat the process, comparing to the last batch of points added:
12,5                    1,0
25,1                    1,1
13,4                    0,1
                           1,3 < this is added because it's next to 1,1
Keep repeating until nothing new is added.  Then process the result, getting
a value of 0.75,1.25 as the centroid.
Clear the second list and move the first element of the original list across
25,1                    12,5
13,4
And so on.....

There can only be one solution doing it the way I'm describing.  Hopefully
the example above illustrates that.  It doesn't matter where you start,
since you keep iterating through until nothing is sufficiently close to the
point list you have generated.  In your example below you've only gone
through the process once.

Andrew

"Jonas Nilsson" <jonni@ifm.liu.se> wrote in message
news:9aa9ba$bc8$1@newsy.ifm.liu.se...
> > > I am asking because it seems to me that your problem doesn't have a
> > > unique solution.  Depending on which elements you merge first you may
> > > end up with different patterns of which elements make up which of the
> > > final entries.
> >
> > There should be one unique solution.  Hopefully the clarification above
> > should indicate why.
> >
>
> There isn't a unique soloution using this code, as Anno says. Example:
> Threshold distance<=2
> points=(0,0) (0,2) (0,3)
>
> Replacing form beginning gives:
> points=(0,0) (0,2) (0,3) => points=(0,1)          (0,3)
> points=(0,1)          (0,3) => points=(0,2)
> Results in all points within distance, result: centers=(0,2)
>
> Working from end gives:
> points=(0,0) (0,2) (0,3) => points=(0,1) (0,2.5)
> Results in two different clusters, result: centers=(0,1) (0,2.5)
>
> So you must specify the algorithm more before an optimization can be
done...
> /jN
>
>
>
> --
> _______________________________
> Jonas Nilsson
>
>




------------------------------

Date: Mon, 2 Apr 2001 18:06:43 +0000 (UTC)
From: abigail@foad.org (Abigail)
Subject: Re: Script optimization question
Message-Id: <slrn9chftj.pp.abigail@tsathoggua.rlyeh.net>

Anno Siegel (anno4000@lublin.zrz.tu-berlin.de) wrote on MMDCCLXXI
September MCMXCIII in <URL:news:9aa4rn$r0g$1@mamenchi.zrz.TU-Berlin.DE>:
'' According to Andrew <andrew@mvt.ie>:
'' > The problem that I'm trying to solve is this;
'' > I have a file, each line of which contains an x coordinate, a y coordinate
'' > and a unique reference.  I want to be able to specify a distance threshold
'' > and have all items in the list which are closer to each other than this
'' > threshold merged together.  The solution I have come up with is as follows;
'' 
'' The problem is probably covered (ha ha) in computational geometry.
'' Abigail?


Probably. I'd say "All Nearest Neighbours" and "Voronoi Diagrams", but
the problem isn't specified clearly, so I can't say more.



Abigail
-- 
sub f{sprintf$_[0],$_[1],$_[2]}print f('%c%s',74,f('%c%s',117,f('%c%s',115,f(
'%c%s',116,f('%c%s',32,f('%c%s',97,f('%c%s',0x6e,f('%c%s',111,f('%c%s',116,f(
'%c%s',104,f('%c%s',0x65,f('%c%s',114,f('%c%s',32,f('%c%s',80,f('%c%s',101,f(
'%c%s',114,f('%c%s',0x6c,f('%c%s',32,f('%c%s',0x48,f('%c%s',97,f('%c%s',99,f(
'%c%s',107,f('%c%s',101,f('%c%s',114,f('%c%s',10,)))))))))))))))))))))))))


------------------------------

Date: 2 Apr 2001 23:09:31 GMT
From: damian@qimr.edu.au (Damian James)
Subject: Re: Script or Application
Message-Id: <slrn9ci1jn.eu5.damian@puma.qimr.edu.au>

Tom chose Mon, 02 Apr 2001 21:30:10 GMT to say this:
>I have a question... should a perl program be called a script or an
>application ?
>

Really, it's whatever you want to call it. Language is arbitrary, there are
no ontological links between words and their meanings, merely conventions.
Of course you only annoy and/or confuse people by ignoring conventions...

That said, $high_horse->mount(<<DONE_RAMBLING);

The only 'scripting languages' are shells. To refer to every Perl program
as a 'script' because you believe that Perl is a 'scripting language' is
the act of a {pick suitable term of invective, so long as it carries the
connotation of some sort of mental deficit}.

A script is what actors follow, and, depending on your regional usage,
what your doctor writes to your pharmacist. Fit those analogies in to
however you look at your program. My take is that if it uses anything
other than external programs and basic flow control, then it's a 
*program*, dammit! 

DONE_RAMBLING

Cheers,
Damian
-- 
@:=grep!($;+=m!$/|#!),split//,<DATA>;@;=0..$#:;while(@;){for($;=@;;--$;;){;(
$:=rand$;+$|)==$;&&next;@;[$;,$:]=@;[$:,$;]}push@|,shift@;if$;[0]==@|;select
$,,$,,$,,1/80;print qq x\bxx((@;+@|)*$|++),@:[@|,@;],!@;&&$/} __END__
Just another Perl Hacker # rev 3 -- a JAPH in progress, I guess...


------------------------------

Date: Mon, 02 Apr 2001 20:40:18 +0100
From: "wayne.keenan" <wayne.keenan@ntlworld.com>
Subject: Re: simultaneously open file handles -- limit?
Message-Id: <3AC8D5A2.CF3EB48C@ntlworld.com>

perldoc filecache


B McDonald wrote:

> Hi. Would someone please tell me if there is a limit on the number of
> simultaneously open file handles? I have a script that processes XML files
> and generates a number of CSV output files for bulk load into SQL 7. I was
> generating 10 of these "slurp" files, but then, when I added two new files
> to the total, file 12 was not getting written to... and it looked as if some
> data was not getting written to other files as well.
>
> I've checked the ActivePerl (5.6.0.613 under Win98) documentation and don't
> see any info on the max number of files that may be simultaneously open for
> writing. Does anyone have any information/experience on this?
>
> Thanks,
>
> Brian



------------------------------

Date: Mon, 02 Apr 2001 17:57:30 -0700
From: Me <moiraine@qwest.net>
Subject: Re: SMTP Connections
Message-Id: <3AC91FFA.C3B8E1D8@qwest.net>

Yes, but what's a good way to recieve and store mail in $user_mailbox without
having the user be listed under /etc/passwd.

Lots of info on sending....none on recieving....of course....that's not perl....but
what the hell, I'm not getting an answer anywhere, so I might as well try here.


ted wrote:

> Good answer!
>
> ted
>
> Ilmari Karonen wrote:
>
> > In article <9a6vcq$g8c$1@news.netmar.com>, peter.reid2000@ntlworld.com wrote:
> > >Guys, I need some help here. Can any1 give me a COMPLETE walkthrough on how
> > >to send email from Perl.
> >
> >  1. Open a web browser.
> >
> >  2. Go to "http://search.cpan.org/".
> >
> >  3. Click on the link "Mail and Usenet News".
> >
> >  4. Click on the link "MailTools".
> >
> >  5. Click on the link "Mail::Send".
> >
> >  6. Read.
> >
> > There are, of course, many ways to do it.  You could, for example, use
> > Net::SMTP directly -- in fact, I might prefer doing it that way.  Or you
> > could open a pipe to sendmail or some other mailer.  Or you could look
> > more closely at the module list between steps 3 and 4 to see what other
> > modules have been written for this purpose.
> >
> > --
> > Ilmari Karonen - http://www.sci.fi/~iltzu/
> > Please ignore Godzilla / Kira -- do not feed the troll.

--
Geekette

"Try Not.  Do or do not.  There is no try."
-If you don't know who said this,
I don't want to talk to you. ;-)

"Nothing is impossible, no matter how improbable."
-Anonymous




------------------------------

Date: Sat, 31 Mar 2001 06:47:31 GMT
From: tiltonj@erols.com (Jay Tilton)
Subject: Re: sort array contents from file question
Message-Id: <3ac57cc3.170841288@news.erols.com>

On Sat, 31 Mar 2001 00:28:17 -0600, CoralBanded
<whynot117@hotmail.com> wrote:

>------------------------------------------ now I want the foreach loop
>to wok on a sorted @lines list but hte following methods do not work,
>the foreach line fails :

Fails in what way?  Dies with an error?  Sorts poorly?  Power supply
catches fire?

>open(FILE,"$filename");
> foreach $line (@sortedlist) {
>   ...
>}

Should work.

> foreach $line (sort @lines) {
>   ...
>}

Ditto.

>open(FILE,"$filename");

You're not checking the return value.  Are you certain the file is
being read?


------------------------------

Date: Fri, 30 Mar 2001 23:14:20 -0800
From: "Thomas Theakanath" <thomastk@prodigy.net>
Subject: Strings in XML files and Unicode
Message-Id: <9a3veg$6r1e$1@newssvr05-en0.news.prodigy.com>

Hi Netters,

I have problem retrieving strings correctly from XML files. If anybody has
done similar to what I am trying to do, please give some suggestions. I am
kind of clueless..

The XML file contains non-ASCII characters in the text nodes and encoding in
XML is defined as "ISO-8859-1". These files are created using regular Perl
print statements. The text strings are read using DOM methods. The output of
these methods seem to be in UTF-8 format. I tried to convert it back to
8859-1 (Latin 1) format using the following transliteration:
$str =~ tr/\0-\x{FF}//UC;
But some of the characters are not converted properly. And that is the
problem.

The Perl scripts are run on a  Perl, v5.6.0 for sun4-solaris installation. I
think, these scripts worked as expected on a Windows NT installation. So, I
don't know if the problem is OS specific.

Anyways, if somebody has done anything similar to this, please give me some
pointers on what I am doing wrong here. Can I use Unicode::Map8 or
Unicode:String to fix this problem? Also, please give me some insights into
how Perl stores non-ASCII characters in text files and how XML files are
handled within XML::DOM module.

Thanks in advance
Thomas.




------------------------------

Date: Sat, 31 Mar 2001 05:09:18 GMT
From: Rick Delaney <rick.delaney@home.com>
Subject: Summary: closures and //o (was Re: regex-qr// for search and replace)
Message-Id: <3AC569BB.49AAC734@home.com>

[posted to clpmisc and cc'ed p5p]

This is supposed to be a summary but it's kinda long.

Bart Lateur wrote:
> 
>         sub regexsub {
>             my $re = shift;
>             return sub {
>                 shift =~ /$re/o;
>             }
>         }
>         $sub[0] = regexsub('fo+');
>         $sub[1] = regexsub('ba+r');
>         
>         foreach (qw'foo baaar fooooo bar bbbbbbbbar') {
>            foreach my $i(0, 1) {
>               $sub[$i]->($_) and print "Match for '$_' with regex $i\n";
>            }
>         }
> -->
>         Match for 'foo' with regex 0
>         Match for 'baaar' with regex 1
>         Match for 'fooooo' with regex 0
>         Match for 'bar' with regex 1
>         Match for 'bbbbbbbbar' with regex 1
> 
> In pre-5.6.0, //o and closures don't work too well together. There,
> you'd have to put the closure (the sub that gets returned by regexsub)
> in an eval STRING construct.

I wrote that I get

Match for 'foo' with regex 0
Match for 'foo' with regex 1
Match for 'fooooo' with regex 0
Match for 'fooooo' with regex 1

with my copy of 5.6.0 (and up).

Bart wrote:
> Gee. It works properly in ActivePerl 5.6.0 Build 623 for
> MSWin32-x86-multi-thread,
 ...
> Could it be in the patches for multi-thread support?

Yes!

I recompiled my copy of perl-5.6.0 to include threads and it now 
exhibits the same behaviour as ActivePerl.  I also downloaded and 
compiled ActivePerl without threads and it only gives me the /fo+/ 
matches.

But...

It appears that multi-thread support has actually broken /o which is
the only reason it "works".

$ perl -le 'for (qw/foo bar baz/) {print if /$_/o }'
foo 
$ perl -v           

This is perl, v5.6.0 built for i686-linux
 ... 
$ as_perl -le 'for (qw/foo bar baz/) {print if /$_/o }'
foo
bar
baz
$ as_perl -v

This is perl, v5.6.0 built for i686-linux-thread-multi
(with 1 registered patch, see perl -V for more detail) 
 ...

When run under -Mre=debug you can see that the regexp is compiled
3 times with threads but only once without.

Maybe this is to be expected?

Also see bug ID 20000810.008 and

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-08/msg00770.html

-- 
Rick Delaney
rick.delaney@home.com


------------------------------

Date: Mon, 02 Apr 2001 09:18:18 -0400
From: Daniel P Maloney <Daniel.Maloney@bms.com>
Subject: Re: Taint problem?
Message-Id: <3AC87C1A.4B23B551@bms.com>

Tad McClellan wrote:

> >> What would cause this snippet to work fine in the shell but fail
> >> within a setuid script?
> 
> [snip code]
> 
> >> When I run it in my program, I get nothing
> >> Perl isn't complaining about insecurities.
> 
> I assume then, that you've looked at the server logs?

I hadn't when I posted, but here's the error log:
 
[Sun Apr  1 04:02:03 2001] [notice] Apache/1.3.12 (Unix)  (Red
Hat/Linux) ApacheJServ/1.1.1 PHP/3.0.18
mod_perl/1.21 configured -- resuming normal operations

[Mon Apr  2 08:37:46 2001] .qc_upload.cgi: Name "main::i" used
only once: possible typo at /home/httpd/cgi-bin/qc/.qc_upload.cgi
line 29.
                          ^^^^^^^^^^^^^^
note that I am running setuid - wrapped script with wrapsuid,
mode 6775, owner=root, group="systems".

[/home/systems/microbial/data/tmp.zip]
  End-of-central-directory signature not found.  Either this file
is not
  a zipfile, or it constitutes one disk of a multi-part archive. 
In the
  latter case the central directory and zipfile comment will be
found on
  the last disk(s) of this archive.
zipinfo:  cannot find zipfile directory in one of
/home/systems/microbial/data/tmp.zip or
          /home/systems/microbial/data/tmp.zip.zip, and cannot
find /home/systems/microbial/data/tmp.zip.ZIP, period.

> >Here are some suggestions.
> >
> >1. From you program, open and close /tmp/xxxx. In Linux, look at the owner
> >of /tmp/xxx.

[root@zipper] ls -l /tmp/foo.bar
-rw-r--r--    1 root     systems         0 Apr  2 08:51
/tmp/foo.bar

Same output when run from shell or cgi.
 
> Or, do it from Perl:
> 
>    perl -e 'print "I am running setuid\n" if $< != $>'

It's definitely running setuid from cgi, but not from the shell. 
================================================================
 Dan Maloney                             daniel.maloney@bms.com
 Instrumentation Technologies Group                203.677.7135
 Discovery Technologies		              fax: 203.677.6417
 Bristol-Myers Squibb Co.


------------------------------

Date: Mon, 02 Apr 2001 09:19:21 -0400
From: Daniel P Maloney <Daniel.Maloney@bms.com>
Subject: Re: Taint problem?
Message-Id: <3AC87C59.1F65538F@bms.com>

Ilmari Karonen wrote:
> 
> In article <3AC4F261.1C208527@bms.com>, Daniel P Maloney wrote:
> >What would cause this snippet to work fine in the shell but fail
> >within a setuid script?
> >
> >$file = shift;
> 
> Okay, obvious question: Is that shifting from @ARGV or from @_, and are
> you certain it yields the same value in both scripts?

@ARGV

================================================================
 Dan Maloney                             daniel.maloney@bms.com
 Instrumentation Technologies Group                203.677.7135
 Discovery Technologies		              fax: 203.677.6417
 Bristol-Myers Squibb Co.


------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 605
**************************************


home help back first fref pref prev next nref lref last post