[24549] in Perl-Users-Digest
Perl-Users Digest, Issue: 6727 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jun 25 00:05:36 2004
Date: Thu, 24 Jun 2004 21:05:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Thu, 24 Jun 2004 Volume: 10 Number: 6727
Today's topics:
Re: Perl Programmers Needed 4-6 month contract position <dha@panix2.panix.com>
Trim Multiple Dirs to Max Total Space Used - by Date <heiby_u@falkor.chi.il.us>
Re: Trim Multiple Dirs to Max Total Space Used - by Dat <jurgenex@hotmail.com>
Trying to write my first Regex's <ducott@hotmail.com>
Re: Trying to write my first Regex's <invalid-email@rochester.rr.com>
Re: Trying to write my first Regex's <bigiain@mightymedia.com.au>
Re: Trying to write my first Regex's <ducott@hotmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 25 Jun 2004 03:54:36 +0000 (UTC)
From: "David H. Adler" <dha@panix2.panix.com>
Subject: Re: Perl Programmers Needed 4-6 month contract position
Message-Id: <slrncdn8fs.1sa.dha@panix2.panix.com>
On 2004-06-24, Brigitte <brigitte.wenzel@modisit.com> wrote:
>
> My client is looking for consultants with experience in Perl
You have posted a job posting or a resume in a technical group.
Longstanding Usenet tradition dictates that such postings go into
groups with names that contain "jobs", like "misc.jobs.offered", not
technical discussion groups like the ones to which you posted.
Had you read and understood the Usenet user manual posted frequently to
"news.announce.newusers", you might have already known this. :) (If
n.a.n is quieter than it should be, the relevent FAQs are available at
http://www.faqs.org/faqs/by-newsgroup/news/news.announce.newusers.html)
Another good source of information on how Usenet functions is
news.newusers.questions (information from which is also available at
http://www.geocities.com/nnqweb/).
Please do not explain your posting by saying "but I saw other job
postings here". Just because one person jumps off a bridge, doesn't
mean everyone does. Those postings are also in error, and I've
probably already notified them as well.
If you have questions about this policy, take it up with the news
administrators in the newsgroup news.admin.misc.
http://jobs.perl.org may be of more use to you
Yours for a better usenet,
dha
--
David H. Adler - <dha@panix.com> - http://www.panix.com/~dha/
"...for all you know we're a bunch of malcontents who couldn't get
sci.corned-beef, and are going to reject all the submitted articles
that aren't about corned beef." - Mark-Jason Dominus
------------------------------
Date: Fri, 25 Jun 2004 03:43:53 GMT
From: Ron Heiby <heiby_u@falkor.chi.il.us>
Subject: Trim Multiple Dirs to Max Total Space Used - by Date
Message-Id: <jd7nd0pheg4881cf1tgpsnvculfcrn7m26@4ax.com>
Hi! I've done a lot of FAQ reading and Google-ing and reading in O'Reilly books, but
I'm still stuck.
I have a system where data files are created in multiple directories. I need to run a
daily script that will total the disk space used by all the files in all the
directories and see whether the space exceeds some MAXSPACE value. In this case, all
but one of the directories are subdirectories of a common parent dir, while the other
one is off on its own. If the space does exceed the maximum, I need to start deleting
files, oldest first, until the total space used drops just below the maximum.
I've been looking at File::Find, and File::stat, among others, but don't quite see how
this all can be hung together to accomplish this seemingly simple task.
Any help would be much appreciated. Thanks!
P.S. I'll be looking for responses here. If using Email, remove the "_u" from my name
to avoid getting shuffled into an infrequently perused mailbox.
--
Ron.
------------------------------
Date: Fri, 25 Jun 2004 04:02:15 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Trim Multiple Dirs to Max Total Space Used - by Date
Message-Id: <btNCc.26712$a61.5805@nwrddc01.gnilink.net>
Ron Heiby wrote:
> Hi! I've done a lot of FAQ reading and Google-ing and reading in
> O'Reilly books, but I'm still stuck.
>
> I have a system where data files are created in multiple directories.
> I need to run a daily script that will total the disk space used by
> all the files in all the directories and see whether the space
> exceeds some MAXSPACE value. In this case, all but one of the
> directories are subdirectories of a common parent dir, while the
> other one is off on its own. If the space does exceed the maximum, I
> need to start deleting files, oldest first, until the total space
> used drops just below the maximum.
>
> I've been looking at File::Find, and File::stat, among others, but
> don't quite see how this all can be hung together to accomplish this
> seemingly simple task.
I would attack the problem in four steps:
First loop through all the directories to create an internal array of all
files which you are interested in. Forget File::Find, you don't need it
because you already have the comprehensive list of all directories.
For your purposes a file consists of the name including the full path, the
file size, and the date.
The obvious data structure would be an array of hash where each hash
contains three items, namely the qualified file name, the size, and the
date.
In step two you simply add all the sizes to determine your total used space.
Or you can do that while collecting the files in step 1 already.
Then sort the array by the date element.
And then beginning with the oldest file delete files (you got the fully
qualified name in the hash) until the added size of all deleted files is
larger than the difference between desired size and actual size as
determined in step 2.
jue
------------------------------
Date: Fri, 25 Jun 2004 02:05:28 GMT
From: "Robert TV" <ducott@hotmail.com>
Subject: Trying to write my first Regex's
Message-Id: <ILLCc.847056$Pk3.308032@pd7tw1no>
Hi, I am trying to learn the fine points of writing correct regex's to
untaint my data. I have gone through a few tutorials and I have a very basic
idea of their operations. I would like some assistance writing them
correctly.
Example 1
$name = "Jimmy Spenser";
# allow $name to only have letters or spaces by filtering out unwanted junk
if ($name =~ /\d|[\!\@\#\$\%\^\&\*\(\)\-\=\_\+]/;) {
print "Bad"
} else {
print "Good";
}
Im sure the above is sloppy and right now your laughing. Also there are
other charaters that exist that were not included in the filter. It was my
goal to filter out and digits "\d" and all the trailing characters. I tried
$name =~ /\W/ but that wouldn't allow spaces. What is the best was to allow
$name to only have any case letters or spaces?
Example 2
$address = "#12 - 4243 Jones Street.";
# allow $address to only have letters, digits, the # sign or spaces by
filtering out unwanted junk
if ($name =~ /[\!\@\$\%\^\&\*\(\)\-\=\_\+]/;) {
print "Bad"
} else {
print "Good";
}
Now my filter needs to allow digits and the # sign as well as letters and
periods and spaces etc. Is there a way to better write these filters so that
I can "define" what I consider allowable instead of filtering out what is
bad? $name is allowed to have for instance /digits/letters/number
sign/period/spaces/ but does not HAVE to contain them, any other charater
would be detected as bad.
My end goal will be creating a web form that will be secsure by not allowing
bad stuff.
Thank you all
Robert
------------------------------
Date: Fri, 25 Jun 2004 03:22:23 GMT
From: Bob Walton <invalid-email@rochester.rr.com>
Subject: Re: Trying to write my first Regex's
Message-Id: <40DB9A6D.60107@rochester.rr.com>
Robert TV wrote:
> Hi, I am trying to learn the fine points of writing correct regex's to
> untaint my data. I have gone through a few tutorials and I have a very basic
> idea of their operations. I would like some assistance writing them
> correctly.
>
> Example 1
>
> $name = "Jimmy Spenser";
> # allow $name to only have letters or spaces by filtering out unwanted junk
> if ($name =~ /\d|[\!\@\#\$\%\^\&\*\(\)\-\=\_\+]/;) {
You'd better carefully read and study "perldoc perlre" -- that regexp
isn't even close. It will match any string containing anywhere in it
one of the characters: a digit, !, @, #, $, %, ^, &, *, (, ), -, =, _,
+, but will fail to match many many other characters you probably don't
want either, like all the control characters, ~, `, [, {, |, \, etc etc.
If you wanted to match any string which contains a character that is
not a letter or whitespace, you might try:
if($name =~ /[^a-z\s]/i){
But warning: that is not how to untaint stuff. Keep reading.
> print "Bad"
> } else {
> print "Good";
> }
>
Well, you want to design a regexp that will allow only what you want,
not one that disallows specific stuff -- if you happen to neglect a
disallow item, it would get through. So to have a regexp that matches
only on all letters or whitespace, try:
if($name =~ /^[a-z\s]*$/i){
print "Good\n";
}
else{
print "Bad\n";
}
In that regexp, the /i switch is used on the end to make it case
insensitive (saves making the character class [a-zA-Z\s]). The ^
anchors the start of the match at the beginning of the string so
something like ***blah won't match, and the $ anchors the end of the
match at the end of the string so something like blah*** won't match.
Note that \s is a code for a regexp that matches any one single
whitespace character.
You should also read up on tainting (perldoc perlsec) where you will
learn that you need to assign a variable's value from one of the $1, $2
etc variables which result from a successful pattern match from a regexp
containing parentheses groupings. This means something like:
...
if($name =~ /^([a-z\s]*)$/i){
$name=$1; #$name is now untainted
}
else{
die "\$name had a bad value which I refuse to untaint: $name";
}
...
> Im sure the above is sloppy and right now your laughing. Also there are
> other charaters that exist that were not included in the filter. It was my
> goal to filter out and digits "\d" and all the trailing characters. I tried
> $name =~ /\W/ but that wouldn't allow spaces. What is the best was to allow
> $name to only have any case letters or spaces?
>
> Example 2
>
> $address = "#12 - 4243 Jones Street.";
> # allow $address to only have letters, digits, the # sign or spaces by
> filtering out unwanted junk
> if ($name =~ /[\!\@\$\%\^\&\*\(\)\-\=\_\+]/;) {
> print "Bad"
> } else {
> print "Good";
> }
>
Again, write a regexp to match only on what you *want to permit*, like:
if($name =~ /^([a-z\d#\s]*)$/i){
$name=$1; #$name now untainted
}
else {
die "I refuse to untaint this tainted crap: $name";
}
I note, though, that this will fail on your example string because it
contains a period and a hyphen, neither of which is among your defined
permitted characters above.
> Now my filter needs to allow digits and the # sign as well as letters and
> periods and spaces etc. Is there a way to better write these filters so that
> I can "define" what I consider allowable instead of filtering out what is
> bad? $name is allowed to have for instance /digits/letters/number
> sign/period/spaces/ but does not HAVE to contain them, any other charater
> would be detected as bad.
>
> My end goal will be creating a web form that will be secsure by not allowing
> bad stuff.
An admirable goal. Be sure to very carefully think through what you
permit, as making a bad decision in your untainting regexp can leave
security holes. Just the fact that Perl considers the data to be
untainted does not mean it is secure -- that is up to your regexp. Perl
helps you a lot by letting you know it is certain that you did pass the
data through an untaining regexp.
...
> Robert
--
Bob Walton
Email: http://bwalton.com/cgi-bin/emailbob.pl
------------------------------
Date: Fri, 25 Jun 2004 13:30:09 +1000
From: Iain Chalmers <bigiain@mightymedia.com.au>
Subject: Re: Trying to write my first Regex's
Message-Id: <bigiain-8520E5.13300925062004@individual.net>
In article <ILLCc.847056$Pk3.308032@pd7tw1no>,
"Robert TV" <ducott@hotmail.com> wrote:
> Hi, I am trying to learn the fine points of writing correct regex's to
> untaint my data. I have gone through a few tutorials and I have a very basic
> idea of their operations. I would like some assistance writing them
> correctly.
>
> Example 1
>
> $name = "Jimmy Spenser";
> # allow $name to only have letters or spaces by filtering out unwanted junk
> if ($name =~ /\d|[\!\@\#\$\%\^\&\*\(\)\-\=\_\+]/;) {
> print "Bad"
> } else {
> print "Good";
> }
>
> Im sure the above is sloppy and right now your laughing. Also there are
> other charaters that exist that were not included in the filter. It was my
> goal to filter out and digits "\d" and all the trailing characters. I tried
> $name =~ /\W/ but that wouldn't allow spaces. What is the best was to allow
> $name to only have any case letters or spaces?
Note the ^ as the first character in a character class negates the
class, so:
if ($name =~ /[^A-Za-z ]/) { print "Bad"}
means "if name contains anything thats not [A-Za-z ]"
>
> Example 2
>
> $address = "#12 - 4243 Jones Street.";
> # allow $address to only have letters, digits, the # sign or spaces by
> filtering out unwanted junk
> if ($name =~ /[\!\@\$\%\^\&\*\(\)\-\=\_\+]/;) {
> print "Bad"
> } else {
> print "Good";
> }
if ($address=~ /[^0-9A-Za-z#. ]/) { print "Bad"}
>
> Now my filter needs to allow digits and the # sign as well as letters and
> periods and spaces etc. Is there a way to better write these filters so that
> I can "define" what I consider allowable instead of filtering out what is
> bad? $name is allowed to have for instance /digits/letters/number
> sign/period/spaces/ but does not HAVE to contain them, any other charater
> would be detected as bad.
See character classes in perlre
perldoc perlre
cheers,
big
--
"I ran out of gas! I got a flat tire! I didn't have change for cab fare!
I lost my tux at the cleaners! I locked my keys in the car! An old friend
came in from out of town! Someone stole my car! There was an earthquake!
A terrible flood! Locusts! It wasn't my fault I swear to god!" Jake Blues
------------------------------
Date: Fri, 25 Jun 2004 04:04:55 GMT
From: "Robert TV" <ducott@hotmail.com>
Subject: Re: Trying to write my first Regex's
Message-Id: <HvNCc.892644$Ig.317644@pd7tw2no>
"Bob Walton" <invalid-email@rochester.rr.com> wrote
> An admirable goal. Be sure to very carefully think through what you
> permit, as making a bad decision in your untainting regexp can leave
> security holes. Just the fact that Perl considers the data to be
> untainted does not mean it is secure -- that is up to your regexp. Perl
> helps you a lot by letting you know it is certain that you did pass the
> data through an untaining regexp.
Thank you Bob, that was an excellent reply, your suggestions and advice will
be of great value in my learning process. I really appreciate your
assistance.
Robert
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 6727
***************************************