[24288] in Perl-Users-Digest
Perl-Users Digest, Issue: 6479 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Apr 27 18:16:23 2004
Date: Tue, 27 Apr 2004 15:15:15 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 27 Apr 2004 Volume: 10 Number: 6479
Today's topics:
Perl newbie q: trying to access array using variable na (m)
Re: Perl newbie q: trying to access array using variabl <ittyspam@yahoo.com>
Re: Perl newbie q: trying to access array using variabl <Juha.Laiho@iki.fi>
Re: Perl newbie q: trying to access array using variabl <tore@aursand.no>
Re: Perl newbie q: trying to access array using variabl <postmaster@castleamber.com>
Re: Perl newbie q: trying to access array using variabl <noreply@gunnar.cc>
Re: Perl newbie q: trying to access array using variabl <dwall@fastmail.fm>
Re: Perl scares me ... <perl@my-header.org>
Re: Perl scares me ... <nobull@mail.com>
Re: Perl scares me ... <pkent77tea@yahoo.com.tea>
Re: RFC: Text similarity <bik.mido@tiscalinet.it>
Re: sending data from one program to a perl prog <andrew@localhost.localdomain>
Re: sort numeric lists <robin @ infusedlight.net>
Re: variable scope and use strict <PerlGuRu2b@bobotheclown.org>
Re: variable scope and use strict <tore@aursand.no>
Re: variable scope and use strict (Lack Mr G M)
Re: variable scope and use strict <tore@aursand.no>
Which characters are really unsafe to use in Linux file <recycle@bin.com>
Re: Which characters are really unsafe to use in Linux <Juha.Laiho@iki.fi>
Re: Which characters are really unsafe to use in Linux (Randal L. Schwartz)
Re: Which characters are really unsafe to use in Linux <t@REMOVETHISbrowse.to>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 27 Apr 2004 11:25:58 -0700
From: mpepple@hotmail.com (m)
Subject: Perl newbie q: trying to access array using variable name
Message-Id: <4af35cf2.0404271025.282ae093@posting.google.com>
i checked the archives, but couldn't find anything that made sense to
me (blush). this is a generalized example of my problem:
i have a set of 100 lists:
@list1 = ...
@list2 = ...
@list3 = ...
@list4 = ...
@list5 = ...
...
@list 99 = ...
@list100 = ...
i want to then check a value to see what list it's in. instead of
writing 100 of these:
foreach $list1 (@list1) {
if $value =~ /$list1/ {
print "found the value in list1\n";
}
}
i want to put that in a loop that runs 100 times. in my mind, it
seems like i will be calling a list by a variable name that increments
$listTrigger = "list"
$counter = 1;
$listTrigger .= $counter;
until ( !exists @$listTrigger) {
if $value =~ /$list1/ {
print "found the value in $listTrigger\n";
}
++$counter;
}
i receive the following message: Can't use string ("list1") as an
ARRAY ref while "strict refs". any help would make me the happiest
boy on the planet.
matt
------------------------------
Date: Tue, 27 Apr 2004 14:46:48 -0400
From: Paul Lalli <ittyspam@yahoo.com>
Subject: Re: Perl newbie q: trying to access array using variable name
Message-Id: <20040427143555.J1107@dishwasher.cs.rpi.edu>
On Tue, 27 Apr 2004, m wrote:
> i checked the archives, but couldn't find anything that made sense to
> me (blush). this is a generalized example of my problem:
>
> i have a set of 100 lists:
>
> @list1 = ...
> @list2 = ...
> @list3 = ...
> @list4 = ...
> @list5 = ...
> ...
> @list 99 = ...
> @list100 = ...
>
> i want to then check a value to see what list it's in. instead of
> writing 100 of these:
>
> foreach $list1 (@list1) {
> if $value =~ /$list1/ {
> print "found the value in list1\n";
> }
> }
>
> i want to put that in a loop that runs 100 times. in my mind, it
> seems like i will be calling a list by a variable name that increments
>
> $listTrigger = "list"
> $counter = 1;
> $listTrigger .= $counter;
>
> until ( !exists @$listTrigger) {
^^^^^^^^^
"until not"? Ew. Why not just use 'while'? But I digress...
> if $value =~ /$list1/ {
> print "found the value in $listTrigger\n";
> }
> ++$counter;
> }
>
> i receive the following message: Can't use string ("list1") as an
> ARRAY ref while "strict refs". any help would make me the happiest
> boy on the planet.
What you are trying to do is use what are known as "soft references".
That is, use a variable containing a string as the name of a different
variable. This is generally considered A Very Bad Idea, which is why it
is disallowed under 'use strict'.
Now, the cheesy way out of your predicament might just be to get rid of
strict refs while in that block:
{
no strict 'refs';
$listTrigger = "list1";
while (exists @$listTrigger){ ... }
}
But this is another Bad Idea. (Indeed, I'll probably be yelled at by
members of this group for even suggesting it). Instead, what you should
be doing is redesigning your program just slightly. Rather than having
100 distinct arrays, why not have one array containing 100 array
references:
my @big_array = (
[1, 2, 3, 4],
['foo', 'bar','baz'],
#etc
);
Now you can just have a double loop to go through everything:
for (my $i=0; $i < @big_array; $i++){
foreach $list_item (@{$big_array[$i]}){
if ($value =~ /$list_item/){
print "Found it in array $i\n";
}
}
}
(Note that I've used your logic from above, which is testing to see if
some unknown value contains as a substring the list item that's currently
being examined. This may or may not be what you actually want.
I hope this was clear enough to get you pointed in the right direction.
If you've never used multi-dimensional arrays, you might want to look over
the following documentation:
perldoc perlreftut
perldoc perldsc
perldoc perllol
Paul Lalli
------------------------------
Date: Tue, 27 Apr 2004 18:47:02 GMT
From: Juha Laiho <Juha.Laiho@iki.fi>
Subject: Re: Perl newbie q: trying to access array using variable name
Message-Id: <c6m9qd$21a$2@ichaos.ichaos-int>
mpepple@hotmail.com (m) said:
>i checked the archives, but couldn't find anything that made sense to
>me (blush). this is a generalized example of my problem:
>
>i have a set of 100 lists:
>
>@list1 = ...
>@list2 = ...
>@list3 = ...
>@list4 = ...
>@list5 = ...
>...
>@list 99 = ...
>@list100 = ...
And there you have a design problem that calls for multi-dimensional
arrays, aka arrays of arrays (AoA in short, or LoL -- lists of lists).
See "perldoc perllol".
>i want to then check a value to see what list it's in. instead of
>writing 100 of these:
>
>foreach $list1 (@list1) {
> if $value =~ /$list1/ {
> print "found the value in list1\n";
> }
>}
... and this makes me wonder should the inner elements in the structure
actually be hashes instead of arrays. Hashes are indexed by name, so
you could store your values directly as hash keys, and this checking
for existence would become much easier (your proposed =~ doesn't work
for list contents; with hashes you could use a test like
"if exists $list{$value}".
Perhaps if you could describe a higher-level view of what it is you're
building, we could give you even better directions on how to achieve that.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
------------------------------
Date: Tue, 27 Apr 2004 20:57:45 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: Perl newbie q: trying to access array using variable name
Message-Id: <pan.2004.04.27.18.56.53.222304@aursand.no>
On Tue, 27 Apr 2004 11:25:58 -0700, m wrote:
> i checked the archives, but couldn't find anything that made sense to
> me (blush). this is a generalized example of my problem:
>
> i have a set of 100 lists:
>
> @list1 = ...
> @list2 = ...
> @list3 = ...
> @list4 = ...
> @list5 = ...
> ...
> @list 99 = ...
> @list100 = ...
What you really _want_ is an array of arrays. You _really_ don't want to
deal with a static number of lists (as above) when you don't have to.
> i want to then check a value to see what list it's in. instead of
> writing 100 of these:
>
> foreach $list1 (@list1) {
> if $value =~ /$list1/ {
> print "found the value in list1\n";
> }
> }
Are you sure you want a regular expression like the one above? The one
above will only match if '$value' contains the value of '$list1'. Are you
sure you don't want it the other way around, ie. to check if '$list1'
contains '$value'?
If you had an array of arrays, you could do it like this;
my @lists = ( ... ); # An array which contains your sub-lists
my $value = ...; # The value you are searching for
for ( my $i = 0; $i <= $#lists; $i++ ) {
foreach ( @{$lists[$i]} ) {
if ( /$value/ ) {
print "'$value' found in list #" . $i . "\n";
}
}
}
If you don't care about what list there is a match, you could replace the
outer 'for()' loop with a 'foreach ()' loop and drop the '$i' variable.
You could also enhance the code above by using 'grep' where it is
appropriate.
--
Tore Aursand <tore@aursand.no>
"There are three kinds of lies: lies, damn lies, and statistics."
(Benjamin Disraeli)
------------------------------
Date: Tue, 27 Apr 2004 13:58:54 -0500
From: John Bokma <postmaster@castleamber.com>
Subject: Re: Perl newbie q: trying to access array using variable name
Message-Id: <408ead70$0$196$58c7af7e@news.kabelfoon.nl>
Tore Aursand wrote:
> On Tue, 27 Apr 2004 11:25:58 -0700, m wrote:
>
>>i checked the archives, but couldn't find anything that made sense to
>>me (blush). this is a generalized example of my problem:
>>
>>i have a set of 100 lists:
>>
>>@list1 = ...
>>@list2 = ...
>>@list3 = ...
>>@list4 = ...
>>@list5 = ...
>>...
>>@list 99 = ...
>>@list100 = ...
>
>
> What you really _want_ is an array of arrays. You _really_ don't want to
> deal with a static number of lists (as above) when you don't have to.
>
>
>>i want to then check a value to see what list it's in. instead of
>>writing 100 of these:
>>
>>foreach $list1 (@list1) {
>> if $value =~ /$list1/ {
>> print "found the value in list1\n";
>> }
>>}
>
>
> Are you sure you want a regular expression like the one above? The one
> above will only match if '$value' contains the value of '$list1'. Are you
> sure you don't want it the other way around, ie. to check if '$list1'
> contains '$value'?
>
> If you had an array of arrays, you could do it like this;
>
> my @lists = ( ... ); # An array which contains your sub-lists
> my $value = ...; # The value you are searching for
>
> for ( my $i = 0; $i <= $#lists; $i++ ) {
> foreach ( @{$lists[$i]} ) {
> if ( /$value/ ) {
> print "'$value' found in list #" . $i . "\n";
> }
> }
> }
>
> If you don't care about what list there is a match, you could replace the
> outer 'for()' loop with a 'foreach ()' loop and drop the '$i' variable.
>
> You could also enhance the code above by using 'grep' where it is
> appropriate.
Or when you need to do a lot of lookups, use hashes.
--
John MexIT: http://johnbokma.com/mexit/
personal page: http://johnbokma.com/
Experienced Perl programmer available: http://castleamber.com/
------------------------------
Date: Tue, 27 Apr 2004 21:02:12 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Perl newbie q: trying to access array using variable name
Message-Id: <c6mbcj$dnbq3$1@ID-184292.news.uni-berlin.de>
m wrote:
> i have a set of 100 lists:
>
> @list1 = ...
> @list2 = ...
> @list3 = ...
> @list4 = ...
> @list5 = ...
> ...
> @list 99 = ...
> @list100 = ...
Even if I don't know how you populate those lists, it appears to be an
inappropriate starting point. Instead of 100 arrays you should use an
array of arrays or a hash of arrays. This is an example of a hash of
arrays:
my %lists = (
1 => [ 1, 2, 3 ],
2 => [ 4, 5, 6 ],
3 => [ 7, 8, 9 ],
);
> i want to then check a value to see what list it's in. instead of
> writing 100 of these:
>
> foreach $list1 (@list1) {
> if $value =~ /$list1/ {
> print "found the value in list1\n";
> }
> }
>
> i want to put that in a loop that runs 100 times.
With a hash of arrays it can be done like this:
for my $list (keys %lists) {
for my $elem ( @{ $lists{$list} } ) {
if ($value =~ /$elem/) {
print "found the value $value in list $list\n";
}
}
}
HTH
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Tue, 27 Apr 2004 21:11:42 -0000
From: "David K. Wall" <dwall@fastmail.fm>
Subject: Re: Perl newbie q: trying to access array using variable name
Message-Id: <Xns94D8AEEAF2235dkwwashere@216.168.3.30>
m <mpepple@hotmail.com> wrote:
> i have a set of 100 lists:
>
> @list1 = ...
> @list2 = ...
> @list3 = ...
> @list4 = ...
> @list5 = ...
> ...
> @list 99 = ...
> @list100 = ...
As others have pointed out, all these lists are probably the wrong
approach.
> i want to then check a value to see what list it's in. instead of
> writing 100 of these:
Below is an idea you might be able to adapt. The list of lists (@LoL)
is hacked together just to get a demo working, so it's ugly and not
what I would really use (given a choice).
It would help if we knew more about the data. If we knew what the
data is, how it's read, and what you want to do with it we might
suggest something different.
use strict;
use warnings;
my @list0 = qw(j s t j o 2);
my @list1 = qw(a b d 2 h d s);
my @list2 = qw(d l w s v n 7 2 h);
my @list3 = qw(X w f h 4 j df c b);
my @list4 = qw(X g h j 4 f x m x);
my @list5 = qw(X t f j m 4 c);
# somehow the list of lists gets populated....
my @LoL = (\@list0, \@list1, \@list2, \@list3, \@list4, \@list5);
my %hash;
for my $list_number (0 .. $#LoL) {
for my $element ( @{$LoL[$list_number]} ) {
$hash{$element}{$list_number}++;
}
}
my $find = 'X';
print "$find exists in list(s) ",
join ', ', sort keys %{$hash{$find}};
------------------------------
Date: Tue, 27 Apr 2004 17:08:46 +0200
From: Matija Papec <perl@my-header.org>
Subject: Re: Perl scares me ...
Message-Id: <fhts801cft3b9fu85lp67dfij945gphplt@4ax.com>
X-Ftn-To: Richard Gration
"Richard Gration" <richard@zync.co.uk> wrote:
>If there was any further proof needed that Perl updates are received in a
>back alley somewhere from some dude with red eyes who smells of sulphur
>... I mean, honestly !!! ;-)
Err, and what was your expected output?
>And then there's my slight amazement that #1 needs "no strict 'refs'" and
>#2 doesn't ...
References in perl are very well documented, you may want to check "perldoc
perlref"
--
Matija
------------------------------
Date: 27 Apr 2004 18:12:16 +0100
From: Brian McCauley <nobull@mail.com>
Subject: Re: Perl scares me ...
Message-Id: <u9r7u9tken.fsf@wcl-l.bham.ac.uk>
"Richard Gration" <richard@zync.co.uk> writes:
> my $one = "mysub";
> #1
> $one->('scalar');
> #2
> 'mysub'->('string');
> And then there's my slight amazement that #1 needs "no strict 'refs'" and
> #2 doesn't ...
Yeah, it's a known problem (er.. bug) that strict does not pick-up on
stuff that gets optomised away during compliation. Indeed there seems
to have been a spate of threads about this lately.
--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
------------------------------
Date: Tue, 27 Apr 2004 21:26:52 +0100
From: pkent <pkent77tea@yahoo.com.tea>
Subject: Re: Perl scares me ...
Message-Id: <pkent77tea-A32E36.21265127042004@pth-usenet-01.plus.net>
In article <c6lqjn$kgc$1@news.freedom2surf.net>,
"Richard Gration" <richard@zync.co.uk> wrote:
> While trying to work out the syntax for building a dispatch table of
> coderefs from strings in a database I had occasion to construct the
> following:
...
<snip example code>
Obviously I have no idea what you were trying to do, or what your
restrictions were, but... wasn't there a better way? :-) My dispatch
tables always look something like:
%foo = (
bar => \&bar,
baz => \&baz,
# etc...
);
# insert more code here
sub bar {
my ($thing, $whatever, $boo) = @_;
# some code here
}
and as luck would have it I'm working on a database-based app right now
at work, too.
P
--
pkent 77 at yahoo dot, er... what's the last bit, oh yes, com
Remove the tea to reply
------------------------------
Date: Tue, 27 Apr 2004 23:12:40 +0200
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: RFC: Text similarity
Message-Id: <frft80t1g17o1ntt6vt1r3uvqgg7urapln@4ax.com>
On Tue, 27 Apr 2004 00:37:29 +0200, Tore Aursand <tore@aursand.no>
wrote:
>> I know that this may seem naive, but in a popular science magazine I
>> read that a paper has been published about a technique that indeed
>> identifies the (natural) language some documents are written in by
>> compressing (e.g. LZW) them along with some more text from samples taken
>> from a bunch of different languages and comparing the different
>> compressed sizes. You may try some variation on this scheme...
>
>I really don't have the opportunity to categorize any of the documents;
>Everything must be 100% automatic without human interference.
Well, you may try matching limited-sized portions of the documents
(after having converted them to pure text) against each other (I mean
across documents, not within the *same* document) and average the
result over a document.
Just my 2x10^-12 Eur,
Michele
--
$\=q.,.,$_=q.print' ,\g,,( w,a'c'e'h,,map{$_-=qif/g/;chr
}107..q[..117,q)[map+hex,split//,join' ,2B,, w$ECDF078D3'
F9'5F3014$,$,];];$\.=$/,s,q,32,g,s,g,112,g,y,' , q,,eval;
------------------------------
Date: Tue, 27 Apr 2004 22:52:50 +0100
From: Andrew Wheeler <andrew@localhost.localdomain>
Subject: Re: sending data from one program to a perl prog
Message-Id: <pan.2004.04.27.21.52.48.848028@localhost.localdomain>
On Mon, 26 Apr 2004 23:42:43 +0200, Gunnar Hjalmarsson wrote:
> my $maxsize = 131072;
what is the significance of 131072 as a max size ?
------------------------------
Date: Tue, 27 Apr 2004 11:48:31 -0800
From: "Robin" <robin @ infusedlight.net>
Subject: Re: sort numeric lists
Message-Id: <c6m6ds$20o3$3@news.f.de.plusline.net>
> What code? And how about testing your code yourself?
The code below, it's a sort routine. And your right, I should have
referenced it.
> You should take a little more care in preparing your articles. Quoting
> fifty lines just to add a single one is not the done thing. Forgetting
> to add necessary text (or to remove large chunks of unwanted text) may
> happen once in a while, but with your posts it happens with a regularity
> that seems to say you don't care.
> Take the time to look over your articles before you post them and make
> sure they contain all you want them to contain and nothing you don't
> want them to contain. On Usenet, that's part of common courtesy.
>
> Anno
Ok, good call.
-Robin
------------------------------
Date: Tue, 27 Apr 2004 14:57:24 GMT
From: Rocky <PerlGuRu2b@bobotheclown.org>
Subject: Re: variable scope and use strict
Message-Id: <pan.2004.04.27.14.55.15.943706@bobotheclown.org>
On Tue, 27 Apr 2004 10:32:20 -0400, Richard Morse wrote:
> In article <pan.2004.04.27.12.43.01.8345@bobotheclown.org>,
> Rocky <PerlGuRu2b@bobotheclown.org> wrote:
>
>> I am trying to use strict all the time now. I have a problem with the
>> following code. When I use strict the value of $highest won't leave the
>> foreach construct. I know this is by design but I cannot figure out how
>> make that variable available outside the loop. Any advice?
>>
>> Also if someone could tell me where to find perldoc regarding "use
>> strict;" or scoping in general I would appreciate it.
>>
>>
>> #!/usr/bin/perl
>> my $number = 1;
>> my $dir = "/etc/backup";
>> opendir DIR1, $dir;
>> my @files = readdir(DIR1);
>> closedir DIR1;
>> foreach my $filename (@files)
>> {
>> push(@filenum, $1) if ($filename =~ /^CUX(\d+)\.txt$/);
>> #my $highest = (sort { $a <=> $b } @filenum)[-1];
>> }
>> $highest = (sort { $a <=> $b } @filenum)[-1];
>> $final = $highest + $number;
>> print "CUX00" . "$final" . ".txt\n";
>
> I _think_ (although I'm not certain), you're asking how to make a
> variable available outside a loop.
>
> Try this example:
>
> #!/usr/bin/perl
> use strict;
> use warnings;
>
> my $highest;
> $highest = 0;
> foreach (1 .. 9) {
> if ($_ > $highest) {
> $highest = $_;
> }
> }
>
> print $highest, "\n";
> __END__
>
> The key point (if I understand your question) is to declare the variable
> before you enter the loop (I also initialized it to 0, so I didn't get a
> warning about uninitialized values being used in the comparison). Then
> it remains available once you leave the loop.
>
> One trick -- make sure you don't rescope the variable inside the loop
> (ie, don't use 'my $highest' inside the loop), because that would hide
> the original $highest from the statements inside the loop.
>
> HTH,
> Ricky
perfect. Thank you
------------------------------
Date: Tue, 27 Apr 2004 18:52:48 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: variable scope and use strict
Message-Id: <pan.2004.04.27.16.27.35.552481@aursand.no>
On Tue, 27 Apr 2004 13:17:01 +0000, Rocky wrote:
>>> my $dir = "/etc/backup";
>> Needless use of double quotes;
> is this because the value of $dir does not contain another variable and
> does not need to be interpolated?
As David already has pointed out (in an excellent way): Yes.
> Can I open and close a directory before I push it's contents into an
> array?
You mean before you _read_ from the directory? No. That doesn't make
sense, does it?
opendir( DIR, $dir ) or die "$!\n";
# Read the contents of <DIR>
closedir( DIR );
>> opendir( DIR1, $dir ) or die "$!\n";
>> my @files = sort {$a <=> $b} grep {/^CUX\d+\.txt$/} readdir(DIR1);
>> closedir( DIR1 );
>>
>> print 'Highest valued filename: ' . $files[-1] . "\n";
> The filename is returned to a korn shell script, but the filename is
> highest + 1 so that the korn shell script can create the file and add
> details about last night's backups.
No problem adding 1 to the value extracted above.
--
Tore Aursand <tore@aursand.no>
"What we see depends mainly on what we look for." (Sir John Lubbock)
------------------------------
Date: Tue, 27 Apr 2004 18:34:51 BST
From: gml4410@ggr.co.uk (Lack Mr G M)
Subject: Re: variable scope and use strict
Message-Id: <2004Apr27.183451@ukwit01>
In article <pan.2004.04.27.12.43.01.8345@bobotheclown.org>, Rocky <PerlGuRu2b@bobotheclown.org> writes:
|>
Why do you remember everything, when all you want is the maximum?
And reading in all file names just to process them sequentially isn't
necessary.
my $number = 1;
opendir DIR1, $dir or die "opendir: $!\n";
my $highest; # Now defined outside of the loop, so visible after it
while (1) {
my $filename = readdir(DIR1);
last if (not defined $filename);
if ($filename =~ /^CUX(\d+)\.txt$/)) {
$highest = $1 if ($1 > $highest);
}
}
closedir DIR1 or die "closedir: $!\n";
my $final = $highest + $number;
print "CUX00" . "$final" . ".txt\n";
--
--------- Gordon Lack --------------- gml4410@ggr.co.uk ------------
This message *may* reflect my personal opinion. It is *not* intended
to reflect those of my employer, or anyone else.
------------------------------
Date: Tue, 27 Apr 2004 20:57:46 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: variable scope and use strict
Message-Id: <pan.2004.04.27.18.47.58.237400@aursand.no>
On Tue, 27 Apr 2004 18:34:51 +0000, Lack Mr G M wrote:
> my $number = 1;
> opendir DIR1, $dir or die "opendir: $!\n";
> my $highest; # Now defined outside of the loop, so visible after it
> while (1) {
> my $filename = readdir(DIR1);
> last if (not defined $filename);
> if ($filename =~ /^CUX(\d+)\.txt$/)) {
> $highest = $1 if ($1 > $highest);
> }
> }
> closedir DIR1 or die "closedir: $!\n";
> my $final = $highest + $number;
> print "CUX00" . "$final" . ".txt\n";
I don't like this code at all. Why do you call readdir() in scalar
context when there's no need to? Do you fear that there are too many
files to fit in memory? If so, you could at least have dropped that
'while (1)' thing;
while ( my $filename = readdir(DIR) ) {
if ( $filename =~ /^CUX(\d+)\.txt$/ ) {
$highest = $1;
}
}
I still prefer the other suggestion I had (in a previous posting), though;
Using 'sort' and 'grep'.
--
Tore Aursand <tore@aursand.no>
"First get your facts; then you can distort them at your leisure."
(Mark Twain)
------------------------------
Date: Tue, 27 Apr 2004 18:19:53 +0200
From: "Craig Manley" <recycle@bin.com>
Subject: Which characters are really unsafe to use in Linux filenames (from Perl)?
Message-Id: <408e882d$0$81335$e4fe514c@dreader6.news.xs4all.nl>
Hi,
From testing (using Perl + Slackware Linux) I've found that the only
characters I can't use in a directory/file name are the 0 byte and path
seperator /. Below is my test script and function that makes tainted strings
safe to use as directory/file names. Because a mistake or misassumption here
can open a huge security hole I'ld like to know if this is really correct in
the opinions of others and if this idea is valid for all *nix variants. My
goal is to create a filename validator for html form uploaded file names
that is as unrestrictive as possible (yet safe).
Another question for those of you know much about MSWin32: which characters
can't be used in a MSWin32 directory/filename (I think it's much more than
Linux)?
Another question: are these single byte character file systems?
-Craig Manley.
#!/usr/bin/perl -w
use strict;
use bytes;
sub safe {
my $s = shift;
# replace path seperators
$s =~ s|/|_|g;
# replace 0 bytes.
$s =~ s|\000|_|g;
# keep length <= 255 characters
return substr($s,0,255);
}
my $backslash = '\\';
# these all work
#mkdir('hoi' . $backslash . 'nbla') || warn $!;
#mkdir('hoi..bla') || warn $!;
#mkdir('hoi' . $backslash . 'bla') || warn $!;
#mkdir('..hoi') || warn $!;
# these don't work
#mkdir($backslash . '/hoi') || warn $!;
#mkdir($backslash . '../hoi') || warn $!;
# try all possible bytes
my %chars;
for (my $i = 0; $i <= 255; $i++) {
$chars{$i} = chr($i);
}
my $s = join('',sort values(%chars));
if (mkdir(safe($s))) {
my $h;
opendir($h,'.');
my @entries = grep(/.{20,}/,readdir($h));
closedir($h);
open($h, '>t.bin') or die $!;
binmode $h;
print $h join("\n\n",@entries);
close($h);
}
else {
warn($!);
}
------------------------------
Date: Tue, 27 Apr 2004 18:27:03 GMT
From: Juha Laiho <Juha.Laiho@iki.fi>
Subject: Re: Which characters are really unsafe to use in Linux filenames (from Perl)?
Message-Id: <c6m8le$1lo$1@ichaos.ichaos-int>
"Craig Manley" <recycle@bin.com> said:
>From testing (using Perl + Slackware Linux) I've found that the only
>characters I can't use in a directory/file name are the 0 byte and path
>seperator /.
Correct, in the strictly technical sense. The reason to forbid '/' is that
that is the directory separator character - and thus the ability to use
it would make path names ambiguous -- f.ex. is "/tmp/x" a file named "tmp/x"
in the root directory, or a file named "x" in the /tmp diretory. The reason
to forbid \0 comes from the use of it as the string terminator in C language.
No such reasons exist for any of the other possible byte values, so they
are allowed.
Words of warning, though; there are tools that have problems properly
understanding anything except US-ASCII (so, byte values from 32 to 127
inclusive), and even within this range there are some characters that
I'd consider ill-advised. Space (32) is perhaps the hardest one; there
are tools that emit/expect lists of file names using whitespace as the
separator, and for them whitespace within a file name is a problem that
cannot be overcome. The most commonly seen pair of such tools are "find"
and "xargs" ("cpio" being yet another tool having this problem). There
are implementations (GNU) of these tools that have workarounds for this
problem, but the workarounds are not generally applicable (as the
availability of the GNU tools cannot be universally assumed).
Other characters that I would consider problematic are
!, ", ', `, *, ?, $, {, }, [, ], (, ), ~, <, >, |, #, & and \,
as these have special meanings in various shells and tools.
So, complementing this would leave characters
-, _, ,, ., ;, :, ^, =, +, %, 0-9, a-z and A-Z as the safe ones.
"-" and "." are ill-advised as the first characters in a file name.
Then, lately I've heard some reports of XFS filesystem on Linux having
trouble coping with UTF-8 byte sequences; it apparently is trying to
do something smart with non-US-ASCII file names and failing miserably.
>Another question: are these single byte character file systems?
Unix filesystems tend to be; until I heard about the XFS issues, I had
assumed all Unix filesystems to be purely byte-oriented.
What you might do to allow "any" character in names at application level,
though, is to encode known problematic characters -- something like URL
encoding (%xx where xx is the two-digit hexadecimal value for the
character) should be usable -- just remember that when using this, %
becomes an unsafe character, so it needs to be encoded, too).
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
------------------------------
Date: Tue, 27 Apr 2004 21:57:00 GMT
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Which characters are really unsafe to use in Linux filenames (from Perl)?
Message-Id: <f0a0aaf57b25e959de4fdc159c6c75f2@news.teranews.com>
>>>>> "Craig" == Craig Manley <recycle@bin.com> writes:
Craig> From testing (using Perl + Slackware Linux) I've found that the only
Craig> characters I can't use in a directory/file name are the 0 byte and path
Craig> seperator /.
This has been true for every version of Unix I've used since 1977.
Can't say what it was before Unix V6 though... didn't get to use
those. :)
Gets fun when you permit \n in a filename. Lots of programs
don't expect that, and break. But those are broken programs, I say.
Not a broken filename.
print "Just another Perl hacker,"; # the first!
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
------------------------------
Date: Tue, 27 Apr 2004 18:43:59 +0000 (UTC)
From: Tom <t@REMOVETHISbrowse.to>
Subject: Re: Which characters are really unsafe to use in Linux filenames (from Perl)?
Message-Id: <c6m9lf$6ld$1@sparta.btinternet.com>
Craig Manley wrote...
<>
> Another question for those of you know much about MSWin32: which
> characters can't be used in a MSWin32 directory/filename (I think
> it's much more than Linux)?
\/:*?"<>|
you shouldn't use a leading space or a leading dot
you also need to avoid reserved names like:
COM
LPT1
PRN
AUX
...etc.
http://support.microsoft.com/default.aspx?scid=kb;EN-US;120716
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 6479
***************************************