[32741] in Perl-Users-Digest
Perl-Users Digest, Issue: 4005 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Jul 31 18:09:31 2013
Date: Wed, 31 Jul 2013 15:09:03 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 31 Jul 2013 Volume: 11 Number: 4005
Today's topics:
Re: 'portable' filesystem manipulation <rweikusat@mssgmbh.com>
Re: 'portable' filesystem manipulation (Seymour J.)
Re: 'portable' filesystem manipulation <ben@morrow.me.uk>
Re: lest talk a litle more about directories <gravitalsun@hotmail.foo>
Re: lest talk a litle more about directories <cwilbur@chromatico.net>
Re: lest talk a litle more about directories (Seymour J.)
Re: No more than N element of an array <rweikusat@mssgmbh.com>
Re: No more than N element of an array <derykus@gmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 29 Jul 2013 11:40:51 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: 'portable' filesystem manipulation
Message-Id: <87wqo9abt8.fsf@sapphire.mobileactivedefense.com>
Ben Morrow <ben@morrow.me.uk> writes:
> Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
>>
>> perl seems a little schizophrenic in this respect since it both
>> includes File::Spec which provided 'portable pathname manipulation' in
>> an IMHO sensible way[*] by delegateing all actual work to a
>> system-specific implementation module and File::Basename (used by
>> File::Path) which employs if - elsif - else-cascades to select the
>> code to execute for a particular system at runtime.
>
> Yes, I agree, it's a bit peculiar. However, they both work, and have
> been thoroughly tested on many systems over many years, so I suppose
> noone thinks it's worth changing them.
File::Basename consists of a system-dependant part, namely, the regexes
& associated handling code dealing with the actual pathnames, and the
'pure Perl' switching logic. And I doubt that there was ever a perl
port where if - elsif - elsif - else didn't work because of inherent
'portability problems' in this Perl code. Consequently, the 'tested on
many systems' part doesn't really apply to that.
>> [*] Mostly. It still relies on a hardcoded 'system list' in order to
>> determine which module to load,
>
> What else would you do?
>
>> uses @ISA for forwarding calls to 'implementation modules' and loads
>> these via require. Ideally, it should select a module based on 'system
>> name' only (possibly with an override facility)
>
> System names are too whimsical to use like that; they need to be
> normalised to something sensible. That's all the list of 'known systems'
> is doing.
There are two lists of 'known systems', one in Basename.pm and one in
Spec.pm and they don't only differ but even treat the same system in
different ways, at least seemingly. I wouldn't be suprised if there
some more different 'lists of known systems' (handling seemingly
identical systems in different ways) in the Perl core and I'm certain
that there will be n more (with n possibly being large) 'different
lists of known systems' (handling ...) all over CPAN. That's an
unmaintainable mess and bound to cause problems (eg, will someone
fixing a bug in File::Spec always remember or even know that
File::Basename needs to be changed as well and how much additional
work is it to make a semantically equivalent change to both which
correct within the constraints of either?).
Also, I wouldn't be surprised when people who actually need 'portable
filename manipulation' would usually end up trying n different modules
in turn until they found one which works for their problem and that
forget about the issue, IOW, in the end, bugs won't ever or won't
usually get fixed.
[...]
>> The File::Spec::Functions
[...]
>> uses
>>
>> foreach my $meth (@EXPORT, @EXPORT_OK) {
>> my $sub = File::Spec->can($meth);
>> no strict 'refs';
>> *{$meth} = sub {&$sub('File::Spec', @_)};
>> }
>>
>> at runtime in order to generate proxy subs. The 'double call' is
>> necessary to work around the IMHO totally weird descision to use
>> 'method calls' for this in the first place (explanation why this could
>> make any sense greatly apprecicated) ...
>
> The method implementation is indeed bizarre. I think the only reason it
> works like that is that the module was written in the very early days of
> Perl OO, before any sort of best practices had emerged, and someone
> thought it would be a clever idea. (IMHO the same applies to Exporter
> and Dynaloader, neither of which ought to use inheritance. AutoLoader
> used to do the same, apparently, but has been changed.)
>
> OTOH, does it matter? The code's there, it's tested, it works, and it's
> reliable.
It matters to me: I wouldn't mind using 'portable pathname
manipulation code' if Perl offered a sensible implementation. Since it
IMHO doesn't, I'll stick to using 'regexes for UNIX(*)'. That's all I
usually care for and even if I had to support 'weird filesystems', I
would rather lift the filesystem-specific bits from one of the modules
and (in the most 'complicated' case), integrate them with some
'selection logic' which doesn't unduly 'penalize' my code.
If a library 'wants' to be used, it should offer something better than
'this may save you an hour of work' --- 'Oh me Gawd, how can I get
this done at all!" is not a universal problem.
------------------------------
Date: Mon, 29 Jul 2013 11:27:18 -0400
From: Shmuel (Seymour J.) Metz <spamtrap@library.lspace.org.invalid>
Subject: Re: 'portable' filesystem manipulation
Message-Id: <51f689d6$11$fuzhry+tra$mr2ice@news.patriot.net>
In <g3kgca-2uc.ln1@anubis.morrow.me.uk>, on 07/28/2013
at 11:55 PM, Ben Morrow <ben@morrow.me.uk> said:
>OTOH, does it matter? The code's there, it's tested, it works, and
>it's reliable.
There are unresolved problems in File::Spec, e.g.,
<https://rt.cpan.org/Public/Bug/Display.html?id=75319>
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>
Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to spamtrap@library.lspace.org
------------------------------
Date: Tue, 30 Jul 2013 00:00:59 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: 'portable' filesystem manipulation
Message-Id: <bq8jca-qji1.ln1@anubis.morrow.me.uk>
Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
> Ben Morrow <ben@morrow.me.uk> writes:
> > Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
> >>
> >> perl seems a little schizophrenic in this respect since it both
> >> includes File::Spec which provided 'portable pathname manipulation' in
> >> an IMHO sensible way[*] by delegateing all actual work to a
> >> system-specific implementation module and File::Basename (used by
> >> File::Path) which employs if - elsif - else-cascades to select the
> >> code to execute for a particular system at runtime.
> >
> > Yes, I agree, it's a bit peculiar. However, they both work, and have
> > been thoroughly tested on many systems over many years, so I suppose
> > noone thinks it's worth changing them.
>
> File::Basename consists of a system-dependant part, namely, the regexes
> & associated handling code dealing with the actual pathnames, and the
> 'pure Perl' switching logic. And I doubt that there was ever a perl
> port where if - elsif - elsif - else didn't work because of inherent
> 'portability problems' in this Perl code. Consequently, the 'tested on
> many systems' part doesn't really apply to that.
The inherent problem with the sort of logic File::Basename uses is that
it becomes very hard to maintain, so there is a tendency for the
less-popular branches to suffer bitrot. The fact that, in practice, it
(mostly) works suggests this hasn't happened yet.
> >> [*] Mostly. It still relies on a hardcoded 'system list' in order to
> >> determine which module to load,
> >
> > What else would you do?
> >
> >> uses @ISA for forwarding calls to 'implementation modules' and loads
> >> these via require. Ideally, it should select a module based on 'system
> >> name' only (possibly with an override facility)
> >
> > System names are too whimsical to use like that; they need to be
> > normalised to something sensible. That's all the list of 'known systems'
> > is doing.
>
> There are two lists of 'known systems', one in Basename.pm and one in
> Spec.pm and they don't only differ but even treat the same system in
> different ways, at least seemingly.
I don't think that can be the case in practice, at least not on the
platforms on which perl still builds. There are clearly branches in both
modules for systems like 'AmigaOS' and 'Symbian' which end up behaving
differently, but perl doesn't build on those systems any more.
> I wouldn't be suprised if there
> some more different 'lists of known systems' (handling seemingly
> identical systems in different ways) in the Perl core and I'm certain
> that there will be n more (with n possibly being large) 'different
> lists of known systems' (handling ...) all over CPAN. That's an
> unmaintainable mess and bound to cause problems (eg, will someone
> fixing a bug in File::Spec always remember or even know that
> File::Basename needs to be changed as well and how much additional
> work is it to make a semantically equivalent change to both which
> correct within the constraints of either?).
I agree entirely: the whole thing is a stinking mess once you start
looking inside. I'm not saying this is a good thing, I'm just saying
that there are (reasonable) reasons it ended up that way and reasons it
hasn't been thrown out and replaced. The most important of these, of
course, is that at least File::Spec is a Toolchain Module, meaning the
consequences of breaking it are very high. This tends to lead to
somewhat excessively conservative maintanance.
> Also, I wouldn't be surprised when people who actually need 'portable
> filename manipulation' would usually end up trying n different modules
> in turn until they found one which works for their problem and that
> forget about the issue, IOW, in the end, bugs won't ever or won't
> usually get fixed.
I don't know of any other portable-filename modules on CPAN; do you?
(Path::Class just uses File::Spec underneath.) I would quite like to see
a reasonable alternative.
> >> The File::Spec::Functions
>
> [...]
>
> >> uses
> >>
> >> foreach my $meth (@EXPORT, @EXPORT_OK) {
> >> my $sub = File::Spec->can($meth);
> >> no strict 'refs';
> >> *{$meth} = sub {&$sub('File::Spec', @_)};
> >> }
> >>
> >> at runtime in order to generate proxy subs. The 'double call' is
> >> necessary to work around the IMHO totally weird descision to use
> >> 'method calls' for this in the first place (explanation why this could
> >> make any sense greatly apprecicated) ...
> >
> > The method implementation is indeed bizarre. I think the only reason it
> > works like that is that the module was written in the very early days of
> > Perl OO, before any sort of best practices had emerged, and someone
> > thought it would be a clever idea. (IMHO the same applies to Exporter
> > and Dynaloader, neither of which ought to use inheritance. AutoLoader
> > used to do the same, apparently, but has been changed.)
> >
> > OTOH, does it matter? The code's there, it's tested, it works, and it's
> > reliable.
>
> It matters to me: I wouldn't mind using 'portable pathname
> manipulation code' if Perl offered a sensible implementation. Since it
> IMHO doesn't, I'll stick to using 'regexes for UNIX(*)'.
You are, I presume, writing code which won't be published, for use on
known systems? I tend to do the same in that situation. basename and
dirname are sometimes more convenient than regexes, but I find the
File::Spec interface (even F:S:Functions, which is an improvement)
generally makes things more confusing rather than less.
The situation is completely different when writing code to be published.
Code on CPAN Should Be Portable; there's no real excuse for not making
it at least as portable as File::Spec.
> That's all I
> usually care for and even if I had to support 'weird filesystems', I
> would rather lift the filesystem-specific bits from one of the modules
> and (in the most 'complicated' case), integrate them with some
> 'selection logic' which doesn't unduly 'penalize' my code.
You can do that, but once you start trying to support 'everything
reasonable' it rapidly gets out of hand.
Ben
------------------------------
Date: Mon, 29 Jul 2013 14:15:49 +0300
From: George Mpouras <gravitalsun@hotmail.foo>
Subject: Re: lest talk a litle more about directories
Message-Id: <kt5itb$11v8$1@news.ntua.gr>
Στις 29/7/2013 11:55, ο/η Rainer Weikusat έγραψε:
> Keith Keller <kkeller-usenet@wombat.san-francisco.ca.us> writes:
>> On 2013-07-28, Rainer Weikusat <rweikusat@mssgmbh.com> wrote:
>
> [...]
>
>>> It certainly wasn't ever peer-reviewed by someone with
>>> at least half a clue (eg, the outoging check for 'has another process
>>> created the directory meanwhile is totally useless because said other
>>> process could create it one microsecond after the check) and isn't a
>>> particularly good implementation.
>>
>> Did you submit a patch to the current maintainer?
>
> I don't use this code. Hence, I don't modify it, either. And the only
> sensible modification here would be a wholesale rewrite to get rid of
> the recursion. Assuming I did that for some other reason than
> 'performing an experiment whose outcome interested me' (like the two
> subroutines I posted in a related threat), I'd probably just use the
> result. Interactions with open source projects tend to be longuish,
> flame-happy (since you basically 'appear on the scene' telling one of
> the established bigwigs that he did something wrong) and lead nowhere
> (eg, I have DBD::Pg with fully-working support for asynchronous
> interactions with Postgres. I tried to 'submit' some preliminary
> patches to 'the maintainer' and he even accepted some of. As soon as it got
> into 'I changed this because it was stupidly written' [in a very
> slight way], that came to an end. Considering that I'm usually
> strongly urged to spend as little time as possible on each individual
> work task, why would I cleanup to async stuff to a degree where it
> could be published, given that I know that it works, so that that can
> be shelved [or - at best - rewritten by one of the 'core guys'] as
> well?)
>
>>> If someone else either doesn't think so or
>>> wants to spend some time with researching sensible solutions to a
>>> particular problem, even a problem you really don't care about, it
>>> would be appropriate to let him instead of demanding that he has to
>>> make the same choices you happen to have made (even if you
>>> encompasses the population of all of China) because you happen to
>>> have made them.
>>
>> I am not demanding that George use File::Path. I am suggesting that he
>> not suggest that others use his code,
>
> He didn't do that. He published some musings about 'directory creation'
> which included two routines actually doing this.
>
>> and instead offer patches to File::Path,
>
> [...]
>
>> Do you have good, verifiable reason to think that these patches, if they
>> passed tests, would not be accepted?
>
> To be honest: So far, I submitted a single patch to something
> maintained as part of perl and that got accepted, it was even
> preferred over a similar one written by someone else. But that
> happened indirectly through the Debian BTS. OTOH, I remember (all too
> clearly) a period of time where I was trying to live on EUR 300/ month
> (fixed rent cost of EUR 210) while expected to work. This was
> ultimatively caused (or at least stimulated) by colliding with some
> 'Perl community' people who pulled some strings after drawing me into
> a series of heated exchanges, something I absolutely couldn't deal with
> ten years ago, on USENET. It is not difficult to find some of the
> names in various 'Perl core' stuff and this is an experience I'm not
> keen to repeat, especially considering that what people believe about
> other people never changes, no matter how hard observable reality
> contradicts it.
>
> Consequently, I wouldn't want to try and would advise others to treat
> very carefully in this area.
>
I like your thinking Rainer. For every day work I have to solve
difficult practical problems, so in order to catch the deadlines I do
not reinvent the wheel and use as match modules I can, in order to be as
effective/fast is possible.
When I have some time at work, I like to think about some points that
looks interesting, and I had the (wrong) idea that here is a good place
for discussion.
As many guys have put me in their blacklist , maybe I will not heart
them to talk a little more.
the record insertion on a data structure is an interesting issue in
general. The dir creation, is the same problem of how to make the ~~
operator faster.
I mean what Perl is doing at "... if ('foo' ~~ @Array') { .... "
When we are talking about directories, the upper directory of course
must exist. So the problem is not symmetrical . I mean there is greater
probability for a directory to exist at the left half than the right. So
there is a way to make the algorithm even faster.
Thats all ... only some ideas; I do not want to patch any module either.
------------------------------
Date: Mon, 29 Jul 2013 12:10:50 -0400
From: Charlton Wilbur <cwilbur@chromatico.net>
Subject: Re: lest talk a litle more about directories
Message-Id: <87d2q1cpo5.fsf@new.chromatico.net>
>>>>> "RW" == Rainer Weikusat <rweikusat@mssgmbh.com> writes:
RW> Interactions with open source projects tend to be longuish,
RW> flame-happy (since you basically 'appear on the scene' telling
RW> one of the established bigwigs that he did something wrong) and
RW> lead nowhere (eg, I have DBD::Pg with fully-working support for
RW> asynchronous interactions with Postgres. I tried to 'submit'
RW> some preliminary patches to 'the maintainer' and he even
RW> accepted some of. As soon as it got into 'I changed this because
RW> it was stupidly written' [in a very slight way], that came to an
RW> end.
If you enter into communication assuming it will go badly, it usually
does; if you enter into communication assuming it will go well, it
usually does. I tend to be fairly direct and frequently abrasive, and I
haven't had anywhere near the level of difficulty you report.
And I am far from an "established bigwig"; for roughly the past year,
Perl has been a hobby interest of mine rather than a profession, due to
many tech companies being trend-following sheep.
Charlton
--
Charlton Wilbur
cwilbur@chromatico.net
------------------------------
Date: Tue, 30 Jul 2013 10:11:35 -0400
From: Shmuel (Seymour J.) Metz <spamtrap@library.lspace.org.invalid>
Subject: Re: lest talk a litle more about directories
Message-Id: <51f7c997$26$fuzhry+tra$mr2ice@news.patriot.net>
In <87fvuxu4ml.fsf@sapphire.mobileactivedefense.com>, on 07/29/2013
at 09:55 AM, Rainer Weikusat <rweikusat@mssgmbh.com> said:
>Interactions with open source projects tend to be longuish,
>flame-happy (since you basically 'appear on the scene' telling one of
>the established bigwigs that he did something wrong) and lead
>nowhere
It's true that a resolution may take a long time, but I don't recall
ever being flamed for reporting a bug. I have been asked for
additional information, and I have been asked to test fixes, but IMHO
both are reasonable.
>or - at best - rewritten by one of the 'core guys'
If the issue is resolved, why do you care whether they write their own
code instead of using yours? If I submit a code fix as part of a
problem report, I expect the developers to decide whether to us it as
is, modify it or write a fix from scratch.
>Consequently, I wouldn't want to try and would advise others to
>treat very carefully in this area.
I will continue to report bugs, including suggested fixes and
work-arounds, as appropriate. YMMV.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>
Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to spamtrap@library.lspace.org
------------------------------
Date: Tue, 30 Jul 2013 17:52:28 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: No more than N element of an array
Message-Id: <87k3k8q9bn.fsf@sapphire.mobileactivedefense.com>
Ben Morrow <ben@morrow.me.uk> writes:
> Quoth Rainer Weikusat <rweikusat@mssgmbh.com>:
>> Ben Morrow <ben@morrow.me.uk> writes:
>> > Quoth Charles DeRykus <derykus@gmail.com>:
>>
>> >> Setting $#results = MAXSEARCHRESULTS undoubtedly comes out of the wash
>> >> purest and fastest.
>> >
>> > It's probably easiest, though turning off the warning and using Tim's
>> >
>> > splice @results, MAXSEARCHRESULTS - 1;
>> >
>> > is probably better, on balance. Using $#ary as an lvalue has some
>> > permanent side-effects on the array; you can see them with Devel::Peek.
>>
>> While there is little reason to prefer one or the other, I
>> nevertheless want to make an argument in favor of assigning to $#ary:
>>
>> 'Splicing' usually refers to connecting things together. This can
>> still be seen in the '4 argument splice' which 'works' the contents of
>> a list into an array.
>
> 'Splice' is not an ideal name for the operation; however, it's no worse
> than 'substr', which is exactly the same operator on strings.
They're sort-of the inverse of each other in this respect: The
4-argument splice actually splices something (in a sense at least -- I
figure that whoever invented this name wasn't a conscript mariner in
some navy :-), the other three don't: They seem to have 'grown' on the
splice implementation because it happened to be a suitable environment
for them. For substr, the same three cases actually extract substrings
from a string while the 4-argument one does something different.
>> It is a more general 'array element manipulation
>> operator' in Perl but statements like
>>
>> "splice(@a, @a, 0, $x, $y) is equivalent to push(@a, $x, $y)"
>> [paraphrase of a part of 'perldoc -f splice']
> [...]
>> The splice-operation I quoted above is
>> similar to this, expressing a relatively simple 'well-known' operation
>> in a more complicated way than necessary by invoking splice with two
>> additional arguments (compared to push) in order to work around the
>> 'actual' semantics of the 4-argument splice, namely, replace some run
>> of array elements with a run of other "datas" (datums?). Making the
>> simple appear complicated may be good for achieving a "Wow!" effect
>> but it isn't a good strategy for software:
>
> Did it occur to you that this was not intended to explain what 'push'
> does, but rather to help explain what 'splice' does?
I think it was intended to explain what splice can be made to do by
feeding 'cleverly selected arguments' to it.
> I would agree with you that the push is a simpler expression than
> the splice, but for example the similar equivalence
>
> shift(@a) splice(@a, 0, 1)
>
> shows you how to use splice to shift multiple elements at once.
Conceptually, the Perl splice can be thought of as a combination of
two 'primitive operations', namely a
remove(@array, $offset, $length)
which removes @array[$offset .. $offset + $length - 1] from @array and
a
insert(@array, $offset, @list)
which inserts the elements on @list into @array starting at offset
$offset. The latter can be expressed in Perl as
splice(@array, $offset, 0, @list)
(another 'clever abuse'). There doesn't seem to be any good reason for
combining both in this way except that this means the implementation
can be 'smart' wrt changing the array length for the 'splicing'
splice case. Where Perl provides another 'built-in' way to perform a
particular array manipulation, ie, shift/unshift, push/pop and
assignment to $#array for truncation, the 'generic splice' should IMHO
be avoided because of this 'optimized combo-opness'.
------------------------------
Date: Tue, 30 Jul 2013 22:10:59 -0700
From: Charles DeRykus <derykus@gmail.com>
Subject: Re: No more than N element of an array
Message-Id: <kta6a0$ke1$1@speranza.aioe.org>
On 7/30/2013 9:52 AM, Rainer Weikusat wrote:
...
>
> Conceptually, the Perl splice can be thought of as a combination of
> two 'primitive operations', namely a
>
> remove(@array, $offset, $length)
>
> which removes @array[$offset .. $offset + $length - 1] from @array and
> a
> ...
"Splicing" in a bit of a tangent here... the unsuspecting might think,
however briefly, that 'delete' on an array could pinch hit for 'remove'
above. Except that 'delete' on arrays was DWIM-challenged and almost
never what you wanted.
'delete' on arrays has been euthanized, ie, 'deprecated' (which is
euthanasia...just dragging it on for years).
A remove/zap/delete for arrays though would fill the gap nicely, maybe a
List::Util function that, passed an array and an indices list, eg, some
faster equivalent of:
sub zap(+@) { die "not an array ref unless ref $_[0] eq 'ARRAY';
splice( $_[0], $_ ,1 ) for reverse sort @_[1..$#_]; }
called like, eg, zap( @results, MAXSEARCHRESULTS..$#results)
--
Charles DeRykus
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 4005
***************************************