[31983] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 3247 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Dec 27 06:09:23 2010

Date: Mon, 27 Dec 2010 03:09:06 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 27 Dec 2010     Volume: 11 Number: 3247

Today's topics:
    Re: Oddity with File::Find and timestamps <tadmc@seesig.invalid>
    Re: Oddity with File::Find and timestamps <skye.shaw@gmail.com>
    Re: Oddity with File::Find and timestamps <hjp-usenet2@hjp.at>
    Re: Oddity with File::Find and timestamps <dave@invalid.invalid>
    Re: Oddity with File::Find and timestamps <rvtol+usenet@xs4all.nl>
    Re: Oddity with File::Find and timestamps <dave@invalid.invalid>
    Re: Oddity with File::Find and timestamps <tadmc@seesig.invalid>
    Re: Oddity with File::Find and timestamps <hjp-usenet2@hjp.at>
    Re: Oddity with File::Find and timestamps <dave@invalid.invalid>
    Re: Oddity with File::Find and timestamps <dave@invalid.invalid>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 25 Dec 2010 14:48:30 -0600
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <slrnihcmrq.8ks.tadmc@tadbox.sbcglobal.net>

Skye Shaw!@#$ <skye.shaw@gmail.com> wrote:
>> > > On Dec 24, 9:39 am, "Dave Saville" <d...@invalid.invalid> wrote:
>> > > In a "wanted" routine -M _ differs from -M $_ depending on whether one
>> > > processes files or not:
>
>> > On Fri, 24 Dec 2010 21:01:50 UTC, "Skye Shaw!@#$" wrote:
>> > Try passing this function to findepth()
>> > sub notwanted
>> > {
>> >     return if m/^\.$/; # Don't need
>> > '.'
>> >     return if $d && ! -d; # only


The order of operations in this statement is why the OP was
not getting the output that was expected.


>> > directories
>> >     print $_, ' ';
>> >     print -M _, " ";
>> >     print " (using a file's stat entry) " if -f _;
>> >     print -M $_, " (using ${_}'s stat entry)";
>> >     print "\n";
>> > }
>
>
>> On Dec 25, 1:33 am, "Dave Saville" <d...@invalid.invalid> wrote:
>> I don't see what that proved.
>
> It shows why you're not getting the output you expected.


No it doesn't.

It shows a different way of getting the expected output.

(that is, a way that does not use the _ filehandle)


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.


------------------------------

Date: Sat, 25 Dec 2010 14:39:27 -0800 (PST)
From: "Skye Shaw!@#$" <skye.shaw@gmail.com>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <5fae57d5-e60e-4d78-aadf-a11a0b52075c@q8g2000prm.googlegroups.com>

On Dec 25, 12:48=A0pm, Tad McClellan <ta...@seesig.invalid> wrote:
> Skye Shaw!@#$ <skye.s...@gmail.com> wrote:
> >> > > On Dec 24, 9:39 am, "Dave Saville" <d...@invalid.invalid> wrote:
> >> > > In a "wanted" routine -M _ differs from -M $_ depending on whether=
 one
> >> > > processes files or not:
> >> > directories
> >> > =A0 =A0 print $_, ' ';
> >> > =A0 =A0 print -M _, " ";
> >> > =A0 =A0 print " (using a file's stat entry) " if -f _;
> >> > =A0 =A0 print -M $_, " (using ${_}'s stat entry)";
> >> > =A0 =A0 print "\n";
> >> > }
>
> >> On Dec 25, 1:33=A0am, "Dave Saville" <d...@invalid.invalid> wrote:
> >> I don't see what that proved.
>
> > It shows why you're not getting the output you expected.
>
> No it doesn't.

The output differs because _ was never updated, which is what I
showed.

> >> > =A0 =A0 print " (using a file's stat entry) " if -f _;
> >> > =A0 =A0 print -M $_, "

> It shows a different way of getting the expected output.
> (that is, a way that does not use the _ filehandle)

It shows that too.

-Skye


------------------------------

Date: Sun, 26 Dec 2010 13:13:33 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <slrnihecbd.mm3.hjp-usenet2@hrunkner.hjp.at>

On 2010-12-25 15:49, Tad McClellan <tadmc@seesig.invalid> wrote:
> Dave Saville <dave@invalid.invalid> wrote:
>> but the point is the docs imply that $_ has been stat'ed
>> before wanted() gets control for each file/dir. 
>
> Huh?
>
> What docs are you referring to?

perldoc File::Find, presumably:

|         *     It is guaranteed that an lstat has been called before the
|               user’s "wanted()" function is called. This enables fast file
|               checks involving _.  Note that this guarantee no longer holds
|               if follow or follow_fast are not set.

Maybe Dave read only the first but not the second sentence of this
paragraph.

	hp




------------------------------

Date: Sun, 26 Dec 2010 13:58:08 +0000 (UTC)
From: "Dave Saville" <dave@invalid.invalid>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <fV45K0OBJxbE-pn2-sOo0D6T5ruyI@localhost>

On Sat, 25 Dec 2010 15:49:50 UTC, Tad McClellan <tadmc@seesig.invalid>
wrote:

> Dave Saville <dave@invalid.invalid> wrote:
> > On Fri, 24 Dec 2010 21:01:50 UTC, "Skye Shaw!@#$" 
> ><skye.shaw@gmail.com> wrote:
> >
> >> On Dec 24, 9:39 "Dave Saville" <d...@invalid.invalid> wrote:
> 
> >> > turn if $d && ! -d; # only directories
> 
> 
> When $d is false, there is no stat call.
> 
> 
> >>     return if $d && ! -d; # only
> 
> 
> When $d is false, there is still no stat call.

I know - I put that test in to save editing the script for doing all 
or just doing directories every time.

> 
> 
> > but the point is the docs imply that $_ has been stat'ed
> > before wanted() gets control for each file/dir. 
> 
> 
> Huh?
> 
> What docs are you referring to?
> 
> $_ will have been stat()ed only when $d is true.
> 
> Since you want it to be stat()ed every time, you must do the file
> test before you test the $d flag:
> 
>     return if ! -d && $d;
> 

Lets start again :-)

use strict;
use warnings;
use File::Find;

$^T=1293201891; # fix basetime to keep numbers the same between runs
finddepth(\&wanted1, '.');
print "\n";
finddepth(\&wanted2, '.');
exit;

sub wanted1
{
  return if m/^\.$/; # Don't need '.'
  return if ! -d; # only directories
  print $_, ' ', -M _, " ", "\n";
}

sub wanted2
{
  return if m/^\.$/; # Don't need '.'
  print $_, ' ', -M _, " ", "\n";
}


[T:\tmp\test]../try.pl
dir1 0.00167824074074074
dir2 0.00149305555555556

stuff1 0.00533564814814815
dir1 0.00533564814814815
stuff2 0.00140046296296296
dir2 0.00140046296296296

It looks to me that files always get a stat call performed for you, 
but directories don't unless a test is done that implicitly calls 
stat(). inserting a "return if ! -e;" into wanted2 then gets the 
correct time for the directories. Whilst file timestamps remain the 
same. Indicating t to me that files *always* get stat'ed before wanted
gets a look in.

I can't use follow* settings to check what Peter pointed out because 
the port for OS/2, which does not have symlinks by default, fails with
"The stat preceding -l _ wasn't an lstat at 
D:/usr/lib/perl/lib/5.8.2/File/Find.pm line 515". Find.pm obviously 
needs whatever the code is there for Windows.




> 
> I gave this solution to your problem yesterday in the duplicate 
> version of this thread.
> 
> Why did you start a duplicate version of this thread?
> 
> 

Huh? I didn't. I see only one thread and never saw your answer 
yesterday.

-- 
Regards
Dave Saville


------------------------------

Date: Sun, 26 Dec 2010 15:28:58 +0100
From: "Dr.Ruud" <rvtol+usenet@xs4all.nl>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <4d17512a$0$41110$e4fe514c@news.xs4all.nl>

On 2010-12-26 14:58, Dave Saville wrote:

> sub wanted1
> {
>    return if m/^\.$/; # Don't need '.'

ITYM:

      return if $_ eq '.';

(and be aware that you are skipping a file called ".\n")


>    return if ! -d; # only directories
>    print $_, ' ', -M _, " ", "\n";

Are you afraid of printf?

      printf q{%s %s\n},  $_, -M _;

(and why did you have the space before the newline?)

-- 
Ruud


------------------------------

Date: Sun, 26 Dec 2010 16:03:46 +0000 (UTC)
From: "Dave Saville" <dave@invalid.invalid>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <fV45K0OBJxbE-pn2-eVgqLQdg6OoV@localhost>

On Sun, 26 Dec 2010 14:28:58 UTC, "Dr.Ruud" <rvtol+usenet@xs4all.nl> 
wrote:

> On 2010-12-26 14:58, Dave Saville wrote:
> 
> > sub wanted1
> > {
> >    return if m/^\.$/; # Don't need '.'
> 
> ITYM:
> 
>       return if $_ eq '.';
> 
> (and be aware that you are skipping a file called ".\n")
> 
> 
> >    return if ! -d; # only directories
> >    print $_, ' ', -M _, " ", "\n";
> 
> Are you afraid of printf?
> 
>       printf q{%s %s\n},  $_, -M _;
> 
> (and why did you have the space before the newline?)
> 

Because the whole sorry mess is me hacking stuff about inserting and 
deleting bits of code and joining lines together to see WTF is going 
on with the timestamps and whether or not stat() is automaticlly 
involved. Which it appears not to be for directories and the default 
mode of operation. I am sorry if my temporary diagnostics are not to 
your purest taste. :-)

-- 
Regards
Dave Saville


------------------------------

Date: Sun, 26 Dec 2010 12:03:54 -0600
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <slrnihf1j4.clp.tadmc@tadbox.sbcglobal.net>

Dave Saville <dave@invalid.invalid> wrote:
> On Sat, 25 Dec 2010 15:49:50 UTC, Tad McClellan <tadmc@seesig.invalid>
> wrote:
>
>> Dave Saville <dave@invalid.invalid> wrote:
>> > On Fri, 24 Dec 2010 21:01:50 UTC, "Skye Shaw!@#$" 
>> ><skye.shaw@gmail.com> wrote:
>> >
>> >> On Dec 24, 9:39 "Dave Saville" <d...@invalid.invalid> wrote:


>> > but the point is the docs imply that $_ has been stat'ed
>> > before wanted() gets control for each file/dir. 


>> What docs are you referring to?


If Peter has quoted the docs that you were referring to, then
I don't see what your question is.

What is your question?


> Lets start again :-)


OK, but what question is it that you hope to have answered?


> I can't use follow* settings to check what Peter pointed out 


Then there is no guarantee that lstat has been called.

So you need to arrange to have it called yourself.


>> I gave this solution to your problem yesterday in the duplicate 
>> version of this thread.
>> 
>> Why did you start a duplicate version of this thread?
>> 
>> 
>
> Huh? I didn't. I see only one thread and never saw your answer 
> yesterday.


The one with:

    Subject: Oddity with Find::File and -M

As opposed to this thread with:

    Subject: Oddity with File::Find and timestamps


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.


------------------------------

Date: Mon, 27 Dec 2010 00:20:29 +0100
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <slrnihfjdu.h8t.hjp-usenet2@hrunkner.hjp.at>

On 2010-12-26 13:58, Dave Saville <dave@invalid.invalid> wrote:
> Lets start again :-)
>
> use strict;
> use warnings;
> use File::Find;
>
> $^T=1293201891; # fix basetime to keep numbers the same between runs
> finddepth(\&wanted1, '.');
> print "\n";
> finddepth(\&wanted2, '.');
> exit;
>
> sub wanted1
> {
>   return if m/^\.$/; # Don't need '.'
>   return if ! -d; # only directories
>   print $_, ' ', -M _, " ", "\n";
> }
>
> sub wanted2
> {
>   return if m/^\.$/; # Don't need '.'
>   print $_, ' ', -M _, " ", "\n";
> }
>
>
> [T:\tmp\test]../try.pl
> dir1 0.00167824074074074
> dir2 0.00149305555555556
>
> stuff1 0.00533564814814815
> dir1 0.00533564814814815
> stuff2 0.00140046296296296
> dir2 0.00140046296296296
>
> It looks to me that files always get a stat call performed for you, 
> but directories don't unless a test is done that implicitly calls 
> stat(). inserting a "return if ! -e;" into wanted2 then gets the 
> correct time for the directories. Whilst file timestamps remain the 
> same. Indicating t to me that files *always* get stat'ed before wanted
> gets a look in.

I don't know the OS/2 file system. It is possible that on OS/2 every
directory entry has to be stat'ed to determine whether it is a
subdirectory or a file (on Unix this can sometimes be optimized).
Since you are using finddepth the sequence would be something like:

read directory . (returns ".", "..", "dir1", "dir2").
stat "dir1" (aha, its a directory, so enter it)
read "." (returns ".", "..", "stuff1")
call wanted(".")
skip ".." (special case)
stat "stuff1" (its a file)
call wanted("stuff1")
(no we are done with the contents of "dir1", so we return to the parent dir and)
call wanted("dir1")
stat "dir2" (aha, its a directory, so enter it)
read "." (returns ".", "..", "stuff2")
call wanted(".")
skip ".." (special case)
stat "stuff2" (its a file)
call wanted("stuff2")
(no we are done with the contents of "dir2", so we return to the parent dir and)
call wanted("dir2")

You see that dir1 and dir2 are stat'ed, but wanted() isn't called
immediately after the stat, it is called only after all the the files
the directory are stat'ed, too. So _ contains the data from the last
file.


The lesson you should learn here is not that File::Find behaves as
in this example (it behaves differently on a Unix system),
but that if the docs say you cannot rely on something you really
shouldn't rely on it, even if it seems to work most of the time.

	hp




------------------------------

Date: Mon, 27 Dec 2010 10:42:51 +0000 (UTC)
From: "Dave Saville" <dave@invalid.invalid>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <fV45K0OBJxbE-pn2-8ylqxs0GfpYP@localhost>

On Sun, 26 Dec 2010 18:03:54 UTC, Tad McClellan <tadmc@seesig.invalid>
wrote:

> Dave Saville <dave@invalid.invalid> wrote:
> > On Sat, 25 Dec 2010 15:49:50 UTC, Tad McClellan <tadmc@seesig.invalid>
> > wrote:
> >
> >> Dave Saville <dave@invalid.invalid> wrote:
> > I can't use follow* settings to check what Peter pointed out 
> 
> 
> Then there is no guarantee that lstat has been called.
> 
> So you need to arrange to have it called yourself.
>

I had worked that out by now :-) But....

If, as it appears, all but directories *are* stat'ed then by doing a 
test that invokes stat one is doing it twice for 99% of the cases. Now
I agree that most of the time this matters not a jot, but it just so 
happens that the structure I am really processing, rather than the 
Micky Mouse test case, has upwards of 60 directories with around 
100,000 files. That *must* be a performance hit surely?
 
> 
> >> I gave this solution to your problem yesterday in the duplicate 
> >> version of this thread.
> >> 
> >> Why did you start a duplicate version of this thread?
> >> 
> >> 
> >
> > Huh? I didn't. I see only one thread and never saw your answer 
> > yesterday.
> 
> 
> The one with:
> 
>     Subject: Oddity with Find::File and -M
> 
> As opposed to this thread with:
> 
>     Subject: Oddity with File::Find and timestamps
> 
> 

Hmm, then there is something very wrong with my newsreader. I posted 
the first - but I never saw it nor a reply *and* upon checking found 
it was not in my "sent" folder so I assumed I had closed the compose 
window rather than hitting send and so did it again. Sorry if it 
caused confusion. I will take it up with the maintainer of the 
newsreader. 

Thanks for helping get things straight in my mind.
-- 
Regards
Dave Saville


------------------------------

Date: Mon, 27 Dec 2010 10:54:44 +0000 (UTC)
From: "Dave Saville" <dave@invalid.invalid>
Subject: Re: Oddity with File::Find and timestamps
Message-Id: <fV45K0OBJxbE-pn2-PvOBwhe4QD35@localhost>

On Sun, 26 Dec 2010 23:20:29 UTC, "Peter J. Holzer" 
<hjp-usenet2@hjp.at> wrote:

<good explanation snipped>

> You see that dir1 and dir2 are stat'ed, but wanted() isn't called
> immediately after the stat, it is called only after all the the files
> the directory are stat'ed, too. So _ contains the data from the last
> file.

Makes complete sense.

> 
> 
> The lesson you should learn here is not that File::Find behaves as
> in this example (it behaves differently on a Unix system),

Actually it fails the same way on Ubuntu - First thing I tried.

> but that if the docs say you cannot rely on something you really
> shouldn't rely on it, even if it seems to work most of the time.

Actually, I had not read that before you, or someone else, pointed it 
out. Perldoc does not play well on OS/2 and I usually don't bother and
use Google. Unfortunately, I found what must have been an early 
version that did not carry that particular nugget of information. 
Previous uses and sample code I had seen suggested it always worked - 
hence the confusion. Law of expected behaviour :-)

Thanks.
-- 
Regards
Dave Saville


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 3247
***************************************


home help back first fref pref prev next nref lref last post