[33037] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4313 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Nov 22 03:09:16 2014

Date: Sat, 22 Nov 2014 00:09:03 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 22 Nov 2014     Volume: 11 Number: 4313

Today's topics:
        A hash of references to arrays of references to hashes. <see.my.sig@for.my.address>
    Re: A hash of references to arrays of references to has <gamo@telecable.es>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Fri, 21 Nov 2014 21:20:48 -0800
From: Robbie Hatley <see.my.sig@for.my.address>
Subject: A hash of references to arrays of references to hashes... is there a better way?
Message-Id: <gfednc6pXr-tgO3JnZ2dnUVZ5703t52d@giganews.com>

Greetings, group. I've been away several years, but I'm back. :-)
I've changed jobs (several times), changed homes (several times).
I've also changed programming platforms. I got rid of Windows 2000
and djgpp, and I'm currently using the following 2 platforms:
1. Perl on Cygwin on Win 8.1 on notebook computer
2. Perl on Point Linux on desktop computer

I'd gotten somewhat good at Perl, 2005-2008, stopped using it
for a while (got distracted), but started getting back into it
November 2014. I'm taking up a program I started writing in 2005
but never finished. (At least not in Perl; I have a version
written in C++, but that only works on djgpp on Win2K; not portable.)
This program I'm writing will eventually be a duplicate file finding
and erasing program. I've pasted what I've written so far at the end
of this message for reference.

This program uses a hash of references to arrays of references
to hashes. (Gulp.) Seems to me there's got to be an easier way.
In C++ I just use "multimaps" from the C++ Standard Template
Library. Maybe there's something like that in CPAN but I haven't
looked yet.

My question is this: do the programmers here see any places in what
I've written below where things could be expressed more briefly or
clearly?

The part where I'm adding file-record hashes to arrays seems clunky
to me. The idea is this: We're riffling through all files in the
current directory, storing file records as hashes in arrays of
same-file-size, with the arrays being inserted into a hash keyed
by file size. So, if a file of same size as the current file has
been processed already, then add record for current file to the
appropriate array; otherwise, create a new array and insert it
into the outer hash. But the way I have this implemented below
is kinda ugly. Is there a better way of doing this?

And the following line looks too complicated to me; it works,
but is there a better way to do this?

foreach my $HashRef (@{$CurDirFiles{$Size}})


Here's the entire program (what I've written of it so far) for
reference:

#!/usr/bin/perl

################################################################################
# dedup3                                                                       #
# Duplicate file finding/erasing program.                                      #
# Written by Robbie Hatley, starting 2005-06-21, as a "learn Perl" exercise.   #
# Plan: Recursively descend directory tree starting from current working       #
# directory, and make a master list of all files encountered on this branch.   #
# Order the list by size.  Within each size group, compare each file, from     #
# left to right, to all the files to its right.  If a duplicate pair is found, #
# alert user and get user input.  Give user these choices:                     #
# 1. Erase left file                                                           #
# 2. Erase right file                                                          #
# 3. Ignore this pair of duplicate files and move to next                      #
# 4. Quit                                                                      #
# If user elects to delete a file, delete it, then move to next duplicate.     #
# Edit history:                                                                #
#    Tue Jun 21, 2005 - Started writing it.                                    #
#    Thu Nov 20, 2014 - Getting back to this exercise after 9-year hiatus.     #
################################################################################

use v5.14;
use strict;
use warnings;

use Cwd;

sub time_from_mtime;
sub date_from_mtime;

my $CurDir;
my %CurDirFiles;

$CurDir = getcwd();
print "CWD = ", $CurDir, "\n";
opendir(my $Dot, ".") or die "Can\'t open directory. $!";

while (my $FileName=readdir($Dot))
{
    my ($dev,     $ino,     $mode,    $nlink,   $uid,
        $gid,     $rdev,    $size,    $atime,   $mtime,
        $ctime,   $blksize, $blocks)
       = stat($FileName);

    my $ModDate = date_from_mtime($mtime);
    my $ModTime = time_from_mtime($mtime);
    my $Size = -s _ ;
    my $Type;

    if ( -d _ )
    {
       $Type = "Dir";
    }
    else
    {
       $Type = "File";
    }

    if ($CurDirFiles{$Size})
    {
       $CurDirFiles{$Size} =
	  [
	     @{$CurDirFiles{$Size}},
	     {
                 "Date" => $ModDate,
                 "Time" => $ModTime,
                 "Type" => $Type,
                 "Size" => $Size,
                 "Attr" => $mode,
                 "Name" => $FileName
              }
	  ];
    }
    else
    {
       $CurDirFiles{$Size} =
	  [
	     {
                 "Date" => $ModDate,
                 "Time" => $ModTime,
                 "Type" => $Type,
                 "Size" => $Size,
                 "Attr" => $mode,
                 "Name" => $FileName
              }
	  ];
    }


};

closedir($Dot);

foreach my $Size (reverse sort {$a<=>$b} keys %CurDirFiles)
{
    ##### Could this next line be written a better way? #####
    foreach my $HashRef (@{$CurDirFiles{$Size}})
    {
       print($$HashRef{Date}, "  ");
       print($$HashRef{Time}, "  ");
       print($$HashRef{Type}, "  ");
       print($$HashRef{Size}, "  ");
       print($$HashRef{Attr}, "  ");
       print($$HashRef{Name}, "\n");
    }
}

sub date_from_mtime
{
    my $TimeDate = scalar localtime shift @_;
    my $Date = substr ($TimeDate, 0, 10);
    $Date .= ", ";
    $Date .= substr ($TimeDate, 20, 4);
    return $Date;
}

sub time_from_mtime
{
    my $TimeDate = scalar localtime shift @_;
    my $Time = substr ($TimeDate, 11, 8);
    return $Time;
}




-- 
Robbie Hatley
lonewolf [at] well [dot] com


------------------------------

Date: Sat, 22 Nov 2014 08:06:04 +0100
From: gamo <gamo@telecable.es>
Subject: Re: A hash of references to arrays of references to hashes... is there a better way?
Message-Id: <m4pckr$3oh$1@speranza.aioe.org>

El 22/11/14 a las 06:20, Robbie Hatley escribió:
> and djgpp, and I'm currently using the following 2 platforms:
> 1. Perl on Cygwin on Win 8.1 on notebook computer
> 2. Perl on Point Linux on desktop computer

Since you use 2) you can't code the thing as if
`locate -b "$filename"` does not exists.

Other considerations are that after you find
candidates to dupes by file attributes, you must
really check if they are dupes, so use of
Digest::SHA3 is advisable.

-- 
http://www.telecable.es/personales/gamo/


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 4313
***************************************


home help back first fref pref prev next nref lref last post