[32761] in Perl-Users-Digest
Perl-Users Digest, Issue: 4025 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Sep 3 16:09:38 2013
Date: Tue, 3 Sep 2013 13:09:05 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 3 Sep 2013 Volume: 11 Number: 4025
Today's topics:
Cannot have locale word characters in a variable fmassion@web.de
Re: Cannot have locale word characters in a variable <klaus03@gmail.com>
Re: Cannot have locale word characters in a variable <derykus@gmail.com>
Re: Cannot have locale word characters in a variable <hSoPrAsMt@raSdPnAeMrs.de>
Re: Cannot have locale word characters in a variable <hjp-usenet3@hjp.at>
Re: Cannot have locale word characters in a variable <ben@morrow.me.uk>
Re: Cannot have locale word characters in a variable <klaus03@gmail.com>
Re: Cannot have locale word characters in a variable <ben@morrow.me.uk>
Re: Cannot have locale word characters in a variable <derykus@gmail.com>
Re: Cannot have locale word characters in a variable fmassion@web.de
Re: File deduplication <justin.1303@purestblue.com>
Re: File deduplication <rweikusat@mobileactivedefense.com>
Re: File deduplication <gravitalsun@hotmail.foo>
Re: File deduplication <gravitalsun@hotmail.foo>
Re: File deduplication <rweikusat@mobileactivedefense.com>
Re: File deduplication <justin.1303@purestblue.com>
Re: File deduplication <gravitalsun@hotmail.foo>
Re: File deduplication <rweikusat@mobileactivedefense.com>
Re: File deduplication <jurgenex@hotmail.com>
Re: File deduplication <john@castleamber.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 2 Sep 2013 10:34:57 -0700 (PDT)
From: fmassion@web.de
Subject: Cannot have locale word characters in a variable
Message-Id: <ca0b4394-2f91-4153-ba3b-bcbfc0c53a2c@googlegroups.com>
My test file:
h=F6heneinstellbar 1234
bedienbar 5678
1111 M=FCller
gr=F6=DFer 8765
My script:
#!/usr/bin/perl -w
use locale;
open(FILE,'test.txt') ;=20
@sentence =3D <FILE>;
foreach $sentence (@sentence) {
chomp $sentence;
if ($sentence =3D~ m/(\w+)(\s)(\d+)/gx) {=20
print "$1\n"; =20
}}
Instead of "use locale" I have also tried unsucessfully:
(1)
use utf8;=20
(2)
use POSIX qw(locale_h);
(3)
use POSIX qw(locale_h);
my $locale =3D setlocale(LC_ALL, "de_DE");
Result (words broken at German special characters):
heneinstellbar (instead of the expected "h=F6heneinstellbar")
bedienbar
=DFer (instead of the expected "gr=F6=DFer")
The script works with [\w=F6=E4=FC=DF=C4=D6=DC] instead of \w but I assume =
there is a better solution.
------------------------------
Date: Mon, 02 Sep 2013 20:22:01 +0200
From: klaus03 <klaus03@gmail.com>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <l02l05$ttf$1@speranza.aioe.org>
Le 02/09/2013 19:34, fmassion@web.de a crit :
> My test file:
> hheneinstellbar 1234
> [...]
> if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
> print "$1\n";
> [...]
> Result (words broken at German special characters):
> heneinstellbar (instead of the expected "hheneinstellbar")
> [...]
> The script works with [\w] instead of \w but I assume there is a better solution.
What is the perl version you are using ?
My very simple test.pl with perl 5.018...
( no "use locale", no "use utf8", no "setlocale()" ):
======================================
use 5.018;
use warnings;
my $sentence = 'hheneinstellbar 1234';
if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
print "$1\n";
}
======================================
...shows:
hheneinstellbar
------------------------------
Date: Mon, 02 Sep 2013 12:45:16 -0700
From: Charles DeRykus <derykus@gmail.com>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <l02ps8$c1p$1@speranza.aioe.org>
On 9/2/2013 10:34 AM, fmassion@web.de wrote:
> My test file:
>
> hheneinstellbar 1234
> bedienbar 5678
> 1111 Mller
> grer 8765
>
>
> My script:
> #!/usr/bin/perl -w
> use locale;
> open(FILE,'test.txt') ;
> @sentence = <FILE>;
> foreach $sentence (@sentence) {
> chomp $sentence;
> if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
> print "$1\n";
> }}
>
> Instead of "use locale" I have also tried unsucessfully:
> (1)
> use utf8;
> (2)
> use POSIX qw(locale_h);
> (3)
> use POSIX qw(locale_h);
> my $locale = setlocale(LC_ALL, "de_DE");
>
> Result (words broken at German special characters):
>
> heneinstellbar (instead of the expected "hheneinstellbar")
> bedienbar
> er (instead of the expected "grer")
>
> The script works with [\w] instead of \w but I assume there is a better solution.
>
binmode(STDOUT, ":utf8");
--
Charles DeRykus
------------------------------
Date: Mon, 02 Sep 2013 22:05:37 +0200
From: "Horst-W. Radners" <hSoPrAsMt@raSdPnAeMrs.de>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <l02r2h$cet$1@online.de>
fmassion@web.de schrieb am 02.09.2013 19:34:
> My test file:
>
> höheneinstellbar 1234
> bedienbar 5678
> 1111 Müller
> größer 8765
>
>
> My script:
> #!/usr/bin/perl -w
> use locale;
> open(FILE,'test.txt') ;
> @sentence = <FILE>;
> foreach $sentence (@sentence) {
> chomp $sentence;
> if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
> print "$1\n";
> }}
>
> Instead of "use locale" I have also tried unsucessfully:
> (1)
> use utf8;
> (2)
> use POSIX qw(locale_h);
> (3)
> use POSIX qw(locale_h);
> my $locale = setlocale(LC_ALL, "de_DE");
>
> Result (words broken at German special characters):
>
> heneinstellbar (instead of the expected "höheneinstellbar")
> bedienbar
> ßer (instead of the expected "größer")
>
> The script works with [\wöäüßÄÖÜ] instead of \w but I assume there is a better solution.
>
It depends on the encoding of your inputfile.
Perl assumes Latin-1 encoding unless told otherwise.
If your input-encoding is UTF-8, you'll need
open(my $FILE, '<:encoding(utf8)', 'test.txt') or die;
and don't use locale.
Furthermore on the output side, if your terminal-encoding is UTF-8 too,
you'll need
binmode(STDOUT, ':utf8');
to get the output right.
Please read at least
perldoc perluniintro
Regards, Horst
--
<remove S P A M 2x from my email address to get the real one>
------------------------------
Date: Mon, 2 Sep 2013 22:08:19 +0200
From: "Peter J. Holzer" <hjp-usenet3@hjp.at>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <slrnl29s1j.eb7.hjp-usenet3@hrunkner.hjp.at>
On 2013-09-02 19:45, Charles DeRykus <derykus@gmail.com> wrote:
> On 9/2/2013 10:34 AM, fmassion@web.de wrote:
>> My test file:
>>
>> hheneinstellbar 1234
>> bedienbar 5678
>> 1111 Mller
>> grer 8765
Which character encoding does the file use?
>> My script:
>> #!/usr/bin/perl -w
>> use locale;
>> open(FILE,'test.txt') ;
>> @sentence = <FILE>;
>> foreach $sentence (@sentence) {
>> chomp $sentence;
>> if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
>> print "$1\n";
>> }}
>>
>> Instead of "use locale" I have also tried unsucessfully:
>> (1)
>> use utf8;
>> (2)
>> use POSIX qw(locale_h);
>> (3)
>> use POSIX qw(locale_h);
>> my $locale = setlocale(LC_ALL, "de_DE");
>>
>> Result (words broken at German special characters):
>>
>> heneinstellbar (instead of the expected "hheneinstellbar")
>> bedienbar
>> er (instead of the expected "grer")
>>
>> The script works with [\w] instead of \w but I assume there is
>> a better solution.
>>
>
>
> binmode(STDOUT, ":utf8");
Maybe, but that's secondary. First the file must be read correctly, then
you can worry about printing the results correctly.
So he needs to apply the correct encoding filter to FILE:
open(FILE, "<:encoding($encoding)", 'test.txt')
or
binmode FILE, ":encoding($encoding)";
(of course, $encoding must be set to the correct first, e.g. "UTF-8" or
"ISO-8859-15")
perldoc perlunitut.
hp
PS: Lexical file handles are preferred over bare filehandles.
--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | hjp@hjp.at | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpat. -- Ralph Babel
------------------------------
Date: Mon, 2 Sep 2013 21:40:42 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <an9ffa-9ue2.ln1@anubis.morrow.me.uk>
Quoth "Peter J. Holzer" <hjp-usenet3@hjp.at>:
> On 2013-09-02 19:45, Charles DeRykus <derykus@gmail.com> wrote:
> > On 9/2/2013 10:34 AM, fmassion@web.de wrote:
> >>
> >> use locale;
> >> open(FILE,'test.txt') ;
> >> @sentence = <FILE>;
> >> foreach $sentence (@sentence) {
> >> chomp $sentence;
> >> if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
> >> print "$1\n";
> >> }}
> >>
> >> Instead of "use locale" I have also tried unsucessfully:
> >> (1)
> >> use utf8;
> >> (2)
> >> use POSIX qw(locale_h);
> >> (3)
> >> use POSIX qw(locale_h);
> >> my $locale = setlocale(LC_ALL, "de_DE");
> >
> > binmode(STDOUT, ":utf8");
>
> Maybe, but that's secondary. First the file must be read correctly, then
> you can worry about printing the results correctly.
>
> So he needs to apply the correct encoding filter to FILE:
>
> open(FILE, "<:encoding($encoding)", 'test.txt')
>
> or
>
> binmode FILE, ":encoding($encoding)";
>
> (of course, $encoding must be set to the correct first, e.g. "UTF-8" or
> "ISO-8859-15")
If you want de_DE rather than Unicode \w semantics, you also need perl
5.14, and you need to call setlocale and either 'use locale' or use the
/l regex flag.
Ben
------------------------------
Date: Tue, 03 Sep 2013 00:48:54 +0200
From: klaus03 <klaus03@gmail.com>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <l034km$9g0$1@speranza.aioe.org>
Le 02/09/2013 22:40, Ben Morrow a écrit :
>
> Quoth "Peter J. Holzer" <hjp-usenet3@hjp.at>:
>> On 2013-09-02 19:45, Charles DeRykus <derykus@gmail.com> wrote:
>>> On 9/2/2013 10:34 AM, fmassion@web.de wrote:
>>>>
>>>> use locale;
>>>> open(FILE,'test.txt') ;
>>>> @sentence = <FILE>;
>>>> foreach $sentence (@sentence) {
>>>> chomp $sentence;
>>>> if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
>>>> print "$1\n";
>>>> }}
>>>>
>>>> Instead of "use locale" I have also tried unsucessfully:
>>>> (1)
>>>> use utf8;
>>>> (2)
>>>> use POSIX qw(locale_h);
>>>> (3)
>>>> use POSIX qw(locale_h);
>>>> my $locale = setlocale(LC_ALL, "de_DE");
>>>
>>> binmode(STDOUT, ":utf8");
>>
>> Maybe, but that's secondary. First the file must be read correctly, then
>> you can worry about printing the results correctly.
>>
>> So he needs to apply the correct encoding filter to FILE:
>>
>> open(FILE, "<:encoding($encoding)", 'test.txt')
>>
>> or
>>
>> binmode FILE, ":encoding($encoding)";
>>
>> (of course, $encoding must be set to the correct first, e.g. "UTF-8" or
>> "ISO-8859-15")
>
> If you want de_DE rather than Unicode \w semantics
de_DE semantics is probably not needed, the usual Unicode semantics of
\w should by default include all German umlauts + other special German
characters.
> you also need perl 5.14,
Yes, Unicode semantics requires a recent perl.
> and you need to call setlocale and either 'use locale' or use the
> /l regex flag.
That's not necessarily needed:
My understanding is that Unicode takes precedence over any locales.
However, you might have to call setlocale, 'use locale' or /l regex
flag, but only if you don't have Unicode semantics (that is: only if
your perl is older than 5.014)
------------------------------
Date: Tue, 3 Sep 2013 00:50:47 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <nrkffa-chr2.ln1@anubis.morrow.me.uk>
Quoth klaus03 <klaus03@gmail.com>:
> Le 02/09/2013 22:40, Ben Morrow a écrit :
> >
> > If you want de_DE rather than Unicode \w semantics
>
> de_DE semantics is probably not needed, the usual Unicode semantics of
> \w should by default include all German umlauts + other special German
> characters.
Yes. However, Unicode will include (for example) non-Latin letter
characters as letters, which I would not expect a German locale to do.
The difference may or may not be important.
> > you also need perl 5.14,
>
> Yes, Unicode semantics requires a recent perl.
>
> > and you need to call setlocale and either 'use locale' or use the
> > /l regex flag.
>
> That's not necessarily needed:
>
> My understanding is that Unicode takes precedence over any locales.
>
> However, you might have to call setlocale, 'use locale' or /l regex
> flag, but only if you don't have Unicode semantics (that is: only if
> your perl is older than 5.014)
Your understanding is out of date. Up until 5.12, whether regexes
matched with Unicode, ISO8859-1 or locale semantics was rather
unpredictable, though in general if either the pattern or the string was
Unicode then Unicode rules were used. In 5.12 the unpredictability was
fixed, so Unicode semantics were (IIRC) always used.
In 5.14 the /luad regex flags were introduced, which explicitly control
which semantics to use. (These flags do not exist on older perls.) 'use
locale' was also changed to implicitly apply /l to any patterns compiled
in the scope of the pragma. See perl5140delta, which also has pointers
to the relevant core documentation.
This ought to mean that, for the first time since 5.6, locales are again
usable and useful in Perl. However, I have not yet had occasion to test
this.
Ben
------------------------------
Date: Mon, 02 Sep 2013 22:21:04 -0700
From: Charles DeRykus <derykus@gmail.com>
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <l03rk1$nk7$1@speranza.aioe.org>
On 9/2/2013 1:08 PM, Peter J. Holzer wrote:
> On 2013-09-02 19:45, Charles DeRykus <derykus@gmail.com> wrote:
>> On 9/2/2013 10:34 AM, fmassion@web.de wrote:
>>> My test file:
>>>
>>> hheneinstellbar 1234
>>> bedienbar 5678
>>> 1111 Mller
>>> grer 8765
>
> Which character encoding does the file use?
>
>
>>> My script:
>>> #!/usr/bin/perl -w
>>> use locale;
>>> open(FILE,'test.txt') ;
>>> @sentence = <FILE>;
>>> foreach $sentence (@sentence) {
>>> chomp $sentence;
>>> if ($sentence =~ m/(\w+)(\s)(\d+)/gx) {
>>> print "$1\n";
>>> }}
>>>
>>> Instead of "use locale" I have also tried unsucessfully:
>>> (1)
>>> use utf8;
>>> (2)
>>> use POSIX qw(locale_h);
>>> (3)
>>> use POSIX qw(locale_h);
>>> my $locale = setlocale(LC_ALL, "de_DE");
>>>
>>> Result (words broken at German special characters):
>>>
>>> heneinstellbar (instead of the expected "hheneinstellbar")
>>> bedienbar
>>> er (instead of the expected "grer")
>>>
>>> The script works with [\w] instead of \w but I assume there is
>>> a better solution.
>>>
>>
>>
>> binmode(STDOUT, ":utf8");
>
> Maybe, but that's secondary. First the file must be read correctly, then
> you can worry about printing the results correctly.
>
> So he needs to apply the correct encoding filter to FILE:
>
> open(FILE, "<:encoding($encoding)", 'test.txt')
>
> or
>
> binmode FILE, ":encoding($encoding)";
>
> (of course, $encoding must be set to the correct first, e.g. "UTF-8" or
> "ISO-8859-15")
> ...
With 'use locale' plus 'binmode(STDOUT,":utf8")', there is correct
output but maybe there are potential shortcomings since locale can be
problematic.
IIUC doesn't Perl internally store as Latin-1,eg, and seamlessly upgrade
to Unicode as needed.. It seems clunky then to nail down the input
encoding as well although perhaps the idea is to throw an error if the
specified encoding doesn't validate?
--
Charles DeRykus
------------------------------
Date: Tue, 3 Sep 2013 05:45:10 -0700 (PDT)
From: fmassion@web.de
Subject: Re: Cannot have locale word characters in a variable
Message-Id: <97ac2d53-97de-401d-887b-09325e383329@googlegroups.com>
Thanks to all of you for your support. This below didn't work for whatever reason. I am using Perl v.14.1 (on Windows 7)
> With 'use locale' plus 'binmode(STDOUT,":utf8")', there is correct
> output but maybe there are potential shortcomings since locale can be
> problematic.
>
I had also tried without success:
use utf8;
binmode STDIN, ":utf8";
binmode STDOUT, ":utf8";
open(FILE,'testfile.txt') or die;
Finally, the following was successful:
open(FILE, '<:encoding(utf8)', 'testfile.txt') or die;
binmode STDOUT, ":utf8"; # output
@sentence = <FILE>;
Francois
------------------------------
Date: Mon, 2 Sep 2013 15:44:36 +0100
From: Justin C <justin.1303@purestblue.com>
Subject: Re: File deduplication
Message-Id: <krkefa-a0e.ln1@zem.masonsmusic.co.uk>
On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
> here is a Perl function to deduplicate your files. Not perfect but works
>
>
> http://antarcticasurfer.wordpress.com/2013/09/02/deduplicate-files-contents/
I think fdupes is much more likely to serve your
purpose correctly and efficiently.
Stop trying to re-invent the wheel, and stop pushing
your code here, no one is asking for it, no one else
does it. If you've a perl problem then post a snippet
and explain what you expect it to do, you'll get any
help you need. But I for one am fed up with what you
keep posting, it's not helpful, useful, or wanted[1].
You're circling the black hole that is my KF, unless
you alter your trajectory you won't escape it.
Justin.
1. Please correct me if I'm wrong. If you look
forward to the next installment of George's code
posting please say and I'll re-align what I consider
this group to be.
--
Justin C, by the sea.
------------------------------
Date: Mon, 02 Sep 2013 16:52:34 +0100
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: File deduplication
Message-Id: <87ppsrte4d.fsf@sapphire.mobileactivedefense.com>
Justin C <justin.1303@purestblue.com> writes:
> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>> here is a Perl function to deduplicate your files. Not perfect but works
>>
>>
>> http://antarcticasurfer.wordpress.com/2013/09/02/deduplicate-files-contents/
>
> I think fdupes is much more likely to serve your
> purpose correctly and efficiently.
>
> Stop trying to re-invent the wheel,
Some people believe that they've accomplished a technical feat
equivalent to inventing the wheel whenever they've managed to tack
three lines of code together which do something else than 'crash
immediately'. I'd calls this another nice example of the
Dunning-Kruger effect in action.
------------------------------
Date: Mon, 02 Sep 2013 23:36:35 +0300
From: George Mpouras <gravitalsun@hotmail.foo>
Subject: Re: File deduplication
Message-Id: <l02ssu$tjt$1@news.ntua.gr>
Στις 2/9/2013 17:44, ο/η Justin C έγραψε:
> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>> here is a Perl function to deduplicate your files. Not perfect but works
> I think fdupes is much more likely to serve your
> purpose correctly and efficiently.
technical speaking fdupes find same files
the code I post deduplicate multiple file, content, in place.
I think you write your reply without wasting 5 seconds to read even what
the post was about .
------------------------------
Date: Tue, 03 Sep 2013 00:04:53 +0300
From: George Mpouras <gravitalsun@hotmail.foo>
Subject: Re: File deduplication
Message-Id: <l02uhv$1233$1@news.ntua.gr>
Στις 2/9/2013 18:52, ο/η Rainer Weikusat έγραψε:
> Justin C <justin.1303@purestblue.com> writes:
>> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>>> here is a Perl function to deduplicate your files. Not perfect but works
>>>
>>>
>>> http://antarcticasurfer.wordpress.com/2013/09/02/deduplicate-files-contents/
>>
>> I think fdupes is much more likely to serve your
>> purpose correctly and efficiently.
>>
>> Stop trying to re-invent the wheel,
>
> Some people believe that they've accomplished a technical feat
> equivalent to inventing the wheel whenever they've managed to tack
> three lines of code together which do something else than 'crash
> immediately'. I'd calls this another nice example of the
> Dunning-Kruger effect in action.
>
I do not know any Perl "wheel" dedup a set of files content
------------------------------
Date: Mon, 02 Sep 2013 22:11:47 +0100
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: File deduplication
Message-Id: <874na3szcc.fsf@sapphire.mobileactivedefense.com>
George Mpouras <gravitalsun@hotmail.foo> writes:
> Στις 2/9/2013 18:52, ο/η Rainer Weikusat έγραψε:
>> Justin C <justin.1303@purestblue.com> writes:
>>> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>>>> here is a Perl function to deduplicate your files. Not perfect but works
>>>>
>>>>
>>>> http://antarcticasurfer.wordpress.com/2013/09/02/deduplicate-files-contents/
>>>
>>> I think fdupes is much more likely to serve your
>>> purpose correctly and efficiently.
>>>
>>> Stop trying to re-invent the wheel,
>>
>> Some people believe that they've accomplished a technical feat
>> equivalent to inventing the wheel whenever they've managed to tack
>> three lines of code together which do something else than 'crash
>> immediately'. I'd calls this another nice example of the
>> Dunning-Kruger effect in action.
>
> I do not know any Perl "wheel" dedup a set of files content
This 're-invent the wheel' statement is incredibly stupid for two
reasons:
1. 'The wheel' is not some 'static' piece of technology but new kinds
of wheels are constantly being developed and different kinds, eg,
wheels used in high-speed trains vs wheels use for wheelbarrows are
very much different.
2. The basic design of 'the wheel' represents a very simple way to solve a
particular problem 'perfectly' and has thus been unchanged for a few
thousand years. In contrast to this, software which hasn't either
vanished altogether or undergone a serious redesign for, say, thirty
years, is extremely rare. The same is true for most other 'human
inventions': Usually, they're useless trifles and vanish quickly.
------------------------------
Date: Tue, 3 Sep 2013 09:14:49 +0100
From: Justin C <justin.1303@purestblue.com>
Subject: Re: File deduplication
Message-Id: <pcigfa-70t.ln1@zem.masonsmusic.co.uk>
On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
> Στις 2/9/2013 17:44, ο/η Justin C έγραψε:
>> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>>> here is a Perl function to deduplicate your files. Not perfect but works
>
>> I think fdupes is much more likely to serve your
>> purpose correctly and efficiently.
>
>
>
> technical speaking fdupes find same files
> the code I post deduplicate multiple file, content, in place.
>
> I think you write your reply without wasting 5 seconds to read even what
> the post was about .
You go ahead and think what you like, but, for once,
I'm with Rainer, your posts appear to be no more than
Dunning-Kruger in action.
Justin.
--
Justin C, by the sea.
------------------------------
Date: Tue, 03 Sep 2013 12:15:57 +0300
From: George Mpouras <gravitalsun@hotmail.foo>
Subject: Re: File deduplication
Message-Id: <l049cq$2mn5$1@news.ntua.gr>
Στις 3/9/2013 11:14, ο/η Justin C έγραψε:
> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>> Στις 2/9/2013 17:44, ο/η Justin C έγραψε:
>>> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>>>> here is a Perl function to deduplicate your files. Not perfect but works
>>
>>> I think fdupes is much more likely to serve your
>>> purpose correctly and efficiently.
>>
>>
>>
>> technical speaking fdupes find same files
>> the code I post deduplicate multiple file, content, in place.
>>
>> I think you write your reply without wasting 5 seconds to read even what
>> the post was about .
>
> You go ahead and think what you like, but, for once,
> I'm with Rainer, your posts appear to be no more than
> Dunning-Kruger in action.
>
>
> Justin.
>
I do not "think" I am based on facts like man pages.
------------------------------
Date: Tue, 03 Sep 2013 13:52:59 +0100
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Subject: Re: File deduplication
Message-Id: <871u563w44.fsf@sapphire.mobileactivedefense.com>
George Mpouras <gravitalsun@hotmail.foo> writes:
> here is a Perl function to deduplicate your files. Not perfect but works
>
>
> http://antarcticasurfer.wordpress.com/2013/09/02/deduplicate-files-contents/
This looks a URL to me. As a comment which is not a flame: You're
doing 'OS detection' at runtime and execute different code based
on that:
if ($^O=~/(?i)MSWin/) {
unless (0 == system(“RD /Q /S \”$temp/$_\”")) {
die “Could not delete \”$temp/$_\” directory because\”$^E\”\n”
}
} else {
unless (0 == system(“rm -rf \”$temp/$_\”")) {
die “Could not delete \”$temp/$_\” directory because \”$^E\”\n”
}
}
but the OS will rarely ever change at runtime. You should rather move
this into a BEGIN block and create 'a suitable function' you could
then call from the main code. I think you should also consider using
the 'list form' of system so that the runtime doesn't have to parse
you're command in order to deteremine how to execute them, especially
as this would also get around the (broken) 'quoted text
interpolation'. Example:
---------------
BEGIN {
if ($^O eq 'linux') {
*rmtree = sub {
system(qw(rm -rf), $_[0]) == 0 and return;
die("could not delete '$_[0]': $?");
};
}
}
rmtree($ARGV[0]);
---------------
Using $^E/ $! here doesn't make much sense because this will only
contain information about a problem which caused system to fail, not
about one encountered by the program which was started.
You should also consider to get rid of the 'inverted comparisons'
habit: This isn't even theoretically useful when both compared objects
are lvalues and mainly communicates a certain mathetic refusal to
accept reality: A lot of programming languages use == as comparison
operator, have been doing so for fourty years, and partisan syntax
won't change that. Also, natural western languages work such that
questions are asked in order to determine properties of object ('Is
the car blue?') and not objects of properties ('Is blue the colour of
the car?'). This latter is just awkward and outlandish style.
------------------------------
Date: Tue, 03 Sep 2013 07:36:43 -0700
From: Jrgen Exner <jurgenex@hotmail.com>
Subject: Re: File deduplication
Message-Id: <2tsb291v3qtb69reo1jk4gtg7flduu9loo@4ax.com>
George Mpouras <gravitalsun@hotmail.foo> wrote:
>here is a Perl function to deduplicate your files. Not perfect but works
>
>
>http://antarcticasurfer.wordpress.com/2013/09/02/deduplicate-files-contents/
<excerpt>
....
if ($#dirs == -1)
...
</excerpt>
You must be kidding....
jue
------------------------------
Date: Tue, 03 Sep 2013 14:58:39 -0500
From: John Bokma <john@castleamber.com>
Subject: Re: File deduplication
Message-Id: <87hae1brtc.fsf@castleamber.com>
Justin C <justin.1303@purestblue.com> writes:
> On 2013-09-02, George Mpouras <gravitalsun@hotmail.foo> wrote:
>> here is a Perl function to deduplicate your files. Not perfect but works
>>
>>
>> http://antarcticasurfer.wordpress.com/2013/09/02/deduplicate-files-contents/
>
> I think fdupes is much more likely to serve your
> purpose correctly and efficiently.
>
> Stop trying to re-invent the wheel,
There's plenty that can be improved about fdupes. For example limiting
it to certain file extensions, skipping directories. Personally I would
like to have a program which I can give a list of dirs I want to "keep"
and a list of dirs I want to "empty". The program will remove all files
that are in the "to empty" list that have a duplicate in the "keep"
list. And no, that's not the same as the auto-delete option that fdupes has.
> and stop pushing your code here,
If the OP just drops links to his site, report him for spam. Otherwise I
suggest you use a kill file.
--
John Bokma j3b
Blog: http://johnbokma.com/ Perl Consultancy: http://castleamber.com/
Perl for books: http://johnbokma.com/perl/help-in-exchange-for-books.html
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 4025
***************************************