[176715] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Got a call at 4am - RAID Gurus Please Read

daemon@ATHENA.MIT.EDU (Joe Greco)
Wed Dec 10 18:40:57 2014

X-Original-To: nanog@nanog.org
From: Joe Greco <jgreco@ns.sol.net>
To: javier@advancedmachines.us (Javier J)
Date: Wed, 10 Dec 2014 18:07:24 -0600 (CST)
In-Reply-To: <CA+M5dWa+nXWOcn7D53UeerYgfGigrr_=oQ+Q1sYMjcLwieR9uQ@mail.gmail.com>
Cc: Rob Seastrom <rs@seastrom.com>, "nanog@nanog.org" <nanog@nanog.org>
Errors-To: nanog-bounces@nanog.org

> I'm just going to chime in here since I recently had to deal with bit-rot
> affecting a 6TB linux raid5 setup using mdadm (6x 1TB disks)
> 
> We couldn't rebuild because of 5 URE sectors on one of the other disks in
> the array after a power / ups issue rebooted our storage box.
> 
> We are now using ZFS RAIDZ and the question I ask myself is, why wasn't I
> using ZFS years ago?
> 
> +1 for ZFS and RAIDZ

I hope you are NOT using RAIDZ.  The chances of an error showing up
during a resilver is uncomfortably high and there are no automatic 
tools to fix pool corruption with ZFS.  Ideally use RAIDZ2 or RAIDZ3
to provide more appropriate levels of protection.  Errors introduced
into a pool can cause substantial unrecoverable damage to the pool,
so you really want the bitrot detection and correction mechanisms to
be working "as designed."

... JG
-- 
Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net
"We call it the 'one bite at the apple' rule. Give me one chance [and] then I
won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN)
With 24 million small businesses in the US alone, that's way too many apples.

home help back first fref pref prev next nref lref last post