[145567] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: [outages] News item: Blackberry services down worldwide, Egypt

daemon@ATHENA.MIT.EDU (Tayeb Meftah)
Wed Oct 12 11:57:34 2011

From: Tayeb Meftah <tayeb.meftah@gmail.com>
In-Reply-To: <CAOF0KO8oHvHhS=c-sn6o6b5xFiZX6SiJhK28MpbBQm8AuvYS-w@mail.gmail.com>
Date: Wed, 12 Oct 2011 17:56:40 +0200
To: Charles Mills <w3yni1@gmail.com>
Cc: "nanog@nanog.org" <nanog@nanog.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org

Idiotberry


Envoy=E9 de mon iPhone

Le 12 oct. 2011 =E0 17:55, Charles Mills <w3yni1@gmail.com> a =E9crit :

> +1
> On Oct 12, 2011 11:51 AM, <Valdis.Kletnieks@vt.edu> wrote:
>
>> On Wed, 12 Oct 2011 09:52:02 CDT, -Hammer- said:
>>> What kills me is what they have told the public. The lost a "core
>>> switch". I don't know if they actually mean network switch or not but
>>> I'm pretty sure any of us that work on an enterprise environment know
>>> how to factor N+1 just for these types of days. And then the backup
>>> solution failed? I'm not buying it either.
>>
>> Yeah, and that extra comma in the one config file that didn't make a
>> difference
>> when you tested the failover in the lab *never* makes a difference when =
it
>> hits
>> in the production network, right?  Or they changed the config of the
>> primary and
>> it didn't get propogated just right to the backup, or they had mismatche=
d
>> firmware
>> levels on blades in the blades on the primary and backup switches, so
>> traffic that
>> didn't tickle a bug on the primary blades caused the blade to crash on t=
he
>> backup,
>> or...
>>
>> Anybody on this list who's been around long enough probably has enough "=
We
>> should have had N+2 because the N+1'th device failed too" stories to dra=
in
>> *several* pitchers of beer at a good pub... I've even had one case where=
 my
>> butt got *saved* from a ohnosecond-class whoops because the N+1'th devic=
e
>> *was*
>> crashed (stomped a config file, it replicated, was able to salvage a cop=
y
>> from
>> a device that didn't replicate because it was down at the time).
>>
>>


home help back first fref pref prev next nref lref last post