[154318] in North American Network Operators' Group
Re: FYI Netflix is down
daemon@ATHENA.MIT.EDU (steve pirk [egrep])
Sun Jul 1 22:39:12 2012
In-Reply-To: <14393687.11884.1341167937515.JavaMail.root@benjamin.baylink.com>
From: "steve pirk [egrep]" <steve@pirk.com>
Date: Sun, 1 Jul 2012 19:38:16 -0700
To: Jay Ashworth <jra@baylink.com>
Cc: NANOG <nanog@nanog.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
On Sun, Jul 1, 2012 at 11:38 AM, Jay Ashworth <jra@baylink.com> wrote:
> Not entirely. Datacenters do go down, our best efforts to the contrary
> notwithstanding. Amazon doesn't guarantee you redundancy on EC2, only
> the tools to provide it yourself. 25% Amazon; 75% service provider
> clients;
> that's my appraisal of the blame.
>
From a Wired article:
> That=92s what was supposed to happen at Netflix Friday night. But it didn=
=92t
> work out that way. According to Twitter messages from Netflix Director of
> Cloud Architecture Adrian Cockcroft and Instagram Engineer Rick Branson, =
it
> looks like an Amazon Elastic Load Balancing service, designed to spread
> Netflix=92s processing loads across data centers, failed during the outag=
e.
> Without that ELB service working properly, the Netflix and Pintrest
> services hosted by Amazon crashed.
http://www.wired.com/wiredenterprise/2012/06/real-clouds-crush-amazon/
The GSLB fail-over that was supposed to take place for the affected
services (that had configured their applications to fail-over) failed.
I heard about this the day after Google announced the Compute Engine
addition to the App Engine product lines they have. The demo was awesome.
I imagine Google has GSLB down pat by now, so some companies might start
looking... ;-]
--steve