[154265] in North American Network Operators' Group
Re: FYI Netflix is down
daemon@ATHENA.MIT.EDU (Cameron Byrne)
Sat Jun 30 09:13:28 2012
In-Reply-To: <4FEEA9C1.1000401@bogus.com>
Date: Sat, 30 Jun 2012 06:12:27 -0700
From: Cameron Byrne <cb.list6@gmail.com>
To: joel jaeggli <joelja@bogus.com>
Cc: nanog@nanog.org
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
On Jun 30, 2012 12:25 AM, "joel jaeggli" <joelja@bogus.com> wrote:
>
> On 6/30/12 12:11 AM, Tyler Haske wrote:
>>>
>>> I am not a computer science guy but been around a long time. Data
centers
>>> and clouds are like software. Once they reach a certain size, its
>>> impossible to keep the bugs out. You can test and test your heart out
and
>>> something will slip by. You can say the same thing about nuclear
reactors,
>>> Apollo moon missions, the NorthEast power grid, and most other
technology
>>> disasters.
>>
>> How to run a datacenter 101. Have more then one location, preferably
>> far apart. It being Amazon I would expect more. :/
>
> there are 7 regions in ec2 three in north america two in asia one in
europe and one in south america.
>
> us east coast, the one currently being impacted is further subdivided
into 5 availability zones.
>
> us east 1d appears to be the only one currently being impacted.
>
> distributing your application is left as an exercise to the reader.
>
>
+1
Sorry to be the monday morning quarterback, but the sites that went down
learned a valuable lesson in single point of failure analysis. A highly
redundant and professionally run data center is a single point of failure.
Geo-redundancy is key. In fact, i would take distributed data centers over
RAID, UPS, or any other "fancy pants" =A9 mechanisms any day.
And, aws East also seems to be cursed. I would run out of west for a
while. :-)
I would also look into clouds of clouds. ... Who knows. Amazon could have
an Enron moment, at which point a corporate entity with a tax id is now a
single point of failure.
Pay your money, take your chances.
CB