[164373] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

RE: What to expect after a cooling failure

daemon@ATHENA.MIT.EDU (Tony Patti)
Wed Jul 10 10:39:34 2013

From: "Tony Patti" <tony@swalter.com>
To: "'Erik Levinson'" <erik.levinson@uberflip.com>,
 "'NANOG mailing list'" <nanog@nanog.org>
In-Reply-To: <1373426894.69598008@apps.rackspace.com>
Date: Wed, 10 Jul 2013 10:39:10 -0400
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org

This has been a very interesting thread.

Google pointed me to this Dell document which specs some of their =
servers having an expanded operating temperature range
*** based on the amount of time spent at the elevated temperature, as a =
percentage of annual operating hours. ***

ftp://ftp.dell.com/Manuals/all-products/esuprt_ser_stor_net/esuprt_powere=
dge/poweredge-r710_User%27s%20Guide4_en-us.pdf

I mention that because the "1% of annual operating hours" at 45 C would =
be two degrees higher than the 43 C stated as reached in the original =
email.

It would seem that Dell recognizes that there might be situations, such =
as this, where the "continuous operation" range (35 C) is briefly =
exceeded.

Tony Patti
CIO
S. Walter Packaging Corp.

-----Original Message-----
From: Erik Levinson [mailto:erik.levinson@uberflip.com]=20
Sent: Tuesday, July 09, 2013 11:28 PM
To: NANOG mailing list
Subject: What to expect after a cooling failure

As some may know, yesterday 151 Front St suffered a cooling failure =
after Enwave's facilities were flooded.=20

One of the suites that we're in recovered quickly but the other took =
much longer and some of our gear shutdown automatically due to =
overheating. We shut down remotely many redundant and non-essential =
systems in the hotter suite, and transferred remotely some others to the =
cooler suite, to ensure that we had a minimum of all core systems =
running in the hotter suite. We waited until the temperatures returned =
to normal, and brought everything back online. The entire event lasted =
from approx 18:45 until 01:15. Apparently ambient temperature was above =
43 degrees Celcius at one point on the cool side of cabinets in the =
hotter suite.=20

For those who have gone through such events in the past, what can one =
expect in terms of long-term impact...should we expect some premature =
component failures? Does anyone have any stats to share?=20

Thanks

--
Erik Levinson
CTO, Uberflip
416-900-3830
1183 King Street West, Suite 100
Toronto ON  M6K 3C5
www.uberflip.com
=20




home help back first fref pref prev next nref lref last post