[33312] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Operate until failure

daemon@ATHENA.MIT.EDU (Josh Richards)
Mon Jan 8 19:13:41 2001

Date: Mon, 8 Jan 2001 15:58:32 -0800
From: Josh Richards <jrichard@cubicle.net>
To: nanog@merit.edu
Message-ID: <20010108155831.A9069@datahaven.freedom.gen.ca.us>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-md5;
	protocol="application/pgp-signature"; boundary="5vNYLRcllDrimb99"
Content-Disposition: inline
In-Reply-To: <20010108223549.28492.cpmta@c004.sfo.cp.net>; from sean@donelan.com on Mon, Jan 08, 2001 at 02:35:49PM -0800
Errors-To: owner-nanog-outgoing@merit.edu



--5vNYLRcllDrimb99
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

* Sean Donelan <sean@donelan.com> [20010108 15:05]:
>=20
> And what if you are not using APCs?

But still stand alone UPSes?  Don't most data centers have larger UPS(es) or
battery plants (say, two) feeding the entire facility?  The ones I've worked
in have (well, not *all* of them, but those exceptions had much bigger
issues than worrying about how they were going to shutdown all of the boxes
at once..)  And if you aren't using standalone UPSes what do you care what
the interface is to the BigUPS(tm) as long as you can get one of your netwo=
rk
monitoring servers to talk to it (and reliably)?  None of your servers in t=
he
server farm  are going to be talking to your BigUPS(tm) directly anyway.. =
=20

> One issue with highly redundant data centers is the failure modes are
> "interesting."  You don't want to shutdown due to a single UPS failure, so
> you don't use something simple like PowerChute Plus.  You most likely don=
't
> want to shutdown based on any automatic signal.  However, you do want a w=
ay
> for an operator to gracefully shutdown a lot of equipment quickly when
> the decision is made.

Agreed.  And in this case, the UPS has no involvement.  If the operator=20
wants the servers shutdown, the operator shuts servers down.  No UPS=20
involved (OK, well not literally).  I realize this doesn't address your=20
entire point...one sec I'll get to that.

> For a server farm, with potentially thousands of individual systems, is
> there any standard piece of software you can install on all of the systems
> to act as a receiver of a signal to begin a graceful shutdown that does
> not depend on a vendor's proprietary interface?  Preferabally one which
> does not involve running a lot of additional wires.

Sure, ssh/rsh[1]. :-) What vendor's proprietory interface -- the OS vendor =
of
the servers?  The UPSes don't have anything to do with the shutdown process=
=20
if the operator is the one making the call.  To accomplish that it's a simp=
le
matter of scripting a bunch of:

    ssh webserver01 'shutdown -h now Power-Go-Bye-Bye'

Of course, if you have unmanaged (e.g. customer boxes you do not have root
access to) within the same data center, and you want to do the same for=20
those, that's a whole another story...=20

Oh, hmm, and Windows.  Well, remote command execution is possible there too=
=20
from my understanding.=20

At that point, once all servers are gracefull shutdown, you can just shut t=
he
UPS(es) off if you're intent is to eventually cut any and all power to the=
=20
facility.

Or did I completely miss your point?

> Again this is only needed if people want a gracefull shutdown.  If
> you can live with a hard shutdown, you wouldn't require this.  If you
> use ctrl-alt-del as a normal management practice, I suspect you don't
> really require a graceful shutdown.

I'm being anal but even ctrl-alt-del is graceful on most modern OSes. The=
=20
power or reset button though on the other hand...  :-)

[1] rsh only mentioned for historical reasons, please don't use to manage
the remote power capability of your mission-critical server farm located=20
in your highly redundant data center unless you understand why you might
consider not doing so. :)=20

-jr

----
Josh Richards [JTR38/JR539-ARIN]
<jrichard@geekresearch.com/cubicle.net/fix.net/freedom.gen.ca.us>
Geek Research LLC - <URL:http://www.geekresearch.com/>
IP Network Engineering and Consulting

--5vNYLRcllDrimb99
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAjpaVB8ACgkQ8VgqD3XNPNVsvgCgmUBi1K823blrcYhgKMU0Y26j
TmoAoKI2benMq4MdvKTZvDb8XxwjBSWo
=wUXN
-----END PGP SIGNATURE-----

--5vNYLRcllDrimb99--


home help back first fref pref prev next nref lref last post