[4226] in Release_7.7_team

home help back first fref pref prev next nref lref last post

Lore of how Red Hat Network trades Security against Scalability

daemon@ATHENA.MIT.EDU (William Cattey)
Wed Feb 25 14:00:44 2004

Mime-Version: 1.0 (Apple Message framework v612)
Content-Type: text/plain; charset=US-ASCII; format=flowed
Message-Id: <7C2FA857-67C4-11D8-ADB7-000A9596D0BC@mit.edu>
Content-Transfer-Encoding: 7bit
Cc: Dan Logcher <dlogcher@mit.edu>, Hal Abelson <hal@mit.edu>
From: William Cattey <wdc@MIT.EDU>
Date: Wed, 25 Feb 2004 13:57:40 -0500
To: release-team@mit.edu

In responding to a query from Theresa on related issues, I found myself 
documenting a discovery I made of how Red Hat Network trades Security 
against Scalability in a way that hurt MIT a little last week, and 
would hold the potential to hurt it a LOT if, for example, the Athena 
update were replaced with RHN Institute-wide.

I'm sending out this lore to a wider audience in the interests of 
helping people unfamiliar with one or the other system to understand 
better.

The fundamental tradeoff is that since Red Hat insists that nobody 
should be allowed to perform an update through RHN if they lack a 
certifying credential, the potential exists for a systemic outage until 
every system gets visited to update certificates.

Last week we had an outage that prevented all users of our Red Hat 
Proxy into Red Hat Network from performing any updates until two 
problems at the Red Hat end were fixed, and then each and every machine 
was hand-tooled with new certificates.

Events of the outage:

	1. The Red Hat Certifying Authority Certificate expired.
	2. Nobody could update starting Saturday Feb 14.
	3. It took us 3 days to correct a Red Hat internal blunder that made 
their support people not talk to us.
	4. It took another two days to re-install the CA (their baroque repair 
procedures were an issue.)
	5. Hand tooling EVERY system using Red Hat Network was then required 
to re-install the Red Hat CA Cert to get updates going again.  The 
announcement that the service was back up (including customer 
instructions on how to do the hand tooling) was sent out on 20 
February.

If the Athena Release Team were to go away, and the Athena update were 
converted to "just use Red Hat Network" we would be at risk of another 
such outage any time there was a problem with either the Red Hat 
Certifying Authority Certificate, or with the individual certificate on 
the client host.

Furthermore, the Red Hat Proxy server represents a single point of 
failure.  A client-driven update that pulls data from an enterprise 
filesystem with replication is a more reliable infrastructure.

When the time comes to have a technical conversation with Red Hat, I'll 
try to point out the value of a scalable, non-secure, pathway through 
their service to head off such failures.

-wdc


home help back first fref pref prev next nref lref last post