[47927] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Network Reliability Engineering

daemon@ATHENA.MIT.EDU (Pete Kruckenberg)
Sat May 18 19:13:35 2002

Date: Sat, 18 May 2002 17:13:02 -0600 (MDT)
From: Pete Kruckenberg <pete@kruckenberg.com>
To: <nanog@merit.edu>
Message-ID: <Pine.LNX.4.33.0205181701090.32373-100000@minot.kruckenberg.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Errors-To: owner-nanog-outgoing@merit.edu


I'm looking for some good reference materials to do some
"reliability engineering" calculations and projections.

This is to justify increased redundancy, and I want to
include quantifiable numbers based on MTBF data and other
reliability factors, kind of a scientific justification
instead of just the typical emotional appeal using
analyst/vendor FUD.

I'd appreciate references on how to do this in a network
environment (what data to collect, how to collect it, how to
analyze, etc). Also any data (or rules of thumb) on typical
MTBFs for network events that I won't find on vendor product
slicks (like what's the MTBF on IOS, or human-caused service
outages of various types, etc).

If someone has put together something remotely like this
that they'd care to share, that'd be incredibly helpful.

Thanks.
Pete.



home help back first fref pref prev next nref lref last post