[77619] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Resilience: faults, causes, statistics, open issues

daemon@ATHENA.MIT.EDU (=?ISO-8859-2?Q?Andr=E1s_Cs=E1sz=E1)
Thu Jan 27 06:40:32 2005

From: =?ISO-8859-2?Q?Andr=E1s_Cs=E1sz=E1r_=28IJ/ETH=29?= <Andras.Csaszar@ericsson.com>
To: nanog@merit.edu
Date: Thu, 27 Jan 2005 12:39:32 +0100
Errors-To: owner-nanog-outgoing@merit.edu


Hi people!

I've begun research on (carrier-grade, aka telecom-grade) resiliency in =
IP transport networks. The first step would be to collect possible =
failure events, their causes and consequences, statistics about =
downtimes (mean time to repair) and mean times between failures, and I =
would like to identify which of the problems are most typical (HW bug, =
SW bug, cable cut through, plugged out (link going down), severe =
misconfiguration).

I think this is the perfect forum to get some feedback from real =
network-operational experience.

Is anyone out there who has some statistics/documents that would help =
me in any way?

Also, do you have any suggestions on open research issues to be solved =
in the area?

Any thoughts on your mind or comments would be most welcome!

Thanks!

Andr=E1s

home help back first fref pref prev next nref lref last post