[55689] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Cascading Failures Could Crash the Global Internet

daemon@ATHENA.MIT.EDU (Jack Bates)
Fri Feb 7 08:54:37 2003

From: "Jack Bates" <jbates@brightok.net>
To: "N. Richard Solis" <nrsolis@aol.net>,
	"Vadim Antonov" <avg@kotovnik.com>
Cc: <nanog@merit.edu>
Date: Fri, 7 Feb 2003 07:53:50 -0600
Errors-To: owner-nanog-outgoing@merit.edu


This is a multi-part message in MIME format.

------=_NextPart_000_0025_01C2CE7E.0D770420
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

N. Richard Solis wrote:

> Yeah yeah yeah.  I know that everything isn't simple.  I actually =
worked at a power plant so
> none of this is new to me.  Can cascading failures occur?  Yes.  =
Witness the Great Blackout in
> NYC.  My point was that there are places where the electrical network =
is designed to "blow the
> bolts" to TRY and protect everything.  Does it work?  Most of the =
time, yes.  All of the time?
> NO.

Bringing this back to topic. What you are refering to is similar to a =
failure within an AS. When you start having problems within one section =
of your network that could jeapordize the rest of your network, you cut =
it off until the problem can be fixed. Does it work? Most of the time, =
yes. All the time? No. Sometimes the failure is too rapid to avoid the =
cascade failure within the AS. This practice is seprate from grid and AS =
interconnects.

> It is a complicated problem but you'd be suprised at how fast things =
can happen when you
> HAVE to keep the system running.  There is a tremendous amount of =
skill concentrated in that
> field and they do a good job of keeping everything running well.  How =
many turbine overspeed
> events do <snip>

I agree. The same can be said for many networks. The difference is that =
dealing with some networking problems is new to many engineers. Without =
proper training and expecting a cascade failure, how do you know the =
fastest method to deal with it? I've had lots of practice on my network. =
I have an average stabalization rate of about 5 minutes now, but then, I =
redesigned my network a long time ago to effectively deal with such =
problems in a shorter time span.

> The loss of a single transmission line isn't going to cause a whole =
station to trip.  If you're losing a
> bunch though, you've probably got lots of other problems to worry =
about.

Also true with many networks today. However, this topic falls within a =
single grid. The original analogy was dealing with grid interconnects =
which have different requirements and must be protected at all costs. =
IF, and I don't think it's happened in a very long time, an entire grid =
lost integrity, it would be unacceptable for the grid to cascade into =
the other two grids. Extra percautions are put into place. In the same =
reguards, many Autonomous Systems do have different policies reguarding =
their interconnects compared to their internal network.

-Jack





------=_NextPart_000_0025_01C2CE7E.0D770420
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 5.50.4728.2300" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><TT><FONT face=3DArial size=3D2>N. Richard Solis=20
wrote:</FONT></TT></DIV><TT><FONT face=3DArial size=3D2></FONT><FONT =
face=3DArial=20
size=3D2></FONT><FONT face=3DArial size=3D2></FONT>
<DIV><BR></TT>&gt; Yeah yeah yeah.&nbsp; I know that everything isn't=20
simple.&nbsp; I actually worked at a power plant so</DIV>
<DIV>&gt; none of this is new to me.&nbsp; Can cascading failures =
occur?&nbsp;=20
Yes.&nbsp; Witness the Great Blackout in</DIV>
<DIV>&gt; NYC.&nbsp; My point was that there are places where the =
electrical=20
network is designed to "blow the</DIV>
<DIV>&gt; bolts" to TRY and protect everything.&nbsp; Does it =
work?&nbsp; Most=20
of the time, yes.&nbsp; All of the time?</DIV>
<DIV>&gt; NO.<BR></DIV>
<DIV><FONT face=3DArial size=3D2>Bringing this back to topic. What you =
are refering=20
to is similar to a failure within an AS. When you start having problems =
within=20
one section of your network that could jeapordize the rest of your =
network, you=20
cut it off until the problem can be fixed. Does it work? Most of the =
time, yes.=20
All the time? No. Sometimes the failure is too rapid to avoid the =
cascade=20
failure within the AS. This practice is seprate from grid and AS=20
interconnects.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV>&gt; It is a complicated problem but you'd be suprised at how fast =
things=20
can happen when you</DIV>
<DIV>&gt; HAVE to keep the system running.&nbsp; There is a tremendous =
amount of=20
skill concentrated in that</DIV>
<DIV>&gt; field and they do a good job of keeping everything running =
well.&nbsp;=20
How many turbine overspeed</DIV>
<DIV>&gt; events do &lt;snip&gt;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>I agree. The same can be said for many =
networks.=20
The difference is that dealing with some networking problems is new to =
many=20
engineers. Without proper training and expecting a cascade failure, how =
do you=20
know the fastest method to deal with it? I've had lots of practice on my =

network. I have an average stabalization rate of about 5 minutes now, =
but then,=20
I redesigned my network a long time ago to effectively deal with such =
problems=20
in a shorter time span.</FONT></DIV><FONT face=3DArial =
size=3D2></FONT><FONT=20
face=3DArial size=3D2></FONT><FONT face=3DArial size=3D2></FONT><FONT =
face=3DArial=20
size=3D2></FONT><FONT face=3DArial size=3D2></FONT>
<DIV><BR>&gt; The loss of a single transmission line isn't going to =
cause a=20
whole station to trip.&nbsp; If you're losing a</DIV>
<DIV>&gt; bunch though, you've probably got lots of other problems to =
worry=20
about.</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Also true with many networks today. =
However, this=20
topic falls within a single grid. The original analogy was dealing with =
grid=20
interconnects which have different requirements and must be protected at =
all=20
costs. IF, and I don't think it's happened in a very long time, an =
entire grid=20
lost integrity, it would be unacceptable for the grid to cascade into =
the other=20
two grids. Extra percautions are put into place. In the same reguards, =
many=20
Autonomous Systems do have different policies reguarding their =
interconnects=20
compared to their internal network.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>-Jack</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT><FONT face=3DArial=20
size=3D2></FONT><BR>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_0025_01C2CE7E.0D770420--


home help back first fref pref prev next nref lref last post