[3360] in Release_7.7_team

home help back first fref pref prev next nref lref last post

how much bandwidth will the 9.1 release require

daemon@ATHENA.MIT.EDU (Jonathon Weiss)
Tue Jun 25 00:37:06 2002

Message-Id: <200206250437.AAA03249@attraction.mit.edu>
To: release-team@MIT.EDU
Date: Tue, 25 Jun 2002 00:37:03 -0400
From: Jonathon Weiss <jweiss@MIT.EDU>


I did some estimated calculations on how much network we'll need o
push out the 9.1 release.  My recommendatiosn are highlighted with
asterisks.  We should probably at least mention this briefly on
Monday.

	Jonathon



Notes:
    All AFS servers have 100Mbps interfaces
	the w20 network is connected to the rest of campus at 100Mbps
	the w92 nework is connected at 1Gbps
    There is no shortage of AFS disk space
    Assume a 4 hour desyncronization of updates (there may be an
	additional two hours from the hesiod TTLs but don;t count on it.)
    The w20 machine counts are from cview
    The total machine counts are from the numbers Bill tracks
	whcih means it is machines that have rebooted in the last 6
	months, and does not pay any attention to whether the machine 
	hesiod info
    The amount of data to be transfered is based upon df output from 
	several 9.1 machines.


PC  2G x 960 machines  (43 in w20 cluster)
but how many of these are student linux machines that aren't around
for the summer?

Athinfo run of 1217 machines that either had public-linux hesiod or
    were in Bill's list:
	453 AUTOUPDATE=true
	 70 AUTOUPDATE=false
	155 connection refused (presumably running another OS right now)
	539 down
	
2GB x 1024MB/GB x 8Mb/MB = 16384Mb x 960 = 15728649Mb
4 hours x 3600s/h = 14400s 
# Calculate the cumulatve bandwidth required to push the update out
15728649Mb / 14400s = 1092 Mbps
1092 / 2 = 546Mbps (if only 1/2 of the 960 machines will actually autoupdate)

 500M of Athena stuff (1/4 of the total data)
1500M of redhat stuff (3/4 of the total data)

Looking at just the w20 cluster
16384Mb X 43 = 704512Mb / 14400s = 48Mbps (this is potentially a
problem, given that the w20 cluster is on a significantly hubbed
network. (My guess is that the cluster is served by 3 10Mbps segments
that are switched together.  Assuming I'm correct and an optimal
split, some additional desyncronization due to users and hesiod and
not too much other traffic, we might squeak by with out huge delays.))

Looked at another way:
10Mbps x 14400s / (8Mb/MB) / (1024MB/GB) / (2GB/machine) = 8.8 machines
A single 10Mbps segment can't handle more than 8 machines.

* Recommendation (based on the conservative assumption of up to 1/2 of
* the 960 machines being active and wanting to update) 2-3 rep sites of
* system.rhlnx.athena91.readonly with exactly 1 in w20, and 5 rep sites
* of system.rhlnx.rh73.rpm.readonly with exactly 1 in w20.  These
* numbers (especially the later can be reduced a week or two after the
* release goes out, since the packs are not referenced in day to day
* use, and small patch releases will generate smaller amounts of
* traffic.)  Also do the release realatively early (11 or midnight, so
* if things do bog down there is some time to recover.




SUN 400M x 1124 machines (38 in the w20 cluster)

Without doing all of the math let's assume that roughly all of the
suns are active, but have less than 1/4 the data of a linux box.  It
looks like we can safely approximate this as the same 546Mbps as
above.

* Recommendation: Our existing 4 rep sites of the OS and 5 of the srvd
* are probably more enough.  We might want to balance toward w92
* (particualrly on the OS side, but I don;t know how big a deal it is.

* Recommendation: Release Solaris and linux on different nights,
* particularly to protect slow client networks.  Release Irix probably
* the same night as solaris.  Also do the release realatively early (11
* or midnight, so if things do bog down there is some time to recover.

home help back first fref pref prev next nref lref last post