[188372] in North American Network Operators' Group

home help back first fref pref prev next nref lref last post

Re: Why the US Government has so many data centers

daemon@ATHENA.MIT.EDU (George Herbert)
Sat Mar 19 01:21:00 2016

X-Original-To: nanog@nanog.org
From: George Herbert <george.herbert@gmail.com>
In-Reply-To: <3B1CC2C8-560A-4D02-8AED-3CE8180EC7FA@n5tech.com>
Date: Fri, 18 Mar 2016 22:21:01 -0700
To: Todd Crane <todd.crane@n5tech.com>
Cc: "nanog@nanog.org" <nanog@nanog.org>
Errors-To: nanog-bounces@nanog.org


So...

Before I go on, I have not been in Todd's shoes, either serving nor directly=
 supporting an org like that.

However, I have indirectly supported orgs like that and consulted at or supp=
orted literally hundreds of commercial and a few educational and nonprofit o=
rgs over the last 30 years.=20

There are corner cases where distributed resilience is paramount, including a=
 lot of field operations (of all sorts) on ships (and aircraft and spacecraf=
t), or places where the net really is unstable.  Any generalizations that wr=
ap those legitimate exceptions in are overreaching their valid descriptive r=
ange.

That said, the vast bulk of normal world environments, individuals make just=
ifications like Todd's and argue for distributed services, private servers, e=
tc.  And then do not run them reliably, with patches, backups, central secur=
ity management, asset tracking, redundancy, DR plans, etc.

And then they break, and in some cases are and will forever be lost.  In oth=
er cases they will "merely" take 2, 5, 10, in one case more than 100 times l=
onger to repair and more money to recover than they should have.

Statistically these are very very poor operational practice.  Not so much be=
cause of location (some) but because of lack of care and quality management w=
hen they get distributed and lost out of IT's view.

Statistically, several hundred clients in and a hundred or so organizational=
 assessments in, if I find servers that matter under desks you have about a 2=
% chance that your IT org can handle supporting and managing them appropriat=
ely.

If you think that 98% of servers in a particular category being at high risk=
 of unrecoverable or very difficult recovery when problems crop up is accept=
able, your successor may be hiring me or someone else who consults a lot for=
 a very bad day's cleanup.

I have literally been at a billion dollar IT disaster and at tens of smaller=
 multimillion dollar ones trying to clean it up.  This is a very sad type of=
 work.

I am not nearly as cheap for recoveries as for preventive management and pro=
active fixes.=20


George William Herbert
Sent from my iPhone

> On Mar 18, 2016, at 9:28 PM, Todd Crane <todd.crane@n5tech.com> wrote:
>=20
> I was trying to resist the urge to chime in on this one, but this discussi=
on has continued for much longer than I had anticipated... So here it goes
>=20
> I spent 5 years in the Marines (out now) in which one of my MANY duties wa=
s to manage these "data centers" (a part of me just died as I used that word=
 to describe these server rooms). I can't get into what exactly I did or wit=
h what systems on such a public forum, but I'm pretty sure that most of the s=
ervers I managed would be exempted from this paper/policy.
>=20
> Anyways, I came across a lot of servers in my time, but I never came acros=
s one that I felt should've been located elsewhere. People have brought up t=
he case of personal share drive, but what about the combat camera (think pub=
lic relations) that has to store large quantities (100s of 1000s) of high re=
solution photos and retain them for years. Should I remove that COTS (commer=
cial off the shelf) NAS underneath the Boss' desk and put in a data center 4=
 miles down the road, and force all that traffic down a network that was des=
igned for light to moderate web browsing and email traffic just so I can che=
ck a box for some politician's reelection campaign ads on how they made the g=
overnment "more efficient"
>=20
> Better yet, what about the backhoe operator who didn't call before he dug,=
 and cut my line to the datacenter? Now we cannot respond effectively to a n=
atural disaster in the Asian Pacific or a bombing in the Middle East or a pl=
atoon that has come under fire and will die if they can't get air support, a=
ll because my watch officer can't even login to his machine since I can no l=
onger have a backup domain controller on-site
>=20
> These seem very far fetched to most civilian network operators, but to any=
body who has maintained military systems, this is a very real scenario. As m=
entioned, I'm pretty sure my systems would be exempted, but most would not. W=
hen these systems are vital to national security and life & death situations=
, it can become a very real problem. I realize that this policy was intended=
 for more run of the mill scenarios, but the military is almost always group=
ed in with everyone else anyways.=20
>=20
> Furthermore, I don't think most people realize the scale of these networks=
. NMCI, the network that the Navy and Marine Corps used (when I was in), had=
 over 500,000 active users in the AD forest. When you have a network that si=
ze, you have to be intentional about every decision, and you should not leav=
e it up to a political appointee who has trouble even checking their email.=20=

>=20
> When you read how about much money the US military hemorrhages, just remem=
ber....=20
> - The multi million dollar storage array combined with a complete network o=
verhaul, and multiple redundant 100G+ DWDM links was "more efficient" than a=
 couple of NAS that we picked up off of Amazon for maybe $300 sitting under a=
 desk connected to the local switch.=20
> - Using an old machine that would otherwise be collecting dust to ensure t=
hat users can login to their computers despite conditions outside of our con=
trol is apparently akin to treason and should be dealt with accordingly.
> </rant>
>=20
>=20
> --Todd
>=20
> Sent from my iPad
>=20
>>> On Mar 14, 2016, at 11:01 AM, George Metz <george.metz@gmail.com> wrote:=

>>>=20
>>> On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762@gmail.com> wrote:
>>>=20
>>>=20
>>> Yes, *sigh*, another what kind of people _do_ we have running the govt
>>> story.  Altho, looking on the bright side, it could have been much
>>> worse than a final summing up of "With the current closing having been
>>> reported to have saved over $2.5 billion it is clear that inroads are
>>> being made, but ... one has to wonder exactly how effective the
>>> initiative will be at achieving a more effective and efficient use of
>>> government monies in providing technology services."
>>>=20
>>> Best Regards,
>>> Lee
>>=20
>> That's an inaccurate cost savings though most likely; it probably doesn't=

>> take into account the impacts of the consolidation on other items. As a
>> personal example, we're in the middle of upgrading my site from an OC-3 t=
o
>> an OC-12, because we're running routinely at 95+% utilization on the OC-3=

>> with 4,000+ seats at the site. The reason we're running that high is
>> because several years ago, they "consolidated" our file storage, so inste=
ad
>> of file storage (and, actually, dot1x authentication though that's
>> relatively minor) being local, everyone has to hit a datacenter some 500+=

>> miles away over that OC-3 every time they have to access a file share. An=
d
>> since they're supposed to save everything to their personal share drive
>> instead of the actual machine they're sitting at, the results are
>> predictable.
>>=20
>> So how much is it going to cost for the OC-12 over the OC-3 annually? Is
>> that difference higher or lower than the cost to run a couple of storage
>> servers on-site? I don't know the math personally, but I do know that if w=
e
>> had storage (and RADIUS auth and hell, even a shell server) on site, we
>> wouldn't be needing to upgrade to an OC-12.

home help back first fref pref prev next nref lref last post