[3189] in Release_7.7_team
black Dell cluster machine net still poor?
daemon@ATHENA.MIT.EDU (Erik Wile)
Tue Mar 26 15:32:38 2002
Message-Id: <200203262032.PAA10488@esw.mit.edu>
To: release-team@MIT.EDU
Date: Tue, 26 Mar 2002 15:32:35 -0500
From: Erik Wile <esw@MIT.EDU>
Some cluster machines have had piss poor network performance, on the
order of 50kB/s at 3am on a friday night... (Don't ask how I know
this. :) while adjacent machines get 800kB/s to the same host shortly
thereafter. I noticed a line about this in the 9.0.25 patch release,
excerpted here:
* On Linux, a workaround is in place for the 3C509 autonegotiation
problem causing poor network performance for some black Dell
machines.
I just checked one of the poorly performing machines [m1-142-16] which
took the update last night:
Athena Workstation (linux) Version Update Mon Mar 25 23:52:30 EST 2002
Athena Workstation (linux) Version 9.0.25 Tue Mar 26 00:54:15 EST 2002
And it's still performing poorly. Granted my quick test was mid day,
with presumably a fair amount of competing traffic, but it was still
suspiciously pegged at 50kB/s. Does anyone know what the root cause
of this is, what the workaround was and why it might not have took?
Grepping for eth0 in /var/log/messages didn't show anything telltale
when I did it on friday night... I failed to check this afternoon and
I'm far from the cluster now. I noticed that "athinfo m1-142-16
interfaces" shows 0.1% RX-ERRs for eth0, and similar low percentages
for other known bad machines, while it shows 0 errors for known good
machines such as m1-142-9 and m1-142-12. That's interesting. Hmmm,
now that I think about it I might have just grepped eth0 in dmesg, not
/var/log. Oops.
Am I imagining things here? [was the performance during my quick test
this afternoon just a strange coincidence?] Is there anything I can
do in the meantime -- a meta-workaround?
-erik