[7056] in testers

home help back first fref pref prev next nref lref last post

Immediate ssh disconnects

daemon@ATHENA.MIT.EDU (Greg Hudson)
Mon May 30 12:01:02 2005

From: Greg Hudson <ghudson@MIT.EDU>
To: testers@mit.edu
Content-Type: multipart/mixed; boundary="=-igV/a8Shgevr6GoYSsLQ"
Date: Mon, 30 May 2005 12:00:41 -0400
Message-Id: <1117468841.3570.107.camel@egyptian-gods.mit.edu>
Mime-Version: 1.0


--=-igV/a8Shgevr6GoYSsLQ
Content-Type: text/plain
Content-Transfer-Encoding: 7bit

I've seen this behavior on equal-rites, leading me to suspect a problem
with ssh in the release.  I can't trigger the problem at will, and it's
pretty sporadic, so I haven't collected much information.

David Glasser's comment suggests this might not be a new problem, but I
don't think I've ever seen it in 9.3, so I suspect something about RHEL
4 makes it worse.


--=-igV/a8Shgevr6GoYSsLQ
Content-Disposition: inline
Content-Description: Forwarded message - flaky sshd on quiche-lorraine?
Content-Type: message/rfc822

Return-Path: <belmonte@MIT.EDU>
Received: from po12.mit.edu ([unix socket]) by po12.mit.edu (Cyrus v2.1.5)
	with LMTP; Mon, 30 May 2005 11:47:07 -0400
X-Sieve: CMU Sieve 2.2
Received: from biscayne-one-station.mit.edu by po12.mit.edu (8.12.4/4.7) id
	j4UFl3El011407; Mon, 30 May 2005 11:47:03 -0400 (EDT)
Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by
	biscayne-one-station.mit.edu (8.12.4/8.9.2) with ESMTP id j4UFksdn024675;
	Mon, 30 May 2005 11:46:54 -0400 (EDT)
Received: from coleco-sidewinder.mit.edu (COLECO-SIDEWINDER.MIT.EDU
	[18.187.2.149]) (authenticated bits=56) (User authenticated as
	belmonte@ATHENA.MIT.EDU) by outgoing.mit.edu (8.12.4/8.12.4) with ESMTP id
	j4UFklwe009514 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256
	verify=NOT); Mon, 30 May 2005 11:46:48 -0400 (EDT)
Received: (from belmonte@localhost) by coleco-sidewinder.mit.edu (8.12.9)
	id j4UFklMQ029461; Mon, 30 May 2005 11:46:47 -0400 (EDT)
Date: Mon, 30 May 2005 11:46:47 -0400 (EDT)
Message-Id: <200505301546.j4UFklMQ029461@coleco-sidewinder.mit.edu>
From: belmonte@MIT.EDU (Matthew Belmonte)
To: sipb-office@mit.edu
Subject: flaky sshd on quiche-lorraine?
X-Spam-Score: -1.959
X-Spam-Flag: NO
X-Scanned-By: MIMEDefang 2.42
Mime-Version: 1.0

When I try to ssh into quiche-lorraine or to scp files from quiche-lorraine,
about a third of the time the command hangs for a few seconds and then says
that quiche-lorraine closed the connection.  The rest of the time it works:

coleco-sidewinder% scp -p keesh:/tmp/mkb /tmp
Connection to keesh closed by remote host.
coleco-sidewinder% scp -p keesh:/tmp/mkb /tmp
Connection to keesh closed by remote host.
coleco-sidewinder% scp -p keesh:/tmp/mkb /tmp
Connection to keesh closed by remote host.
coleco-sidewinder% scp -p keesh:/tmp/mkb /tmp
mkb                  100% |*****************************|   952       00:00    



keesh's log indicates that authorisation was granted on each try:

May 30 11:20:02 quiche-lorraine sshd[27479]: Authorized to belmonte, krb5 principal belmonte@ATHENA.MIT.EDU (krb5_kuserok)
May 30 11:20:02 quiche-lorraine sshd[27479]: Accepted external-keyx for belmonte from 18.187.2.149 port 49986 ssh2
May 30 11:20:11 quiche-lorraine sshd[27493]: Authorized to belmonte, krb5 principal belmonte@ATHENA.MIT.EDU (krb5_kuserok)
May 30 11:20:11 quiche-lorraine sshd[27493]: Accepted external-keyx for belmonte from 18.187.2.149 port 49987 ssh2
May 30 11:20:19 quiche-lorraine sshd[27495]: Authorized to belmonte, krb5 principal belmonte@ATHENA.MIT.EDU (krb5_kuserok)
May 30 11:20:19 quiche-lorraine sshd[27495]: Accepted external-keyx for belmonte from 18.187.2.149 port 49988 ssh2
May 30 11:20:27 quiche-lorraine sshd[27497]: Authorized to belmonte, krb5 principal belmonte@ATHENA.MIT.EDU (krb5_kuserok)
May 30 11:20:27 quiche-lorraine sshd[27497]: Accepted external-keyx for belmonte from 18.187.2.149 port 49989 ssh2



I first noticed this problem during the time when quiche-lorraine's AFS was
broken; I don't know whether it existed before that.  Restarting sshd doesn't
alter the problem.  If anyone has any bright ideas as to what may be causing
this, please share them.  Otherwise I'll just look into it further when I'm
motivated.


--=-igV/a8Shgevr6GoYSsLQ--

home help back first fref pref prev next nref lref last post