[651] in linux-net channel archive

home help back first fref pref prev next nref lref last post

Re: Bad c'sums from csum_partial_copy()

daemon@ATHENA.MIT.EDU (Tom May)
Tue Jul 11 23:31:08 1995

Date: Mon, 10 Jul 1995 10:46:06 -0700
From: ftom@netcom.com (Tom May)
To: gpg109@rsphy1.anu.edu.au
CC: gpg109@rsphy1.anu.edu.au, iialan@iifeak.swan.ac.uk,
        torvalds@cs.helsinki.fi, linux-net@vger.rutgers.edu
In-reply-to: <9507100554.AA29755@rsphy9.anu.edu.au> (message from Paul Gortmaker on Mon, 10 Jul 1995 15:54:07 +1000 (EST))

>----------------------------------------------------------
>Now here are the results of the above test, for 2 target machines.
>Here is an alpha on the same subnet as testbox:
>----------------------------------------------------------
>testbox:~> telnet rsphy9
>Trying 150.203.15.148...
>Connected to rsphy9.anu.edu.au.
>Escape character is '^]'.
>eth0: bad csum: 0x1E62C350, actual=0x1C5FC553, len=23.

These are partial checksums, which means the actual sum is low word + hi
word with end-around carry:

C350 + 1E62 = E1B2
C553 + 1C5F = E1B2

so they match, no problem.

>OSF/1 (rsphy9) (ttyq1)

>login: gpg109
>Password:
>Last login: Mon Jul 10 15:10:19 from testbox.anu.edu.au
>----------------------------------------------------------
>It seems fine after only one bad c'sum. 
>Now watch when I try a sun box in another building. (again user=gpg109)
>----------------------------------------------------------
>testbox:~> telnet csc
>Trying 150.203.2.12...
>Connected to huxley.anu.edu.au.
>Escape character is '^]'.
>eth0: bad csum: 0x2596745F, actual=0x2596432A, len=24.
>eth0: bad csum: 0x7FC42409, actual=0x4DAA1C16, len=23.
>eth0: bad csum: 0xB0D55479, actual=0x90ABDC14, len=20.

By inspection, the first example has different 16-bit csums.  Ok,
something is broken.  (Like you didn't know already.)

I suspect the problem is elsewhere in the new net code (i.e., not the
checksum routines), which breaks when the other machine isn't on the
same subnet.  I have hacked in a similar printk to yours, only with a
partial csum to actual csum conversion, and I can talk to an NT
machine on my subnet with samba, ftp, and telnet and all the
precalculated csums match (I am printing them all out), including
packets of sizes 20, 21, 22, 23, and 24.  Likewise, when you are
talking to your alpha everything is working.  Since I am just runnning
a local net, I can't try connecting to anything more remote.

I could send you a copy of checksum.c which is fully reverted to the
1.3.0 snapshot code and builds the csum+copy routines on top of
csum_partial, if you want to be convinced the problem is not the csum
routines.  I suppose I'd have to throw in checksum.h, too.  Let me
know if you're interested.

Tom.


home help back first fref pref prev next nref lref last post