[1126] in linux-net channel archive
Kernel trap in tcp_rcv
daemon@ATHENA.MIT.EDU (Corey Minyard)
Tue Sep 26 07:11:15 1995
Date: Mon, 25 Sep 95 23:46 EDT
From: Corey Minyard <minyard@wf-rch.cirr.com>
To: linux-net@vger.rutgers.edu, linux-kernel@vger.rutgers.edu
Reply-to: minyard@metronet.com
I have a problem I have seen using netscape across a ppp link using
1.3.28 (X11R6 ELF, ELF everything) on a 386/33 with 8MB RAM, AHA1542B
with 2 SCSI disks and a tape, PAS-16 soundcard, CDU31A CDROM, and a
28.8 modem. I upgraded from 1.3.25 (also running ELF everything) and
had not seen the problem then or before. It seems to happen when
netscape opens and closes a bunch of sockets rapidly (like when you go
to a page that has lots of small symbols as pictures). It is random,
I've only seen it a few times and I can pull up the same page after
the machine comes up and it works fine. Anyway, the panic is:
IMPOSSIBLE 3
Oops: 0002
EIP: 0010:00139b04
EFLAGS: 00010002
eax: 0000002a ebx: 04060923 ecx: 5fde0000 edx: 00000002
esi: 0031c808 edi: 0024f654 ebp: 001b2e10 esp: 00460ef0
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process netscape (pid: 1199, process nr: 27, stackpage=00460000)
Stack: 001a9f08 0024f428 0024f424 0024f654 0024f654 00000205 0023d019 0068002d
001cc660 00000001 0000002d 00000246 0024f438 00240214 0017b0b4 001cc660
00134547 0024f654 001b2e10 00000000 7b89f5c0 00000214 04060923 00000000
Call Trace: 0017b0b4 00134547 0012fccc 00114a59 0010a1bd 0018eb8f
Code: 89 38 52 9d fb 31 c0 5b 5e 5f 5d 83 c4 30 c3 90 c6 46 34 01
Aiee, killing interrupt handler
The traceback routines would be the following (no big surprises here):
001398ec T tcp_rcv
0017afa0 t ppp_doframe
00134160 T ip_rcv
0012fbcc T net_bh
00114a1c T do_bottom_half
0010a1b0 t handle_bottom_half
00108cdc t parse_options
Other times this has happened did not have the "IMPOSSIBLE 3" before
them. This panic point corresponds to the following assembly:
00139af3 <tcp_rcv+207> cli
00139af4 <tcp_rcv+208> movl %eax,(%edi)
00139af6 <tcp_rcv+20a> movl 0x64(%esi),%eax
00139af9 <tcp_rcv+20d> movl %eax,0x4(%edi)
00139afc <tcp_rcv+210> movl (%edi),%eax
00139afe <tcp_rcv+212> movl %edi,0x4(%eax)
00139b01 <tcp_rcv+215> movl 0x4(%edi),%eax
-> 00139b04 <tcp_rcv+218> movl %edi,(%eax)
00139b06 <tcp_rcv+21a> pushl %edx
00139b07 <tcp_rcv+21b> popf
00139b08 <tcp_rcv+21c> sti
00139b09 <tcp_rcv+21d> xorl %eax,%eax
00139b0b <tcp_rcv+21f> popl %ebx
00139b0c <tcp_rcv+220> popl %esi
00139b0d <tcp_rcv+221> popl %edi
00139b0e <tcp_rcv+222> popl %ebp
00139b0f <tcp_rcv+223> addl $0x30,%esp
00139b12 <tcp_rcv+226> ret
00139b13 <tcp_rcv+227> nop
00139b14 <tcp_rcv+228> movb $0x1,0x34(%esi)
00139b18 <tcp_rcv+22c> sti
which corresponds to the following "C" code in net/ipv4/tcp.c:
/* We may need to add it to the backlog here. */
cli();
if (sk->inuse)
{
-> skb_queue_tail(&sk->back_log, skb);
sti();
return(0);
}
sk->inuse = 1;
sti();
Which corresponds to the following inlined code:
extern __inline__ void skb_queue_tail(struct sk_buff_head *list_, struct sk_buff *newsk)
{
unsigned long flags;
struct sk_buff *list = (struct sk_buff *)list_;
save_flags(flags);
cli();
newsk->next = list;
newsk->prev = list->prev;
newsk->next->prev = newsk;
-> newsk->prev->next = newsk;
restore_flags(flags);
}
It seems that sk->back_log.prev is set to an invalid value (0x2a, in
this case, it has been 0 in others). I have looked at all the obvious
places in the code and I can't find where it happens unless it is a
race condition on shutdown (since flushing the backlog at shutdown is
not protected with cli()/sti()). I don't know the net code as well as
I used to; so I hope this helps someone track the problem down.
--
Corey Minyard Internet: minyard@metronet.com
Work: minyard@bnr.ca UUCP: minyard@wf-rch.cirr.com