[316] in linux-net channel archive
Re: ne2000 and 1.2.8
daemon@ATHENA.MIT.EDU (Paul Gortmaker)
Sat May 13 01:20:55 1995
From: Paul Gortmaker <gpg109@rsphy1.anu.edu.au>
To: rj@rainbow.in-berlin.de (Robert Joop)
Date: Sat, 13 May 1995 14:44:01 +1000 (EST)
Cc: linux-net@vger.rutgers.edu
In-Reply-To: <m0s9x8n-000fuPC@rainbow.in-berlin.de> from "Robert Joop" at May 12, 95 06:00:36 pm
> sadly, the ne2000 still appears to be broken. last night my machine
> hung again. but this time, the last line on the console didn't read
> eth0: DMAing conflict in ne_block_output.[DMAstat:2][irqlock:1][intr:0]
> as it did every time the machine hung before, but instead it read
>
> eth0: Tx access conflict. irq=0 lock=1 tx1=0 tx2=0 last=20
Yes, the lock=1 message for the ne2000 is fatal. I am aware of that,
and trying to figure out the reason why. I have been only able to
cause it to happen 3 times since 1.2.8 was released, which makes it
very hard to track down. Please stay tuned. In the meantime, here is
a "band-aid" fix that I am presently using, which makes the above
non-fatal. Note that this condition is relatively hard to trigger
(it takes me about a *week* of abuse with cross-mounted NFS servers
plus TCP/IP traffic to cause it to appear.) As you can see from
the above, it appears that it is two back to back dev_queue_xmit()
calls (in dev.c) on an otherwise idle transmitter that cause
the problem. Hence bursts of traffic followed by idle time may
prove to be the trigger.
This patch also enables some extra debugging info for the ne2k. If
you use it to avoid the (hopefully) hard to achieve "lock=1" hangs,
please mail me any eth0 printk's that you get. When the "lock=1"
happens, you won't hang, but you will get a dump of some timevalues
that will help me reconstruct the scenario.
NB: As this is just a "band-aid" fix, I don't want or expect this
patch to go into 1.2.9 -- however, as a stop-gap measure, we could
use something like this in 1.2.9 if *lots* of people hit it.
dev->name, dev->interrupt, ei_local->irqlock, ei_local->tx1,
ei_local->tx2, ei_local->lasttx);
- restore_flags(flags);
+ if (!dev->mem_start && ei_local->irqlock) {
+ restore_flags(flags);
+ ei_reset_8390(dev);
+ NS8390_init(dev, 1);
+ } else
+ restore_flags(flags);
return 1;
}
Paul.
diff -ur /opie/linux/drivers/net/8390.c linux/drivers/net/8390.c
--- /opie/linux/drivers/net/8390.c Sat Apr 29 16:49:58 1995
+++ linux/drivers/net/8390.c Tue May 9 19:38:42 1995
@@ -185,7 +185,16 @@
printk("%s: Tx access conflict. irq=%d lock=%d tx1=%d tx2=%d last=%d\n",
dev->name, dev->interrupt, ei_local->irqlock, ei_local->tx1,
ei_local->tx2, ei_local->lasttx);
- restore_flags(flags);
+ if (ei_local->irqlock) {
+ printk("dma=%d tx=%d dir=%d lstop=%ld start=%ld stop=%ld now=%ld\n",
+ ei_local->dmaing, ei_local->txing, ei_local->lastdma,
+ ei_local->laststop, ei_local->dmastart, ei_local->dmastop,
+ jiffies);
+ restore_flags(flags);
+ ei_reset_8390(dev);
+ NS8390_init(dev, 1);
+ } else
+ restore_flags(flags);
return 1;
}
diff -ur /opie/linux/drivers/net/8390.h linux/drivers/net/8390.h
--- /opie/linux/drivers/net/8390.h Tue May 9 18:01:40 1995
+++ linux/drivers/net/8390.h Tue May 9 19:34:26 1995
@@ -56,6 +56,10 @@
unsigned char reg0; /* Register '0' in a WD8013 */
unsigned char reg5; /* Register '5' in a WD8013 */
unsigned char saved_irq; /* Original dev->irq value. */
+ unsigned char lastdma; /* Direction of last DMA (1=Rx,2=Tx) */
+ unsigned long dmastart; /* jiffies of last DMA start. */
+ unsigned long dmastop; /* jiffies of last DMA stop. */
+ unsigned long laststop; /* jiffies of 2nd last DMA stop. */
/* The new statistics table. */
struct enet_statistics stat;
};
diff -ur /opie/linux/drivers/net/ne.c linux/drivers/net/ne.c
--- /opie/linux/drivers/net/ne.c Tue May 9 18:01:44 1995
+++ linux/drivers/net/ne.c Tue May 9 19:46:11 1995
@@ -86,7 +86,7 @@
#define NESM_START_PG 0x40 /* First page of TX buffer */
#define NESM_STOP_PG 0x80 /* Last page +1 of RX ring */
-#define NE_RDC_TIMEOUT 0x02 /* Max wait in jiffies for Tx RDC */
+#define NE_RDC_TIMEOUT 0x01 /* Max wait in jiffies for Tx RDC */
int ne_probe(struct device *dev);
static int ne_probe1(struct device *dev, int ioaddr);
@@ -368,6 +368,7 @@
return 0;
}
ei_status.dmaing |= 0x02;
+ ei_status.dmastart = jiffies;
outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, nic_base+ NE_CMD);
outb_p(count & 0xff, nic_base + EN0_RCNTLO);
outb_p(count >> 8, nic_base + EN0_RCNTHI);
@@ -409,6 +410,9 @@
}
#endif
outb_p(ENISR_RDC, nic_base + EN0_ISR); /* Ack intr. */
+ ei_status.lastdma = 0x01; /* Last was a Rx */
+ ei_status.laststop = ei_status.dmastop;
+ ei_status.dmastop = jiffies;
ei_status.dmaing &= ~0x03;
return ring_offset + count;
}
@@ -439,6 +443,7 @@
return;
}
ei_status.dmaing |= 0x04;
+ ei_status.dmastart = jiffies;
/* We should already be in page 0, but to be safe... */
outb_p(E8390_PAGE0+E8390_START+E8390_NODMA, nic_base + NE_CMD);
@@ -510,6 +515,9 @@
}
outb_p(ENISR_RDC, nic_base + EN0_ISR); /* Ack intr. */
+ ei_status.lastdma = 0x02; /* Last was a Tx */
+ ei_status.laststop = ei_status.dmastop;
+ ei_status.dmastop = jiffies;
ei_status.dmaing &= ~0x05;
return;
}