[616] in linux-scsi channel archive
scsi generic driver deadlock - with possible fix.
daemon@ATHENA.MIT.EDU (Michael Morrison)
Wed Sep 13 21:23:17 1995
From: mike@ringo.reno.nv.us (Michael Morrison)
To: linux-scsi@vger.rutgers.edu (linux scsi)
Date: Wed, 13 Sep 1995 13:07:28 -0700 (PDT)
Hi all,
Bug report for scsi generic driver, with fix.
I have a test scenario which causes the scsi generic driver to deadlock.
The behavior has been observed in 1.2.10 and 1.3.20, using the aic7xxx
driver and an AHA 3940W, but couls also happen in all kernels that have the
generic driver. It's very possible that this problem has not been seen,
or is rarely seen as the timing of the interaction described below is
critical for the deadlock to occur.
Here it is:
I write() a scsi command to the generic driver.
I then read() the scsi driver to get status or data.
in linux/drivers/scsi/sg.c -
In the sg_read() function, the code makes the desision to
call interruptable_sleep_on() because the command is not complete.
Somehere between the desision to call interruptible_sleep_on
and the actual call, the command completes and the sg_command_done()
function is called at interrupt level which calls wake_up() to wake
up the possibly sleeping process. After the interrupt completes,
sg_read() continues and calls interruptible_sleep_on(). The process
is now deadlocked.
The fix.
in sg.c function sg_read, turn off interrupts before deciding to
call interruptible_sleep_on. The read_wait wait queue is manipulated
at interrupt level!
static int sg_read(struct inode *inode,struct file *filp,char *buf,int count)
{
int dev=MINOR(inode->i_rdev);
unsigned long flags; <<---- ADD THIS
int i;
struct scsi_generic *device=&scsi_generics[dev];
if ((i=verify_area(VERIFY_WRITE,buf,count)))
return i;
/*
* Wait until the command is actually done.
*/
save_flags( flags ); <<---- ADD THIS
cli(); <<---- ADD THIS
while(!device->pending || !device->complete)
{
if (filp->f_flags & O_NONBLOCK)
{
restore_flags( flags ); <<-- ADD THIS
return -EWOULDBLOCK;
}
interruptible_sleep_on(&device->read_wait);
if (current->signal & ~current->blocked)
{
restore_flags( flags ); <<-- ADD THIS
return -ERESTARTSYS;
}
}
restore_flags( flags ); <<-- ADD THIS
....
In my test, the generic driver would deadlock in about 5 minutes of
operation. After the fix, it has been running for about 4 hours.
Conclusion, the fix is not conclusive. But *looks* like it fixed the
problem.
Cheers
mike
mike@ringo.reno.nv.us