[440] in linux-scsi channel archive
Generic Scsi Driver Bug Report and comments.
daemon@ATHENA.MIT.EDU (Michael Morrison)
Sat Jul 29 10:07:30 1995
From: mike@ringo.reno.nv.us (Michael Morrison)
To: linux-scsi@vger.rutgers.edu (linux scsi)
Date: Fri, 28 Jul 1995 23:10:19 -0700 (PDT)
This is a combination bug report and some general comments on
the scsi generic driver. I'll summarize the bugs at the end.
IMHO the generic scsi driver interface is a little clumsy. The following
comments point out these problems. I intend to make the changes
to the sg driver we use to provide a better interface and device mapping
stratagy. I will follow this message up with a formal specification.
But before I do, I'd like comments from sg driver users. Do you agree
with the following comments? Would having another sg driver that is
based on the existing one but with a slightly different interface and
better error reporting be desirable?
1. Dynamic device mapping
From the scsi programming howto:
3.3. Device Mapping
These device files are dynamically mapped to SCSI id/LUNs on your SCSI
bus (LUN = logical unit). The mapping allocates devices consecutively
for each LUN of each device on each SCSI bus found at time of the SCSI
scan, beginning at the lower LUNs/ids/buses. It starts with the first
SCSI controller and continues without interruption with all following
controllers. This is currently done in the initialization of the SCSI
driver.
For example, assuming you had three SCSI devices hooked up with ids 1,
3, and 5 on the first SCSI bus (each having one LUN), then the
following mapping would be in effect:
/dev/sga -> SCSI id 1
/dev/sgb -> SCSI id 3
/dev/sgc -> SCSI id 5
If you now add a new device with id 4, then the mapping (after the
next rescan) will be:
/dev/sga -> SCSI id 1
/dev/sgb -> SCSI id 3
/dev/sgc -> SCSI id 4
/dev/sgd -> SCSI id 5
I believe this method of device mapping is not the best way.
A better way would to map devices as follows:
/dev/sg00 -> Controller 0, Id 0
/dev/sg01 -> Controller 0, Id 1
/dev/sg10 -> Controller 1, Id 0
/dev/sg11 -> Controller 1, Id 1
etc.
The problem with the current mapping is this:
I have one machine with hard disks at SCSI id 0 and 1
and another machine with one drive at id 0.
We typically assign an "exotic" SCSI device at id 6.
On machine 1 the "exotic" device is mapped to /dev/sgc
and on machine 2 the "exotic" device is mapped to /dev/sgb.
My application code wants to talk to the "exotic" device at
SCSI id 6, but there is no way to assertain which generic device
name this is mapped to without scanning the whole bus. Our application
software typically has command line arguments for controller number
and target id. It is impossible to translate controller number and
target id to a device file name in the current scheme.
The Linux generic driver device mappings is also very different
from other workstation's generic drivers (sgi, HP) which map devices
to to specific device names based on controller number and target id.
2. Generic Driver Interface.
The interface requires the application to fill in a buffer
with the sg_header structure, followed by the scsi command string, then
the data to be transfered. Our application code runs on top of a
'very generic' level of code that has functions like scsiWrite10,
scsiWrite6, inquiry etc. These functions must interface to the generic
scsi drivers on a number of different workstations, (Linux, sgi, HP,
DOS ASPI). This means that the application level code can't know
about the underlying generic driver's implementation. The problem is
that I must copy the data buffer from the application code to the
a buffer destined for the generic driver. This of course causes
a less efficient data transfer. A better way is to add a pointer
to the sg_header structure that points at the application's data
buffer.
By examining the generic driver source code one can see
that it assigns a pointer to the data buffer passed to it after
doing some math to figure out the offset in the buffer. So why not
just give it a pointer and avoid the math. The same thing happens with
the command string that immediatly follows the sg_header.
3. Error propagation
Error propagation is lacking because the error code returned up from
the lower level drivers is cleared in scsi generic driver's sg_read()
function. This seriously reduces the error information that gets
returned to the application. All you get is 16 bytes of sense data.
There is a discussion in the SCSI programming HOWTO that describes the
various errors that are set in the lower level scsi drivers, and describes
macros to dig various error bytes from the 32 bit error code, but there
is no facility in the generic driver's interface to get at these errors
from your application.
A solution is to add 2 fields to the sg_header structure.
target_status and initiator_status which can probagate to the application.
This bug can be fixed by commenting out the following line in sg_read.
device->header.result=0;
The only changes I am suggesting are a different device mapping scheme,
the addition of some additional fields to the sg_header structure and
propagation of errors.
Bug List: (1.3.9)
1. When the sg_command_done() function is called from the lower level
driver upon command completion, the sg_header.result field is
conditionally updated from the result of the lower level drivers
result. In sg_read(), sg_header.result is cleared.
To leave error data intact, comment out the line in sg_read():
This may or may not be a bug depending on how you look at it.
If the sense data is valid, you get sense data, otherwise you
get the result returned from the lower level driver. Personnaly,
I'd rather get as much error information as possible and and have
the option of ignoring it.
In sg_read ...
[snip]
while(!device->pending || !device->complete)
{
if (filp->f_flags & O_NONBLOCK)
return -EWOULDBLOCK;
interruptible_sleep_on(&device->read_wait);
if (current->signal & ~current->blocked)
return -ERESTARTSYS;
}
device->header.pack_len=device->header.reply_len;
device->header.result=0; <<<<<<<------- comment out.
if (count>=sizeof(struct sg_header))
[snip]
In sg_command_done...
[snip]
memcpy(device->header.sense_buffer, SCpnt->sense_buffer, sizeof(SCpnt->sense_buffer));
[Comment out this block]
if (SCpnt->sense_buffer[0])
{
device->header.result=EIO;
}
2. Using the default configuration (SG_BIG_BUFF == 32768). A read
or write of data buffers with size > 32256 < 32768 will fail.
This is because the code in sg_write rounds the buffer size up to
the next 512 byte boundary which is 32768 in this case. In sg_malloc
some code checks if the buffer size is < SG_BIG_BUFF. Since the
buffer size is == SG_BIG_BUFF, the sg_malloc function fails.
The solution is to change the line in sg_malloc:
static char *sg_malloc(int size)
{
if (size<=4096)
return (char *) scsi_malloc(size);
#ifdef SG_BIG_BUFF
if (size<SG_BIG_BUFF) <<<<----- change to if (size <= SG_BIG_BUFF)
{
while(big_inuse)
{
[snip]