[440] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

Generic Scsi Driver Bug Report and comments.

daemon@ATHENA.MIT.EDU (Michael Morrison)
Sat Jul 29 10:07:30 1995

From: mike@ringo.reno.nv.us (Michael Morrison)
To: linux-scsi@vger.rutgers.edu (linux scsi)
Date: Fri, 28 Jul 1995 23:10:19 -0700 (PDT)


This is a combination bug report and some general comments on
the scsi generic driver.  I'll summarize the bugs at the end.

IMHO the generic scsi driver interface is a little clumsy.  The following
comments point out these problems.  I intend to make the changes
to the sg driver we use to provide a better interface and device mapping
stratagy.  I will follow this message up with a formal specification.
But before I do, I'd like comments from sg driver users. Do you agree
with the following comments?  Would having another sg driver that is
based on the existing one but with a slightly different interface and
better error reporting be desirable?  


1.  Dynamic device mapping

    From the scsi programming howto:

    3.3.  Device Mapping

    These device files are dynamically mapped to SCSI id/LUNs on your SCSI
    bus (LUN = logical unit). The mapping allocates devices consecutively
    for each LUN of each device on each SCSI bus found at time of the SCSI
    scan, beginning at the lower LUNs/ids/buses. It starts with the first
    SCSI controller and continues without interruption with all following
    controllers. This is currently done in the initialization of the SCSI
    driver.

    For example, assuming you had three SCSI devices hooked up with ids 1,
    3, and 5 on the first SCSI bus (each having one LUN), then the
    following mapping would be in effect:

    /dev/sga -> SCSI id 1
    /dev/sgb -> SCSI id 3
    /dev/sgc -> SCSI id 5

    If you now add a new device with id 4, then the mapping (after the
    next rescan) will be:

    /dev/sga -> SCSI id 1
    /dev/sgb -> SCSI id 3
    /dev/sgc -> SCSI id 4
    /dev/sgd -> SCSI id 5


    I believe this method of device mapping is not the best way.
    A better way would to map devices as follows:

    /dev/sg00 -> Controller 0, Id 0
    /dev/sg01 -> Controller 0, Id 1
    /dev/sg10 -> Controller 1, Id 0
    /dev/sg11 -> Controller 1, Id 1
    etc.

    The problem with the current mapping is this:

    I have one machine with hard disks at SCSI id 0 and 1
    and another machine with one drive at id 0.
    We typically assign an "exotic" SCSI device at id 6.

    On machine 1 the "exotic" device is mapped to /dev/sgc
    and on machine 2 the "exotic" device is mapped to /dev/sgb.

    My application code wants to talk to the "exotic" device at
    SCSI id 6, but there is no way to assertain which generic device
    name this is mapped to without scanning the whole bus.  Our application
    software typically has command line arguments for controller number
    and target id.  It is impossible to translate controller number and
    target id to a device file name in the current scheme.

    The Linux generic driver device mappings is also very different
    from other workstation's generic drivers (sgi, HP) which map devices
    to to specific device names based on controller number and target id. 


2.  Generic Driver Interface.

    The interface requires the application to fill in a buffer
    with the sg_header structure, followed by the scsi command string, then
    the data to be transfered.  Our application code runs on top of a
    'very generic' level of code that has functions like scsiWrite10,
    scsiWrite6, inquiry etc. These functions must interface to the generic
    scsi drivers on a number of different workstations, (Linux, sgi, HP, 
    DOS ASPI).  This  means that the application level code can't know 
    about the underlying generic driver's implementation.  The problem is 
    that I must copy the data buffer from the application code to the 
    a buffer destined for the generic driver.  This of course causes 
    a less efficient data transfer.  A better way is to add a pointer
    to the sg_header structure that points at the application's data
    buffer. 
        By examining the generic driver source code one can see
    that it assigns a pointer to the data buffer passed to it after
    doing some math to figure out the offset in the buffer. So why not 
    just give it a pointer and avoid the math.  The same thing happens with
    the command string that immediatly follows the sg_header.


3.  Error propagation

    Error propagation is lacking because the error code returned up from
    the lower level drivers is cleared in scsi generic driver's sg_read() 
    function.  This seriously reduces the error information that gets
    returned to the application.  All you get is 16 bytes of sense data.

    There is a discussion in the SCSI programming HOWTO that describes the
    various errors that are set in the lower level scsi drivers, and describes
    macros to dig various error bytes from the 32 bit error code, but there 
    is no facility in the generic driver's interface to get at these errors
    from your application.

    A solution is to add 2 fields to the sg_header structure.
    target_status and initiator_status which can probagate to the application.

    This bug can be fixed by commenting out the following line in sg_read.

         device->header.result=0;



The only changes I am suggesting are a different device mapping scheme, 
the addition of some additional fields to the sg_header structure and
propagation of errors.




Bug List: (1.3.9)

    1.  When the sg_command_done() function is called from the lower level
        driver upon command completion, the sg_header.result field is 
        conditionally updated from the result of the lower level drivers
        result.  In sg_read(), sg_header.result is cleared.
        To leave error data intact, comment out the line in sg_read():
        This may or may not be a bug depending on how you look at it.
        If the sense data is valid, you get sense data, otherwise you 
        get the result returned from the lower level driver.  Personnaly,
        I'd rather get as much error information as possible and and have
        the option of ignoring it.

            In sg_read ...
            [snip]

            while(!device->pending || !device->complete)
             {
              if (filp->f_flags & O_NONBLOCK)
               return -EWOULDBLOCK;
              interruptible_sleep_on(&device->read_wait);
              if (current->signal & ~current->blocked)
               return -ERESTARTSYS;
             }
            device->header.pack_len=device->header.reply_len;
            device->header.result=0;   <<<<<<<------- comment out.
            if (count>=sizeof(struct sg_header))

            [snip]

            In sg_command_done...

            [snip]


            memcpy(device->header.sense_buffer, SCpnt->sense_buffer, sizeof(SCpnt->sense_buffer));

            [Comment out this block]
            if (SCpnt->sense_buffer[0])      
             {
              device->header.result=EIO;
             }

    2.  Using the default configuration (SG_BIG_BUFF == 32768). A read
        or write of data buffers with size > 32256 < 32768 will fail.
        This is because the code in sg_write rounds the buffer size up to 
        the next 512 byte boundary which is 32768 in this case.  In sg_malloc
        some code checks if the buffer size is < SG_BIG_BUFF.  Since the
        buffer size is == SG_BIG_BUFF, the sg_malloc function fails.

        The solution is to change the line in sg_malloc:

        static char *sg_malloc(int size)
         {
          if (size<=4096)
           return (char *) scsi_malloc(size);
        #ifdef SG_BIG_BUFF
          if (size<SG_BIG_BUFF)   <<<<----- change to if (size <= SG_BIG_BUFF)
           {
            while(big_inuse)
             {

             [snip]



home help back first fref pref prev next nref lref last post