[1491] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

Re: elevator sorting for the scsi subsystem.

daemon@ATHENA.MIT.EDU (Leonard N. Zubkoff)
Sun Mar 2 15:54:06 1997

Date: 	Sun, 2 Mar 1997 12:44:37 -0800
From: "Leonard N. Zubkoff" <lnz@dandelion.com>
To: Dario_Ballabio@milano.europe.dg.com
CC: linux-scsi@vger.rutgers.edu
In-reply-to: <199703022017.VAA21095@milano.europe.dg.com>
	(Dario_Ballabio@milano.europe.dg.com)

  Date: Sun, 2 Mar 1997 21:17:04 +0100
  From: Dario_Ballabio@milano.europe.dg.com

  If a request writes sectors 2 and 3, filling with 1's the two blocks
  and then another request writes sectorr 1 and 2 filling with 0's,
  at the end sector 2 shoud be filled with 0's. If we sort the two
  requests or if the drive reorder them, at the end sector 2 is
  filled with 1's, which is not the expected result.
  I cheched that there are (rare) cases of overlapping write requests
  in normal operations, tipically at the very beginning of the
  disk partition. My concern is that if we use simple queue tags
  for write requests, cases as the above are likely to happen soon
  or later, causing randomic disk corruptions.
  I belive that we can safely sort write requests only provided that
  there are no overlapping requests in the batch to be sorted.
  In any case I would always use ordered queue tags for write requests.

If overlapping I/O's are generated and the driver sorts the requests, then you
are correct that we could have a problem, but I don't think you can convince
the kernel to rewrite sector 2 until the first I/O writing sectors 2 and 3 is
completed.  Once an I/O request is made in make_request, the buffer header is
locked (preventing further requests) until the I/O completes in
end_scsi_request.

However, even this wouldn't be a problem if the disk does the reordering,
though it is a problem if the driver or disk controller does.  Unless the disk
is specifically allowed to reorder the effect of commands by setting the Queue
Algorithm Modifier bit in the Control Mode Page, it will prevent the case above
(see below for excerpt from SCSI-2 spec).

I've been running systems with tagged queuing in this fashion, even with the
Queue Algorithm Modifier bit set to 1, for almost two years now, and have never
had such a corruption problem.  And ordered queue tags are only generated when
necessary to avoid starvation.

If there is a special case where we can generate overlapping write requests as
you indicate, I think it would be better to prevent this special case from
occurring than constrain our ability to use tagged queuing.

		Leonard




The queue algorithm modifier field (see table 97) specifies restrictions
on the algorithm used for reordering commands that are tagged with the
SIMPLE QUEUE TAG message.

                     Table 97 - Queue algorithm modifier
+===========-====================================+
|   Value   |  Definition                        |
|-----------+------------------------------------|
|     0h    |  Restricted reordering             |
|     1h    |  Unrestricted reordering allowed   |
|  2h - 7h  |  Reserved                          |
|  8h - Fh  |  Vendor-specific                   |
+================================================+


A value of zero in this field specifies that the target shall order the
actual execution sequence of the commands with a SIMPLE QUEUE tag such
that data integrity is maintained for that initiator.  This means that,
if the transmission of new commands is halted at any time, the final value
of all data observable on the medium shall have exactly the same value as
it would have if the commands had been executed in the same received
sequence without tagged queuing.  The restricted reordering value shall
be the default value.

A value of one in this field specifies that the target may reorder the
actual execution sequence of the commands with a SIMPLE QUEUE tag in any
manner.  Any data integrity exposures related to command sequence order
are explicitly handled by the initiator through the selection of
appropriate commands and queue tag messages. 

home help back first fref pref prev next nref lref last post