[2510] in linux-scsi channel archive

home help back first fref pref prev next nref lref last post

Re: HP SCSI DAT problems..

daemon@ATHENA.MIT.EDU (Pete Popov)
Mon Sep 22 13:37:30 1997

To: Ivo Clarysse <soggie@riv.be>
cc: linux-scsi@vger.rutgers.edu
In-reply-to: Your message of "Mon, 22 Sep 1997 15:16:42 +0200."
             <Pine.LNX.3.95.970922151254.12927A-100000@daffodil.riv.be> 
Date: 	Mon, 22 Sep 1997 09:39:45 -0700
From: Pete Popov <pete@jones.asd.sel.sony.com>


Hello Ivo,

I've inserted some comments here and there. I hope they help.

> I can't get my system to work with my HP SCSI DAT drive.. :-(((

> Attached devices: 
> Host: scsi0 Channel: 00 Id: 04 Lun: 00
>   Vendor: HP       Model: HP35470A         Rev: T503
>   Type:   Sequential-Access                ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 05 Lun: 00
>   Vendor: QUANTUM  Model: FIREBALL ST3.2S  Rev: 0F04
>   Type:   Direct-Access                    ANSI SCSI revision: 02
> Host: scsi0 Channel: 00 Id: 06 Lun: 00
>   Vendor: QUANTUM  Model: FIREBALL ST3.2S  Rev: 0F04
>   Type:   Direct-Access                    ANSI SCSI revision: 02
> 
> Also tried with an HP C1599A DDS-2, which reports as:
>   Vendor: HP       Model: C1533A           Rev: A612
> 
> The system is a RedHat/Intel 4.2 system,
> kernel-2.0.30-3.i386.rpm (redhat 4.2 update from RedHat)
> mt-st-0.4-2 (original redhat 4.2 distribution)

> Problem description:
> --------------------
> 
> I can't do large backups (larger than a couple of megabytes)..  this 
> results in:
> 
> kernel: scsi : aborting command due to timeout : pid 204223, scsi0, channel 0, 
> id 5, lun 0 Read (6) 01 c0 53 02 00
> kernel: aic7xxx: (abort) Aborting scb 1, TCL 5/0/0
> kernel: scsi : aborting command due to timeout : pid 204224, scsi0, channel 0, 
> id 5, lun 0 Read (6) 05 68 60 02 00
> kernel: aic7xxx: (abort) Aborting scb 0, TCL 5/0/0
> kernel: scsi : aborting command due to timeout : pid 204223, scsi0, channel 0, 
> id 5, lun 0 Read (6) 01 c0 53 02 00
>                                                      ^^^^^^
>                    weird.. I don't have any processes running at that
>                 time with a PID above 9000..
> kernel: aic7xxx: (abort) Aborting scb 1, TCL 5/0/0
> kernel: SCSI host 0 abort (pid 204223) timed out - resetting
> kernel: SCSI bus is being reset for host 0 channel 0.

> Small test-backups work fine though..

Apparently it's the command to the hard disk that's timing out.
Unless this is a Linux driver bug, I suspect that the HP drive does not 
always disconnect from the bus, thus starving the rest of the SCSI devices.
Get an other SCSI adapter an attach the tape drive to it by itself.  I
think that's a good idea anyway, since tape devices are inherently slow
and will slow down your system significantly -- especially if the tape
device is not a good SCSI citizen (I've seen plently of those).

> Also, when I do 'mt erase', and then, after a few seconds, 'mt rewind' - the
> tape device locks up.

That's strange.  Are you putting the erase command in the background
and then issuing the rewind?  Otherwise, the device will not come back
with good status until the erase is done (unless it's erase "immediate"),
but at that time any command should be OK.  Perhaps the HP drive is not
done with the erase eventhough it returned good status??  Are any of
the leds still blinking eventhough the erase "completed"?  

In any case, there's really no reason to do an erase. I put a "date" mark
on each of my tapes:

mt -f /dev/st0 rewind
date > date.log
tar -cf /dev/nst0 date.log

That effectively writes the "date" at the beginning of tape, and
the rest of the tape is "erased".

> What is weird, when I reboot afterwards, and do an 'mt status'; I get
> "ONLINE IM_REP_EN"; then I do 'mt rewind', and immediately it says
> "BOT ONLINE IM_REP_EN" when I do another 'mt status'.. Shouldn't it
> take some time to rewind ?

No, it'a already at Beginning of Tape (BOT).

> Also, when I do an 'mt erase', and then an 'mt tell', it locks up.

> The SCSI bus is attached to the SCSI host adapter, which is terminated 
> (according to the manual).  The host adapter does not have an external SCSI 
p> connector.  The flatcable has an active terminator installed on the other
> end. (which was included with the HP NetServer E30).

>        +------+ +-----+ +-----+
>        | Tape | | HDD | | HDD |
>        +------+ +-----+ +-----+
>  [Term]----+-------+-------+-------[Host Adapter]

>   Term= Active Terminator

> I patched the aic7xxx.o driver to negotiate at 5MHz instead of 10MHz, but
> that didn't help.

> I tried out the different DIP switch settings for RS6000 instead of those
> suggested by HP for 'PC-based Unix systems' (which are the same
> as for WinNT/MS-DOS/..). That didn't help either.

Unlike other proprietary systems that restrict the use of "after market"
products (ie, products not directly purchased from the vendor of the 
system) by adding ridiculous settings to the system drivers, Linux
doesn't do that.  Thus, I suggest leaving the dip switches in their
default positions.

> I mailed HP SureStore support, but they refused to help me:
>     "Unfortunatly, The DAT isn't supported by us on your "flavor" of Unix."

May I suggest switching vendors :-) next time? 

> Any clues, or -even better- solutions ?  Please ?

Again, unless someone can help you patch the linux driver to 
not timeout for a very very long time, I think adding a second
SCSI adapter may be your best choice.  Try an NCR 53815; it's
what I have an it works quite well.


 Pete Popov
 Sony Electronics
 Advanced Storage Development
 3300 Zanker Rd, SJ3B2
 San Jose, CA 95134
 pete_popov@asd.sel.sony.com



home help back first fref pref prev next nref lref last post