[5357] in Hotline Meeting

home help back first fref pref prev next nref lref last post

themis disk controller

daemon@ATHENA.MIT.EDU (John Carr)
Sat Jun 8 21:57:58 1991

To: ops@ATHENA.MIT.EDU
Cc: hotline@ATHENA.MIT.EDU
Date: Sat, 08 Jun 91 21:57:32 EDT
From: John Carr <jfc@ATHENA.MIT.EDU>


Themis crashed tonight with disk errors.  They started around 6 PM:

[system had been up for a few days, normal NFS server messages]
Jun  8 18:16:07 themis vmunix: uda0: controller error, sa=800e<ERR>
Jun  8 18:16:17 themis vmunix: uda0: DMA burst size set to 4
Jun  8 18:16:17 themis vmunix: ra0: ra82, size = 1216665 sectors
Jun  8 18:16:17 themis vmunix: ra2: ra90, size = 2376153 sectors
Jun  8 18:16:17 themis vmunix: ra0: unit 0, nspt 57, group 1, ntpc 15, rctsize 912, nrpt 1, nrct 4
Jun  8 18:16:17 themis vmunix: ra2: unit 2, nspt 69, group 1, ntpc 13, rctsize 414, nrpt 1, nrct 4
Jun  8 20:20:47 themis vmunix: 9, group 1, ntpc 13, rctsize 414, nrpt 1, nrct 4
Jun  8 20:20:48 themis vmunix: ra3: unit 3, nspt 69, group 1, ntpc 13, rctsize 414, nrpt 1, nrct 4
Jun  8 20:20:49 themis vmunix: uda0: soft error datagram: unit 2: level 0 retry 0, lbn 1213352: - unknown code (??) (code 25, subcode 1)
Jun  8 20:20:50 themis vmunix: uda0: soft error datagram: unit 2: level 0 retry 0, lbn 1213353: - unknown code (??) (code 25, subcode 1)
Jun  8 20:20:50 themis vmunix: uda0: lost interrupt
Jun  8 20:20:51 themis vmunix: uda0: Sudden Death!!!
[normal reboot messages, including disk online notices]
Jun  8 20:21:11 themis vmunix: uda0: controller error, sa=8c20<ERR,STEP1,NV>
Jun  8 20:21:11 themis vmunix: uda0: DMA burst size set to 4
Jun  8 20:21:12 themis vmunix: ra0: ra82, size = 1216665 sectors
[repeat for ra1, ra2, ra3]
Jun  8 20:21:15 themis vmunix: ra0: unit 0, nspt 57, group 1, ntpc 15, rctsize 912, nrpt 1, nrct 4
[repeat for ra1, ra2, ra3]
Jun  8 20:21:17 themis vmunix: uda0: controller error, sa=800e<ERR>
Jun  8 20:21:18 themis vmunix: uda0: lost interrupt
Jun  8 20:21:19 themis vmunix: uda0: Sudden Death!!!
[more reboot and disk online messages]
Jun  8 20:21:38 themis vmunix: uda0: controller error, sa=8005<ERR,GO>
Jun  8 20:21:39 themis vmunix: uda0: DMA burst size set to 4
Jun  8 20:21:40 themis vmunix: ra0: ra82, size = 1216665 sectors
[repeat for ra1, ra2, ra3]
Jun  8 20:21:42 themis vmunix: ra0: unit 0, nspt 57, group 1, ntpc 15, rctsize 912, nrpt 1, nrct 4
[repeat for ra1, ra2, ra3]

It looks like it needs a new disk controller.

If someone wants to examine a better way to handle lost interrupts
than rebooting without syncing disks, this controller might be a good
way to do so (when moved to a non-server).


home help back first fref pref prev next nref lref last post