[3138] in Release_Engineering
Dot disk 5 dropped off line
daemon@ATHENA.MIT.EDU (epeisach@MIT.EDU)
Sat Dec 18 19:43:00 1993
From: epeisach@MIT.EDU
Date: Sat, 18 Dec 93 19:42:25 -0500
To: builder@MIT.EDU, op@MIT.EDU, rel-eng@MIT.EDU
I noticed that the fileserver appeared hung on Dot (the server not the
person) so I tried restarting it. The fileserver went into disk wait and
didn't respond. Then dot hung up. It was on the network but wouldn't
respond so I said 'sh-t' and thought - great the root disk went off
line.
I got in and tried the three finger halt on dot - no response. I had to
power cycle and disk 5 didn't come back.
BTW: Dots disks need to be labelled.
I noticed then that there was no power to drive 5, but when I toched the
drive enclosure the power came back on. All the cables appeared to be
secure, but I double checked.
Theory the power failed on drive 5 which then hung the scsi bus on
trying to access it. Someone should keep an eye on this.
There was a complete salvage... There was one piece of damage:
12/18/93 19:22:52 system.sun4.srvd.78 (536979286) updated 12/09/93 13:11
12/18/93 19:22:56 dir vnode 25: invalid entry deleted:./SUNWabe/ps/ADVOSUG/Cred its (vnode 52, unique 44)
Can someone check on this?
Side effects - it looks like this process caused ahaxes vlserver to
restart.... (dumping core in the process).
I also took the liberty of restarting the kaserver and ptserver on ajax
so that the processes no longer thought about the old minos. (the config
file was changed a while ago but ajax has been too reliable).
Status:
Dot: All ok (with the sole missing file)
Machine room alarm: Armed
--Ezra