[29032] in Hotline Meeting

home help back first fref pref prev next nref lref last post

tardis hdisk1

daemon@ATHENA.MIT.EDU (hartmans@MIT.EDU)
Fri Jul 7 16:45:01 1995

From: hartmans@MIT.EDU
Date: Fri, 7 Jul 1995 16:44:53 -0400
To: hotline@MIT.EDU
Cc: wdc@MIT.EDU, probe@MIT.EDU, op@MIT.EDU, cfields@MIT.EDU

	OK, after a long and arduous struggle, I fixed tardis so that
it boots.  After examining the error logs, I determined that the
problem was caused by bad block replacement on hdisk1, the root disk.
As blocks were replaced (and could not be read), large chunks of nulls
were written to files; one of the files with significant corruption
was /usr/lib/objrepos/PdAt, the list of predefined attributes.  This
file must have proper structure for a successful boot.

	I.E.  The problem was caused by AIX dealing with a bad hard
disk. This isn't surprising; hdisk1 is one of the ill-fated 857 Mb
disks that IBM issued several warnings about.  These disks are
notorious for failing.

	It would be a good idea to replace this disk.  Since we don't
want to reinstall, it will require coordination between myself and the
person who replaces the disk; the replacement will need to go in the
following order:

* Remove tardis's current hdisk0
* Insert the new disk with SCSI ID the same as hdisk0
* Migrate data from hdisk1 to hdisk0
* Remove hdisk1
* replace it with tardis's old hdisk0 using the SCSI ID of hdisk1

--Sam

home help back first fref pref prev next nref lref last post