[89] in linux-scsi channel archive
Kernel panic (non-fatal) with AHA1542B in 1.1.94 sd_open
daemon@ATHENA.MIT.EDU (Al Longyear)
Wed Mar 1 17:25:14 1995
Date: Wed, 1 Mar 1995 08:58:24 -0800
To: linux-scsi@vger.rutgers.edu
From: longyear@netcom.com (Al Longyear)
I have having a large amount to trouble getting my Adaptec 1542b SCSI
controller to work with the secondary drives and the newer kernels.
It will work correctly with the kernel that I was using 1.1.85 kernel
prior to the need to rebuild the libc code. According to H.J., I need
a new kernel, so, Ok, I'll update the kernel. Now the new kernel will
not run.
I thought that this may be the same problem that other have with fsck
of the new ext2fs' fsck. It dies in same manner wether or not I do
the fsck operations in parallel or in serial.
The kernels generate the following panic:
[leading insignificant messages cut.]
Configuring Adaptec at IO:330, IRQ 11, DMA priority 5
scsi0 : Adaptec 1542
scsi : 1 host.
Vendor: QUANTUM Model: LP240S GM240S01X Rev: 4.6
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, id 1, lun 0
Vendor: TOSHIBA Model: CD-ROM XM-3301TA Rev: 2411
Type: CD-ROM ANSI SCSI revision: 02
Detected scsi CD-ROM sr0 at scsi0, id 4, lun 0
scsi : detected 1 SCSI cdrom 1 SCSI disk total.
SCSI Hardware sector size is 512 bytes on device sda
Memory: 15148k/16384k available (640k kernel code, 384k reserved, 212k data)
[other insignificant messages cut.]
Linux version 1.1.94 (root@longyear) (gcc version 2.6.3) #2 Sat Feb 25
12:43:24 PST 1995
Partition check:
sda: sda1
hda: WDC AC2540H, 515MB w/128KB Cache, CHS=1048/16/63, MaxMult=16
hda: hda1 hda2 hda3 < hda5 >
VFS: Mounted root (ext2 filesystem) readonly.
At this point the system runs the rc scripts and tries to validate the file
systems listed in fstab. The first partition is a normal IDE controller and
this will work correctly. The second partition is the SCSI controller and
this dies with the following panic from the kernel drivers.
sd_sizes is in the eax register at the time of the trap. The edx has the
dev number which is 1.
Unable to handle kernel NULL pointer dereference at virtual address c0000004
current->tss.cr3 = 00f03000, %cr3 = 00f03000
*pde = 00102067
*pte = 00000027
Oops: 0000
EIP: 0010:00196fc8
EFLAGS: 00010246
eax: 00000000 ebx: 00000000 ecx: 0008a7e0 edx: 00000001
esi: 00000000 edi: 0008ca80 ebp: 00f08000 esp: 00f55f5c
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process fsck.ext2 (pid: 9, process nr: 3, stackpage=00f55000)
Stack: 0008ca80 00000003 00000000 0012460c 0008ca80 0008a7e0 0008a7e0 00122bd3
0008ca80 0008a7e0 00000000 00000001 00000002 bfffff4f 0008ca80 00122c6a
00f08000 00000002 00000001 00f12000 0002a000 bfffff59 00f08000 0010efa1
Call Trace: 0012460c 00122bd3 00122c6a 0010efa1
Code: 83 3c 90 00 74 91 8b 0d 4c b4 1a 00 89 f2 c1 e2 04 8b 44 11
The kernel continues to run. It just terminates the fsck function and the
system continues to run with the usual message that you must reboot
immediately.
The problem lies in the function sd_open. By the time that it is called
to open the drive, the sd_sizes pointer is zero.
The problem is that the pointer should have been initialized in the sd_init
function.
I can not use the current kernels with this code in place. Do you have
any idea what may be wrong with the changes? I suppose that I could put
back the older sd.c code and get the current kernels to work but that is
not the 'proper' method to use the Linux kernels.
I have thought of changing the rc file's fsck operation so that it validated
only the root partition, mounted it read/write, then mounted and unmounted
the secondary partition, did the fsck operation, then mounted it back. The
trap does not seem to occur if I mount the partition first as I unmounted it
and then manually ran the fsck operation without the problem.
Does anyone have any good idea what may have changed since 1.1.84 (my prior
WORKING kernel version)?
I am willing to supply the trace information if this will help.
The /etc/fstab which I use is:
/dev/hda1 / ext2 defaults 0 1
none /proc proc defaults 0 0
/dev/sda1 /mnt ext2 defaults 0 2
sd.c has the following code for the sd_open function.
static int sd_open(struct inode * inode, struct file * filp)
{
int target;
target = DEVICE_NR(MINOR(inode->i_rdev));
if(target >= sd_template.dev_max || !rscsi_disks[target].device)
return -ENXIO; /* No such device */
/* Make sure that only one process can do a check_change_disk at one time.
This is also used to lock out further access when the partition table is
being re-read. */
while (rscsi_disks[target].device->busy);
if(rscsi_disks[target].device->removable) {
check_disk_change(inode->i_rdev);
if(!rscsi_disks[target].device->access_count)
sd_ioctl(inode, NULL, SCSI_IOCTL_DOORLOCK, 0);
};
/*
* See if we are requesting a non-existent partition. Do this
* after checking for disk change.
*/
if(sd_sizes[MINOR(inode->i_rdev)] == 0) // IT DIES HERE!!!
return -ENXIO;
rscsi_disks[target].device->access_count++;
if (rscsi_disks[target].device->host->hostt->usage_count)
(*rscsi_disks[target].device->host->hostt->usage_count)++;
return 0;
}
The assembled code is:
200 0210 8B441808 movl 8(%eax,%ebx),%eax
201 0214 83C404 addl $4,%esp
202 0217 83781000 cmpl $0,16(%eax)
203 021b 7512 jne L236
204 021d 6A00 pushl $0
205 021f 68805300 pushl $21376 << this is the DOORLOCK
205 00
206 0224 6A00 pushl $0
207 0226 57 pushl %edi
208 0227 E8D4FDFF call _sd_ioctl
208 FF
209 022c 83C410 addl $16,%esp
210 L236:
211 022f 0FB65710 movzbl 16(%edi),%edx
212 0233 A1242700 movl _sd_sizes,%eax
212 00
213 0238 833C9000 cmpl $0,(%eax,%edx,4) <<<<< DEATH HERE
214 023c 7491 je L231
215 023e 8B0D9426 movl _rscsi_disks,%ecx
215 0000
216 0244 89F2 movl %esi,%edx
--
Al Longyear longyear@netcom.com
--
Al Longyear longyear@netcom.com longyear@sii.com
The public pgp 2.6 key is available by fingering longyear@netcom.com.