[89] in linux-scsi channel archive


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Kernel panic (non-fatal) with AHA1542B in 1.1.94 sd_open

daemon@ATHENA.MIT.EDU (Al Longyear)
Wed Mar 1 17:25:14 1995

Date: Wed, 1 Mar 1995 08:58:24 -0800
To: linux-scsi@vger.rutgers.edu
From: longyear@netcom.com (Al Longyear)

I have having a large amount to trouble getting my Adaptec 1542b SCSI
controller to work with the secondary drives and the newer kernels.

It will work correctly with the kernel that I was using 1.1.85 kernel
prior to the need to rebuild the libc code. According to H.J., I need
a new kernel, so, Ok, I'll update the kernel. Now the new kernel will
not run.

I thought that this may be the same problem that other have with fsck
of the new ext2fs' fsck. It dies in same manner wether or not I do
the fsck operations in parallel or in serial.

The kernels generate the following panic:

[leading insignificant messages cut.]

Configuring Adaptec at IO:330, IRQ 11, DMA priority 5
scsi0 : Adaptec 1542
scsi : 1 host.
  Vendor: QUANTUM   Model: LP240S GM240S01X  Rev: 4.6 
  Type:   Direct-Access                      ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, id 1, lun 0
  Vendor: TOSHIBA   Model: CD-ROM XM-3301TA  Rev: 2411
  Type:   CD-ROM                             ANSI SCSI revision: 02
Detected scsi CD-ROM sr0 at scsi0, id 4, lun 0
scsi : detected 1 SCSI cdrom 1 SCSI disk total.
SCSI Hardware sector size is 512 bytes on device sda
Memory: 15148k/16384k available (640k kernel code, 384k reserved, 212k data)

[other insignificant messages cut.]

Linux version 1.1.94 (root@longyear) (gcc version 2.6.3) #2 Sat Feb 25
12:43:24 PST 1995
Partition check:
  sda: sda1
  hda: WDC AC2540H, 515MB w/128KB Cache, CHS=1048/16/63, MaxMult=16
  hda: hda1 hda2 hda3 < hda5 >
VFS: Mounted root (ext2 filesystem) readonly.

At this point the system runs the rc scripts and tries to validate the file 
systems listed in fstab. The first partition is a normal IDE controller and 
this will work correctly. The second partition is the SCSI controller and 
this dies with the following panic from the kernel drivers.

sd_sizes is in the eax register at the time of the trap. The edx has the 
dev number which is 1.

Unable to handle kernel NULL pointer dereference at virtual address c0000004
current->tss.cr3 = 00f03000, %cr3 = 00f03000
*pde = 00102067
*pte = 00000027
Oops: 0000
EIP:    0010:00196fc8
EFLAGS: 00010246
eax: 00000000   ebx: 00000000   ecx: 0008a7e0   edx: 00000001
esi: 00000000   edi: 0008ca80   ebp: 00f08000   esp: 00f55f5c
ds: 0018   es: 0018   fs: 002b   gs: 002b   ss: 0018
Process fsck.ext2 (pid: 9, process nr: 3, stackpage=00f55000)
Stack: 0008ca80 00000003 00000000 0012460c 0008ca80 0008a7e0 0008a7e0 00122bd3 
       0008ca80 0008a7e0 00000000 00000001 00000002 bfffff4f 0008ca80 00122c6a 
       00f08000 00000002 00000001 00f12000 0002a000 bfffff59 00f08000 0010efa1 
Call Trace: 0012460c 00122bd3 00122c6a 0010efa1 
Code: 83 3c 90 00 74 91 8b 0d 4c b4 1a 00 89 f2 c1 e2 04 8b 44 11 

The kernel continues to run. It just terminates the fsck function and the
system continues to run with the usual message that you must reboot
immediately.

The problem lies in the function sd_open. By the time that it is called
to open the drive, the sd_sizes pointer is zero.

The problem is that the pointer should have been initialized in the sd_init
function.

I can not use the current kernels with this code in place. Do you have
any idea what may be wrong with the changes? I suppose that I could put
back the older sd.c code and get the current kernels to work but that is
not the 'proper' method to use the Linux kernels.

I have thought of changing the rc file's fsck operation so that it validated 
only the root partition, mounted it read/write, then mounted and unmounted 
the secondary partition, did the fsck operation, then mounted it back. The 
trap does not seem to occur if I mount the partition first as I unmounted it 
and then manually ran the fsck operation without the problem.

Does anyone have any good idea what may have changed since 1.1.84 (my prior 
WORKING kernel version)?

I am willing to supply the trace information if this will help.

The /etc/fstab which I use is:

/dev/hda1	/	ext2	defaults	0	1
none		/proc	proc	defaults	0	0
/dev/sda1	/mnt	ext2	defaults	0	2

sd.c has the following code for the sd_open function.

static int sd_open(struct inode * inode, struct file * filp)
{
        int target;
	target =  DEVICE_NR(MINOR(inode->i_rdev));

	if(target >= sd_template.dev_max || !rscsi_disks[target].device)
	  return -ENXIO;   /* No such device */
	
/* Make sure that only one process can do a check_change_disk at one time.
 This is also used to lock out further access when the partition table is
being re-read. */

	while (rscsi_disks[target].device->busy);

	if(rscsi_disks[target].device->removable) {
	  check_disk_change(inode->i_rdev);

	  if(!rscsi_disks[target].device->access_count)
	    sd_ioctl(inode, NULL, SCSI_IOCTL_DOORLOCK, 0);
	};
	/*
	 * See if we are requesting a non-existent partition.  Do this
	 * after checking for disk change.
	 */
	if(sd_sizes[MINOR(inode->i_rdev)] == 0)  // IT DIES HERE!!!
	  return -ENXIO;

	rscsi_disks[target].device->access_count++;
	if (rscsi_disks[target].device->host->hostt->usage_count)
	  (*rscsi_disks[target].device->host->hostt->usage_count)++;
	return 0;
}

The assembled code is:

 200 0210 8B441808 		movl 8(%eax,%ebx),%eax
 201 0214 83C404   		addl $4,%esp
 202 0217 83781000 		cmpl $0,16(%eax)
 203 021b 7512     		jne L236
 204 021d 6A00     		pushl $0
 205 021f 68805300 		pushl $21376    << this is the DOORLOCK
 205      00
 206 0224 6A00     		pushl $0
 207 0226 57       		pushl %edi
 208 0227 E8D4FDFF 		call _sd_ioctl
 208      FF
 209 022c 83C410   		addl $16,%esp
 210              	L236:
 211 022f 0FB65710 		movzbl 16(%edi),%edx
 212 0233 A1242700 		movl _sd_sizes,%eax
 212      00
 213 0238 833C9000 		cmpl $0,(%eax,%edx,4)   <<<<< DEATH HERE
 214 023c 7491     		je L231
 215 023e 8B0D9426 		movl _rscsi_disks,%ecx
 215      0000
 216 0244 89F2     		movl %esi,%edx

-- 
Al Longyear           longyear@netcom.com

-- 
Al Longyear               longyear@netcom.com            longyear@sii.com
The public pgp 2.6 key is available by fingering longyear@netcom.com.

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[89] in linux-scsi channel archive

Kernel panic (non-fatal) with AHA1542B in 1.1.94 sd_open

daemon@ATHENA.MIT.EDU (Al Longyear)Wed Mar 1 17:25:14 1995

daemon@ATHENA.MIT.EDU (Al Longyear)
Wed Mar 1 17:25:14 1995