[5407] in linux-scsi channel archive
RELEASE: RAID-0,1,4,5 patch 1998.12.14, 2.0.36/2.1.131-ac9
daemon@ATHENA.MIT.EDU (MOLNAR Ingo)
Mon Dec 14 00:50:34 1998
Date: Mon, 14 Dec 1998 06:47:10 +0100 (CET)
From: MOLNAR Ingo <mingo@valerie.inf.elte.hu>
Reply-To: MOLNAR Ingo <mingo@valerie.inf.elte.hu>
To: linux-raid@vger.rutgers.edu, linux-scsi@vger.rutgers.edu
cc: Erik Troan <ewt@redhat.com>, "Stephen C. Tweedie" <sct@redhat.com>,
Doug Ledford <dledford@redhat.com>
this is an alpha release of the latest Linux RAID0145 drivers, against
kernel 2.1.131-ac9 and 2.0.36. (ac10 and ac11 should patch cleanly too)
WARNING: we are still not out of alpha status, some of the new features
were tested only on my box. It should be mostly ok, but a backup never
hurts ...
i have fixed every architectural thing i wanted to fix, now i aim for a
'stable' release of the RAID driver before the end of the year, which will
probably be forked later on. As usual, the patches will follow both the
2.0 and 2.2 kernel line.
the files raid0145-19981214-2.0.36.gz, raidtools-19981214-0.90.tar.gz and
raid0145-19981214-2.1.131-ac9.gz can be found at the usual place:
http://linux.kernel.org/pub/linux/daemons/raid/alpha
(i have sanity compiled/booted these patches on vanilla source trees)
Fixes in this release:
======================
-- reconstruction/resync speed and stability
reconstruction is much more stable and error-resistant, and does not
trash the buffer cache anymore. People who have reported system slowdown
during resync/reconstruction should recheck wether this is still there.
From now on any noticable system slowdown during reconstruction is
considered a major bug.
-- device name renumbering, SCSI-reordering
i have completely rewritten this feature, and it's much more robust now.
In the last few days i have tested various likely and unlikely
removal/addition and device-renumbering scenarios, and all of them were
properly resolved. Some people have reported MD_BUG() failures for
certain 'crash-test' scenarios, these should be all fixed in this patch.
(please, if anyone has a noncritical array, try to break this patch by
(re)moving disks/IDs. It is also safe to move disks between controllers
and to change the SCSI ID.)
Also there is a new 64-bit 'event counter' in every superblock, this does
not fail when the system clock is nonmonotonic. (we used utime until now,
but this is not reliable across RT-clock-failure) This feature needs no
upgrade, the event counter starts from zero for everybody.
-- autodetection/raidstart fixes
autodetection and raidstart got rewritten too, it's now much more robust:
the mechanizm behind this is the 128-bit array-UUID that got generated
through /dev/random for some time silently for every newly created array,
now it's used to categorize arrays.
Disks now 'arrive' into the RAID subsystem, and we have an 'all devices
have arrived' event, after wich we start and categorize the pool of
devices detected up to that point. This approach resolves the 'should we
start the array in degraded mode when N-1 disks have arrived' dilemma
pretty nicely.
-- set_blocksize() bugfix
when creating new filesystems with different blocksize, all Linux kernels
show a serious slowdown for no good reason, until next reboot. The
problem was that set_blocksize() did not flush old-size buffers
aggressively enough, which resulted in way too big buffer-lists.
-- raidtools fixes
apart from the important --upgrade fix, there is now a new --debug fix
which prints the current RAID state into the syslog. Some usability fixes
were added too.
plus countless smaller bugs were fixed along the way, and some Alpha and
Sparc/UltraSparc fixes went in too. But modules are still somewhat broken,
so i recommend to compile RAID drivers into the kernel statically.
New features/improvements in the release:
=========================================
-- New block_read()
new block_read() implementation from Gadi Oxman: this helps e2fsck
speed and 'hdparm -tT /dev/md0' benchmarks numbers.
-- Self-tuning readahead and reconstruction
the RAID code now properly tunes the page-cache to do readahead
depending on the number of data disks. this is also used for
reconstruction. 'bonnie' numbers should improve. (but please also report
if there is degradation somewhere)
-- Improved reconstruction, 'low-prio' IO
i've added a new feature to the buffer-cache: 'low-prio IO requests'. The
reconstruction code tries to be as nonintrusive as possible. The block
layer got extended to detect idle block devices. The resync of my test
array has dropped from 40 minutes to 6 minutes (!).
'low-prio' IO requests are generated during resync, and are detected by
the RAID1 and RAID4/5 driver and are special cased: for resync we do not
have to write back the 'source' disk. This optimization is significant
for both RAID1 and RAID5, eg. for RAID1 we read one disk, and write all
other disks, without rewriting the first disk. This results in faster
resync.
also, resync/reconstruction now does a proper drop-behind pass through
the disk, ie. a resync will create no new buffer-cache pages.
- physical dependency discovery, serialized reconstruction
this is a feature request by Mark Anthony Lisher, we now properly 'delay'
the resync of an array, if a running resync involves partitions on the
same physical disk. Even a single 'overlapping' partition makes the
resync delayed. A 'delayed' array is clearly marked as such in
/proc/mdstat, and will be started when the first array has finished
reconstruction. Independent resyncs will execute in parallel. This
mechanizm is just as transparent as resync is, the array is completely
usable.
- 'noautodetect'
there is a new 'noautodetect' kernel boot option, which can be used to
skip autodetection even if partitions are marked type 0xfd. This is a
debugging aid.
plus most of the 'configuration management' and ioctl code got rewritten
to be more robust.
enjoy. Reports, comments, feature-requests welcome. I'd especially ask
everybody to re-check 'bonnie' and 'hdparm' numbers, i might have missed
some cases.
-- mingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.rutgers.edu