[2073] in linux-scsi channel archive
Re: compression mode for HP1533A streamer
daemon@ATHENA.MIT.EDU (Trevor Johnson)
Thu Jun 26 15:43:03 1997
Date: Thu, 26 Jun 1997 12:20:01 -0700 (PDT)
From: Trevor Johnson <trevor@jpj.net>
To: ClimServ <climserv@info13.polytechnique.fr>
cc: Uwe Schmeling <uschmeli@gibson.csa.de>, linux-scsi@vger.rutgers.edu
In-Reply-To: <33AFA90A.560C7FEE@info13.polytechnique.fr>
> You _are_ in compressed mode !
> My own experience is that you cannot record much more than 1.7 GB
> on a "2 GB" DAT, with compression off. You can record only a little bit
> more than 2 GB of random binary data in compressed mode.
> I think the annouced 2GB native is a raw capacity, which does not
> take into account the blocking factor, file gaps, and so on.
> A 2X compression factor may be reached when you compress very
> redundant data, with long series of identical bytes. this is generally
> not the case with binary files.
A truly random data stream, such as /dev/urandom is meant to resemble, is
incompressible. If you can get 2 GB of such data on your tape, your tape
holds 2 GB uncompressed. Files picked "at random" from your $PATH are
likely to compress by 50% or so with gzip (better yet with bzip), unless
they're already gzexe'd.
> When I have to shrink 2.5 GB of binary, I do not use the internal
> compression scheme of the HP1533. I use
> ...|gzip -9|dd of=<my_device> bs=<my_blocksize> conv=sync .
> It takes a lot of cpu, but it is efficient in terms of compression
> ratio.
I've found that a 166 MHz Pentium is inadequate to do this when there are
many incompressible files. It can't maintain a throughput of 172 kB/s or
so, so even a bottom-feeding DAT drive will have to quit streaming and
reposition, slowing it down still more. Another problem with piping
through gzip is that if there's an error on the tape, nothing after that
error can be recovered. There are some backup programs which overcome
this, but I forget which ones (afio maybe?).
___
Trevor Johnson