[107788] in Cypherpunks

home help back first fref pref prev next nref lref last post

CDR: Redundancy and Randomness [Information Theory] (fwd)

daemon@ATHENA.MIT.EDU (Jim Choate)
Sat Jan 23 21:07:35 1999

From: Jim Choate <ravage@EINSTEIN.ssz.com>
To: cypherpunks@EINSTEIN.ssz.com
Date: Sat, 23 Jan 1999 19:47:22 -0600 (CST)
Reply-To: Jim Choate <ravage@EINSTEIN.ssz.com>


----- Forwarded message from mgraffam@idsi.net -----

From: mgraffam@idsi.net
Date: Sat, 23 Jan 1999 19:21:44 -0500 (EST)
Subject: CDR: Redundancy and Randomness [Information Theory]

I've been working on some stuff lately, and it occurred to me that
it hinges on the assumption that if a message has 0 redundancy, then it
consists of essentially random bits. 

I'm not comfortable with this assumption intuitively, so I was wondering
if anyone knows of a proof that such a message must be random, or can
think of an example where this would not be the case. 

----- End of forwarded message from mgraffam@idsi.net -----

There is a questions I have for clarification. However, I think I understand
what you are asking....

What does one mean by redundancy?

Now if we assume we mean a repetition of a pattern, then we must then ask
within what scale?

For example a random pattern will be redundant at the bit-pair level because
there will be repetitions of ...10... or ...01... all through the sequence.
As we extend this pattern length we become able to use compression because we
can substitution a code for a particular bit string where that code is of a
shorter length than the bit stream it is replacing and that bit string
appears often enough to be worth replacing. For example if one had a very
long bit sequence and within a particular scale a bit sequence appears twice
it wouldn't be worth compressing. In order to build the dictionary, or
equation, or whatever we need to include the bit sequence itself as a sort
of index at least once.

Now an infinitely long random bit sequence would allow for some compression
but it would be limited upon the particular lengths of bit sequences you
were looking at and how often they occurred within the span of the random
sequence you were looking at. If the sequence is truly random this would
imply that only short bit sequences were repeated (low information content)
rather than long sequences (high information content).


    ____________________________________________________________________

          What raises the standard of living may well diminish the
          quality of life.

                                                 The Club of Rome

       The Armadillo Group       ,::////;::-.          James Choate
       Austin, Tx               /:'///// ``::>/|/      ravage@ssz.com
       www.ssz.com            .',  ||||    `/( e\      512-451-7087
                           -====~~mm-'`-```-mm --'-
    --------------------------------------------------------------------



home help back first fref pref prev next nref lref last post