[8401] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Re: Perfect compression and true randomness

daemon@ATHENA.MIT.EDU (Arnold G. Reinhold)
Mon Jan 8 15:01:14 2001

Mime-Version: 1.0
Message-Id: <v04210100b67e9fd5563b@[24.218.56.92]>
In-Reply-To: <200101050520.VAA10489@shell7.ba.best.com>
Date: Mon, 8 Jan 2001 09:41:09 -0500
To: Nick Szabo <szabo@best.com>, cryptography@c2.net
From: "Arnold G. Reinhold" <reinhold@world.std.com>
Cc: burton rosenberg <burt@math.miami.edu>
Content-Type: text/plain; charset="iso-8859-1" ; format="flowed"
Content-Transfer-Encoding: quoted-printable

I don't think Chaitin/Kolomogorv complexity is relevant here. In real=20
world systems both parties have a lot of a priori knowledge. Your=20
probably_perfect_compress program is not likely to compress this=20
sentence at all, but PKZIP can.  The probably_perfect_compress=20
argument would work (ignoring run time) if Alice first had to send=20
Bob the entire PKZIP program, but in reality she doesn't. Also=20
discussing "perfect compression" doesn't make sense in the absence of=20
a space of possible messages and a probability distribution on that=20
space.

I don't agree that the assumption of randomness in OTP's is on the=20
same footing as "perfect" compression.  The laws of physics let you=20
put a lower bound on the entropy per bit for practical noise=20
generators.  You can then distill the collected bits to produce fewer=20
bits which are completely random.

In any case, as I tried to point out before, perfect compression,=20
what ever it may be, does not prevent a know-plaintext attack.  If=20
Malfoy knows the plaintext and the compression algorithm, he has=20
every thing he needs to guess or exhaust keys. If he has a large=20
number of plaintexts or can choose plaintexts he might be able to=20
effect more sophisticated attacks attacks.

Arnold Reinhold


At 9:20 PM -0800 1/4/2001, Nick Szabo wrote:
>Anonymous wrote (responding to the idea of "perfect compression"):
>> ... Once you have specified
>> such a probability distribution, you can evaluate how well a particular
>> compression algorithm works.  But speaking of absolute compression or
>> absolute entropy is meaningless.
>
>These ideas have on a Turing machine the same meaning as the idea of
>"truly random numbers", and for the same reason.  The assumption of
>randomness used in proving that OTPs and other protocols are
>"unconditionally" secure is very similar to the assumption that a string
>is "perfectly compressed".  The problem is that determining the absolute
>entropy of a string, as well as the equivalent problem of determining
>whether it is "real random", is both uncomputable and language-dependent.=
=A0
>
>Empirically, it seems likely that generating truly random numbers is much
>more practical than perfect compression.  If one has access to certain
>well-observed physical phenomena, one can make highly confident, if
>still mathematically unproven, assumptions of "true randomness", but
>said phenomena don't help with perfect compression.=A0
>
>If we restrict ourselves to Turing machines, we can do something *close*
>to perfect compression and tests of true randomness -- but not quite.
>And *very* slow.  From a better physical source there is still the problem
>that if we can't sufficiently test them, how can we be so confident
>they are random anyway?  Such assumptions are based on the extensive and
>various, but imperfect, statistical tests physicists have done (has
>anybody tried cryptanalyzing radioactive decay?  :-)
>
>We can come close to testing for true randomness and and doing perfect
>compression on a Turing machine.   For example, here is an algorithm that,
>for sufficiently long but finite number of steps t, will *probably* give yo=
u
>the perfect compression (I believe the probability converges on
>a number related to Chaitin's "Omega" halting probability as t grows,
>but don't quote me -- this would make an interesting research topic).
>
>probably_perfect_compress(data,t) {
>for all binary programs smaller than data {
>        run program until it halts or it has run for time t
>        if (output of program =3D=3D data AND
>                length(program) < length(shortest_program)) {
>                shortest_program =3D program
>        }
>}
>print "the data: ", data
>print "the (probably) perfect compression of the data", shortest_program
>return shortest_program
>}
>
>(We have to makes some reasonable assumption about what the binary
>programming language is -- see below).
>
>We can then use our probably-perfect compression algorithm as a statstical
>test of randomness as follows:
>
>probably_random_test(data,t) {
>	if length(probably_perfect_compress(data,t)) =3D length(data)
>	then print "data is probably random"
>	else print "pattern found, data is not random"
>}
>
>We can't *prove* that we've found the perfect compression.  However,
>I bet we can get a good idea of the *probability* that we've found the
>perfect compression by examining this algorithm in terms
>of the algorithmic probability of the data and Chaitin's halting
>probability.
>
>Nor is the above algorithm efficient.   Similarly, you can't prove
>that you've found truly random numbers, nor is it efficient to
>generate such numbers on a Turing machine.  (Pseudorandom
>numbers are another story, and numbers derived from non-Turing
>physical sources are another story).=A0=A0=A0=A0=A0=A0=A0=A0=A0
>
>We could generate (non-cryptographic) probably-random numbers as follows:
>
>probably_random_generate(seed,t) {
>	return probably_perfect_compress(seed,t)
>}
>
>For cryptographic applications there are two important ideas,
>one-wayness and expanding rather than contracting the seed, that
>are not captured here.  Probably_random_generate is more like the idea
>of hashing an imperfect entropy source to get the "core" entropy
>one believes exists in it.  Only probably_random_generate is far=20
>more reliable,
>as one can actually formally analyze the probability of having generated
>truly random numbers.   It is, alas, much slower than hashing. :-(
>
>Back to the theoretical point about whether there is such a thing
>as "absolute" entropy or compression.  The Kolmogorov complexity
>(the smallest program that, when run, produce the decompressed data)
>is clearly defined and fully general for Turing machines.  If we could
>determine the Kolmogorov complexity we wouldn't need to invoke any
>probability distribution to determine the absolute minimum possible
>entropy of any data to be compressed on a Turning machine.
>
>It is, alas, uncomputable.  To find the Kolmogorov complexity we could
>simply search through the space of all programs smaller than the data.
>But due to the halting problem we cannot always be certain that there
>does not exist a smaller program that, run for a sufficiently long period
>of time, will produce the decompressed data.  When we can't prove that
>there is no smaller program than the data which generates the data,
>we also can't prove that there is not a pattern hidden in the data which
>makes it less than "truly random".  The finite version of this search
>process, in the program probably_perfect_compression, circumvents
>the halting problem by arbitrarily halting programs that have already
>run for t steps.
>
>Also, since the length of the program depends on what language it's
>written in, absolute Kolmogorov complexity is good only for
>analyzing growth rates.  The choice of language adds a constant
>length to the program.  We'd have to look at probably_perfect_compression
>in this context to see if the choice of binary language is a
>reasonable one or if other languages would give better compressions
>on the data we are likely to encounter.
>
>One consequence is that OTPs themselves have a big problem when we
>assume "truly random" numbers.  This assumption is, in terms of
>the provability of security, no weaker or stronger than than
>an assumption of "perfect compression".   (Which assumption is
>more practical is a different question -- as per above, if one has
>access to certain well-observed physical phenomena, one can make
>highly confident, if still mathematically unproven, assumptions of
>"true randomness", but said phenomena don't help with perfect
>compression).
>
>For more on this topic see http://www.best.com/~szabo/kolmogorov.html and
>the references therein.



home help back first fref pref prev next nref lref last post