[6306] in cryptography@c2.net mail archive
Re: rate of finding collisions
daemon@ATHENA.MIT.EDU (David Wagner)
Mon Jan 3 14:27:27 2000
To: cryptography@c2.net
From: daw@cs.berkeley.edu (David Wagner)
Date: 3 Jan 2000 10:20:01 -0800
Message-ID: <84qp8h$ls4$1@blowfish.isaac.cs.berkeley.edu>
In article <87embzmioc.fsf@hedonism.demon.co.uk>,
Paul Crowley <paul@hedonism.demon.co.uk> wrote:
> iang@cs.berkeley.edu (Ian Goldberg) writes:
> > The expected number of collisions you get if you sample S items out of
> > a universe of size U (=2^N in the above case) is about (S^2)/U.
>
> I know this is a month old but I'm only now catching up on the
> newsgroup. I'd be surprised if the expected number isn't about
> (S^2)/2U where S is large but much smaller than U. Where S is small
> you have to use S(S-1)/2U. The source of error in this reckoning is
> that you treat every pair of members of S as independent, but it's
> very small when U is much bigger than S.
>
> If the expected number of collisions is x, the probability of *no*
> collisions is I think roughly e^-x.
Looks right to me.
(I suspect Ian meant to absorb this small error in the word `about'.)
> If I keep generating new members of S from U until I get a collision,
> does anyone know the expected size of S when I succeed?
Yes. \sqrt{\pi U / 2}. In general, finding k collisions is expected
to take \sqrt{k \pi U / 2} trials. The bible on this topic is
P. Flajolet and A.M. Odlyzko, ``Random mapping statistics'', EUROCRYPT '89.