[140780] in North American Network Operators' Group
Re: Had an idea - looking for a math buff to tell me if it's possible
daemon@ATHENA.MIT.EDU (Valdis.Kletnieks@vt.edu)
Wed May 18 22:22:54 2011
To: Heath Jones <hj1980@gmail.com>
In-Reply-To: Your message of "Thu, 19 May 2011 01:01:43 BST."
<BANLkTinO5exbdnRd26Cn7EsLjbUxgix2jg@mail.gmail.com>
From: Valdis.Kletnieks@vt.edu
Date: Wed, 18 May 2011 22:21:24 -0400
Cc: nanog <nanog@nanog.org>
Errors-To: nanog-bounces+nanog.discuss=bloom-picayune.mit.edu@nanog.org
--==_Exmh_1305771684_2570P
Content-Type: text/plain; charset=us-ascii
On Thu, 19 May 2011 01:01:43 BST, Heath Jones said:
> My point here is it IS possible to transfer just a hash and counter value
> and effectively generate identical data at the remote end.
Nope. Let's use phone numbers as an example. I want to send you the phone
number 540-231-6000. The hash function is "number mod 17 plus 5". So
5402316000 mod 17 plus 5 is '7'. Yes, it's a poor hash function, except it has
two nice features - it can be worked with pencil and paper or a calculator, and
it has similar output distributions to really good hash functions (math geeks
would say it's an "onto function", but not a "one-to-one" function).
http://www.regentsprep.org/Regents/math/algtrig/ATP5/OntoFunctions.htm
Go read that, and get your mind wrapped around onto and one-to-one. Almost
all good hashes are onto, and almost none are one-to-one.
OK. counter = 0. Hash that, we got 5. increment and hash, we get 6. Increment
and hash, we got 7. If we keep incrementing and hashing, we'll also get 7 for
19, 36, 53, 70, and roughly 317,783,289 other numbers before you get to my
phone number.
Now if I send you 2 and 7, how do you get that phone number back out, and be
sure you wanted *that* phone number and not 212-555-3488, which *also* ends up
with a hash of 7, so you'd send a counter of 2? Or a number in Karubadam, Tajikistan
that starts with +992 3772 but also hashes to 7?
The problem is that if the number of input values is longer than the hash output,
there *will* be collisions. The hash function above generates 17 numbers from 5 to
22 - if you try to hash an 18th number, it *has* to collide with a number
already used. Think a game of musical chairs, which is interesting only because
it's an "onto" function (every chair gets a butt mapped to it), but it's not
one-to-one (not all chairs have *exactly one* butt aimed at them).
(And defining the hash function so that it's one-to-one and every possible
input value generates a different output value doesn't work either - because at
that point, the only counter that generates the same hash as the number you're
trying to send *is that number*. So if 5552316000 generates a hash value of
8834253743, you'll hash 0, 1, 2,3, ... and only get that same hash again when you get
to the phone number. Then you send me "5552316000,8834253743" and I hash some
5,552,315,999 other numbers till I reach the phone number.. which you sent me
already as the counter value.
tl;dr: If the hash function is onto but not one-to-one, you get collisions that
you can't resolve. And if the hash function *is* one-to-one, you end up
sending a counter that's equal to the data.
--==_Exmh_1305771684_2570P
Content-Type: application/pgp-signature
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Exmh version 2.5 07/13/2001
iD8DBQFN1H6kcC3lWbTT17ARApIjAKC8Qra4V7gAEMsSOclLATZq/UUXVgCfSE+U
j0UKetv/tnvjlTOzqqzY6UM=
=4JiJ
-----END PGP SIGNATURE-----
--==_Exmh_1305771684_2570P--