[350] in WWW Security List Archive
Re: Experimental implementation of SimpleMD5
daemon@ATHENA.MIT.EDU (Phillip M. Hallam-Baker)
Wed Jan 25 20:50:24 1995
To: www-security@ns2.rutgers.edu
cc: hallam@dxal18.cern.ch
In-reply-to: Your message of "Wed, 25 Jan 1995 22:49:19 +0900."
<9501251349.AA09188@link.osf.org>
Date: Thu, 26 Jan 1995 04:49:36 +0900
From: "Phillip M. Hallam-Baker" <hallam@dxal18.cern.ch>
Reply-To: www-security@ns2.rutgers.edu
Errors-To: owner-www-security@ns2.rutgers.edu
Mez sez:
>In terms of comparing proposals, I'm afraid I understand the writeup
>of SimpleMD5 way better than the Simple Digest scheme. For example,
>the writeup of SimpleMD5 clearly states that usernames and passwords
>are propagated as before (it doesn't say what before was, but I guess
>that means ad hoc). I don't quite understand the pragmatics of Simple
>Digest, because I think Phill wrote somewhere something about
>passwords not being stored in cleartext. Could someone who understands
>this dimension of both schemes (how passwords are made available to
>user and server, and how stored), explain it clearly and completely?
The Digest scheme is conceptualy similar to SimpleMd5 in this respect
except that where SimpleMD5 uses the password (eg "woof"), the Digest
scheme uses the value MD5(username '@' domain ':' password) eg
MD5 ("pluto@disney.com:woof").
The reasons for this are several :-
The overall aim is to prevent the password from being compromised, this is
especially because user pinhead may have accounts on two machines (eg
nuclear.nsa.gov and moo.phrack.com). :
1) We do not want the password lying arroung in cleartext on either machine.
2) Since we can't use encryption, this means we have to use a one way hash,
having decided to use a one way hash it means that the server does
not know the password and cannot even construct it! Thus the client
has to use the `processed value' of the password, not the real one.
3) We do not want the sysop of moo.phrack.com accessing nuclear.nsa.gov.
If we simply used MD5(password) as the processed value then this
would be the case. If instead we use MD5 (domain ':' password)
this is not the case.
[Thanks to Alan Schifman for this one]
4) We would like to have a facility like the UNIX salt for the password file
so that an exhaustive attack does not get easier for password files
with large numbers of users, If we use
MD5(username '@' domain ':' password) this is OK.
[I thought this one up and then Alan Schifman made the
same suggestion]
To bring the UNIX 'salt' analogy closer, think of the domain and username
being the "salt". Later on I use MD5(password, salt) as a shorthand.
A usefull side effect, the server never needs to know the password itself,
this could be supported in the HTML forms. This would prevent sysops of MOOs
etc acting as password snarfers in their spare time - its very common to
find a system compromised because employee Mr Dweeble T. Brain has used his
work password at moo.hackers.r.us. [*]
(* I always knew there was a use for the `us' hierarchy)
On the nonce question :-
------------------------
The CERN server is forking. Under the UNIX madness there is no threads
systems we can expect to work for reference code - extreeme :-(. This
means that connections *have* to be idempotent. So the use of nonces becomes
a problem, how do we work out which nonces we have sent out??
1) If we allow a multiple operation HTTP session (aka S-HTTP) we get round
this, unfortunately we are no longer "simple".
2) If we have a "global nonce" that we change periodically in the main
forking loop we effectively have a time stamp. In fact we are even worse off
since the granularity is much less than we could achieve using a timestamp.
This being the case I would prefer to dispense with the nonce value idea and
use a timestamp :-)
An alternative approach is to use a scheme suggested by Dave Ragget in a
different context. Here the client calculates a `session id" which is "unique"
- forget how. This has two parameters, a stream-id and a count. For a given
session id and stream-id the count must increment montonically. The reason for
the stream-id is that it allows different windows from the same browser to
be run independently (eg different processes) while appearing as the "same"
session. This is usefull for applications where one wants to put up a warning
message for the first time an access to the site is attempted but not for
further accesses.
Could we do without so many MD5s?
---------------------------------
The expense of an MD5 operation is pretty linear for the number of bits,
thus the number of operations started and stopped is not really a speed issue.
Note however that several message digest functions (including MD5) have
a weakness for what is called an EXTENSION attack. That is given MD5(foo)
I might be able to calculate MD5 (foo + bar) :-(. For any given digest
an extension attack is always at least as easy as a direct forgery attack.
To foil such naughty people the checksup is constructed in a crafty way:-
checksum = MD5 (MD5(password+salt) + timestamp + MD5 (head))
If instead we used
checksum = MD5 (MD5(password+salt) + timestamp + head)
Then Ms Naughty Person might figure out an extension attack on head.
On authenticating the message itself :-
---------------------------------------
I've not done this yet because the CERN client does not have PUT and post :-)
Two thoughts occurred to me here. First it might be usefull to separate the
autbentication of the body of a put from the authentication of the headers. This
would be much more convenient if using a stream based library since the
authentication digest for the body could appear at the end of the body and
be calculated as the message is pumped through the stream rather than force it
to be calculated in advance. I can't think how to get this working through
proxies though :-(
Secondly in many cases the digest might have been pre-calculated (eg in a
server). For this case we really do want it to be arrived at independently
so the authentication function should be something like :-
checksum = MD5 (MD5(password+salt) + timestamp +
MD5 (head) + MD5 (BODY))
On the server Authenticating itself to the client
-------------------------------------------------
Again I haven't yet done this BUT it would be very easy. The only trick
here is that the server should use the same timestamp as the client originally
sent :-)
On the difference between my scheme and S-HTTP
----------------------------------------------
S-HTTP uses ASN.1 BER encoding as part of the PKCS-7 encapsulation scheme.
This encoding method has alternative use as a means for frightening small
children. Judge Lance Itho has banned its broadcase on the cable hook up.
So to make everything sweetness and light vs a vs S-HTTP we add in a
Content-Privacy-Encoding line and give it a new value - shen.
Basically the Digest scheme is just a means of giving an encapsulation
format that is easy to describe and implement. I used the name shen because
that was the first thing that came into my head :-) Digest is NOT
appropriate because later on I might want a "simple" encryption scheme
using the same hook. Simple would be a good name EXCEPT that Eric had
"stolen" it :-( In fact thats what it was at one point in the discussions
between Alan S. and I.
Proxy Authentication
--------------------
This should be possible, in fact all it means is that we want to send an
extra header line to give the proxy authentication information. But we don't
want to encapsulate a second time so we want something like :-
GET http://splunge.com/widgets HTTP/1.1
Content-Privacy-Domain: Simple (or shen or whatever we call it)
Authenticate: blah
Proxy-Authenticate: proxy1.edu blah1
GET /widgets HTTP/1.1
Accept:
........
Ah what has happened to the second line I hear you cry? Simple, the last
proxy in the chain has to provide the authenticated message to the home
server which means that the URL should be stripped of the domain identification.
Also note that we need to think about how a HTTP/1.1 proxy requiring
authentication knows to drop the authentication
[Thats enough comments on the `Simple' scheme - ed]
Phill Hallam-Baker