[114112] in cryptography@c2.net mail archive
Re: [tahoe-dev] Surely M$ can patent this process?
daemon@ATHENA.MIT.EDU (zooko)
Sun Jan 27 12:15:03 2008
In-Reply-To: <001801c8609f$4e287540$ea795fc0$@net>
From: zooko <zooko@zooko.com>
Date: Sun, 27 Jan 2008 09:18:50 -0700
To: tahoe-dev@allmydata.org,
theory and practice of decentralized computer networks <p2p-hackers@lists.zooko.com>,
Cryptography <cryptography@metzdowd.com>
[adding Cc: p2p-hackers and cryptography mailing lists as explained =20
below; Please trim your follow-ups as appropriate.]
Dear Gary Sumner:
On Jan 26, 2008, at 9:44 PM, Gary Sumner wrote:
> I was researching on the weekend and came across Tahoe=85very =20
> exciting and can=92t wait to delve in and understand more in detail.
>
> I was reading over Plank=92s work around erasure encoding and that =20
> lead me to Tahoe. One thing that I was really looking for was to be =20=
> able to encrypt the data before storing it and so was very excited =20=
> when I read your architecture doc and it says =93When a file is to be =20=
> added to the grid, it is first encrypted using a key that is =20
> derived from the hash of the file itself.=94 This seems perfectly =20
> logical and natural way to apply this technique. However, =20
> researching also lead me to a patent M$ has been granted on this =20
> exact process:
>
> Encryption Systems and Methods for Identifying and Coalescing =20
> Identical Objects Encrypted with Different Keys - http://=20
> patft.uspto.gov/netacgi/nph-Parser?Sect1=3DPTO2&Sect2=3DHITOFF&p=3D1&u=3D=
%=20
> 2Fnetahtml%2FPTO%2Fsearch-=20
> bool.html&r=3D1&f=3DG&l=3D50&co1=3DAND&d=3DPTXT&s1=3D6983365.PN.&OS=3DPN=
/=20
> 6983365&RS=3DPN/6983365
>
I haven't read that patent, so I can't say whether it applies to what =20=
allmydata.org Tahoe does or not. By default, for immutable files =20
(but not for mutable files or directories), Tahoe sets the encryption =20=
key equal to the tagged hash of the file contents. (A tagged hash is =20=
simply a hash of the data prefixed by a tag to distinguish it from =20
other uses of hash functions). You don't have to use Tahoe this way, =20=
however:
> The encryption before storing is critical for my application.
>
If, for any reason, you don't want to let your encryption key be =20
produced from the secure hash of the file contents, then Tahoe can =20
instead use a randomly-generated encryption key. The drawback of =20
doing it this way -- with a random encryption key -- is that you lose =20=
the "deduplication" feature: two people who independently store the =20
same file contents will use twice as much space, instead of each of =20
them having a pointer to a single stored copy. The advantages of =20
doing it with a random encryption key are that you get a stronger =20
guarantee about the confidentiality of the contents of your files, =20
and it is faster as you don't need to process the whole file (in =20
order to generate the encryption key) before beginning to upload the =20
file.
> Surely there must be prior art on this technique to refute this =20
> patent?
>
That's an interesting question, and I'm carbon-copying the p2p-=20
hackers and cryptography mailing lists to ask if anyone knows. I =20
learned about this technique from Jim McCoy and Doug Barnes in their =20
design of Mojo Nation. I don't remember whether this technique was =20
mentioned in Jim McCoy's personal communication of Mojo Nation to me =20
in the summer of 1998, but it was definitely present in the design =20
when I started working for Jim and Doug on Mojo Nation in 1999, and =20
when Mojo Nation was first announced to the world at DefCon in July =20
2000 [1, 2]. I don't know if Jim came up with the idea ex nihilo or =20
was exposed to it in the swirling soup of ideas that we lived in at =20
the time: cypherpunks / Electric Communities (which had many ideas =20
gleaned from Xanadu) / Financial Cryptography / etc..
I remember reading about the newly announced Freenet project in 2000 =20
and being surprised at how many similarities its design had to our =20
unannounced Mojo Nation project. The influential Freenet paper [3] =20
was published in July, 2000 -- one month too late to count as prior =20
art for that patent, which was filed May 2000. However, that paper =20
was based on Ian Clarke's master's thesis, which was published in =20
1999. Let's see... A there it is: [4]. Hm, no it does not seem to =20
contain the notion that the 2000 Freenet paper would popularize as =20
"Content Hash Keys".
I've also just now re-read The Eternity Service (Anderson, 1996) [5], =20=
and it, like Clarke 1999, omits details of encryption.
It's an interesting puzzle of intellectual history. The idea =20
certainly seems to have been "in the air", as both Mojo Nation and =20
Freenet were working on it before the May 2000 patent submission by =20
Doceur et al., but Mojo Nation and Freenet each published the idea =20
shortly after May 2000. According to my limited understanding of =20
patent law, this means that they don't count as prior art on that =20
patent.
Regards,
Zooko
[1] http://www.mccullagh.org/image/950-12/jim-mccoy-mojonation.html
[2] http://web.archive.org/web/20001118214000/http://=20
www.mojonation.net/docs/technical_overview.shtml
[3] http://citeseer.ist.psu.edu/420356.html
[4] http://citeseer.ist.psu.edu/380453.html
[5] http://citeseer.ist.psu.edu/anderson96eternity.html=
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo@metzdowd.com