Earlier this week I heard about a new startup called Bitcasa which is offering “infinite” secure cloud storage for a low monthly fee. Now, I’m not particularly interested in relying on a brand-new startup for all of my off-site storage needs, but one of Bitcasa’s technical claims seems to have raised a few eyebrows on the Internet. In particular, I learned from an episode of the podcast Security Now that Bitcasa claims to use exclusively client-side encryption and also to be able to de-duplicate files server-side.

Think about that for a moment, and it may at first seem impossible. How can you de-duplicate plaintext that the server never has access to? But it’s not impossible, and I wouldn’t doubt that many ways of doing this are widely known, but I did find it to be a really interesting computer science brainteaser. Here’s the problem, in my words:

Design a service that allows users to store blocks of data and retrieve them later. The service must have the following properties:

  1. No block is stored more than once.
  2. No more than O(1) additional space is used per user who “owns” a particular block.
  3. No user is able to decrypt a block that he does not own.
  4. The service could not be compelled by any authority to decrypt the blocks it stores.

Extra Credit:

  1. No communication between users, directly or through the service, is required.
  2. The service could not be compelled by any authority to divulge which users own which blocks.

In the Security Now episode in question, the host gave a solution which satisfied 1-4, but not 5 or 6, so I have labelled them as extra credit. If you want to try solving this problem yourself, stop reading now — my solution follows.
Continue reading