Earlier this week I heard about a new startup called Bitcasa which is offering “infinite” secure cloud storage for a low monthly fee. Now, I’m not particularly interested in relying on a brand-new startup for all of my off-site storage needs, but one of Bitcasa’s technical claims seems to have raised a few eyebrows on the Internet. In particular, I learned from an episode of the podcast Security Now that Bitcasa claims to use exclusively client-side encryption and also to be able to de-duplicate files server-side.
Think about that for a moment, and it may at first seem impossible. How can you de-duplicate plaintext that the server never has access to? But it’s not impossible, and I wouldn’t doubt that many ways of doing this are widely known, but I did find it to be a really interesting computer science brainteaser. Here’s the problem, in my words:
Design a service that allows users to store blocks of data and retrieve them later. The service must have the following properties:
- No block is stored more than once.
- No more than O(1) additional space is used per user who “owns” a particular block.
- No user is able to decrypt a block that he does not own.
- The service could not be compelled by any authority to decrypt the blocks it stores.
- No communication between users, directly or through the service, is required.
- The service could not be compelled by any authority to divulge which users own which blocks.
In the Security Now episode in question, the host gave a solution which satisfied 1-4, but not 5 or 6, so I have labelled them as extra credit. If you want to try solving this problem yourself, stop reading now — my solution follows.