Storing Private Data Securely
First, let's go back in time to the Sony hack. This was especially embarassing to Sony because they stored all their passwords in plain text. For those technically normal people out there, there is a way to create data based on a password called a hash, store that data, and then use that data later to check whether a user entered their password correctly or not by making another hash and check whether it matches the stored hash. It is fairly easy to store password hashes instead of the password itself, and it is far more secure to do so.
Admittedley, it is a little more complicated than that. In order to thwart rainbow tables, web site owners need to store a "salt" (a large random number) for each user and then create a hash based on the salt and the password together. The bottom line, however, is that it is not that difficult to store a password securely.
It is not as easy (if what is explained above can be called easy) to store credit card information and addresses securely, and it is likely that the majority of web site owners don't even try to do so.
So how do you, as a web site owner, keep your users' personal information personal?
Like this: Elliptic Curve Cryptography. Briefly, ECC is a public/private key scheme, meaning you use one "key" (bit of data) to encrypt information and one to decrypt information. Since you don't care who encrypts your data as long as you can decrypt it, you share the "encrypt" key with everyone while keeping the "decrypt" key private, which is where the "public/private" part of the name comes from.
The cool thing is that ECC allows you to encrypt data with very small keys. For example, normally web site traffic using the 'https://' prefix use keys which are between 1024 bytes (characters) and 4096 bytes. Compare that to this string of characters, which is the ECC public key corresponding to one of my (old) passwords:
The above string of characters is longer than a sentence, perhaps, but nowhere near the 1024 character essay. And the private key is simply as long as...say... a password. Let me state that again: the actual data used to decrypt user info under the ECC scheme IS that user's password. It now is possible, then, for a user to decrypt their data by typing in their password upon login to the web site. Further, since it is possible for a web site owner to not store a user's password (only its hash), not even the web site owner can decrypt private user data. The owner could then keep valuable meta data, such as log in and link-clicking history, while keeping data about the user which is doing the clicking private.
Here is what to do as a website owner to store user data securely.
- Store the user's salt.
- Store the hash of the user's password+salt. This allows you to securely check to see whether a user entered the right password on login
- Store the user's username in plain text. While unfortunate, you need to store the username in plain text so that you can look up that user's data in the database. It is then incumbent on the user to use a private username, say "superSquirrel", instead of "Daniel.Haskin".
- Upon registration of a user, generate and store that user's public key based on the the password they provide. (Don't worry; it is usually intractably difficult to get a user's password based on their public key.)
- Use that public key to encrypt the rest of the user's data housed within the database.
- When a user logs in to the web site:
- The web site uses the username to look up the database entry.
- The password is verified using the salt and hash.
- The password is used to decrypt the user data.
- The unencrypted user data can then be used while the user is logged in, but it is never recorded anywhere by the web site.
- On log out (or the closing of the user session), the data is again encrypted and stored in the database.
This plan would allow any web service using it to happily comply with the NSA and give them all the information about all the web service's users. This is because personal user data would be encrypted with a key that the web service doesn't have --- the user password. The web service would still store metadata unencrypted and then link it with the encrypted user data. That way, web service could still use the metadata for targeted marketing and data mining. It also means the NSA can then analyze the metadata. This gives them enough information to fight terrorism while also guaranteeing the privacy of individuals. In our example, they they don't know who user "#1acd3fr3" is; however, if their software flags that user as a terrorist, they can get a legal warrant asking a judge if they can go after the information for that username based on other means of investigation. But even if the NSA asked a web service owner who a particular user was, the owner would not be able to tell them, because they wouldn't have access to that information. Thus, the NSA can still do their jobs, and the web service can still use customer data to make more money, but sensitive user data can still be protected from hackers and "bad apples".
Tools of the Trade
Any database software vendor, such as Percona or Oracle, can get you the underlying database software needed. As for the encryption itself, a free and open-source implementation of ECC that I've come to know and love is called seccure. Some quick scripting to complete the algorithm that I have herein outlined should get any one seriously committed to user privacy well on their way to protecting their users' private data.