Far too often, you hear about incidences of major passwords leaks and that users are subsequently asked to change their passwords. But why does this happen so often? And is it possible as a developer to securely store passwords?
In case you don't like to read, there's a video available on KingOfDog International as well as KingOfDog which explains this very topic.
In this article, we will refer to an imaginary service with some kind of user system. In the background, the user data are stored in a database which is not strictly specified.
take our database and want to assign the passwords to the users. Accordingly, we
create a new entry with ID, username, email address, and (of course) the password.
We're leaving the password in its most purest form and how it was intended to be by God: as plain and simple text.
Our imaginary service is attacked by an imaginary hacker. In this article, we don't look at security issues like SQL injection that might as well happen on the client-side. Instead we only focus on an attack directed toward our server/database.
Coincidentally, the hacker knows our pet's name which we chose as the database password. In no time, they can steal all entries from the database, including the users' passwords. Maybe even financial data such as credit card information.
It is obvious that this way of storing passwords is completely insecure and should never ever be used by anyone. Also — it would be smart for a service provider to choose a better password.
By the way, there are still plenty of websites and services out there that store passwords in such a fashion — or rather: give away. You can recognize them when they are so kind to send you your password in case you've forgotten it! Well, run as soon as you see your password flattering in your inbox. Also, it would be nice to inform the administrator about this gaping security issue.
Hashes to the rescue
can one do to prevent such a scenario? “Hash
functions” are coming to the rescue! Examples for those are “MD5”,
"SHA-256”, or “SHA-512”, just to name a few. Hash functions generate a (statistically) unique bit sequence based on any sort of input. Often, you see those hashes represented as hexadecimal numbers. The exact same input value always produces the same output hash. However, it is impossible to “decrypt” a given hash without brute-forcing all possible inputs; hash functions are irreversible.
I tend to incorrectly refer to the process of hashing a string as encryption. To encrypt something always has the motive that the original plaintext can be accessed later, at least as long as you possess the key. Hash functions, on the other hand, don't aim to be reversed.
a hash function — nowadays you should at least take something like SHA-512 as
it spits out rather long hash values and, thereby, is quite tedious to crack.
hash function, we encrypt the password the user provided during registration. It is essential that only the hashed password is stored in the database while the original input is discarded from memory.
As soon as they want to login, we match the hash of the entered password with the password hash stored in the database. In case they should match, the user has submitted the correct password.
Now, the same hacker turns up again and as we still haven’t changed the database password, they can download all entries again. It would appear that the intruder now stands no chance to read the passwords.
there are tables full of hash strings and their corresponding input values.
There are plenty of web interfaces to easily search these tables. You just enter any hash and if it exists in the table, the service returns the password (or whatever the original value was).
These so called Rainbow Tables are especially widespread for MD5, so that is an extremely poor choice for securing passwords. Though, there are enough rainbow tables for other hash functions as well, for instance SHA-256 or SHA-512. Most dangerous of all are the (sadly) very common passwords such as “123456”, “password”, or “qwerty”. These pose no challenge to crack — no matter how long the hash strings.
Adding the salt to the soup
The solution for secure password storage is called “salting”. Salting in computer science is only distantly related to the popular mineral, found in every kitchen. Still, similar to natrium chloride, you do add something to something else.
However, it's not a delicious meal you're adding salt to but the user's password. Before applying the hash function, you append (or prepend for that matter) something to the raw password. That can be the corresponding email address or a randomly generated string.
The security of salting is mainly based on the length of the appended string. That's why using the email address as salt is not recommended as email addresses vary wildly in length.
Adding this string (= salt) makes every password, even those frequently used passwords (“123456” etc.), unique, as the salt is generated on a per-user basis. In turn, the password hash is also unique for every user.
At this point, even rainbow tables can't help anymore.
In praxis, a registration using salts would look something like this:
- The user signs up with their password and email.
- This input is sent to the server.
- The server now appends the email address to the password.
- Using a hash function, let's say SHA-512, this string is now encrypted.
- The server stores the encrypted password, username and email in its database.
Accordingly, the login would look like this:
- The user enters their email and password as always.
- The entered password is appended to the entered email address and this construct is again encrypted with the very same hash function.
- The server queries the stored password hash from the database.
- In the last step, the two password hashes are compared with each other. In case they are identical, the user is now being logged in.
Looking to the future: Quantum Computing
At the moment, this technique is practically impossible to crack and all services dealing with passwords or sensitive user data should use some sort of this “encryption”.
However, looking at the growing efficiency of quantum computing it could become dangerous for password salting because highly optimized quantum computers are way faster at cracking hashes than any regular system. That requires us to come up with new ways of storing passwords in the future — or throw them away completely in favor of something better.
Don't store any passwords in plaintext. Attach a long string to the password, which is unique for every user. Encrypt with a secure hash function.