Jerry on Java: Secure Password Storage - Lots of don'ts, a few dos, and a concrete Java SE example

Note: this post frequently refers to "encrypting" passwords, a term that usually implies that they could be decrypted. We're really talking about doing a one-way hash. I used the term "encrypt" to make it more accessible to those who are less familiar with cryptography, but "hash" would have been more precise.

The importance of storing passwords securely

As software developers, one of our most important responsibilities is the protection of our users' personal information. Without technical knowledge of our applications, users have no choice but to trust that we're fulfilling this responsibility. Sadly, when it comes to passwords, the software development community has a spotty track record.

While it's impossible to build a 100% secure system, there are fortunately some simple steps we can take to make our users' passwords safe enough to send would-be hackers in search of easier prey.

If you don't want all the background, feel free to skip to the Java SE example below.

The Don'ts

First, let's quickly discuss some of the things you shouldn't do when building an application that requires authentication:

Don't store authentication data unless you really have to. This may seem like a cop-out, but before you start building a database of user credentials, consider letting someone else handle it. If you're building a public application, consider using OAuth providers such as Google or Facebook. If you're building an internal enterprise application, consider using any internal authentication services that may already exist, like a corporate LDAP or Kerberos service. Whether it's a public or internal application, your users will appreciate not needing to remember another user ID and password, and it's one less database out there for hackers to attack.
If you must store authentication data, for Gosling's sake don't store the passwords in clear text. This should be obvious, but it bears mentioning. Let's at least make the hackers break a sweat.
Don't use two-way encryption unless you really need to retrieve the clear-text password. You only need to know their clear-text password if you are using their credentials to interact with an external system on their behalf. Even then, you're better off having the user authenticate with that system directly. To be clear, you do not need to use the user's original clear-text password to perform authentication in your application. I'll go into more detail on this later, but when performing authentication, you will be applying an encryption algorithm to the password the user entered and comparing it to the encrypted password you've stored.
Don't use outdated hashing algorithms like MD5. Honestly, hashing a password with MD5 is virtually useless. Here's an MD5-hashed password: 569a70c2ccd0ac41c9d1637afe8cd932. Go to http://www.md5hacker.com/ and you can decrypt it in seconds.
Don't come up with your own encryption scheme. There are a handful of brilliant encryption experts in the world that are capable of outwitting hackers and devising a new encryption algorithm. I am not one of them, and most likely, neither are you. If a hacker gets access to your user database, they can probably get your code too. Unless you've invented the next great successor to PBKDF2 or bcrypt, they will be cackling maniacally as they quickly crack all your users' passwords and publish them on the darknet.

The Dos

Okay, enough lecturing on what not to do. Here are the things you need to focus on:

Choose a one-way encryption algorithm. As I mentioned above, once you've encrypted and stored a user's password, you never need to know the real value again. When a user attempts to authenticate, you'll just apply the same algorithm to the password they entered, and compare that to the encrypted password that you stored.
Make the encryption as slow as your application can tolerate. Any modern password encryption algorithm should allow you to provide parameters that increase the time needed to encrypt a password (i.e. in PBKDF2, specifying the number of iterations). Why is slow good? Your users won't notice if it takes an extra 100ms to encrypt their password, but a hacker trying a brute-force attack will notice the difference as they run the algorithm billions of times.
Pick a well-known algorithm. The National Institute of Standards and Technology (NIST) recommends PBKDF2 for passwords. bcrypt is a popular and established alternative, and scrypt is a relatively new algorithm that has been well-received. All these are popular for a reason: they're good.

PBKDF2

Before I give show you some concrete code, let's talk a little about why PBKDF2 is a good choice for encrypting passwords:

Recommended by the NIST. Section 5.3 of Special Publication 800-132 recommends PBKDF2 for encrypting passwords. Security officials will love that.
Adjustable key stretching to defeat brute force attacks. The basic idea of key stretching is that after you apply your hashing algorithm to the password, you then continue to apply the same algorithm to the result many times (the iteration count). If hackers are trying to crack your passwords, this greatly increases the time it takes to try the billions of possible passwords. As mentioned previously, the slower, the better. PBKDF2 lets you specify the number of iterations to apply, allowing you to make it as slow as you like.
A required salt to defeat rainbow table attacks and prevent collisions with other users. A salt is a randomly generated sequence of bits that is unique to each user and is added to the user's password as part of the hashing. This prevents rainbow table attacks by making a precomputed list of results unfeasible. And since each user gets their own salt, even if two users have the same password, the encrypted values will be different. There is a lot of conflicting information out there on whether the salts should be stored someplace separate from the encrypted passwords. Since the key stretching in PBKDF2 already protects us from brute-force attacks, I feel it is unnecessary to try to hide the salt. Section 3.1 of NIST SP 800-132 also defines salt as a "non-secret binary value," so that's what I go with.
Part of Java SE 6. No additional libraries necessary. This is particularly attractive to those working in environments with restrictive open-source policies.

Finally, a concrete example

Okay, here's some code to encrypt passwords using PBKDF2. Only Java SE 6 is required.

import java.security.NoSuchAlgorithmException;
import java.security.SecureRandom;
import java.security.spec.InvalidKeySpecException;
import java.security.spec.KeySpec;
import java.util.Arrays;

import javax.crypto.SecretKeyFactory;
import javax.crypto.spec.PBEKeySpec;

public class PasswordEncryptionService {

 public boolean authenticate(String attemptedPassword, byte[] encryptedPassword, byte[] salt)
   throws NoSuchAlgorithmException, InvalidKeySpecException {
  // Encrypt the clear-text password using the same salt that was used to
  // encrypt the original password
  byte[] encryptedAttemptedPassword = getEncryptedPassword(attemptedPassword, salt);

  // Authentication succeeds if encrypted password that the user entered
  // is equal to the stored hash
  return Arrays.equals(encryptedPassword, encryptedAttemptedPassword);
 }

 public byte[] getEncryptedPassword(String password, byte[] salt)
   throws NoSuchAlgorithmException, InvalidKeySpecException {
  // PBKDF2 with SHA-1 as the hashing algorithm. Note that the NIST
  // specifically names SHA-1 as an acceptable hashing algorithm for PBKDF2
  String algorithm = "PBKDF2WithHmacSHA1";
  // SHA-1 generates 160 bit hashes, so that's what makes sense here
  int derivedKeyLength = 160;
  // Pick an iteration count that works for you. The NIST recommends at
  // least 1,000 iterations:
  // http://csrc.nist.gov/publications/nistpubs/800-132/nist-sp800-132.pdf
  // iOS 4.x reportedly uses 10,000:
  // http://blog.crackpassword.com/2010/09/smartphone-forensics-cracking-blackberry-backup-passwords/
  int iterations = 20000;

  KeySpec spec = new PBEKeySpec(password.toCharArray(), salt, iterations, derivedKeyLength);

  SecretKeyFactory f = SecretKeyFactory.getInstance(algorithm);

  return f.generateSecret(spec).getEncoded();
 }

 public byte[] generateSalt() throws NoSuchAlgorithmException {
  // VERY important to use SecureRandom instead of just Random
  SecureRandom random = SecureRandom.getInstance("SHA1PRNG");

  // Generate a 8 byte (64 bit) salt as recommended by RSA PKCS5
  byte[] salt = new byte[8];
  random.nextBytes(salt);

  return salt;
 }
}

The flow goes something like this:

When adding a new user, call generateSalt(), then getEncryptedPassword(), and store both the encrypted password and the salt. Do not store the clear-text password. Don't worry about keeping the salt in a separate table or location from the encrypted password; as discussed above, the salt is non-secret.
When authenticating a user, retrieve the previously encrypted password and salt from the database, then send those and the clear-text password they entered to authenticate(). If it returns true, authentication succeeded.
When a user changes their password, it's safe to reuse their old salt; you can just call getEncryptedPassword() with the old salt.

Easy enough, right? If you're building or maintaining an application that violates any of the "don'ts" above, then please do your users a favor and use something like PBKDF2 or bcrypt. Help them, Obi-Wan Developer, you're their only hope.

References

NIST: Special Publication 800-132
Wikipedia: PBKDF2
Wikipedia: Key Stretching
Wikipedia: Salt (cryptography)
Wikipedia: Rainbow Table Attacks
Vladimir Katalov: Smartphone Forensics: Cracking BlackBerry Backup Passwords

21 comments:

PiyushJune 17, 2012 at 8:27 AM
Hi Jerry, I dont know MD5 is hackable or not, however they do store the password with MD5 hash which is used to search in their db.
Jovan JovanovicDecember 3, 2012 at 9:12 PM
Thank you, thank you, thank you, thank you SO MUCH! I'm searching for real HMAC-SHA1 as used in WPA2 for 2 days. I MEAN, SEARCHING, ON INTERNET, CONSTANTLY for 2 days. This is first working right!!! THANK YOU!
BonaciDecember 30, 2012 at 4:54 PM
Thank you for your article! It really is very useful and concise!
AnonymousJanuary 22, 2013 at 11:35 AM
Great article. I'm sharing it with developers as part of my campaign against md5. I had not realized that Java6 came with pbkdf2.
UnknownFebruary 15, 2013 at 7:38 PM
Hi Jerry, I love your class it works great but I'm a little frustrated because when I use it I expected a byte array of 160 bytes but I get one of like 20 bytes and I don't know what I'm doing wrong, should I convert the string to a byte array before running getEncryptedPassword? or should I append like 100 spaces after? why is my encrypted password so little?
Can you help me?

Thanks!
MikiJMarch 9, 2013 at 3:43 PM
Hi and THANKS Jerry! Complete cudos for putting together good cryptological practices in one concise article. There are plenty of usages where I do not need to be an expert and still have some basic security in the application.

Any way, I do have a question, or more a polite big request... I am not sure if you know enough about the topic, but a similar do's and don't of authentication over the network would be really appreciated. I have some ideas about how to do it, but just like you point in your article, making my own algorithm is error-prone to say the least. So the scope is something like, authenticate the user over an unencrypted connection, and after authentication, client should receive encryption key to use for the session.

Well, even if you cannot help with the network stuff, thanks again as the article is great.

Regards,

Miki.
Jin KwonMarch 26, 2013 at 3:23 AM
Great article. This is the most valuable entry that every Java developers should read.
AmolJuly 9, 2013 at 3:55 PM
How to decode password back to normal text using same algorithm? This is helpful in retrieving password when user forget password.
lord aldebaranMarch 24, 2014 at 5:59 PM
How use the class in case 1? Give me example
UnknownOctober 2, 2014 at 9:12 AM
I ran into some trouble using the code, any ideas why?
http://stackoverflow.com/questions/26149242/storing-byte-pbekeyspec-to-a-postgres-behave-differently-on-openshift
AnonymousFebruary 28, 2019 at 9:48 PM
It is an informative post.
Randy TurnerOctober 10, 2024 at 4:56 AM
This is such an inspiring read! Your insights really resonate and make me think differently. Thank you for sharing!

Wednesday, May 16, 2012

Secure Password Storage - Lots of don'ts, a few dos, and a concrete Java SE example