Jerry on Java: 2012

Thursday, August 2, 2012

A petition to release government-developed software to the OSS community

In July 2012, a petition was created to mandate that U.S. federal government-developed software be released to the open source community. I think this is a fantastic idea, and I'd like to elaborate on some of the points made in the petition (highlighting is mine):

Openness: Open Sourcing ensures basic fairness and transparency by making software and related artifacts available to the citizens who provided funding, consistent with the President’s 2009 declaration that “Information maintained by the Federal Government is a national asset.”

If you pay taxes in the United States, then you are paying for the software developed by the government; taxpayers can reap maximum value for their investment by releasing that software to the open source community.

There's even a logical precedent for this; a "work of the United States government" is not entitled to copyright protection (essentially public domain). An excellent example is photos taken by the federal government, which are public domain and freely available.

Supports the Federal “Shared First” Agenda: Maximizes value to the government by significantly increasing reuse and collaborative development between federal agencies and the private sector...

I think this collaboration could result in a few interesting scenarios:

Software developed by the government (which costs taxpayers a significant sum of money) could be leveraged by private sector developers, just as software developed by the private sector and then open sourced has driven so many fantastic projects. In fact, the National Security Agency has already given their Accumulo NoSQL database to Apache.
Private sector developers could actually improve the software the government has developed! Imagine what some of those sharp Google Summer of Code developers could do for a summer project! That sounds to me like democracy for the 21st century.
Open sourcing the software could be an important bridge to opening up the wealth data our government collects. Obviously there would be privacy and security concerns, but far more data would be released if private sector developers were contributing new APIs to open sourced government software.

What you can do

The easiest thing you can do is sign the petition and get the word out through whatever channels you prefer (Twitter, Facebook, your own blog, whatever).

Unfortunately, the petition needs 25,000 signatures by August 16 to receive a White House response; with less than 700 signatures as of August 2, this seems unlikely. However, this petition could just be the beginning of a movement. The more publicity it gets, the more likely it will be that the effort will take off.

Follow @JerryOnJava

Friday, June 22, 2012

Beware of Dojo Selects, Stores, and numeric IDs

A handy feature of Dijit Select (or an extending widget such as FilteringSelect or ComboBox) is the integration with Dojo Data stores. You can populate your Select/FilteringSelect/ComboBox with any data that can be accessed through a Dojo Data store, including JSON REST requests, HTTP queries, CSV files, or even Wikipedia.

However, there is a limitation to this that has caused me problems, even though it's documented and I've run into it a couple times: Selects do not play nicely with stores that have non-string IDs (like integers). To quote from a Dojo tutorial:

dijit.form.Select possesses an important limitation: it is implemented in such a way that it does not handle non-string item identities well. Particularly, setting the current value of the widget programmatically via select.set("value", id) will not work with non-string (e.g. numeric) identities.

To drive this point home, I'll give a concrete example illustrating the problem. You can also see the example in action on OrionHub or download the source code and run it yourself.

The goal

We're going to create a pair of Selects that will display a list of ships: one backed by a store with String IDs, and one backed by a store with numeric IDs. The generated page will look like this:

The data

Our store will be a dojo/data/ItemFileReadStore, which will load JSON from a given file. The String ID JSON looks like this:

{
    "identifier": "shipId",
    "label": "shipName",
    "items": [
     { "shipId": "1", "shipName": "Constitution" },
        { "shipId": "2", "shipName": "Enterprise" }, 
        // You get the point...
        { "shipId": "6", "shipName": "Yorktown" }
    ]
}

And the numeric ID JSON looks like this:

{
    "identifier": "shipId",
    "label": "shipName",
    "items": [
     { "shipId": 1, "shipName": "Constitution" },
        { "shipId": 2, "shipName": "Enterprise" }, 
        // You get the point...
        { "shipId": 6, "shipName": "Yorktown" }
    ]
}

The HTML

The HTML is pretty uninteresting; it basically loads the JavaScript files and throws in some DIV placeholders for the widgets:

<html>
 <head>
  <link rel="stylesheet"
   href="http://ajax.googleapis.com/ajax/libs/dojo/1.7/dijit/themes/claro/claro.css">
  
  <script src="//ajax.googleapis.com/ajax/libs/dojo/1.7.2/dojo/dojo.js"
   data-dojo-config="async: true"></script>
  <script src="//ajax.googleapis.com/ajax/libs/dojo/1.7.2/dijit/dijit.js"
   data-dojo-config="async: true"></script>
  <script src="selectStoreExample.js"
   data-dojo-config="async: true"></script>
 </head>
 
 <body class="claro">

  
  <h1>Using String IDs</h1>
  <div>
   shipSelectString: 
   <div id="shipSelectString"></div>
   
   <div id="selectStringEnterpriseButton"></div>
   
   <div id="selectStringYorktownButton"></div>
  </div>
  
  <h1>Using Numeric IDs</h1>
  <div>
   shipSelectNumber: 
   <div id="shipSelectNumber"></div>
   
   <div id="selectNumberEnterpriseButton"></div>
   
   <div id="selectNumberYorktownButton"></div>
  </div>
 </body>
</html>

The JavaScript

Here's where all the Dijit bindings happen. For both the String and numeric ID stores, we create a Select and a couple buttons that call select.set("value", [the ID]):

require(["dojo", "dijit", "dojo/data/ItemFileReadStore", "dijit/form/Select", 
   "dijit/form/Button", "dojo/domReady!"
  ], function(dojo, dijit, ItemFileReadStore, Select, Button) {
 //****** Using Strings for IDs ******
 //Create the data store using a JSON file
 var shipStoreString = new ItemFileReadStore({
  url: "ships-string.json"
 });
 
 //Create and start the Select, using the ItemFileReadStore as its store
 var selectString = new Select({
  name: "shipSelectString",
  store: shipStoreString
  }, "shipSelectString");

 selectString.startup();
 
 //Create buttons that will use the Select widget's set() function to set the value
 new Button({
  label: "Set shipSelectString to Enterprise",
  onClick: function() {
   selectString.set("value", "2");
  }
 }, "selectStringEnterpriseButton");
 
 new Button({
  label: "Set shipSelectString to Yorktown",
  onClick: function() {
   selectString.set("value", "6");
  }
 }, "selectStringYorktownButton");
 
 //****** Using numbers for IDs ******
 //Create the data store using a JSON file
 var shipStoreNumber = new ItemFileReadStore({
  url: "ships-number.json"
 });
 
 //Create and start the Select, using the ItemFileReadStore as its store
 var selectNumber = new Select({
  name: "shipSelectNumber",
  store: shipStoreNumber
  }, "shipSelectNumber");

 selectNumber.startup();
 
 //Create buttons that will use the Select widget's set() function to set the value
 new Button({
  label: "Set shipSelectNumber to Enterprise",
  onClick: function() {
   selectNumber.set("value", 2);
  }
 }, "selectNumberEnterpriseButton");
 
 new Button({
  label: "Set shipSelectNumber to Yorktown",
  onClick: function() {
   selectNumber.set("value", 6);
  }
 }, "selectNumberYorktownButton");
});

Now if you fire up the HTML page in your browser, you'll see the screen shown above in your browser. Click on the buttons, and you'll see that the Select with Strings as IDs changes when you click the buttons, but nothing happens to the Select with numeric IDs. This is the limitation (bug?) that prevents you from using numeric IDs in a store that's backing a Select.

Resources

Wednesday, May 16, 2012

Secure Password Storage - Lots of don'ts, a few dos, and a concrete Java SE example

Note: this post frequently refers to "encrypting" passwords, a term that usually implies that they could be decrypted. We're really talking about doing a one-way hash. I used the term "encrypt" to make it more accessible to those who are less familiar with cryptography, but "hash" would have been more precise.

The importance of storing passwords securely

As software developers, one of our most important responsibilities is the protection of our users' personal information. Without technical knowledge of our applications, users have no choice but to trust that we're fulfilling this responsibility. Sadly, when it comes to passwords, the software development community has a spotty track record.

While it's impossible to build a 100% secure system, there are fortunately some simple steps we can take to make our users' passwords safe enough to send would-be hackers in search of easier prey.

If you don't want all the background, feel free to skip to the Java SE example below.

The Don'ts

First, let's quickly discuss some of the things you shouldn't do when building an application that requires authentication:

Don't store authentication data unless you really have to. This may seem like a cop-out, but before you start building a database of user credentials, consider letting someone else handle it. If you're building a public application, consider using OAuth providers such as Google or Facebook. If you're building an internal enterprise application, consider using any internal authentication services that may already exist, like a corporate LDAP or Kerberos service. Whether it's a public or internal application, your users will appreciate not needing to remember another user ID and password, and it's one less database out there for hackers to attack.
If you must store authentication data, for Gosling's sake don't store the passwords in clear text. This should be obvious, but it bears mentioning. Let's at least make the hackers break a sweat.
Don't use two-way encryption unless you really need to retrieve the clear-text password. You only need to know their clear-text password if you are using their credentials to interact with an external system on their behalf. Even then, you're better off having the user authenticate with that system directly. To be clear, you do not need to use the user's original clear-text password to perform authentication in your application. I'll go into more detail on this later, but when performing authentication, you will be applying an encryption algorithm to the password the user entered and comparing it to the encrypted password you've stored.
Don't use outdated hashing algorithms like MD5. Honestly, hashing a password with MD5 is virtually useless. Here's an MD5-hashed password: 569a70c2ccd0ac41c9d1637afe8cd932. Go to http://www.md5hacker.com/ and you can decrypt it in seconds.
Don't come up with your own encryption scheme. There are a handful of brilliant encryption experts in the world that are capable of outwitting hackers and devising a new encryption algorithm. I am not one of them, and most likely, neither are you. If a hacker gets access to your user database, they can probably get your code too. Unless you've invented the next great successor to PBKDF2 or bcrypt, they will be cackling maniacally as they quickly crack all your users' passwords and publish them on the darknet.

The Dos

Okay, enough lecturing on what not to do. Here are the things you need to focus on:

Choose a one-way encryption algorithm. As I mentioned above, once you've encrypted and stored a user's password, you never need to know the real value again. When a user attempts to authenticate, you'll just apply the same algorithm to the password they entered, and compare that to the encrypted password that you stored.
Make the encryption as slow as your application can tolerate. Any modern password encryption algorithm should allow you to provide parameters that increase the time needed to encrypt a password (i.e. in PBKDF2, specifying the number of iterations). Why is slow good? Your users won't notice if it takes an extra 100ms to encrypt their password, but a hacker trying a brute-force attack will notice the difference as they run the algorithm billions of times.
Pick a well-known algorithm. The National Institute of Standards and Technology (NIST) recommends PBKDF2 for passwords. bcrypt is a popular and established alternative, and scrypt is a relatively new algorithm that has been well-received. All these are popular for a reason: they're good.

PBKDF2

Before I give show you some concrete code, let's talk a little about why PBKDF2 is a good choice for encrypting passwords:

Recommended by the NIST. Section 5.3 of Special Publication 800-132 recommends PBKDF2 for encrypting passwords. Security officials will love that.
Adjustable key stretching to defeat brute force attacks. The basic idea of key stretching is that after you apply your hashing algorithm to the password, you then continue to apply the same algorithm to the result many times (the iteration count). If hackers are trying to crack your passwords, this greatly increases the time it takes to try the billions of possible passwords. As mentioned previously, the slower, the better. PBKDF2 lets you specify the number of iterations to apply, allowing you to make it as slow as you like.
A required salt to defeat rainbow table attacks and prevent collisions with other users. A salt is a randomly generated sequence of bits that is unique to each user and is added to the user's password as part of the hashing. This prevents rainbow table attacks by making a precomputed list of results unfeasible. And since each user gets their own salt, even if two users have the same password, the encrypted values will be different. There is a lot of conflicting information out there on whether the salts should be stored someplace separate from the encrypted passwords. Since the key stretching in PBKDF2 already protects us from brute-force attacks, I feel it is unnecessary to try to hide the salt. Section 3.1 of NIST SP 800-132 also defines salt as a "non-secret binary value," so that's what I go with.
Part of Java SE 6. No additional libraries necessary. This is particularly attractive to those working in environments with restrictive open-source policies.

Finally, a concrete example

Okay, here's some code to encrypt passwords using PBKDF2. Only Java SE 6 is required.

import java.security.NoSuchAlgorithmException;
import java.security.SecureRandom;
import java.security.spec.InvalidKeySpecException;
import java.security.spec.KeySpec;
import java.util.Arrays;

import javax.crypto.SecretKeyFactory;
import javax.crypto.spec.PBEKeySpec;

public class PasswordEncryptionService {

 public boolean authenticate(String attemptedPassword, byte[] encryptedPassword, byte[] salt)
   throws NoSuchAlgorithmException, InvalidKeySpecException {
  // Encrypt the clear-text password using the same salt that was used to
  // encrypt the original password
  byte[] encryptedAttemptedPassword = getEncryptedPassword(attemptedPassword, salt);

  // Authentication succeeds if encrypted password that the user entered
  // is equal to the stored hash
  return Arrays.equals(encryptedPassword, encryptedAttemptedPassword);
 }

 public byte[] getEncryptedPassword(String password, byte[] salt)
   throws NoSuchAlgorithmException, InvalidKeySpecException {
  // PBKDF2 with SHA-1 as the hashing algorithm. Note that the NIST
  // specifically names SHA-1 as an acceptable hashing algorithm for PBKDF2
  String algorithm = "PBKDF2WithHmacSHA1";
  // SHA-1 generates 160 bit hashes, so that's what makes sense here
  int derivedKeyLength = 160;
  // Pick an iteration count that works for you. The NIST recommends at
  // least 1,000 iterations:
  // http://csrc.nist.gov/publications/nistpubs/800-132/nist-sp800-132.pdf
  // iOS 4.x reportedly uses 10,000:
  // http://blog.crackpassword.com/2010/09/smartphone-forensics-cracking-blackberry-backup-passwords/
  int iterations = 20000;

  KeySpec spec = new PBEKeySpec(password.toCharArray(), salt, iterations, derivedKeyLength);

  SecretKeyFactory f = SecretKeyFactory.getInstance(algorithm);

  return f.generateSecret(spec).getEncoded();
 }

 public byte[] generateSalt() throws NoSuchAlgorithmException {
  // VERY important to use SecureRandom instead of just Random
  SecureRandom random = SecureRandom.getInstance("SHA1PRNG");

  // Generate a 8 byte (64 bit) salt as recommended by RSA PKCS5
  byte[] salt = new byte[8];
  random.nextBytes(salt);

  return salt;
 }
}

The flow goes something like this:

When adding a new user, call generateSalt(), then getEncryptedPassword(), and store both the encrypted password and the salt. Do not store the clear-text password. Don't worry about keeping the salt in a separate table or location from the encrypted password; as discussed above, the salt is non-secret.
When authenticating a user, retrieve the previously encrypted password and salt from the database, then send those and the clear-text password they entered to authenticate(). If it returns true, authentication succeeded.
When a user changes their password, it's safe to reuse their old salt; you can just call getEncryptedPassword() with the old salt.

Easy enough, right? If you're building or maintaining an application that violates any of the "don'ts" above, then please do your users a favor and use something like PBKDF2 or bcrypt. Help them, Obi-Wan Developer, you're their only hope.

References

NIST: Special Publication 800-132
Wikipedia: PBKDF2
Wikipedia: Key Stretching
Wikipedia: Salt (cryptography)
Wikipedia: Rainbow Table Attacks
Vladimir Katalov: Smartphone Forensics: Cracking BlackBerry Backup Passwords

Monday, January 16, 2012

JSF and the "immediate" Attribute - Command Components

The immediate attribute in JSF is commonly misunderstood. If you don't believe me, check out Stack Overflow. Part of the confusion is likely due to immediate being available on both input (i.e.. <h:inputText />) and command (i.e. <h:commandButton />) components, each of which affects the JSF lifecycle differently.

Here is the standard JSF lifecycle:

For the purposes of this article, I'll assume you are familiar with the basics of the JSF lifecycle. If you need an introduction or a memory refresher, check out the Java EE 6 Tutorial - The Lifecycle of a JavaServer Faces Application.

Note: the code examples in this article are for JSF 2 (Java EE 6), but the principals are the same for JSF 1.2 (Java EE 5).

immediate=true on Command components

In the standard JSF lifecycle, the action attribute on an Command component is evaluated in the Invoke Application phase. For example, say we have a User entity/bean:

public class User implements Serializable {

 @NotBlank
 @Length(max = 50)
 private String firstName;

 @NotBlank
 @Length(max = 50)
 private String lastName;

 /* Snip constructors, getters/setters, a nice toString() method, etc */
}

And a UserManager to serve as our managed bean:

@SessionScoped
@ManagedBean
public class UserManager {
 private User newUser;

 /* Snip some general page logic... */

 public String addUser() {
  //Snip logic to persist newUser

  FacesContext.getCurrentInstance().addMessage(null,
    new FacesMessage("User " + newUser.toString() + " added"));

  return "/home.xhtml";
 }

And a basic Facelets page, newUser.xhtml, to render the view:

<h:form>
 <h:panelGrid columns="2">

  <h:outputText value="First Name: " />
  <h:panelGroup>
   <h:inputText id="firstName"
    value="#{userManager.newUser.firstName}" />
   <h:message for="firstName" />
  </h:panelGroup>

  <h:outputText value="Last Name: " />
  <h:panelGroup>
   <h:inputText id="lastName" value="#{userManager.newUser.lastName}" />
   <h:message for="lastName" />
  </h:panelGroup>

 </h:panelGrid>

 <h:commandButton value="Add User" action="#{userManager.addUser()}" />
</h:form>

Which all combine to produce this lovely form:

When the user clicks on the Add User button, #{userManager.addUser} will be called in the Invoke Application phase; this makes sense, because we want the input fields to be validated, converted, and applied to newUser before it is persisted.

Now let's add a "cancel" button to the page, in case the user changes his/her mind. We'll add another <h:commandButton /> to the page:

<h:form>
 <!-- Snip Input components --> 

 <h:commandButton value="Add User" action="#{userManager.addUser()}" />
 <h:commandButton value="Cancel" action="#{userManager.cancel()}" />
</h:form>

And the cancel() method to UserManager:

public String cancel() {
 newUser = new User();

 FacesContext.getCurrentInstance().addMessage(null,
   new FacesMessage("Cancelled new user"));

 return "/home.xhtml";
}

Looks good, right? But when we actually try to use the cancel button, we get errors complaining that first and last name are required:

This is because #{userManager.cancel} isn't called until the Invoke Application phase, which occurs after the Process Validations phase; since we didn't enter a first and last name, the validations failed before #{userManager.cancel} is called, and the response is rendered after the Process Validations phase.

We certainly don't want to require the end user to enter a valid user before cancelling! Fortunately, JSF provides the immediate attribute on Command components. When immediate is set to true on an Command component, the action is invoked in the Apply Request Values phase:

This is perfect for our Cancel use case. If we add immediate=true to the Cancel , #{userManager.cancel} will be called in the Apply Request Values phase, before any validation occurs.

<h:form>  
 <!-- Snip Input components -->

 <h:commandButton value="Add User" action="#{userManager.addUser()}" />
 <h:commandButton value="Cancel" action="#{userManager.cancel()}" immediate="true" />
</h:form>

So now when we click cancel, #{userManager.cancel} is called in the Apply Request Values phase, and we are directed back to the home page with the expected cancellation message; no validation errors!

What about Input components?

Input components have the immediate attribute as well, which also moves all their logic into the Apply Request Values phase. However, the behavior is slightly different from Command components, especially depending on whether or not the validation on the Input component succeeds. My next article will address immediate=true on Input components. For now, here's a preview of how the JSF lifecycle is affected: