Sunday, February 12, 2012

Varargs Runtime Type Tokens

A common pattern for accessing generic type information in Java at runtime is to use class literals as runtime type tokens. For example, consider the following generic interface:

public interface Factory<T>
{
    T make() throws Exception;
}

We can implement this using class literals to access T at runtime even though this information is erased at compile time:

public class ClassLiteralFactory<T> implements Factory<T>
{
    private final Class<T> type;
    
    public ClassLiteralFactory(Class<T> type)
    {
        this.type = type;
    }
    
    public T make() throws Exception
    {
        return type.newInstance();
    }
}

This allows us to create a Factory for any object with a default constructor by supplying the corresponding class literal:

Factory<Date> factory = new ClassLiteralFactory<Date>(
    Date.class);
Date date = factory.make();

All rather straightforward so far, but can we implement this interface without requiring an explicit class literal? It turns out that we can, by using varargs:

public class VarArgsFactory<T> implements Factory<T>
{
    private final Class<? extends T> type;
    
    public VarArgsFactory(T... type)
    {
        this.type = (Class<? extends T>) type.getClass()
            .getComponentType();
    }
    
    public T make() throws Exception
    {
        return type.newInstance();
    }
}

The trick here is to take advantage of the code generated by the compiler when invoking a varargs method. For example, when we write:

Factory<Date> factory = new VarArgsFactory<Date>();

The compiler actually generates the following:

Factory<Date> factory = new VarArgsFactory<Date>(
    new Date[0]);

This empty array then allows us to obtain a class literal as we did previously. The subtle difference with this technique, though, is that we can only guarantee Class<? extends T> as opposed to Class<T> due to the covariant nature of Java arrays. For instance, we could legitimately write:

Factory<Date> factory = new VarArgsFactory<Date>(
    new Timestamp[0]);

Which would happily become a factory for the Date subclass java.sql.Timestamp (if it had a default constructor). Note that ClassLiteralFactory does not suffer from this problem because Timestamp.class would be an invalid argument for the Class<Date> parameter since parameterized types are invariant.

So can we rationalise the unchecked cast in the constructor? Strictly speaking, all the compiler can guarantee for the class of type is Class<? extends Object[]>, since Object[] is the erasure of T[]. Although, in this case, our constructor is not annotated with @SafeVarargs so we can safely assume that T is a reifiable type, otherwise the caller would encounter an unchecked warning and type safety would no longer be guaranteed. This provides the justification to cast type to Class<? extends T[]> and hence its component type to Class<? extends T>.

Considering the case when T is non-reifiable leads us to discover some interesting benefits of this pattern over class literals. For example, if we annotated the constructor with @SafeVarargs then our factory can also support parameterized types:

Factory<ArrayList<Date>> factory =
    new VarArgsFactory<ArrayList<Date>>();
ArrayList<Date> dates = factory.make();

Here, the actual type argument ArrayList<Date> is erased to the raw type ArrayList to create the vararg, then instantiated, and essentially cast back to ArrayList<Date>. This is type safe since all generic instantiations share the same raw type. Note that by allowing non-reifiable types like this means that we should revisit how the runtime type token is declared. Because the raw type is a supertype of its generic subtypes, the runtime type token now becomes Class<? extends ? super T> which can only be safely declared as Class<?>. This has the unfortunate consequence that each use of the runtime type token must rationalise its own unchecked warning.

One caveat of allowing non-reifiable types is that they can violate type safety when type variables are used. For instance, the following method will always return an Object instance irrespective of the actual type argument specified:

public <T> T unsafeMake()
{
    return new VarArgsFactory<T>().make();
}

So the following will throw a ClassCastException:

Date date = unsafeMake();

Still, the benefits of supporting parameterized types may outweigh these drawbacks if they are clearly documented.

In conclusion, the advantages of the varargs pattern for runtime type tokens over class literals are: less boilerplate code since no class literal is required; and parameterized types can be supported to a degree. Its disadvantages are: an upper-bounded runtime type token restricts use; and type safety can be violated when non-reifiable type support is required. Nevertheless, it's a useful trick to keep up an API designer's sleeve.