Difference between revisions of "Java Generics"

From Wiki Notes @ WuJiewen.com, by Jiewen Wu
Jump to: navigation, search
m
m
Line 87: Line 87:
  
 
We could have declared rebox() as a generic method in the first place, like reboxHelper(), but that is considered '''bad''' API design style. The governing design principle here is "don't give something a name if you're never going to refer to it by name." In the case of generic methods, if a type parameter appears only once in the method signature, then it probably should be a wildcard rather than a named type parameter. Because the name can always be resurrected with a private capture helper if needed, this approach gives you the opportunity to keep APIs clean without throwing useful information away.
 
We could have declared rebox() as a generic method in the first place, like reboxHelper(), but that is considered '''bad''' API design style. The governing design principle here is "don't give something a name if you're never going to refer to it by name." In the case of generic methods, if a type parameter appears only once in the method signature, then it probably should be a wildcard rather than a named type parameter. Because the name can always be resurrected with a private capture helper if needed, this approach gives you the opportunity to keep APIs clean without throwing useful information away.
 +
 +
==Type Inferences==
 +
The compiler will try and infer the most specific type it can for the type parameters when resolving a call to a generic method. For example, the compiler could infer that T is Integer, Number, Serializable, or Object, but it chooses Integer as that is the most specific type that fits the constraints.
 +
 +
You can use type inference to reduce some of the redundancy when constructing generic instances. For example, using our Box class, creating a Box<String> requires you to specify the type parameter String twice:
 +
 +
Box<String> box = new BoxImpl<String>();
 +
 +
This violation of the DRY principle (Don't Repeat Yourself) here can be irksome. However, if the implementation class BoxImpl provides a generic factory method, you can reduce this redundancy in client code:
 +
 +
A generic factory method that allows you to avoid redundantly specifying type parameters
 +
 +
public class BoxImpl<T> implements Box<T> {
 +
 +
    public static<V> Box<V> make() {
 +
        return new BoxImpl<V>();
 +
    }
 +
}
 +
 +
 +
If you instantiate a Box using the BoxImpl.make() factory, you need only specify the type parameter once:
 +
 +
Box<String> myBox = BoxImpl.make();
 +
 +
 +
The generic make() method returns a Box<V> for some type V, and the return value is being used in a context that requires a Box<String>. The compiler determines that String is the most specific type that V could take on that satisfies the type constraints. You still have the option of manually specifying the value of V as follows:
 +
 +
Box<String> myBox = BoxImpl.<String>make();
  
  

Revision as of 09:05, 14 March 2011

Generics are a means of expressing type constraints on the behavior of a class or method in terms of unknown types, such as "whatever the types of parameters x and y of this method are, they must be the same type," "you must provide a parameter of the same type to both of these methods," or "the return value of foo() is the same type as the parameter of bar()."

Wildcards

Wildcards — ? — are a means of expressing a type constraint in terms of an unknown type. They were not part of the original design for generics (derived from the Generic Java (GJ) project); they were added as the design process played out over the five years between the formation of JSR 14 and its final release.

The wildcard type List<?> is different from both the raw type List and the concrete type List<Object>. To say a variable x has type List<?> means that there exists some type T for which x is of type List<T>, that x is homogeneous even though we don't know what particular type its elements have. It's not that the contents can be anything, it's that we don't know what the type constraints on the contents are — but we know that there is a constraint. On the other hand, the raw type List is heterogeneous; we are not able to place any type constraints on its elements, and the concrete type List<Object> means that we explicitly know that it can contain any object.

One benefit of wildcards is that they allow you to write code that can operate on variables of generic types without knowing their exact type bound.

Generics

Type Erasure

When a generic type is instantiated, the compiler translates those types by a technique called type erasure — a process where the compiler removes all information related to type parameters and type arguments within a class or method. Type erasure enables Java applications that use generics to maintain binary compatibility with Java libraries and applications that were created before generics.

For instance, Box<String> is translated to type Box, the raw type — a raw type is a generic class or interface name without any type arguments. This means that you can't find out what type of Object a generic class is using at runtime. The following operations are not possible:

   public class MyClass<E> {
       public static void myMethod(Object item) {
           if (item instanceof E) {  //Compiler error
               ...
           }
           E item2 = new E();   //Compiler error
           E[] iArray = new E[10]; //Compiler error
           E obj = (E)new Object(); //Unchecked cast warning
       }
   }

The operations shown in bold are meaningless at runtime because the compiler removes all information about the actual type argument (represented by the type parameter E) at compile time.

Type erasure exists so that new code may continue to interface with legacy code. Using a raw type for any other reason is considered bad programming practice and should be avoided whenever possible.

Generics are not covariant

Arrays are covariant; because Integer is a subtype of Number, the array type Integer[] is a subtype of Number[], and therefore an Integer[] value can be supplied wherever a value of Number[] is required. (More formally, if Number is a supertype of Integer, then Number[] is a supertype of Integer[].) On the other hand, generics are not covariant; List<Integer> is not a subtype of List<Number>, and attempting to supply a List<Integer> where a List<Number> is demanded is a type error.

It turns out there's a good reason it doesn't work that way: It would break the type safety generics were supposed to provide. Imagine you could assign a List<Integer> to a List<Number>. Then the following code would allow you to put something that wasn't an Integer into a List<Integer>:

List<Integer> li = new ArrayList<Integer>(); List<Number> ln = li; // illegal ln.add(new Float(3.1415));

Use Wildcards

Look at this interface Box.

public interface Box<T> {
    public T get();
    public void put(T element);
}

We use the wildcards in the unbox method.

public void unbox(Box<?> box) {
    System.out.println(box.get());
}

unbox() can call the get() method, and it can call any of the methods inherited from Object (such as hashCode()). The only thing it cannot do is call the put() method, and this is because it cannot verify the safety of such an operation without knowing the type parameter T for this Box instance. Because box is a Box<?>, and not a raw Box, the compiler knows that there is some T that serves as a type parameter for box, but because it doesn't know what that T is, it will not let you call put() because it cannot verify that doing so will not violate the type safety constraints for Box. (Actually, you can call put() in one special case: when you pass the null literal. We may not know what type T represents, but we know that the null literal is a valid value for any reference type.)

It might be tempting to write the following rebox().

public void rebox(Box<?> box) {
    box.put(box.get());
}
Rebox.java:8: put(capture#337 of ?) in Box<capture#337 of ?> cannot be applied
  to (java.lang.Object)
   box.put(box.get());
      ^
1 error


When the compiler encounters a variable with a wildcard in its type, such as the box parameter of rebox(), it knows that there must have been some T for which box is a Box<T>. It does not know what type T represents, but it can create a placeholder for that type to refer to the type that T must be. That placeholder is called the capture of that particular wildcard. In this case, the compiler has assigned the name "capture#337 of ?" to the wildcard in the type of box. Each occurrence of a wildcard in each variable declaration gets a different capture, so in the generic declaration foo(Pair<?,?> x, Pair<?,?> y), the compiler would assign a different name to the capture of each of the four wildcards because there is no relationship between any of the unknown type parameters.

In this case, because ? essentially means "? extends Object," the compiler has already concluded that the type of box.get() is Object, not "capture#337 of ?," and it can't statically verify that an Object is an acceptable value for the type identified by the placeholder "capture#337 of ?."

Generic Methods

The following implementation of rebox(), along with a generic helper method, does the trick:

public void rebox(Box<?> box) {
    reboxHelper(box);
}
private<V> void reboxHelper(Box<V> box) {
    box.put(box.get());
}


Generic methods introduce additional type parameters (placed in angle brackets before the return type), which are usually used to formulate type constraints between the parameters and/or return value of the method. In the case of reboxHelper(), however, the generic method does not use the type parameter to specify a type constraint; it allows the compiler (through type inference) to give a name to the type parameter of box's type.

When rebox() calls reboxHelper(), it knows that doing so is safe because its own box parameter must be a Box<T> for some unknown T. Because the type parameter V is introduced in the method signature and not tied to any other type parameter, it can stand for any unknown type as well, so a Box<T> for some unknown T might as well be a Box<V> for some unknown V. (This is similar to the principle of alpha reduction in the lambda calculus, which allows you to rename bound variables.) Now the expression box.get() in reboxHelper() no longer has type Object, it has type V — and it is allowable to pass a V to Box<V>.put().

We could have declared rebox() as a generic method in the first place, like reboxHelper(), but that is considered bad API design style. The governing design principle here is "don't give something a name if you're never going to refer to it by name." In the case of generic methods, if a type parameter appears only once in the method signature, then it probably should be a wildcard rather than a named type parameter. Because the name can always be resurrected with a private capture helper if needed, this approach gives you the opportunity to keep APIs clean without throwing useful information away.

Type Inferences

The compiler will try and infer the most specific type it can for the type parameters when resolving a call to a generic method. For example, the compiler could infer that T is Integer, Number, Serializable, or Object, but it chooses Integer as that is the most specific type that fits the constraints.

You can use type inference to reduce some of the redundancy when constructing generic instances. For example, using our Box class, creating a Box<String> requires you to specify the type parameter String twice:

Box<String> box = new BoxImpl<String>();

This violation of the DRY principle (Don't Repeat Yourself) here can be irksome. However, if the implementation class BoxImpl provides a generic factory method, you can reduce this redundancy in client code:

A generic factory method that allows you to avoid redundantly specifying type parameters

public class BoxImpl<T> implements Box<T> {
   public static<V> Box<V> make() {
       return new BoxImpl<V>();
   }
}


If you instantiate a Box using the BoxImpl.make() factory, you need only specify the type parameter once:

Box<String> myBox = BoxImpl.make();


The generic make() method returns a Box<V> for some type V, and the return value is being used in a context that requires a Box<String>. The compiler determines that String is the most specific type that V could take on that satisfies the type constraints. You still have the option of manually specifying the value of V as follows:

Box<String> myBox = BoxImpl.<String>make();


References