Issue
I know that when a Java compiler compiles a generic type, it performs Type erasure and removes all references to the generic from the code and my ArrayList<Cheesecake>
becomes just an ArrayList
.
The question that I don’t have a clear answer for is, whether this lack of a type (and therefore mandatory Type-casting) causes slow down. In order words, if I am using the standard Java compiler and the standard JVM 1.7 from Oracle:
- Does the bytecode include a type cast?
- If yes, does it include a run-time check to check if its the correct type?
- Does the conversion itself take a non-trivial amount of time?
- Would making my own
CheesecakeList
class that looks identical toArrayList
, just with all of theObject
s turned toCheesecake
s would I gain anything (again just a hypothetical and I am not interested in any side effects of having a larger binary because of more classes nor the duplication in my code).
I am not interested in any other problems that type erasure causes, just a simple answer from someone who understand the JVM better than I do.
I am most interested in the cost in this example:
List<Cheesecake> list = new ArrayList<Cheesecake>();
for (int i = 0; i < list.size(); i++)
{
// I know this gets converted to ((Cheesecake)list.get(i)).at(me)
list.get(i).eat(me);
}
Is that cast inside the loop expensive and/or significant as compared to:
CheesecakeList list = new CheesecakeList();
for (int i = 0; i < list.size(); i++)
{
//where the return of list.get() is a Cheesecake, therefore there is no casting.
list.get(i).eat(me);
}
DISCLAIMER: This is mostly a question of academic curiosity. I really doubt that there is any significant performance issues with the type cast and removing the need for them wouldn’t be even one of the top 5 things I would do if I found a performance bug in my code. If you are reading this because you DO have a performance issue do yourself a favor and start up a profiler and find out where the bottle neck is. If you really believe its in the type-casting, only then should you try and optimize it somehow.
Solution
The short story: Yes, there is a type check. Here’s the proof –
Given the following classes:
// Let's define a generic class.
public class Cell<T> {
public void set(T t) { this.t = t; }
public T get() { return t; }
private T t;
}
public class A {
static Cell<String> cell = new Cell<String>(); // Instantiate it.
public static void main(String[] args) {
// Now, let's use it.
cell.set("a");
String s = cell.get();
System.out.println(s);
}
}
The bytecode that A.main()
is compiled into (decompiled via javap -c A.class
) is as follows:
public static void main(java.lang.String[]);
Code:
0: getstatic #20 // Field cell:Lp2/Cell;
3: ldc #22 // String a
5: invokevirtual #24 // Method p2/Cell.set:(Ljava/lang/Object;)V
8: getstatic #20 // Field cell:Lp2/Cell;
11: invokevirtual #30 // Method p2/Cell.get:()Ljava/lang/Object;
14: checkcast #34 // class java/lang/String
17: astore_1
18: getstatic #36 // Field java/lang/System.out:Ljava/io/PrintStream;
21: aload_1
22: invokevirtual #42 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
25: return
}
As you can in offset 14
, the result of cell.get()
is type checked to verify that is indeed a string:
14: checkcast #34 // class java/lang/String
This does incur some runtime penalty. However, as the JVM is pretty well optimized, the effect of this is likely to be minimal.
The longer story:
How would you go about implementing such a CheesecakeList
class? Wouldn’t this class define an array to hold the elements? Be advised that every assignment to an array incurs a hidden type check. So you won’t gain as much as you think (although it is likely that your program will perform more read operations than write operations so a Cheesecake[]
array will give you something).
Bottom line: don’t optimize prematurely.
A final comment. People often think that type erasure means that Cell<String>
is compiled into Cell<Object>
. That’s not true. The erasure is applied only to the definition of the generic class/method. It is not applied to the usage sites of these classes/methods.
In other words, the class Cell<T>
is compiled as if it were written as Cell<Object>
. If T
has an upper bound (say Number
) then it is compiled into Cell<Number>
. In other places in the code, typically variables/parameters whose type is an instantiation of the Cell
class (such as Cell<String> myCell
), the erasure is not applied. The fact that myCell
is of type Cell<String>
is kept in the classfile. This allows the compiler to type check the program correctly.
Answered By – Itay Maman
Answer Checked By – David Goodson (BugsFixing Volunteer)