Suspicious comparison of Integer references in Java
Have you ever compared two separate (?) Integer instances with the == operator and got the result true? What do you think is the reason for this? Before we examine this situation in detail, let’s remember some basic information about Java types.
There are 8 primitive types (byte, short, int, long, float, double, char, and boolean) in Java. These are stored directly in memory as binary bits. For example, int a = 5; int b = 5;
Here a
and b
directly hold the binary value of 5, and if we try to compare a
and b
using a == b
, we are actually comparing 5 == 5
, which returns true.
All other types can be listed under reference types. Classes, interfaces, records, arrays, etc. These types hold the address of the object instead of the object itself. For example, Integer a = new Integer(5); Integer b = new Integer(5);
Here, instead of holding the value 5, the variables a
and b
hold the addresses of the two separate Integer objects that we define for variables a
and b
. So if we compare a
and b
using a == b
, we are actually comparing the addresses of the objects, which should return false. If we want to compare the values of the objects we use a.equals(b)
. Because Integer class overrides the Object#equals method to compare values instead of references. https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Integer.html#equals(java.lang.Object)
So let’s see in the code block below, with the == operator, the addresses of two objects are compared, not their values.
Integer a = 5;
Integer b = 5;
System.out.print(a == b); // true
How does it print true? Doesn’t the == operator compare addresses of objects? Let’s take a look at the hashcodes of a
and b
.
System.out.println(a.hashCode()); // 5
System.out.println(b.hashCode()); // 5
I guess the Integer class returns its value as hashcode. Well, to be sure, let’s do something like this:
System.out.println(System.identityHashCode(a)); // 1712121640
System.out.println(System.identityHashCode(a)); // 1712121640
Ok, now I’m convinced. It turns out that a
and b
are references to the same object. Let’s try something like this:
Integer a = new Integer(5);
Integer b = new Integer(5);
System.out.print(a == b); // false
Finally the answer we’ve been waiting for. So it doesn’t call the Integer(int) constructor with auto-boxing in our previous example. Let’s take a look at the Java Language Specifications. (https://docs.oracle.com/javase/specs/jls/se17/html/jls-5.html#jls-5.1.7)
If the value p being boxed is the result of evaluating a constant expression of type boolean, byte, char, short, int, or long, and the result is true, false, a character in the range ‘\u0000’ to ‘\u007f’ inclusive, or an integer in the range -128 to 127 inclusive, then let a and b be the results of any two boxing conversions of p. It is always the case that a == b.
Let’s check if this situation written for integer in the JLS is correct. If we use 128 instead of 5:
Integer a = 128;
Integer b = 128;
System.out.print(a == b); // false
Great! JVM caches Integer objects that have values between -128 and 127 and returns the cached instance instead of creating a new instance. In this way, it avoids the expensive object creation process. Clever! So where is the point where it makes this check and accesses the object from the cache and returns it? So which method does it call with auto-boxing? Let’s dive a little further into the documentation.
When you read the definition of Integer#valueOf(int) method https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Integer.html#equals(java.lang.Object), it says:
Returns an Integer instance representing the specified int value. If a new Integer instance is not required, this method should generally be used in preference to the constructor Integer(int), as this method is likely to yield significantly better space and time performance by caching frequently requested values. This method will always cache values in the range -128 to 127, inclusive, and may cache other values outside of this range.
Super! When Integer a = 5;
is compiled, it is arranged as Integer a = Integer.valueOf(5);
not Integer a = new Integer(5);
If you don’t believe me, try it yourself… By the way, Integer(int) constructor deprecated since Java 9. The documentation says about that: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Integer.html#%3Cinit%3E(int)
It is rarely appropriate to use this constructor. The static factory valueOf(int) is generally a better choice, as it is likely to yield significantly better space and time performance.
Look at the Integer#valueOf method.
If the instance with the value to be created is in the cache, return it; if not, create a new instance with the new keyword and return it. When we look at Integer class, we see that it contains the nested static IntegerCache class.
As you can see, -128 is assigned as low
, but we can give property for high
with-Djava.lang.Integer.IntegerCache.high=<size>
I don’t know why low
value can’t be configured by property. If anyone knows, please write in the comments :)
To summarize, the wrappers of primitive types are cached by JVM for certain ranges of values. In case of need an instance of these classes, cached instances are given because the object creation cost is expensive. We talked about Integer in this story, but all other wrapper classes interns objects.