Collisions on a Switch on Strings
Today, I felt on a really nice blog post about how a switch is implemented for String under the hood.
It basically explains that the switch is implemented on the String hash code. Easy and efficient. But it made me wondering what would happen if the was a collision in the hash codes. By that I mean: What happen if you got two cases containing strings having the same hash code?
To test it, I've created such a switch. First I was needing to found two strings with the same hash code. The way the hash code of calculate for a string is quite simple. It loops over the characters, adding their value to the previous result multiplied by 31. So, doing a bit of algebra, the two characters strings below will be in collision.
Decompiling this code brings a surprising result.
It basically explains that the switch is implemented on the String hash code. Easy and efficient. But it made me wondering what would happen if the was a collision in the hash codes. By that I mean: What happen if you got two cases containing strings having the same hash code?
To test it, I've created such a switch. First I was needing to found two strings with the same hash code. The way the hash code of calculate for a string is quite simple. It loops over the characters, adding their value to the previous result multiplied by 31. So, doing a bit of algebra, the two characters strings below will be in collision.
From where I just had to create a switch with the result. For instance "10" and "0O" both have a hash code of 1567.
int c1 = some_character, c2 = some_other_character;
int c3 = c1-1, c4 = 31*c1+c2-31*c3;
String s1 = new String(new char[] {(char)c1, (char)c2});
String s2 = new String(new char[] {(char)c3, (char)c4});
switch (args[0]) {
case "10":
System.out.println("Here");
break;
case "0O":
System.out.println("There");
break;
case "123":
System.out.println("Away");
break;
}
Decompiling this code brings a surprising result.
String str1 = args[0];int i = -1;
switch (str1.hashCode())
{
case 1567:
if (str1.equals("0O")) {
i = 1;
} else if (str1.equals("10")) {
i = 0;
}
break;
case 48690:
if (str1.equals("123")) {
i = 2;
}
break;
}
switch (i)
{
case 0:
System.out.println("Here");
break;
case 1:
System.out.println("There");
break;
case 2:
System.out.println("Away");
}
So, in case of collision, the compiler changes the strategy and add a redirection. It is worthy to mention that the redirection applies even to cases that are not in collision (like "123" in the example).
So, unless the JVM manage to optimize this at runtime (which is highly possible), if we are unlucky enough to have a collision in your switch, you code will be slower than if there wasn't any.