Map best practices
Today’s topic is about Map
and misuses I’ve seen during many code reviews.
The idea with a Map
is to do whatever you need by doing as less hashing as possible.
A hash occurs each time you access the Map
(e.g. get
, containsKey
, put
).
In Java 8, some useful new methods were added.
Let’s say you want to check if something is in a Map
:
- If it is, return it
- If it’s not, add it and return it
The classical way to do it is:
if (map.containsKey(key)) { // one hash
return map.get(key); // two hash
}
List<String> list = new ArrayList<>();
map.put(key, list); // three hash
return list;
It is also the slowest. A better way is:
List<String> list = map.get(key); // one hash
if(list == null) {
list = new ArrayList<>();
map.put(key, list); // two hash
}
return list;
This is already much better. You save one hash.
Important: This isn’t valid if the value might be null
.
But I highly recommend you to never have null values
But since Java 8, you have three better solutions.
The first one is:
map.putIfAbsent(key, new ArrayList<>()); // one hash
return map.get(key); // two hash
It is better but not much. You still have two hashes.
And the ArrayList
is instantiated even if it is already in the map.
You can improve with the longer:
List<String> list = new ArrayList<>();
List<String> result = map.putIfAbsent(key, list); // one hash only!
if(result == null) {
return list;
}
return result;
Now we’re talking, only one hash! But still the ArrayList
is instantiated uselessly.
Which brings us to another Java 8 method that does the trick.
return map.computeIfAbsent(key, unused -> new ArrayList<>()); // one hash only!
Job done.
One line and the fastest we can get.
The ArrayList
will be instantiated only when needed.
Important: Do not do map.computeIfAbsent(key, ArrayList::new)
.
computeIfAbsent
takes a Function<KEY, VALUE>
in parameter.
So this will in general not compile unless the KEY
matches the parameter of one of the ArrayList constructors.
An example is when the KEY
is an Integer.
Passing a constructor method reference will actually call new ArrayList(KEY)
… which is obviously not what you want.
In order to convince you that it’s the best solution, I have made a little benchmark using JMH. Here are the results:
Benchmark Mode Cnt Score Error Units
MapBenchmark.computeIfAbsent_there thrpt 40 25134018.341 ± 687925.885 ops/s (the best!)
MapBenchmark.containsPut_there thrpt 40 21459978.028 ± 401003.399 ops/s
MapBenchmark.getPut_there thrpt 40 24268773.005 ± 690893.070 ops/s
MapBenchmark.putIfAbsentGet_there thrpt 40 18230032.343 ± 238803.546 ops/s
MapBenchmark.putIfAbsent_there thrpt 40 20579085.677 ± 527246.125 ops/s
MapBenchmark.computeIfAbsent_notThere thrpt 40 8229212.547 ± 341295.641 ops/s (the best!)
MapBenchmark.containsPut_notThere thrpt 40 6996790.450 ± 191176.603 ops/s
MapBenchmark.getPut_notThere thrpt 40 8009163.041 ± 288765.384 ops/s
MapBenchmark.putIfAbsentGet_notThere thrpt 40 6212712.165 ± 333023.068 ops/s
MapBenchmark.putIfAbsent_notThere thrpt 40 7227880.072 ± 289581.816 ops/s
Til next time: Happy mapping.