Map best practices

Today’s topic is about Map and misuses I’ve seen during many code reviews.

The idea with a Map is to do whatever you need by doing as less hashing as possible. A hash occurs each time you access the Map (e.g. get, containsKey, put).

In Java 8, some useful new methods were added. Let’s say you want to check if something is in a Map:

The classical way to do it is:

if (map.containsKey(key)) { // one hash
    return map.get(key); // two hash
}
List<String> list = new ArrayList<>();
map.put(key, list); // three hash
return list;

It is also the slowest. A better way is:

List<String> list = map.get(key); // one hash
if(list == null) {
    list = new ArrayList<>();
    map.put(key, list); // two hash
}
return list;

This is already much better. You save one hash.

Important: This isn’t valid if the value might be null. But I highly recommend you to never have null values

But since Java 8, you have three better solutions.

The first one is:

map.putIfAbsent(key, new ArrayList<>()); // one hash
return map.get(key); // two hash

It is better but not much. You still have two hashes. And the ArrayList is instantiated even if it is already in the map.

You can improve with the longer:

List<String> list = new ArrayList<>();
List<String> result = map.putIfAbsent(key, list); // one hash only!
if(result == null) {
    return list;
}
return result;

Now we’re talking, only one hash! But still the ArrayList is instantiated uselessly.

Which brings us to another Java 8 method that does the trick.

return map.computeIfAbsent(key, unused -> new ArrayList<>()); // one hash only!

Job done. One line and the fastest we can get. The ArrayList will be instantiated only when needed.

Important: Do not do map.computeIfAbsent(key, ArrayList::new). computeIfAbsent takes a Function<KEY, VALUE> in parameter. So this will in general not compile unless the KEY matches the parameter of one of the ArrayList constructors. An example is when the KEY is an Integer. Passing a constructor method reference will actually call new ArrayList(KEY)… which is obviously not what you want.

In order to convince you that it’s the best solution, I have made a little benchmark using JMH. Here are the results:

Benchmark                               Mode  Cnt         Score        Error  Units
MapBenchmark.computeIfAbsent_there     thrpt   40  25134018.341 ± 687925.885  ops/s (the best!)
MapBenchmark.containsPut_there         thrpt   40  21459978.028 ± 401003.399  ops/s
MapBenchmark.getPut_there              thrpt   40  24268773.005 ± 690893.070  ops/s
MapBenchmark.putIfAbsentGet_there      thrpt   40  18230032.343 ± 238803.546  ops/s
MapBenchmark.putIfAbsent_there         thrpt   40  20579085.677 ± 527246.125  ops/s

MapBenchmark.computeIfAbsent_notThere  thrpt   40   8229212.547 ± 341295.641  ops/s (the best!)
MapBenchmark.containsPut_notThere      thrpt   40   6996790.450 ± 191176.603  ops/s
MapBenchmark.getPut_notThere           thrpt   40   8009163.041 ± 288765.384  ops/s
MapBenchmark.putIfAbsentGet_notThere   thrpt   40   6212712.165 ± 333023.068  ops/s
MapBenchmark.putIfAbsent_notThere      thrpt   40   7227880.072 ± 289581.816  ops/s

Til next time: Happy mapping.