One Method to Rule Them All: Map.merge()

 I don’t often explain a single method in JDK, but when I do, it’s about Map.merge(). This is probably the most versatile operation in the key-value universe. But it’s also rather obscure and rarely used.

merge() can be explained as follows: it either puts new value under the given key (if absent) or updates existing key with a given value (UPSERT). Let’s start with the most basic example: counting unique word occurrences. Pre-Java 8 (read: pre-2014!) code was quite messy and the essence was lost in implementation details:

var map = new HashMap<String, Integer>();
words.forEach(word -> {
    var prev = map.get(word);
    if (prev == null) {
        map.put(word, 1);
    } else {
        map.put(word, prev + 1);
    }
});

However, it works, and for the given input, it produces the desired output:

var words = List.of("Foo", "Bar", "Foo", "Buzz", "Foo", "Buzz", "Fizz", "Fizz");
//...
{Bar=1, Fizz=2, Foo=3, Buzz=2}

OK, but let’s try to refactor it to avoid conditional logic:

words.forEach(word -> {
    map.putIfAbsent(word, 0);
    map.put(word, map.get(word) + 1);
});

That’s nice!

putIfAbsent()  is a necessary evil; otherwise, the code breaks on the first occurrence of a previously unknown word. Also, I find map.get(word) insidemap.put() to be a bit awkward. Let’s get rid of it as well!

words.forEach(word -> {
    map.putIfAbsent(word, 0);
    map.computeIfPresent(word, (w, prev) -> prev + 1);
});

computeIfPresent() invokes the given transformation only if the key in question (word) exists. Otherwise, it does nothing. We made sure the key exists by initializing it to zero, so incrementation always works. Can we do better? Well, we can cut the extra initialization, but I wouldn’t recommend it:

words.forEach(word ->
        map.compute(word, (w, prev) -> prev != null ? prev + 1 : 1)
);

compute ()is likecomputeIfPresent(), but it is invoked irrespective to the existence of the given key. If the value for the key does not exist, the prevargument is null. Moving a simpleif to a ternary expression hidden in lambda is far from optimal. This is where themerge()operator shines. Before I show you the final version, let’s see a slightly simplified default implementation ofMap.merge():

default V merge(K key, V value, BiFunction<V, V, V> remappingFunction) {
    V oldValue = get(key);
    V newValue = (oldValue == null) ? value :
               remappingFunction.apply(oldValue, value);
    if (newValue == null) {
        remove(key);
    } else {
        put(key, newValue);
    }
    return newValue;
}

The code snippet is worth a thousand words. merge() works in two scenarios. If the given key is not present, it simply becomes put(key, value). However, if the said key already holds some value, our remappingFunction may merge (duh!) the old and the one. This function is free to:

  • overwrite old value by simply returning the new one: (old, new) -> new
  • keep the old value by simply returning the old one: (old, new) -> old
  • somehow merge the two, e.g.: (old, new) -> old + new
  • or even remove old value: (old, new) -> null

As you can see, merge() is quite versatile. So, how does our academic problem look like with merge()? It’s quite pleasing:

words.forEach(word ->
        map.merge(word, 1, (prev, one) -> prev + one)
);

You can read it as follows: put 1 under the word key if absent; otherwise, add 1 to the existing value. I named one of the parameters “ one” because in our example it’s always…  1.

Sadly, remappingFunctiontakes two parameters, where the second one is the value we are about to upsert (insert or update). Technically, we know this value already, so (word, 1, prev -> prev + 1)  would be much easier to digest. But there’s no such API.

All right, but is merge() really useful? Imagine you have an account operation (constructor, getters, and other useful properties omitted):

class Operation {
    private final String accNo;
    private final BigDecimal amount;
}

And a bunch of operations for different accounts:

var operations = List.of(
    new Operation("123", new BigDecimal("10")),
    new Operation("456", new BigDecimal("1200")),
    new Operation("123", new BigDecimal("-4")),
    new Operation("123", new BigDecimal("8")),
    new Operation("456", new BigDecimal("800")),
    new Operation("456", new BigDecimal("-1500")),
    new Operation("123", new BigDecimal("2")),
    new Operation("123", new BigDecimal("-6.5")),
    new Operation("456", new BigDecimal("-600"))
);

We would like to compute balance (total over operations’ amounts) for each account. Without merge(), this is quite cumbersome:

var balances = new HashMap<String, BigDecimal>();
operations.forEach(op -> {
    var key = op.getAccNo();
    balances.putIfAbsent(key, BigDecimal.ZERO);
    balances.computeIfPresent(key, (accNo, prev) -> prev.add(op.getAmount()));
});

But with a little help of merge():

operations.forEach(op ->
        balances.merge(op.getAccNo(), op.getAmount(), 
                (soFar, amount) -> soFar.add(amount))
);

Do you see a method reference opportunity here?

operations.forEach(op ->
        balances.merge(op.getAccNo(), op.getAmount(), BigDecimal::add)
);

I find this astoundingly readable. For each operation, add the given amount to the given accNo. The results are as expected:

{123=9.5, 456=-100}

ConcurrentHashMap

Map.merge() shines even brighter when you realize it’s properly implemented inConcurrentHashMap. This means we can atomically perform an insert-or-update operation — single line and thread-safe.

ConcurrentHashMap is obviously thread-safe, but not across many operations, e.g. get() and thenput(). However, merge() makes sure no updates are lost.

Java SortedMap and TreeMap

This tutorial helps you understand SortedMap with TreeMap implementation in the Java Collections Framework.First, let’s review the API hierarchy. TreeMap doesn’t only implement the Map interface, it also implements the SortedMapand NavigableMap interfaces. Therefore, besides the behaviors inherited from the Map, TreeMap also inherits the behaviors defined by SortedMap and NavigableMap. The following picture depicts the API hierarchy of TreeMap:

TreeMapAPI

1. Understanding SortedMap

The main characteristic of a SortedMap is that, it orders the keys by their natural ordering, or by a specified comparator. So consider using a TreeMap when you want a map that satisfies the following criteria:

  • null key or null value are not permitted.
  • The keys are sorted either by natural ordering or by a specified comparator.

The following example realizes the concept of a SortedMap:

1
2
3
4
5
6
7
8
9
SortedMap<String, String> mapDomains = new TreeMap<>();
mapDomains.put(".com", "International");
mapDomains.put(".us", "United States");
mapDomains.put(".uk", "United Kingdom");
mapDomains.put(".jp", "Japan");
mapDomains.put(".au", "Australia");
System.out.println(mapDomains);

Output:

1
{.au=Australia, .com=International, .jp=Japan, .uk=United Kingdom, .us=United States}

Here, this map contains mappings of domain=country, and as we see in the output, the domains (keys) are sorted by alphabetic order (natural ordering of Strings).

Besides the operations inherited from the Map interface, the SortedMap also defines the following operations:

  • Range view: returns a sub sorted map whose keys fall within a range of keys in the original map.
  • Endpoints: returns the first or last key in the sorted map.
  • Comparator access: returns the comparator (implements the Comparator interface), if any, used to sort the map.

Hence the following code is the definition of a SortedMap:

1
2
3
4
5
6
7
8
public interface SortedMap<K, V> extends Map<K, V>{
    Comparator<? super K> comparator();
    SortedMap<K, V> subMap(K fromKey, K toKey);
    SortedMap<K, V> headMap(K toKey);
    SortedMap<K, V> tailMap(K fromKey);
    K firstKey();
    K lastKey();
}

Let’s look at each type of operation in details. * Range View Operations:+ subMap(K fromKey, K toKey): returns a sorted map whose keys range from fromKey, inclusive, to toKey, exclusive.+ headMap(K toKey): returns a sorted map whose keys are strictly less than toKey.+ tailMap(K fromKey): returns a sorted map whose keys are greater than or equal to fromKey. * Endpoint operations:+ firstKey(): returns the first (lowest) key currently in the map.+ lastKey(): returns the last (highest) key currently in the map. * Comparator access:+ comparator(): returns the comparator used to order the keys in the map, or returns null if this map uses the natural ordering of its keys.

2. SortedMap and TreeMap Examples

The following code example demonstrates how to work with these operations on a TreeMap:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
SortedMap<Integer, String> mapHttpStatus = new TreeMap<>();
mapHttpStatus.put(100, "Continue");
mapHttpStatus.put(200, "OK");
mapHttpStatus.put(300, "Multiple Choices");
mapHttpStatus.put(400, "Bad Request");
mapHttpStatus.put(401, "Unauthorized");
mapHttpStatus.put(402, "Payment Required");
mapHttpStatus.put(403, "Forbidden");
mapHttpStatus.put(404, "Not Found");
mapHttpStatus.put(500, "Internal Server Error");
mapHttpStatus.put(501, "Not Implemented");
mapHttpStatus.put(502, "Bad Gateway");
System.out.println("All key-value pairs: ");
for (Integer code : mapHttpStatus.keySet()) {
    System.out.println(code + " -> " + mapHttpStatus.get(code));
}
System.out.println();
Integer firstKey = mapHttpStatus.firstKey();
String firstValue = mapHttpStatus.get(firstKey);
System.out.println("First status: " + firstKey + " -> " + firstValue);
System.out.println();
Integer lastKey = mapHttpStatus.lastKey();
String lastValue = mapHttpStatus.get(lastKey);
System.out.println("Last status: " + lastKey + " -> " + lastValue);
System.out.println();
SortedMap<Integer, String> map4xxStatus = mapHttpStatus.subMap(400, 500);
System.out.println("4xx Statuses: ");
for (Integer code : map4xxStatus.keySet()) {
    System.out.println(code + " -> " + map4xxStatus.get(code));
}
System.out.println();
SortedMap<Integer, String> mapUnder300Status = mapHttpStatus.headMap(300);
System.out.println("Statuses < 300: ");
for (Integer code : mapUnder300Status.keySet()) {
    System.out.println(code + " -> " + mapUnder300Status.get(code));
}
System.out.println();
SortedMap<Integer, String> mapAbove500Status = mapHttpStatus.tailMap(500);
System.out.println("Statuses > 500: ");
for (Integer code : mapAbove500Status.keySet()) {
    System.out.println(code + " -> " + mapAbove500Status.get(code));
}
Comparator comparator = mapHttpStatus.comparator();
System.out.println("Sorted by natural ordering? " + (comparator == null));

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
All key-value pairs:
100 -> Continue
200 -> OK
300 -> Multiple Choices
400 -> Bad Request
401 -> Unauthorized
402 -> Payment Required
403 -> Forbidden
404 -> Not Found
500 -> Internal Server Error
501 -> Not Implemented
502 -> Bad Gateway
First status: 100 -> Continue
Last status: 502 -> Bad Gateway
4xx Statuses:
400 -> Bad Request
401 -> Unauthorized
402 -> Payment Required
403 -> Forbidden
404 -> Not Found
Statuses < 300:
100 -> Continue
200 -> OK
Statuses > 500:
500 -> Internal Server Error
501 -> Not Implemented
502 -> Bad Gateway
Sorted by natural ordering? true   

And the following example shows how to use a comparator:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
SortedMap<Integer, String> mapHttpStatus = new TreeMap<>(new ReverseComparator());
mapHttpStatus.put(100, "Continue");
mapHttpStatus.put(200, "OK");
mapHttpStatus.put(300, "Multiple Choices");
mapHttpStatus.put(400, "Bad Request");
mapHttpStatus.put(401, "Unauthorized");
mapHttpStatus.put(402, "Payment Required");
mapHttpStatus.put(403, "Forbidden");
mapHttpStatus.put(404, "Not Found");
mapHttpStatus.put(500, "Internal Server Error");
mapHttpStatus.put(501, "Not Implemented");
mapHttpStatus.put(502, "Bad Gateway");
for (Integer code : mapHttpStatus.keySet()) {
    System.out.println(code + " -> " + mapHttpStatus.get(code));
}

Here’s the code of the comparator class:

1
2
3
4
5
class ReverseComparator implements Comparator<Integer> {
    public int compare(Integer num1, Integer num2) {
        return num2.compareTo(num1);
    }
}

Output:

1
2
3
4
5
6
7
8
9
10
11
502 -> Bad Gateway
501 -> Not Implemented
500 -> Internal Server Error
404 -> Not Found
403 -> Forbidden
402 -> Payment Required
401 -> Unauthorized
400 -> Bad Request
300 -> Multiple Choices
200 -> OK
100 -> Continue

As you can see, this comparator sorts the map by the descending order of its keys.In case you are working on Java 8, use Lambda expressions to shorten the comparator code like this:

1
SortedMap<Integer, String> mapHttpStatus = new TreeMap<>((i1, i2) -> i2.compareTo(i1));

Class diagram of Queue API

Queue API is the most complex API in the family of Java Collections Framework. Queue<E> is the base interface for all kind of queues.

  • Sub interfaces: BlockingQueue<E>, Deque<E> and BlockingDeque<E>.
  • Abstract classes: AbstractQueue<E>.
  • Implementation classes:
    • ArrayBlockingQueue<E>
    • ArrayDeque<E>
    • ConcurrentLinkedQueue<E>
    • DelayQueue<E extends Delayed>
    • LinkedBlockingDeque<E>
    • LinkedBlockingQueue<E>
    • LinkedList<E>
    • PriorityBlockingQueue<E>
    • PriorityQueue<E>
    • SynchronousQueue<E>

The following class diagram outlines the hierarchy of Queue API:

Queue API class diagram

Class diagram of Map API

A map stores elements in terms of key=value pair. Map<K, V> is the base interface for all kind of maps.

  • Sub interfaces: ConcurrentMap<K, V>, SortedMap<K, V>, NavigableMap<K, V> and ConcurrentNavigableMap<K, V>.
  • Abstract classes: AbstractMap<E>.
  • Implementation classes:
    • Hashtable<K, V>
    • HashMap<K, V>
    • LinkedHashMap<K, V>
    • IdentityHashMap<K, V>
    • ConcurrentHashMap<K, V>
    • ConcurrentSkipListMap<K, V>
    • EnumMap<K extends Enum<K>, V>
    • WeakHashMap<K, V>
    • TreeMap<K, V>

The following class diagram outlines the hierarchy of Map API:

Map API class diagram

Class diagram of Set API

In Set API:

  • Set<E> is the base interface for all kinds of set. This interface extends all methods from the Collection<E>interface and does not define any new methods.
  • Sub interfaces: SortedSet<E> and NavigableSet<E>.
  • Abstract subclasses: AbstractSet<E> and EnumSet<E extends enum<E>>.
  • Concrete implementation classes: HashSet<E>, LinkedHashSet<E>, TreeSet<E>,ConcurrentSkipListSet<E> and CopyOnWriteArraySet<E> (these two last classes are under java.util.concurrent package).
  • The JobStateReasons class extends HashSet<E> but it is not a member of Java Collections Framework.

The following class diagram outlines the Set API in Java Collections Framework:

Set API class diagram

Class diagram of List API

In List API:

  • List<E> is the base interface for all kinds of list. It defines general operations for a List type.
  • Abstract subclasses: AbstractList<E> and AbstractSequentialList<E>
  • Concrete implementation classes: ArrayList<E>, Vector<E>, LinkedList<E> and CopyOnWriteArrayList<E>(this class is under java.util.concurrent package).
  • Legacy collection: Vector<E>
  • Implementation classes in JDK which are not members of Java Collections Framework: AttributeList,RoleList, RoleUnresolvedList and Stack.

The following class diagram describes the hierarchy structure of List API in Java Collections Framework:

List API class diagram

Overview of Java Collections Framework API (UML diagram)

The Java collections framework has a very complex API hierarchy.

The following class diagram shows a brief overview of the Java Collections Framework which is divided into four groups: List, Set, Map and Queue. Only the principal, commonly-used interfaces and classes are listed.

collections framework overview

 Class diagram of Java Collections framework

There are also more detailed class diagrams for each group: