Chapter 4. Comparators and Collectors

Java 8 enhances the Comparator interface with several static and default methods that make sorting operations much simpler. It’s now possible to sort a collection of POJOs by one property, then equal first properties by a second, then by a third, and so on, just with a series of library calls.

Java 8 also adds a new utility class called java.util.stream.Collectors, which provides static methods to convert from streams back into various types of collections. The collectors can also be applied “downstream,” meaning that they can postprocess a grouping or partitioning operation.

The recipes in this chapter illustrate all these concepts.

4.1 Sorting Using a Comparator

Problem

You want to sort objects.

Solution

Use the sorted method on Stream with a Comparator, either implemented with a lambda expression or generated by one of the static compare methods on the Compa⁠rator interface.

Discussion

The sorted method on Stream produces a new, sorted stream using the natural ordering for the class. The natural ordering is specified by implementing the java.util.Comparable interface.

For example, consider sorting a collection of strings, as shown in Example 4-1.

Example 4-1. Sorting strings lexicographically

private List<String> sampleStrings =
    Arrays.asList("this", "is", "a", "list", "of", "strings");

public List<String> defaultSort() {
    Collections.sort(sampleStrings);  
    return sampleStrings;
}

public List<String> defaultSortUsingStreams() {
    return sampleStrings.stream()
        .sorted()                     
        .collect(Collectors.toList());
}

: Default sort from Java 7 and below
: Default sort from Java 8 and above

Java has had a utility class called Collections ever since the collections framework was added back in version 1.2. The static sort method on Collections takes a List as an argument, but returns void. The sort is destructive, modifying the supplied collection. This approach does not follow the functional principles supported by Java 8, which emphasize immutability.

Java 8 uses the sorted method on streams to do the same sorting, but produces a new stream rather than modifying the original collection. In this example, after sorting the collection, the returned list is sorted according to the natural ordering of the class. For strings, the natural ordering is lexicographical, which reduces to alphabetical when all the strings are lowercase, as in this example.

If you want to sort the strings in a different way, then there is an overloaded sorted method that takes a Comparator as an argument.

Example 4-2 shows a length sort for strings in two different ways.

Example 4-2. Sorting strings by length

public List<String> lengthSortUsingSorted() {
    return sampleStrings.stream()
        .sorted((s1, s2) -> s1.length() - s2.length()) 
        .collect(toList());
}

public List<String> lengthSortUsingComparator() {
    return sampleStrings.stream()
        .sorted(Comparator.comparingInt(String::length)) 
        .collect(toList());
}

: Using a lambda for the Comparator to sort by length
: Using a Comparator using the comparingInt method

The argument to the sorted method is a java.util.Comparator, which is a functional interface. In lengthSortUsingSorted, a lambda expression is provided to implement the compare method in Comparator. In Java 7 and earlier, the implementation would normally be provided by an anonymous inner class, but here a lambda expression is all that is required.

Note

Java 8 added sort(Comparator) as a default instance method on List, equivalent to the static void sort(List, Comparator) method on Collections. Both are destructive sorts that return void, so the sorted(Comparator) approach on streams discussed here (which returns a new, sorted stream) is still preferred.

The second method, lengthSortUsingComparator, takes advantage of one of the static methods added to the Comparator interface. The comparingInt method takes an argument of type ToIntFunction that transforms the string into an int, called a keyExtractor in the docs, and generates a Comparator that sorts the collection using that key.

The added default methods in Comparator are extremely useful. While you can write a Comparator that sorts by length pretty easily, when you want to sort by more than one field that can get complicated. Consider sorting the strings by length, then equal-length strings alphabetically. Using the default and static methods in Comparator, that becomes almost trivial, as shown in Example 4-3.

Example 4-3. Sorting by length, then equal lengths lexicographically

public List<String> lengthSortThenAlphaSort() {
    return sampleStrings.stream()
        .sorted(comparing(String::length)            
                    .thenComparing(naturalOrder()))
        .collect(toList());
}

: Sort by length, then equal-length strings alphabetically

Comparator provides a default method called thenComparing. Just like comparing, it also takes a Function as an argument, again known as a keyExtractor. Chaining this to the comparing method returns a Comparator that compares by the first quantity, then equal first by the second, and so on.

Static imports often make the code easier to read. Once you get used to the static methods in both Comparator and Collectors, this becomes an easy way to simplify the code. In this case, the comparing and naturalOrder methods have been statically imported.

This approach works on any class, even if it does not implement Comparable. Consider the Golfer class shown in Example 4-4.

Example 4-4. A class for golfers

public class Golfer {
    private String first;
    private String last;
    private int score;

    // ... other methods ...
}

To create a leader board at a tournament, it makes sense to sort by score, then by last name, and then by first name. Example 4-5 shows how to do that.

Example 4-5. Sorting golfers

private List<Golfer> golfers = Arrays.asList(
    new Golfer("Jack", "Nicklaus", 68),
    new Golfer("Tiger", "Woods", 70),
    new Golfer("Tom", "Watson", 70),
    new Golfer("Ty", "Webb", 68),
    new Golfer("Bubba", "Watson", 70)
);

public List<Golfer> sortByScoreThenLastThenFirst() {
    return golfers.stream()
        .sorted(comparingInt(Golfer::getScore)
                    .thenComparing(Golfer::getLast)
                    .thenComparing(Golfer::getFirst))
        .collect(toList());
}

The output from calling sortByScoreThenLastThenFirst is shown in Example 4-6.

Example 4-6. Sorted golfers

Golfer{first='Jack', last='Nicklaus', score=68}
Golfer{first='Ty', last='Webb', score=68}
Golfer{first='Bubba', last='Watson', score=70}
Golfer{first='Tom', last='Watson', score=70}
Golfer{first='Tiger', last='Woods', score=70}

The golfers are sorted by score, so Nicklaus and Webb come before Woods and both Watsons.¹ Then equal scores are sorted by last name, putting Nicklaus before Webb and Watson before Woods. Finally, equal scores and last names are sorted by first name, putting Bubba Watson before Tom Watson.

The default and static methods in Comparator, along with the new sorted method on Stream, makes generating complex sorts easy.

4.2 Converting a Stream into a Collection

Problem

After stream processing, you want to convert to a List, Set, or other linear collection.

Solution

Use the toList, toSet, or toCollection methods in the Collectors utility class.

Discussion

Idiomatic Java 8 often involves passing elements of a stream through a pipeline of intermediate operations, finishing with a terminal operation. One terminal operation is the collect method, which is used to convert a Stream into a collection.

The collect method in Stream has two overloaded versions, as shown in Example 4-7.

Example 4-7. The collect method in Stream<T>

<R,A> R collect(Collector<? super T,A,R> collector)
<R>   R collect(Supplier<R> supplier,
                BiConsumer<R,? super T> accumulator,
                BiConsumer<R,R> combiner)

This recipe deals with the first version, which takes a Collector as an argument. Collectors perform a “mutable reduction operation” that accumulates elements into a result container. Here the result will be a collection.

Collector is an interface, so it can’t be instantiated. The interface contains a static of method for producing them, but there is often a better, or at least easier, way.

Tip

The Java 8 API frequently uses a static method called of as a factory method.

Here, the static methods in the Collectors class will be used to produce Collector instances, which are used as the argument to Stream.collect to populate a collection.

A simple example that creates a List is shown in Example 4-8.²

Example 4-8. Creating a List

List<String> superHeroes =
    Stream.of("Mr. Furious", "The Blue Raja", "The Shoveler",
              "The Bowler", "Invisible Boy", "The Spleen", "The Sphinx")
          .collect(Collectors.toList());

This method creates and populates an ArrayList with the given stream elements. Creating a Set is just as easy, as in Example 4-9.

Example 4-9. Creating a Set

Set<String> villains =
    Stream.of("Casanova Frankenstein", "The Disco Boys",
              "The Not-So-Goodie Mob", "The Suits", "The Suzies",
              "The Furriers", "The Furriers")  
          .collect(Collectors.toSet());
}

: Duplicate name, removed when converting to a Set

This method creates an instance of HashSet and populates it, leaving out any duplicates.

Both of these examples used the default data structures—ArrayList for List, and Hash⁠Set for Set. If you wish to specify a particular data structure, you should use the Collectors⁠.toCollection method, which takes a Supplier as an argument. Example 4-10 shows the sample code.

Example 4-10. Creating a linked list

List<String> actors =
    Stream.of("Hank Azaria", "Janeane Garofalo", "William H. Macy",
              "Paul Reubens", "Ben Stiller", "Kel Mitchell", "Wes Studi")
          .collect(Collectors.toCollection(LinkedList::new));
}

The argument to the toCollection method is a collection Supplier, so the constructor reference to LinkedList is provided here. The collect method instantiates a LinkedList and then populates it with the given names.

The Collectors class also contains a method to create an array of objects. There are two overloads of the toArray method:

    Object[] toArray();
<A> A[]      toArray(IntFunction<A[]> generator);

The former returns an array containing the elements of this stream, but without specifying the type. The latter takes a function that produces a new array of desired type with length equal to the size of the stream, and is easiest to use with an array constructor reference as shown in Example 4-11.

Example 4-11. Creating an array

String[] wannabes =
    Stream.of("The Waffler", "Reverse Psychologist", "PMS Avenger")
          .toArray(String[]::new); 
}

: Array constructor reference as a Supplier

The returned array is of the specified type, whose length matches the number of elements in the stream.

To transform into a Map, the Collectors.toMap method requires two Function instances—one for the keys and one for the values.

Consider an Actor POJO, which wraps a name and a role. If you have a Set of Actor instances from a given movie, the code in Example 4-12 creates a Map from them.

Example 4-12. Creating a Map

Set<Actor> actors = mysteryMen.getActors();

Map<String, String> actorMap = actors.stream()
    .collect(Collectors.toMap(Actor::getName, Actor::getRole)); 

actorMap.forEach((key,value) ->
    System.out.printf("%s played %s%n", key, value));

: Functions to produce keys and values

The output is

Janeane Garofalo played The Bowler
Greg Kinnear played Captain Amazing
William H. Macy played The Shoveler
Paul Reubens played The Spleen
Wes Studi played The Sphinx
Kel Mitchell played Invisible Boy
Geoffrey Rush played Casanova Frankenstein
Ben Stiller played Mr. Furious
Hank Azaria played The Blue Raja

Similar code works for ConcurrentMap using the toConcurrentMap method.

4.3 Adding a Linear Collection to a Map

Problem

You want to add a collection of objects to a Map, where the key is one of the object properties and the value is the object itself.

Solution

Use the toMap method of Collectors, along with Function.identity.

Discussion

This is a short, very focused use case, but when it comes up in practice the solution here can be quite convenient.

Say you had a List of Book instances, where Book is a simple POJO that has an ID, a name, and a price. An abbreviated form of the Book class is shown in Example 4-13.

Example 4-13. A simple POJO representing a book

public class Book {
    private int id;
    private String name;
    private double price;

    // ... other methods ...
}

Now assume you have a collection of Book instances, as shown in Example 4-14.

Example 4-14. A collection of books

List<Book> books = Arrays.asList(
    new Book(1, "Modern Java Recipes", 49.99),
    new Book(2, "Java 8 in Action", 49.99),
    new Book(3, "Java SE8 for the Really Impatient", 39.99),
    new Book(4, "Functional Programming in Java", 27.64),
    new Book(5, "Making Java Groovy", 45.99)
    new Book(6, "Gradle Recipes for Android", 23.76)
);

In many situations, instead of a List you might want a Map, where the keys are the book IDs and the values are the books themselves. This is really easy to accomplish using the toMap method in Collectors, as shown two different ways in Example 4-15.

Example 4-15. Adding the books to a Map

Map<Integer, Book> bookMap = books.stream()
    .collect(Collectors.toMap(Book::getId, b -> b));              

bookMap = books.stream()
    .collect(Collectors.toMap(Book::getId, Function.identity()));

: Identity lambda: given an element, return it
: Static identity method in Function does the same thing

The toMap method in Collectors takes two Function instances as arguments, the first of which generates a key and the second of which generates the value from the provided object. In this case, the key is mapped by the getId method in Book, and the value is the book itself.

The first toMap in Example 4-15 uses the getId method to map to the key and an explicit lambda expression that simply returns its parameter. The second example uses the static identity method in Function to do the same thing.

The Two Static Identity Methods

The static identity method in Function has the signature

static <T> Function<T,T>	identity()

The implementation in the standard library is shown in Example 4-16.

Example 4-16. The static identity method in Function

static <T> Function<T, T> identity() {
    return t -> t;
}

The UnaryOperator class extends Function, but you can’t override a static method. In the Javadocs, it also declares a static identity method:

static <T> UnaryOperator<T>	identity()

Its implementation in the standard library is essentially the same, as shown in Example 4-17.

Example 4-17. The static identity method in UnaryOperator

static <T> UnaryOperator<T> identity() {
    return t -> t;
}

The differences are only in the way you call them (from the two interface names) and the corresponding return types. In this case, it doesn’t matter which one you use, but it’s interesting to see that they’re both there.

Whether you decide to supply an explicit lambda or use the static method is merely a matter of style. Either way, it is easy to add collection values to a Map where the key is a property of the object and the value is the object itself.

4.4 Sorting Maps

Problem

You want to sort a Map by key or by value.

Solution

Use the new static methods in the Map.Entry interface.

Discussion

The Map interface has always contained a public, static, inner interface called Map.Entry, which represents a key-value pair. The Map.entrySet method returns a Set of Map.Entry elements. Prior to Java 8, the primary methods used in this interface were getKey and getValue, which do what you’d expect.

In Java 8, the static methods in Table 4-1 have been added.

Table 4-1. Static methods in Map.Entry (from Java 8 docs)
Method	Description
`comparingByKey()`	Returns a comparator that compares `Map.Entry` in natural order on key
`comparingByKey(Comparator<? super K> cmp)`	Returns a comparator that compares `Map.Entry` by key using the given `Comparator`
`comparingByValue()`	Returns a comparator that compares `Map.Entry` in natural order on value
`comparingByValue(Comparator<? super V> cmp)`	Returns a comparator that compares `Map.Entry` by value using the given `Comparator`

To demonstrate how to use them, Example 4-18 generates a Map of word lengths to number of words in a dictionary. Every Unix system contains a file in the usr/share/dict/words directory holding the contents of Webster’s 2nd edition dictionary, with one word per line. The Files.lines method can be used to read a file and produce a stream of strings containing those lines. In this case, the stream will contain each word from the dictionary.

Example 4-18. Reading the dictionary file into a Map

System.out.println("\nNumber of words of each length:");
try (Stream<String> lines = Files.lines(dictionary)) {
    lines.filter(s -> s.length() > 20)
        .collect(Collectors.groupingBy(
            String::length, Collectors.counting()))
        .forEach((len, num) -> System.out.printf("%d: %d%n", len, num));
} catch (IOException e) {
    e.printStackTrace();
}

This example is discussed in Recipe 7.1, but to summarize:

The file is read inside a try-with-resources block. Stream implements AutoCloseable, so when the try block exits, Java calls the close method on Stream, which then calls the close method on File.
The filter restricts further processing to only words of at least 20 characters in length.
The groupingBy method of Collectors takes a Function as the first argument, representing the classifier. Here, the classifier is the length of each string. If you only provide one argument, the result is a Map where the keys are the values of the classifier and the values are lists of elements that match the classifier. In the case we’re currently examining, groupingBy(String::length) would have produced a Map<Integer,List<String>> where the keys are the word lengths and the values are lists of words of that length.
In this case, the two-argument version of groupingBy lets you supply another Collector, called a downstream collector, that postprocesses the lists of words. In this case, the return type is Map<Integer,Long>, where the keys are the word lengths and the values are the number of words of that length in the dictionary.

The result is:

Number of words of each length:
21: 82
22: 41
23: 17
24: 5

In other words, there are 82 words of length 21, 41 words of length 22, 17 words of length 23, and 5 words of length 24.³

The results show that the map is printed in ascending order of word length. In order to see it in descending order, use Map.Entry.comparingByKey as in Example 4-19.

Example 4-19. Sorting the map by key

System.out.println("\nNumber of words of each length (desc order):");
try (Stream<String> lines = Files.lines(dictionary)) {
    Map<Integer, Long> map = lines.filter(s -> s.length() > 20)
        .collect(Collectors.groupingBy(
            String::length, Collectors.counting()));

    map.entrySet().stream()
        .sorted(Map.Entry.comparingByKey(Comparator.reverseOrder()))
        .forEach(e -> System.out.printf("Length %d: %2d words%n",
            e.getKey(), e.getValue()));
} catch (IOException e) {
    e.printStackTrace();
}

After computing the Map<Integer,Long>, this operation extracts the entrySet and produces a stream. The sorted method on Stream is used to produce a sorted stream using the provided comparator.

In this case, Map.Entry.comparingByKey generates a comparator that sorts by the keys, and using the overload that takes a comparator allows the code to specify that we want it in reverse order.

Note

The sorted method on Stream produces a new, sorted stream that does not modify the source. The original Map is unaffected.

The result is:

Number of words of each length (desc order):
Length 24:  5 words
Length 23: 17 words
Length 22: 41 words
Length 21: 82 words

The other sorting methods listed in Table 4-1 are used similarly.

4.5 Partitioning and Grouping

Problem

You want to divide a collection of elements into categories.

Solution

The Collectors.partitioningBy method splits elements into those that satisfy a Predicate and those that do not. The Collectors.groupingBy method produces a Map of categories, where the values are the elements in each category.

Discussion

Say you have a collection of strings. If you want to split them into those with even lengths and those with odd lengths, you can use Collectors.partitioningBy, as in Example 4-20.

Example 4-20. Partitioning strings by even or odd lengths

List<String> strings = Arrays.asList("this", "is", "a", "long", "list", "of",
        "strings", "to", "use", "as", "a", "demo");

Map<Boolean, List<String>> lengthMap = strings.stream()
    .collect(Collectors.partitioningBy(s -> s.length() % 2 == 0)); 

lengthMap.forEach((key,value) -> System.out.printf("%5s: %s%n", key, value));
//
// false: [a, strings, use, a]
//  true: [this, is, long, list, of, to, as, demo]

: Partitioning by even or odd length

The signature of the two partitioningBy methods are:

static <T> Collector<T,?,Map<Boolean,List<T>>> partitioningBy(
    Predicate<? super T> predicate)
static <T,D,A> Collector<T,?,Map<Boolean,D>> partitioningBy(
    Predicate<? super T> predicate, Collector<? super T,A,D> downstream)

The return types look rather nasty due to the generics, but you rarely have to deal with them in practice. Instead, the result of either operation becomes the argument to the collect method, which uses the generated collector to create the output map defined by the third generic argument.

The first partitioningBy method takes a single Predicate as an argument. It divides the elements into those that satisfy the Predicate and those that do not. You will always get a Map as a result that has exactly two entries: a list of values that satisfy the Predic⁠ate, and a list of values that do not.

The overloaded version of the method takes a second argument of type Collector, called a downstream collector. This allows you to postprocess the lists returned by the partition, and is discussed in Recipe 4.6.

The groupingBy method performs an operation like a “group by” statement in SQL. It returns a Map where the keys are the groups and the values are lists of elements in each group.

Note

If you are getting your data from a database, by all means do any grouping operations there. The new API methods are convenience methods for data in memory.

The signature for the groupingBy method is:

static <T,K> Collector<T,?,Map<K,List<T>>>	groupingBy(
    Function<? super T,? extends K> classifier)

The Function argument takes each element of the stream and extracts a property to group by. This time, rather than simply partition the strings into two categories, consider separating them by length, as in Example 4-21.

Example 4-21. Grouping strings by length

List<String> strings = Arrays.asList("this", "is", "a", "long", "list", "of",
        "strings", "to", "use", "as", "a", "demo");

Map<Integer, List<String>> lengthMap = strings.stream()
    .collect(Collectors.groupingBy(String::length)); 

lengthMap.forEach((k,v) -> System.out.printf("%d: %s%n", k, v));
//
// 1: [a, a]
// 2: [is, of, to, as]
// 3: [use]
// 4: [this, long, list, demo]
// 7: [strings]

: Grouping strings by length

The keys in the resulting map are the lengths of the strings (1, 2, 3, 4, and 7) and the values are lists of strings of each length.

4.6 Downstream Collectors

Problem

You want to postprocess the collections returned by a groupingBy or partitioningBy operation.

Solution

Use one of the static utility methods from the java.util.stream.Collectors class.

Discussion

In Recipe 4.5, we looked at how to separate elements into multiple categories. The partitioningBy and groupingBy methods return a Map where the keys were the categories (booleans true and false for partitioningBy, objects for groupingBy) and the values were lists of elements that satisfied each category. Recall the example partitioning strings by even and odd lengths, shown in Example 4-20 but repeated in Example 4-22 for convenience.

Example 4-22. Partitioning strings by even or odd lengths

List<String> strings = Arrays.asList("this", "is", "a", "long", "list", "of",
        "strings", "to", "use", "as", "a", "demo");

Map<Boolean, List<String>> lengthMap = strings.stream()
    .collect(Collectors.partitioningBy(s -> s.length() % 2 == 0));

lengthMap.forEach((key,value) -> System.out.printf("%5s: %s%n", key, value));
//
// false: [a, strings, use, a]
//  true: [this, is, long, list, of, to, as, demo]

Rather than the actual lists, you may be interested in how many elements fall into each category. In other words, instead of producing a Map whose values are List<String>, you might want just the number of elements in each of the lists. The partitioningBy method has an overloaded version whose second argument is of type Collector:

static <T,D,A> Collector<T,?,Map<Boolean,D>>	partitioningBy(
    Predicate<? super T> predicate, Collector<? super T,A,D> downstream)

This is where the static Collectors.counting method becomes useful. Example 4-23 shows how it works.

Example 4-23. Counting the partitioned strings

Map<Boolean, Long> numberLengthMap = strings.stream()
    .collect(Collectors.partitioningBy(s -> s.length() % 2 == 0,
                 Collectors.counting()));  

numberLengthMap.forEach((k,v) -> System.out.printf("%5s: %d%n", k, v));
//
// false: 4
//  true: 8

: Downstream collector

This is called a downstream collector, because it is postprocessing the resulting lists downstream (i.e., after the partitioning operation is completed).

The groupingBy method also has an overload that takes a downstream collector:

/**
* @param <T> the type of the input elements
* @param <K> the type of the keys
* @param <A> the intermediate accumulation type of the downstream collector
* @param <D> the result type of the downstream reduction
* @param classifier a classifier function mapping input elements to keys
* @param downstream a {@code Collector} implementing the downstream reduction
* @return a {@code Collector} implementing the cascaded group-by operation
*/
static <T,K,A,D> Collector<T,?,Map<K,D>>	groupingBy(
    Function<? super T,? extends K> classifier,
    Collector<? super T,A,D> downstream)

A portion of the Javadoc comment from the source code is included in the signature, which shows that T is the type of the element in the collection, K is the key type for the resulting map, A is an accumulator, and D is the type of the downstream collector. The ? represents “unknown.” See Appendix A for more details on generics in Java 8.

Several methods in Stream have analogs in the Collectors class. Table 4-2 shows how they align.

Table 4-2. Collectors methods similar to Stream methods
Stream	Collectors
`count`	`counting`
`map`	`mapping`
`min`	`minBy`
`max`	`maxBy`
`IntStream.sum`	`summingInt`
`DoubleStream.sum`	`summingDouble`
`LongStream.sum`	`summingLong`
`IntStream.summarizing`	`summarizingInt`
`DoubleStream.summarizing`	`summarizingDouble`
`LongStream.summarizing`	`summarizingLong`

Again, the purpose of a downstream collector is to postprocess the collection of objects produced by an upstream operation, like partitioning or grouping.

4.7 Finding Max and Min Values

Problem

You want to determine the maximum or minimum value in a stream.

Solution

You have several choices: the maxBy and minBy methods on BinaryOperator, the max and min methods on Stream, or the maxBy and minBy utility methods on Collectors.

Discussion

A BinaryOperator is one of the functional interfaces in the java.util.function package. It extends BiFunction and applies when both arguments to the function and the return value are all from the same class.

The BinaryOperator interface adds two static methods:

static <T> BinaryOperator<T> maxBy(Comparator<? super T> comparator)
static <T> BinaryOperator<T> minBy(Comparator<? super T> comparator)

Each of these returns a BinaryOperator that uses the supplied Comparator.

To demonstrate the various ways to get the maximum value from a stream, consider a POJO called Employee that holds three attributes: name, salary, and department, as in Example 4-24.

Example 4-24. Employee POJO

public class Employee {
    private String name;
    private Integer salary;
    private String department;

    // ... other methods ...
}

List<Employee> employees = Arrays.asList(                  
        new Employee("Cersei",     250_000, "Lannister"),
        new Employee("Jamie",      150_000, "Lannister"),
        new Employee("Tyrion",       1_000, "Lannister"),
        new Employee("Tywin",    1_000_000, "Lannister"),
        new Employee("Jon Snow",    75_000, "Stark"),
        new Employee("Robb",       120_000, "Stark"),
        new Employee("Eddard",     125_000, "Stark"),
        new Employee("Sansa",            0, "Stark"),
        new Employee("Arya",         1_000, "Stark"));

Employee defaultEmployee =                                
    new Employee("A man (or woman) has no name", 0, "Black and White");

: Collection of employees
: Default for when the stream is empty

Given a collection of employees, you can use the reduce method on Stream, which takes a BinaryOperator as an argument. The snippet in Example 4-25 shows how to get the employee with the largest salary.

Example 4-25. Using BinaryOperator.maxBy

Optional<Employee> optionalEmp = employees.stream()
    .reduce(BinaryOperator.maxBy(Comparator.comparingInt(Employee::getSalary)));

System.out.println("Emp with max salary: " +
    optionalEmp.orElse(defaultEmployee));

The reduce method requires a BinaryOperator. The static maxBy method produces that BinaryOperator based on the supplied Comparator, which in this case compares employees by salary.

This works, but there’s actually a convenience method called max that can be applied directly to the stream:

Optional<T> max(Comparator<? super T> comparator)

Using that method directly is shown in Example 4-26.

Example 4-26. Using Stream.max

optionalEmp = employees.stream()
        .max(Comparator.comparingInt(Employee::getSalary));

The result is the same.

Note that there is also a method called max on the primitive streams (IntStream, LongStream, and DoubleStream) that takes no arguments. Example 4-27 shows that method in action.

Example 4-27. Finding the highest salary

OptionalInt maxSalary = employees.stream()
        .mapToInt(Employee::getSalary)
        .max();
System.out.println("The max salary is " + maxSalary);

In this case, the mapToInt method is used to convert the stream of employees into a stream of integers by invoking the getSalary method, and the returned stream is an IntStream. The max method then returns an OptionalInt.

There is also a static method called maxBy in the Collectors utility class. You can use it directly here, as in Example 4-28.

Example 4-28. Using Collectors.maxBy

optionalEmp = employees.stream()
    .collect(Collectors.maxBy(Comparator.comparingInt(Employee::getSalary)));

This is awkward, however, and can be replaced by the max method on Stream, as shown in the preceding example. The maxBy method on Collectors is helpful when used as a downstream collector (i.e., when postprocessing a grouping or partitioning operation). The code in Example 4-29 uses groupingBy on Stream to create a Map of departments to lists of employees, but then determines the employee with the greatest salary in each department.

Example 4-29. Using Collectors.maxBy as a downstream collector

Map<String, Optional<Employee>> map = employees.stream()
    .collect(Collectors.groupingBy(
                Employee::getDepartment,
                Collectors.maxBy(
                    Comparator.comparingInt(Employee::getSalary))));

map.forEach((house, emp) ->
        System.out.println(house + ": " + emp.orElse(defaultEmployee)));

The minBy method in each of these classes works the same way.

4.8 Creating Immutable Collections

Problem

You want to create an immutable list, set, or map using the Stream API.

Solution

Use the new static method collectingAndThen in the Collectors class.

Discussion

With its focus on parallelization and clarity, functional programming favors using immutable objects wherever possible. The Collections framework, added in Java 1.2, has always had methods to create immutable collections from existing ones, though in a somewhat awkward fashion.

The Collections utility class has methods unmodifiableList, unmodifiableSet, and unmodifiableMap (along with a few other methods with the same unmodifiable prefix), as shown in Example 4-30.

Example 4-30. Unmodifiable methods in the Collections class

static <T> List<T>    unmodifiableList(List<? extends T> list)
static <T> Set<T>     unmodifiableSet(Set<? extends T> s)
static <K,V> Map<K,V> unmodifiableMap(Map<? extends K,? extends V> m)

In each case, the argument to the method is an existing list, set, or map, and the resulting list, set, or map has the same elements as the argument, but with an important difference: all the methods that could modify the collection, like add or remove, now throw an UnsupportedOperationException.

Prior to Java 8, if you received the individual values as an argument, using a variable argument list, you produced an unmodifiable list or set as shown in Example 4-31.

Example 4-31. Creating unmodifiable lists or sets prior to Java 8

@SafeVarargs  
public final <T> List<T> createImmutableListJava7(T... elements) {
    return Collections.unmodifiableList(Arrays.asList(elements));
}

@SafeVarargs  
public final <T> Set<T> createImmutableSetJava7(T... elements) {
    return Collections.unmodifiableSet(new HashSet<>(Arrays.asList(elements)));
}

: You promise not to corrupt the input array type. See Appendix A for details.

The idea in each case is to start by taking the incoming values and converting them into a List. You can wrap the resulting list using unmodifiableList, or, in the case of a Set, use the list as the argument to a set constructor before using unmodi⁠fiableSet.

In Java 8, with the new Stream API, you can instead take advantage of the static Col⁠lectors.collectingAndThen method, as in Example 4-32.

Example 4-32. Creating unmodifiable lists or sets in Java 8

import static java.util.stream.Collectors.collectingAndThen;
import static java.util.stream.Collectors.toList;
import static java.util.stream.Collectors.toSet;

// ... define a class with the following methods ...

@SafeVarargs
public final <T> List<T> createImmutableList(T... elements) {
    return Arrays.stream(elements)
        .collect(collectingAndThen(toList(),
                    Collections::unmodifiableList));  
}

@SafeVarargs
public final <T> Set<T> createImmutableSet(T... elements) {
    return Arrays.stream(elements)
        .collect(collectingAndThen(toSet(),
                    Collections::unmodifiableSet));   
}

: “Finisher” wraps the generated collections

The Collectors.collectingAndThen method takes two arguments: a downstream Collector and a Function called a finisher. The idea is to stream the input elements and then collect them into a List or Set, and then the unmodifiable function wraps the resulting collection.

Converting a series of input elements into an unmodifiable Map isn’t as clear, partly because it’s not obvious which of the input elements would be assumed to be keys and which would be values. The code shown in Example 4-33⁴ creates an immutable Map in a very awkward way, using an instance initializer.

Example 4-33. Creating an immutable Map

Map<String, Integer> map = Collections.unmodifiableMap(
  new HashMap<String, Integer>() {{
    put("have", 1);
    put("the", 2);
    put("high", 3);
    put("ground", 4);
}});

Readers who are familiar with Java 9, however, already know that this entire recipe can be replaced with a very simple set of factory methods: List.of, Set.of, and Map.of.

4.9 Implementing the Collector Interface

Problem

You need to implement java.util.stream.Collector manually, because none of the factory methods in the java.util.stream.Collectors class give you exactly what you need.

Solution

Provide lambda expressions or method references for the Supplier, accumulator, combiner, and finisher functions used by the Collector.of factory methods, along with any desired characteristics.

Discussion

The utility class java.util.stream.Collectors has several convenient static methods whose return type is Collector. Examples are toList, toSet, toMap, and even toCollection, each of which is illustrated elsewhere in this book. Instances of classes that implement Collector are sent as arguments to the collect method on Stream. For instance, in Example 4-34, the method accepts string arguments and returns a List containing only those whose length is even.

Example 4-34. Using collect to return a List

public List<String> evenLengthStrings(String... strings) {
    return Stream.of(strings)
        .filter(s -> s.length() % 2 == 0)
        .collect(Collectors.toList());  
}

: Collect even-length strings into a List

If you need to write your own collectors, however, the procedure is a bit more complicated. Collectors use five functions that work together to accumulate entries into a mutable container and optionally transform the result. The five functions are called supplier, accumulator, combiner, finisher, and characteristics.

Taking the characteristics function first, it represents an immutable Set of elements of an enum type Collector.Characteristics. The three possible values are CON⁠CURRENT, IDENTITY_FINISH, and UNORDERED. CONCURRENT means that the result container can support the accumulator function being called concurrently on the result container from multiple threads. UNORDERED says that the collection operation does not need to preserve the encounter order of the elements. IDENTITY_FINISH means that the finishing function returns its argument without any changes.

Note that you don’t have to provide any characteristics if the defaults are what you want.

The purpose of each of the required methods is:

supplier(): Create the accumulator container using a Supplier<A>
accumulator(): Add a single new data element to the accumulator container using a BiConsumer<A,T>
combiner(): Merge two accumulator containers using a BinaryOperator<A>
finisher(): Transform the accumulator container into the result container using a Function<A,R>
characteristics(): A Set<Collector.Characteristics> chosen from the enum values

As usual, an understanding of the functional interfaces defined in the java.util.function package makes everything clearer. A Supplier is used to create the container where temporary results are accumulated. A BiConsumer adds a single element to the accumulator. A BinaryOperator means that both input types and the output type are the same, so here the idea is to combine two accumulators into one. A Function finally transforms the accumulator into the desired result container.

Each of these methods is invoked during the collection process, which is triggered by (for example) the collect method on Stream. Conceptually, the collection process is equivalent to the (generic) code shown in Example 4-35, taken from the Javadocs.

Example 4-35. How the Collector methods are used

R container = collector.supplier.get();           
for (T t : data) {
    collector.accumulator().accept(container, t); 
}
return collector.finisher().apply(container);

: Create the accumulator container
: Add each element to the accumulator container
: Convert the accumulator container to the result container using the finisher

Conspicuous by its absence is any mention of the combiner function. If your stream is sequential, you don’t need it—the algorithm proceeds as described. If, however, you are operating on a parallel stream, then the work is divided into multiple regions, each of which produces its own accumulator container. The combiner is then used during the join process to merge the accumulator containers together into a single one before applying the finisher function.

A code sample, similar to that shown in Example 4-34, is given in Example 4-36.

Example 4-36. Using collect to return an unmodifiable SortedSet

public SortedSet<String> oddLengthStringSet(String... strings) {
        Collector<String, ?, SortedSet<String>> intoSet =
                Collector.of(TreeSet<String>::new,           
                        SortedSet::add,                      
                        (left, right) -> {                   
                              left.addAll(right);
                              return left;
                        },
                        Collections::unmodifiableSortedSet); 
        return Stream.of(strings)
                .filter(s -> s.length() % 2 != 0)
                .collect(intoSet);
    }

: Supplier to create a new TreeSet
: BiConsumer to add each string to the TreeSet
: BinaryOperator to combine two SortedSet instances into one
: finisher function to create an unmodifiable set

The result will be a sorted, unmodifiable set of strings, ordered lexicographically.

This example used one of the two overloaded versions of the static of method for producing collectors, whose signatures are:

static <T,A,R> Collector<T,A,R>	of(Supplier<A> supplier,
    BiConsumer<A,T> accumulator,
    BinaryOperator<A> combiner,
    Function<A,R> finisher,
    Collector.Characteristics... characteristics)
static <T,R> Collector<T,R,R>	of(Supplier<R> supplier,
    BiConsumer<R,T> accumulator,
    BinaryOperator<R> combiner,
    Collector.Characteristics... characteristics)

Given the convenience methods in the Collectors class that produce collectors for you, you rarely need to make one of your own this way. Still, it’s a useful skill to have, and once again illustrates how the functional interfaces in the java.util.function package come together to create interesting objects.

Chapter 4. Comparators and Collectors

4.1 Sorting Using a Comparator

Problem

Solution

Discussion

Example 4-1. Sorting strings lexicographically

Example 4-2. Sorting strings by length

Note

Example 4-3. Sorting by length, then equal lengths lexicographically

Example 4-4. A class for golfers

Example 4-5. Sorting golfers

Example 4-6. Sorted golfers

4.2 Converting a Stream into a Collection

Problem

Solution

Discussion

Example 4-7. The collect method in Stream<T>

Tip

Example 4-8. Creating a List

Example 4-9. Creating a Set

Example 4-10. Creating a linked list

Example 4-11. Creating an array

Example 4-12. Creating a Map

See Also

4.3 Adding a Linear Collection to a Map

Problem

Solution

Discussion

Example 4-13. A simple POJO representing a book

Example 4-14. A collection of books

Example 4-15. Adding the books to a Map

See Also

4.4 Sorting Maps

Problem

Solution

Discussion

Example 4-18. Reading the dictionary file into a Map

Example 4-19. Sorting the map by key

Note

See Also

4.5 Partitioning and Grouping

Problem

Solution

Discussion

Example 4-20. Partitioning strings by even or odd lengths

Note

Example 4-21. Grouping strings by length

See Also

4.6 Downstream Collectors

Problem

Solution

Discussion

Example 4-22. Partitioning strings by even or odd lengths

Example 4-23. Counting the partitioned strings

See Also

4.7 Finding Max and Min Values

Problem

Solution

Discussion

Example 4-24. Employee POJO

Example 4-25. Using BinaryOperator.maxBy

Example 4-26. Using Stream.max

Example 4-27. Finding the highest salary

Example 4-28. Using Collectors.maxBy

Example 4-29. Using Collectors.maxBy as a downstream collector

See Also

4.8 Creating Immutable Collections

Problem

Solution

Discussion

Example 4-30. Unmodifiable methods in the Collections class

Example 4-31. Creating unmodifiable lists or sets prior to Java 8

Example 4-32. Creating unmodifiable lists or sets in Java 8

Example 4-33. Creating an immutable Map

See Also

4.9 Implementing the Collector Interface

Problem

Solution

Discussion

Example 4-34. Using collect to return a List