Chapter 4. Comparators and Collectors
Java 8 enhances the Comparator
interface with several static and default methods that make sorting operations much simpler. It’s now possible to sort a collection of POJOs by one property, then equal first properties by a second, then by a third, and so on, just with a series of library calls.
Java 8 also adds a new utility class called java.util.stream.Collectors
, which provides static methods to convert from streams back into various types of collections. The collectors can also be applied “downstream,” meaning that they can postprocess a grouping or partitioning operation.
The recipes in this chapter illustrate all these concepts.
4.1 Sorting Using a Comparator
Discussion
The sorted
method on Stream
produces a new, sorted stream using the natural ordering for the class. The natural ordering is specified by implementing the java.util.Comparable
interface.
For example, consider sorting a collection of strings, as shown in Example 4-1.
Example 4-1. Sorting strings lexicographically
private
List
<
String
>
sampleStrings
=
Arrays
.
asList
(
"this"
,
"is"
,
"a"
,
"list"
,
"of"
,
"strings"
)
;
public
List
<
String
>
defaultSort
(
)
{
Collections
.
sort
(
sampleStrings
)
;
return
sampleStrings
;
}
public
List
<
String
>
defaultSortUsingStreams
(
)
{
return
sampleStrings
.
stream
(
)
.
sorted
(
)
.
collect
(
Collectors
.
toList
(
)
)
;
}
Java has had a utility class called Collections
ever since the collections framework was added back in version 1.2. The static sort
method on Collections
takes a List
as an argument, but returns void
. The sort is destructive, modifying the supplied collection. This approach does not follow the functional principles supported by Java 8, which emphasize immutability.
Java 8 uses the sorted
method on streams to do the same sorting, but produces a new stream rather than modifying the original collection. In this example, after sorting the collection, the returned list is sorted according to the natural ordering of the class. For strings, the natural ordering is lexicographical, which reduces to alphabetical when all the strings are lowercase, as in this example.
If you want to sort the strings in a different way, then there is an overloaded sorted
method that takes a Comparator
as an argument.
Example 4-2 shows a length sort for strings in two different ways.
Example 4-2. Sorting strings by length
public
List
<
String
>
lengthSortUsingSorted
(
)
{
return
sampleStrings
.
stream
(
)
.
sorted
(
(
s1
,
s2
)
-
>
s1
.
length
(
)
-
s2
.
length
(
)
)
.
collect
(
toList
(
)
)
;
}
public
List
<
String
>
lengthSortUsingComparator
(
)
{
return
sampleStrings
.
stream
(
)
.
sorted
(
Comparator
.
comparingInt
(
String:
:
length
)
)
.
collect
(
toList
(
)
)
;
}
The argument to the sorted
method is a java.util.Comparator
, which is a functional interface. In lengthSortUsingSorted
, a lambda expression is provided to implement the compare
method in Comparator
. In Java 7 and earlier, the implementation would normally be provided by an anonymous inner class, but here a lambda expression is all that is required.
Note
Java 8 added sort(Comparator)
as a default
instance method on List
, equivalent to the static void sort(List, Comparator)
method on Collections
. Both are destructive sorts that return void
, so the sorted(Comparator)
approach on streams discussed here (which returns a new, sorted stream) is still preferred.
The second method, lengthSortUsingComparator
, takes advantage of one of the static methods added to the Comparator
interface. The comparingInt
method takes an argument of type ToIntFunction
that transforms the string into an int, called a keyExtractor
in the docs, and generates a Comparator
that sorts the collection using that key.
The added default methods in Comparator
are extremely useful. While you can write a Comparator
that sorts by length pretty easily, when you want to sort by more than one field that can get complicated. Consider sorting the strings by length, then equal-length strings alphabetically. Using the default and static methods in Comparator
, that becomes almost trivial, as shown in Example 4-3.
Example 4-3. Sorting by length, then equal lengths lexicographically
public
List
<
String
>
lengthSortThenAlphaSort
(
)
{
return
sampleStrings
.
stream
(
)
.
sorted
(
comparing
(
String:
:
length
)
.
thenComparing
(
naturalOrder
(
)
)
)
.
collect
(
toList
(
)
)
;
}
Comparator
provides a default
method called thenComparing
. Just like comparing
, it also takes a Function
as an argument, again known as a keyExtractor
. Chaining this to the comparing
method returns a Comparator
that compares by the first quantity, then equal first by the second, and so on.
Static imports often make the code easier to read. Once you get used to the static methods in both Comparator
and Collectors
, this becomes an easy way to simplify the code. In this case, the comparing
and naturalOrder
methods have been statically imported.
This approach works on any class, even if it does not implement Comparable
. Consider the Golfer
class shown in Example 4-4.
Example 4-4. A class for golfers
public
class
Golfer
{
private
String
first
;
private
String
last
;
private
int
score
;
// ... other methods ...
}
To create a leader board at a tournament, it makes sense to sort by score, then by last name, and then by first name. Example 4-5 shows how to do that.
Example 4-5. Sorting golfers
private
List
<
Golfer
>
golfers
=
Arrays
.
asList
(
new
Golfer
(
"Jack"
,
"Nicklaus"
,
68
),
new
Golfer
(
"Tiger"
,
"Woods"
,
70
),
new
Golfer
(
"Tom"
,
"Watson"
,
70
),
new
Golfer
(
"Ty"
,
"Webb"
,
68
),
new
Golfer
(
"Bubba"
,
"Watson"
,
70
)
);
public
List
<
Golfer
>
sortByScoreThenLastThenFirst
()
{
return
golfers
.
stream
()
.
sorted
(
comparingInt
(
Golfer:
:
getScore
)
.
thenComparing
(
Golfer:
:
getLast
)
.
thenComparing
(
Golfer:
:
getFirst
))
.
collect
(
toList
());
}
The output from calling sortByScoreThenLastThenFirst
is shown in Example 4-6.
Example 4-6. Sorted golfers
Golfer{first='Jack', last='Nicklaus', score=68} Golfer{first='Ty', last='Webb', score=68} Golfer{first='Bubba', last='Watson', score=70} Golfer{first='Tom', last='Watson', score=70} Golfer{first='Tiger', last='Woods', score=70}
The golfers are sorted by score, so Nicklaus and Webb come before Woods and both Watsons.1 Then equal scores are sorted by last name, putting Nicklaus before Webb and Watson before Woods. Finally, equal scores and last names are sorted by first name, putting Bubba Watson before Tom Watson.
The default and static methods in Comparator
, along with the new sorted
method on Stream
, makes generating complex sorts easy.
4.2 Converting a Stream into a Collection
Solution
Use the toList
, toSet
, or toCollection
methods in the Collectors
utility class.
Discussion
Idiomatic Java 8 often involves passing elements of a stream through a pipeline of intermediate operations, finishing with a terminal operation. One terminal operation is the collect
method, which is used to convert a Stream
into a collection.
The collect
method in Stream
has two overloaded versions, as shown in Example 4-7.
Example 4-7. The collect method in Stream<T>
<
R
,
A
>
R
collect
(
Collector
<?
super
T
,
A
,
R
>
collector
)
<
R
>
R
collect
(
Supplier
<
R
>
supplier
,
BiConsumer
<
R
,?
super
T
>
accumulator
,
BiConsumer
<
R
,
R
>
combiner
)
This recipe deals with the first version, which takes a Collector
as an argument. Collectors perform a “mutable reduction operation” that accumulates elements into a result container. Here the result will be a collection.
Collector
is an interface, so it can’t be instantiated. The interface contains a static of
method for producing them, but there is often a better, or at least easier, way.
Here, the static methods in the Collectors
class will be used to produce Collector
instances, which are used as the argument to Stream.collect
to populate a collection.
A simple example that creates a List
is shown in Example 4-8.2
Example 4-8. Creating a List
List
<
String
>
superHeroes
=
Stream
.
of
(
"Mr. Furious"
,
"The Blue Raja"
,
"The Shoveler"
,
"The Bowler"
,
"Invisible Boy"
,
"The Spleen"
,
"The Sphinx"
)
.
collect
(
Collectors
.
toList
());
This method creates and populates an ArrayList
with the given stream elements. Creating a Set
is just as easy, as in Example 4-9.
Example 4-9. Creating a Set
Set
<
String
>
villains
=
Stream
.
of
(
"Casanova Frankenstein"
,
"The Disco Boys"
,
"The Not-So-Goodie Mob"
,
"The Suits"
,
"The Suzies"
,
"The Furriers"
,
"The Furriers"
)
.
collect
(
Collectors
.
toSet
(
)
)
;
}
This method creates an instance of HashSet
and populates it, leaving out any duplicates.
Both of these examples used the default data structures—ArrayList
for List
, and HashSet
for Set
. If you wish to specify a particular data structure, you should use the Collectors.toCollection
method, which takes a Supplier
as an argument. Example 4-10 shows the sample code.
Example 4-10. Creating a linked list
List
<
String
>
actors
=
Stream
.
of
(
"Hank Azaria"
,
"Janeane Garofalo"
,
"William H. Macy"
,
"Paul Reubens"
,
"Ben Stiller"
,
"Kel Mitchell"
,
"Wes Studi"
)
.
collect
(
Collectors
.
toCollection
(
LinkedList:
:
new
));
}
The argument to the toCollection
method is a collection Supplier
, so the constructor reference to LinkedList
is provided here. The collect
method instantiates a LinkedList
and then populates it with the given names.
The Collectors
class also contains a method to create an array of objects. There are two overloads of the toArray
method:
Object
[]
toArray
();
<
A
>
A
[]
toArray
(
IntFunction
<
A
[]>
generator
);
The former returns an array containing the elements of this stream, but without specifying the type. The latter takes a function that produces a new array of desired type with length equal to the size of the stream, and is easiest to use with an array constructor reference as shown in Example 4-11.
Example 4-11. Creating an array
String
[
]
wannabes
=
Stream
.
of
(
"The Waffler"
,
"Reverse Psychologist"
,
"PMS Avenger"
)
.
toArray
(
String
[
]
:
:
new
)
;
}
The returned array is of the specified type, whose length matches the number of elements in the stream.
To transform into a Map
, the Collectors.toMap
method requires two Function
instances—one for the keys and one for the values.
Consider an Actor
POJO, which wraps a name
and a role
. If you have a Set
of Actor
instances from a given movie, the code in Example 4-12 creates a Map
from them.
Example 4-12. Creating a Map
Set
<
Actor
>
actors
=
mysteryMen
.
getActors
(
)
;
Map
<
String
,
String
>
actorMap
=
actors
.
stream
(
)
.
collect
(
Collectors
.
toMap
(
Actor:
:
getName
,
Actor:
:
getRole
)
)
;
actorMap
.
forEach
(
(
key
,
value
)
-
>
System
.
out
.
printf
(
"%s played %s%n"
,
key
,
value
)
)
;
The output is
Janeane Garofalo played The Bowler Greg Kinnear played Captain Amazing William H. Macy played The Shoveler Paul Reubens played The Spleen Wes Studi played The Sphinx Kel Mitchell played Invisible Boy Geoffrey Rush played Casanova Frankenstein Ben Stiller played Mr. Furious Hank Azaria played The Blue Raja
Similar code works for ConcurrentMap
using the toConcurrentMap
method.
See Also
Supplier
s are discussed in Recipe 2.2. Constructor references are in Recipe 1.3. The toMap
method is also demonstrated in Recipe 4.3.
4.3 Adding a Linear Collection to a Map
Discussion
This is a short, very focused use case, but when it comes up in practice the solution here can be quite convenient.
Say you had a List
of Book
instances, where Book
is a simple POJO that has an ID, a name, and a price. An abbreviated form of the Book
class is shown in Example 4-13.
Example 4-13. A simple POJO representing a book
public
class
Book
{
private
int
id
;
private
String
name
;
private
double
price
;
// ... other methods ...
}
Now assume you have a collection of Book
instances, as shown in Example 4-14.
Example 4-14. A collection of books
List
<
Book
>
books
=
Arrays
.
asList
(
new
Book
(
1
,
"Modern Java Recipes"
,
49.99
),
new
Book
(
2
,
"Java 8 in Action"
,
49.99
),
new
Book
(
3
,
"Java SE8 for the Really Impatient"
,
39.99
),
new
Book
(
4
,
"Functional Programming in Java"
,
27.64
),
new
Book
(
5
,
"Making Java Groovy"
,
45.99
)
new
Book
(
6
,
"Gradle Recipes for Android"
,
23.76
)
);
In many situations, instead of a List
you might want a Map
, where the keys are the book IDs and the values are the books themselves. This is really easy to accomplish using the toMap
method in Collectors
, as shown two different ways in Example 4-15.
Example 4-15. Adding the books to a Map
Map
<
Integer
,
Book
>
bookMap
=
books
.
stream
(
)
.
collect
(
Collectors
.
toMap
(
Book:
:
getId
,
b
-
>
b
)
)
;
bookMap
=
books
.
stream
(
)
.
collect
(
Collectors
.
toMap
(
Book:
:
getId
,
Function
.
identity
(
)
)
)
;
The toMap
method in Collectors
takes two Function
instances as arguments, the first of which generates a key and the second of which generates the value from the provided object. In this case, the key is mapped by the getId
method in Book
, and the value is the book itself.
The first toMap
in Example 4-15 uses the getId
method to map to the key and an explicit lambda expression that simply returns its parameter. The second example uses the static identity
method in Function
to do the same thing.
See Also
Functions are covered in Recipe 2.4, which also discusses unary and binary operators.
4.4 Sorting Maps
Discussion
The Map
interface has always contained a public, static, inner interface called Map.Entry
, which represents a key-value pair. The Map.entrySet
method returns a Set
of Map.Entry
elements. Prior to Java 8, the primary methods used in this interface were getKey
and getValue
, which do what you’d expect.
In Java 8, the static methods in Table 4-1 have been added.
Method | Description |
---|---|
|
Returns a comparator that compares |
|
Returns a comparator that compares |
|
Returns a comparator that compares |
|
Returns a comparator that compares |
To demonstrate how to use them, Example 4-18 generates a Map
of word lengths to number of words in a dictionary. Every Unix system contains a file in the usr/share/dict/words directory holding the contents of Webster’s 2nd edition dictionary, with one word per line. The Files.lines
method can be used to read a file and produce a stream of strings containing those lines. In this case, the stream will contain each word from the dictionary.
Example 4-18. Reading the dictionary file into a Map
System
.
out
.
println
(
"\nNumber of words of each length:"
);
try
(
Stream
<
String
>
lines
=
Files
.
lines
(
dictionary
))
{
lines
.
filter
(
s
->
s
.
length
()
>
20
)
.
collect
(
Collectors
.
groupingBy
(
String:
:
length
,
Collectors
.
counting
()))
.
forEach
((
len
,
num
)
->
System
.
out
.
printf
(
"%d: %d%n"
,
len
,
num
));
}
catch
(
IOException
e
)
{
e
.
printStackTrace
();
}
This example is discussed in Recipe 7.1, but to summarize:
-
The file is read inside a
try-with-resources
block.Stream
implementsAutoCloseable
, so when the try block exits, Java calls theclose
method onStream
, which then calls theclose
method onFile
. -
The filter restricts further processing to only words of at least 20 characters in length.
-
The
groupingBy
method ofCollectors
takes aFunction
as the first argument, representing the classifier. Here, the classifier is the length of each string. If you only provide one argument, the result is aMap
where the keys are the values of the classifier and the values are lists of elements that match the classifier. In the case we’re currently examining,groupingBy(String::length)
would have produced aMap<Integer,List<String>>
where the keys are the word lengths and the values are lists of words of that length. -
In this case, the two-argument version of
groupingBy
lets you supply anotherCollector
, called a downstream collector, that postprocesses the lists of words. In this case, the return type isMap<Integer,Long>
, where the keys are the word lengths and the values are the number of words of that length in the dictionary.
The result is:
Number of words of each length: 21: 82 22: 41 23: 17 24: 5
In other words, there are 82 words of length 21, 41 words of length 22, 17 words of length 23, and 5 words of length 24.3
The results show that the map is printed in ascending order of word length. In order to see it in descending order, use Map.Entry.comparingByKey
as in Example 4-19.
Example 4-19. Sorting the map by key
System
.
out
.
println
(
"\nNumber of words of each length (desc order):"
);
try
(
Stream
<
String
>
lines
=
Files
.
lines
(
dictionary
))
{
Map
<
Integer
,
Long
>
map
=
lines
.
filter
(
s
->
s
.
length
()
>
20
)
.
collect
(
Collectors
.
groupingBy
(
String:
:
length
,
Collectors
.
counting
()));
map
.
entrySet
().
stream
()
.
sorted
(
Map
.
Entry
.
comparingByKey
(
Comparator
.
reverseOrder
()))
.
forEach
(
e
->
System
.
out
.
printf
(
"Length %d: %2d words%n"
,
e
.
getKey
(),
e
.
getValue
()));
}
catch
(
IOException
e
)
{
e
.
printStackTrace
();
}
After computing the Map<Integer,Long>
, this operation extracts the entrySet
and produces a stream. The sorted
method on Stream
is used to produce a sorted stream using the provided comparator.
In this case, Map.Entry.comparingByKey
generates a comparator that sorts by the keys, and using the overload that takes a comparator allows the code to specify that we want it in reverse order.
Note
The sorted
method on Stream
produces a new, sorted stream that does not modify the source. The original Map
is unaffected.
The result is:
Number of words of each length (desc order): Length 24: 5 words Length 23: 17 words Length 22: 41 words Length 21: 82 words
The other sorting methods listed in Table 4-1 are used similarly.
See Also
An additional example of sorting a Map
by keys or values is shown in Appendix A. Downstream collectors are discussed in Recipe 4.6. File operations on the dictionary is part of Recipe 7.1.
4.5 Partitioning and Grouping
Discussion
Say you have a collection of strings. If you want to split them into those with even lengths and those with odd lengths, you can use Collectors.partitioningBy
, as in Example 4-20.
Example 4-20. Partitioning strings by even or odd lengths
List
<
String
>
strings
=
Arrays
.
asList
(
"this"
,
"is"
,
"a"
,
"long"
,
"list"
,
"of"
,
"strings"
,
"to"
,
"use"
,
"as"
,
"a"
,
"demo"
)
;
Map
<
Boolean
,
List
<
String
>
>
lengthMap
=
strings
.
stream
(
)
.
collect
(
Collectors
.
partitioningBy
(
s
-
>
s
.
length
(
)
%
2
=
=
0
)
)
;
lengthMap
.
forEach
(
(
key
,
value
)
-
>
System
.
out
.
printf
(
"%5s: %s%n"
,
key
,
value
)
)
;
//
// false: [a, strings, use, a]
// true: [this, is, long, list, of, to, as, demo]
The signature of the two partitioningBy
methods are:
static
<
T
>
Collector
<
T
,?,
Map
<
Boolean
,
List
<
T
>>>
partitioningBy
(
Predicate
<?
super
T
>
predicate
)
static
<
T
,
D
,
A
>
Collector
<
T
,?,
Map
<
Boolean
,
D
>>
partitioningBy
(
Predicate
<?
super
T
>
predicate
,
Collector
<?
super
T
,
A
,
D
>
downstream
)
The return types look rather nasty due to the generics, but you rarely have to deal with them in practice. Instead, the result of either operation becomes the argument to the collect
method, which uses the generated collector to create the output map defined by the third generic argument.
The first partitioningBy
method takes a single Predicate
as an argument. It divides the elements into those that satisfy the Predicate
and those that do not. You will always get a Map
as a result that has exactly two entries: a list of values that satisfy the Predicate
, and a list of values that do not.
The overloaded version of the method takes a second argument of type Collector
, called a downstream collector. This allows you to postprocess the lists returned by the partition, and is discussed in Recipe 4.6.
The groupingBy
method performs an operation like a “group by” statement in SQL. It returns a Map
where the keys are the groups and the values are lists of elements in each group.
Note
If you are getting your data from a database, by all means do any grouping operations there. The new API methods are convenience methods for data in memory.
The signature for the groupingBy
method is:
static
<
T
,
K
>
Collector
<
T
,?,
Map
<
K
,
List
<
T
>>>
groupingBy
(
Function
<?
super
T
,?
extends
K
>
classifier
)
The Function
argument takes each element of the stream and extracts a property to group by. This time, rather than simply partition the strings into two categories, consider separating them by length, as in Example 4-21.
Example 4-21. Grouping strings by length
List
<
String
>
strings
=
Arrays
.
asList
(
"this"
,
"is"
,
"a"
,
"long"
,
"list"
,
"of"
,
"strings"
,
"to"
,
"use"
,
"as"
,
"a"
,
"demo"
)
;
Map
<
Integer
,
List
<
String
>
>
lengthMap
=
strings
.
stream
(
)
.
collect
(
Collectors
.
groupingBy
(
String:
:
length
)
)
;
lengthMap
.
forEach
(
(
k
,
v
)
-
>
System
.
out
.
printf
(
"%d: %s%n"
,
k
,
v
)
)
;
//
// 1: [a, a]
// 2: [is, of, to, as]
// 3: [use]
// 4: [this, long, list, demo]
// 7: [strings]
The keys in the resulting map are the lengths of the strings (1, 2, 3, 4, and 7) and the values are lists of strings of each length.
See Also
An extension of the recipe we just looked at, Recipe 4.6 shows how to postprocess the lists returned by a groupingBy
or partitioningBy
operation.
4.6 Downstream Collectors
Discussion
In Recipe 4.5, we looked at how to separate elements into multiple categories. The partitioningBy
and groupingBy
methods return a Map
where the keys were the categories (booleans true
and false
for partitioningBy
, objects for groupingBy
) and the values were lists of elements that satisfied each category. Recall the example partitioning strings by even and odd lengths, shown in Example 4-20 but repeated in Example 4-22 for convenience.
Example 4-22. Partitioning strings by even or odd lengths
List
<
String
>
strings
=
Arrays
.
asList
(
"this"
,
"is"
,
"a"
,
"long"
,
"list"
,
"of"
,
"strings"
,
"to"
,
"use"
,
"as"
,
"a"
,
"demo"
);
Map
<
Boolean
,
List
<
String
>>
lengthMap
=
strings
.
stream
()
.
collect
(
Collectors
.
partitioningBy
(
s
->
s
.
length
()
%
2
==
0
));
lengthMap
.
forEach
((
key
,
value
)
->
System
.
out
.
printf
(
"%5s: %s%n"
,
key
,
value
));
//
// false: [a, strings, use, a]
// true: [this, is, long, list, of, to, as, demo]
Rather than the actual lists, you may be interested in how many elements fall into each category. In other words, instead of producing a Map
whose values are List<String>
, you might want just the number of elements in each of the lists. The partitioningBy
method has an overloaded version whose second argument is of type Collector
:
static
<
T
,
D
,
A
>
Collector
<
T
,?,
Map
<
Boolean
,
D
>>
partitioningBy
(
Predicate
<?
super
T
>
predicate
,
Collector
<?
super
T
,
A
,
D
>
downstream
)
This is where the static Collectors.counting
method becomes useful. Example 4-23 shows how it works.
Example 4-23. Counting the partitioned strings
Map
<
Boolean
,
Long
>
numberLengthMap
=
strings
.
stream
(
)
.
collect
(
Collectors
.
partitioningBy
(
s
-
>
s
.
length
(
)
%
2
=
=
0
,
Collectors
.
counting
(
)
)
)
;
numberLengthMap
.
forEach
(
(
k
,
v
)
-
>
System
.
out
.
printf
(
"%5s: %d%n"
,
k
,
v
)
)
;
//
// false: 4
// true: 8
This is called a downstream collector, because it is postprocessing the resulting lists downstream (i.e., after the partitioning operation is completed).
The groupingBy
method also has an overload that takes a downstream collector:
/**
* @param <T> the type of the input elements
* @param <K> the type of the keys
* @param <A> the intermediate accumulation type of the downstream collector
* @param <D> the result type of the downstream reduction
* @param classifier a classifier function mapping input elements to keys
* @param downstream a {@code Collector} implementing the downstream reduction
* @return a {@code Collector} implementing the cascaded group-by operation
*/
static
<
T
,
K
,
A
,
D
>
Collector
<
T
,?,
Map
<
K
,
D
>>
groupingBy
(
Function
<?
super
T
,?
extends
K
>
classifier
,
Collector
<?
super
T
,
A
,
D
>
downstream
)
A portion of the Javadoc comment from the source code is included in the signature, which shows that T
is the type of the element in the collection, K
is the key type for the resulting map, A
is an accumulator, and D
is the type of the downstream collector. The ?
represents “unknown.” See Appendix A for more details on generics in Java 8.
Several methods in Stream
have analogs in the Collectors
class. Table 4-2 shows how they align.
Stream | Collectors |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Again, the purpose of a downstream collector is to postprocess the collection of objects produced by an upstream operation, like partitioning or grouping.
See Also
Recipe 7.1 shows an example of a downstream collector when determining the longest words in a dictionary. Recipe 4.5 discusses the partitionBy
and groupingBy
methods in more detail. The whole issue of generics is covered in Appendix A.
4.7 Finding Max and Min Values
Solution
You have several choices: the maxBy
and minBy
methods on BinaryOperator
, the max
and min
methods on Stream
, or the maxBy
and minBy
utility methods on Collectors
.
Discussion
A BinaryOperator
is one of the functional interfaces in the java.util.function
package. It extends BiFunction
and applies when both arguments to the function and the return value are all from the same class.
The BinaryOperator
interface adds two static methods:
static
<
T
>
BinaryOperator
<
T
>
maxBy
(
Comparator
<?
super
T
>
comparator
)
static
<
T
>
BinaryOperator
<
T
>
minBy
(
Comparator
<?
super
T
>
comparator
)
Each of these returns a BinaryOperator
that uses the supplied Comparator
.
To demonstrate the various ways to get the maximum value from a stream, consider a POJO called Employee
that holds three attributes: name
, salary
, and department
, as in Example 4-24.
Example 4-24. Employee POJO
public
class
Employee
{
private
String
name
;
private
Integer
salary
;
private
String
department
;
// ... other methods ...
}
List
<
Employee
>
employees
=
Arrays
.
asList
(
new
Employee
(
"Cersei"
,
250_000
,
"Lannister"
)
,
new
Employee
(
"Jamie"
,
150_000
,
"Lannister"
)
,
new
Employee
(
"Tyrion"
,
1_000
,
"Lannister"
)
,
new
Employee
(
"Tywin"
,
1_000_000
,
"Lannister"
)
,
new
Employee
(
"Jon Snow"
,
75_000
,
"Stark"
)
,
new
Employee
(
"Robb"
,
120_000
,
"Stark"
)
,
new
Employee
(
"Eddard"
,
125_000
,
"Stark"
)
,
new
Employee
(
"Sansa"
,
0
,
"Stark"
)
,
new
Employee
(
"Arya"
,
1_000
,
"Stark"
)
)
;
Employee
defaultEmployee
=
new
Employee
(
"A man (or woman) has no name"
,
0
,
"Black and White"
)
;
Given a collection of employees, you can use the reduce
method on Stream
, which takes a BinaryOperator
as an argument. The snippet in Example 4-25 shows how to get the employee with the largest salary.
Example 4-25. Using BinaryOperator.maxBy
Optional
<
Employee
>
optionalEmp
=
employees
.
stream
()
.
reduce
(
BinaryOperator
.
maxBy
(
Comparator
.
comparingInt
(
Employee:
:
getSalary
)));
System
.
out
.
println
(
"Emp with max salary: "
+
optionalEmp
.
orElse
(
defaultEmployee
));
The reduce
method requires a BinaryOperator
. The static maxBy
method produces that BinaryOperator
based on the supplied Comparator
, which in this case compares employees by salary.
This works, but there’s actually a convenience method called max
that can be applied directly to the stream:
Optional
<
T
>
max
(
Comparator
<?
super
T
>
comparator
)
Using that method directly is shown in Example 4-26.
Example 4-26. Using Stream.max
optionalEmp
=
employees
.
stream
()
.
max
(
Comparator
.
comparingInt
(
Employee:
:
getSalary
));
The result is the same.
Note that there is also a method called max
on the primitive streams (IntStream
, LongStream
, and DoubleStream
) that takes no arguments. Example 4-27 shows that method in action.
Example 4-27. Finding the highest salary
OptionalInt
maxSalary
=
employees
.
stream
()
.
mapToInt
(
Employee:
:
getSalary
)
.
max
();
System
.
out
.
println
(
"The max salary is "
+
maxSalary
);
In this case, the mapToInt
method is used to convert the stream of employees into a stream of integers by invoking the getSalary
method, and the returned stream is an IntStream
. The max
method then returns an OptionalInt
.
There is also a static method called maxBy
in the Collectors
utility class. You can use it directly here, as in Example 4-28.
Example 4-28. Using Collectors.maxBy
optionalEmp
=
employees
.
stream
()
.
collect
(
Collectors
.
maxBy
(
Comparator
.
comparingInt
(
Employee:
:
getSalary
)));
This is awkward, however, and can be replaced by the max
method on Stream
, as shown in the preceding example. The maxBy
method on Collectors
is helpful when used as a downstream collector (i.e., when postprocessing a grouping or partitioning operation). The code in Example 4-29 uses groupingBy
on Stream
to create a Map
of departments to lists of employees, but then determines the employee with the greatest salary in each department.
Example 4-29. Using Collectors.maxBy as a downstream collector
Map
<
String
,
Optional
<
Employee
>>
map
=
employees
.
stream
()
.
collect
(
Collectors
.
groupingBy
(
Employee:
:
getDepartment
,
Collectors
.
maxBy
(
Comparator
.
comparingInt
(
Employee:
:
getSalary
))));
map
.
forEach
((
house
,
emp
)
->
System
.
out
.
println
(
house
+
": "
+
emp
.
orElse
(
defaultEmployee
)));
The minBy
method in each of these classes works the same way.
See Also
Functions are discussed in Recipe 2.4. Downstream collectors are in Recipe 4.6.
4.8 Creating Immutable Collections
Discussion
With its focus on parallelization and clarity, functional programming favors using immutable objects wherever possible. The Collections framework, added in Java 1.2, has always had methods to create immutable collections from existing ones, though in a somewhat awkward fashion.
The Collections
utility class has methods unmodifiableList
, unmodifiableSet
, and unmodifiableMap
(along with a few other methods with the same unmodifiable
prefix), as shown in Example 4-30.
Example 4-30. Unmodifiable methods in the Collections class
static
<
T
>
List
<
T
>
unmodifiableList
(
List
<?
extends
T
>
list
)
static
<
T
>
Set
<
T
>
unmodifiableSet
(
Set
<?
extends
T
>
s
)
static
<
K
,
V
>
Map
<
K
,
V
>
unmodifiableMap
(
Map
<?
extends
K
,?
extends
V
>
m
)
In each case, the argument to the method is an existing list, set, or map, and the resulting list, set, or map has the same elements as the argument, but with an important difference: all the methods that could modify the collection, like add
or remove
, now throw an UnsupportedOperationException
.
Prior to Java 8, if you received the individual values as an argument, using a variable argument list, you produced an unmodifiable list or set as shown in Example 4-31.
Example 4-31. Creating unmodifiable lists or sets prior to Java 8
@SafeVarargs
public
final
<
T
>
List
<
T
>
createImmutableListJava7
(
T
.
.
.
elements
)
{
return
Collections
.
unmodifiableList
(
Arrays
.
asList
(
elements
)
)
;
}
@SafeVarargs
public
final
<
T
>
Set
<
T
>
createImmutableSetJava7
(
T
.
.
.
elements
)
{
return
Collections
.
unmodifiableSet
(
new
HashSet
<
>
(
Arrays
.
asList
(
elements
)
)
)
;
}
You promise not to corrupt the input array type. See Appendix A for details.
The idea in each case is to start by taking the incoming values and converting them into a List
. You can wrap the resulting list using unmodifiableList
, or, in the case of a Set
, use the list as the argument to a set constructor before using unmodifiableSet
.
In Java 8, with the new Stream
API, you can instead take advantage of the static Collectors.collectingAndThen
method, as in Example 4-32.
Example 4-32. Creating unmodifiable lists or sets in Java 8
import
static
java
.
util
.
stream
.
Collectors
.
collectingAndThen
;
import
static
java
.
util
.
stream
.
Collectors
.
toList
;
import
static
java
.
util
.
stream
.
Collectors
.
toSet
;
// ... define a class with the following methods ...
@SafeVarargs
public
final
<
T
>
List
<
T
>
createImmutableList
(
T
.
.
.
elements
)
{
return
Arrays
.
stream
(
elements
)
.
collect
(
collectingAndThen
(
toList
(
)
,
Collections:
:
unmodifiableList
)
)
;
}
@SafeVarargs
public
final
<
T
>
Set
<
T
>
createImmutableSet
(
T
.
.
.
elements
)
{
return
Arrays
.
stream
(
elements
)
.
collect
(
collectingAndThen
(
toSet
(
)
,
Collections:
:
unmodifiableSet
)
)
;
}
The Collectors.collectingAndThen
method takes two arguments: a downstream Collector
and a Function
called a finisher. The idea is to stream the input elements and then collect them into a List
or Set
, and then the unmodifiable function wraps the resulting collection.
Converting a series of input elements into an unmodifiable Map
isn’t as clear, partly because it’s not obvious which of the input elements would be assumed to be keys and which would be values. The code shown in Example 4-334 creates an immutable Map
in a very awkward way, using an instance initializer.
Example 4-33. Creating an immutable Map
Map
<
String
,
Integer
>
map
=
Collections
.
unmodifiableMap
(
new
HashMap
<
String
,
Integer
>()
{{
put
(
"have"
,
1
);
put
(
"the"
,
2
);
put
(
"high"
,
3
);
put
(
"ground"
,
4
);
}});
Readers who are familiar with Java 9, however, already know that this entire recipe can be replaced with a very simple set of factory methods: List.of
, Set.of
, and Map.of
.
See Also
Recipe 10.3 shows the new factory methods in Java 9 that automatically create immutable collections.
4.9 Implementing the Collector Interface
Solution
Provide lambda expressions or method references for the Supplier
, accumulator, combiner, and finisher functions used by the Collector.of
factory methods, along with any desired characteristics.
Discussion
The utility class java.util.stream.Collectors
has several convenient static methods whose return type is Collector
. Examples are toList
, toSet
, toMap
, and even toCollection
, each of which is illustrated elsewhere in this book. Instances of classes that implement Collector
are sent as arguments to the collect
method on Stream
. For instance, in Example 4-34, the method accepts string arguments and returns a List
containing only those whose length is even.
Example 4-34. Using collect to return a List
public
List
<
String
>
evenLengthStrings
(
String
.
.
.
strings
)
{
return
Stream
.
of
(
strings
)
.
filter
(
s
-
>
s
.
length
(
)
%
2
=
=
0
)
.
collect
(
Collectors
.
toList
(
)
)
;
}
If you need to write your own collectors, however, the procedure is a bit more complicated. Collectors use five functions that work together to accumulate entries into a mutable container and optionally transform the result. The five functions are called supplier
, accumulator
, combiner
, finisher
, and characteristics
.
Taking the characteristics
function first, it represents an immutable Set
of elements of an enum
type Collector.Characteristics
. The three possible values are CONCURRENT
, IDENTITY_FINISH
, and UNORDERED
. CONCURRENT
means that the result container can support the accumulator function being called concurrently on the result container from multiple threads. UNORDERED
says that the collection operation does not need to preserve the encounter order of the elements. IDENTITY_FINISH
means that the finishing function returns its argument without any changes.
Note that you don’t have to provide any characteristics if the defaults are what you want.
The purpose of each of the required methods is:
supplier()
-
Create the accumulator container using a
Supplier<A>
accumulator()
-
Add a single new data element to the accumulator container using a
BiConsumer<A,T>
combiner()
-
Merge two accumulator containers using a
BinaryOperator<A>
finisher()
-
Transform the accumulator container into the result container using a
Function<A,R>
characteristics()
-
A
Set<Collector.Characteristics>
chosen from the enum values
As usual, an understanding of the functional interfaces defined in the java.util.function
package makes everything clearer. A Supplier
is used to create the container where temporary results are accumulated. A BiConsumer
adds a single element to the accumulator. A BinaryOperator
means that both input types and the output type are the same, so here the idea is to combine two accumulators into one. A Function
finally transforms the accumulator into the desired result container.
Each of these methods is invoked during the collection process, which is triggered by (for example) the collect
method on Stream
. Conceptually, the collection process is equivalent to the (generic) code shown in Example 4-35, taken from the Javadocs.
Example 4-35. How the Collector methods are used
R
container
=
collector
.
supplier
.
get
(
)
;
for
(
T
t
:
data
)
{
collector
.
accumulator
(
)
.
accept
(
container
,
t
)
;
}
return
collector
.
finisher
(
)
.
apply
(
container
)
;
Conspicuous by its absence is any mention of the combiner
function. If your stream is sequential, you don’t need it—the algorithm proceeds as described. If, however, you are operating on a parallel stream, then the work is divided into multiple regions, each of which produces its own accumulator container. The combiner is then used during the join process to merge the accumulator containers together into a single one before applying the finisher function.
A code sample, similar to that shown in Example 4-34, is given in Example 4-36.
Example 4-36. Using collect to return an unmodifiable SortedSet
public
SortedSet
<
String
>
oddLengthStringSet
(
String
.
.
.
strings
)
{
Collector
<
String
,
?
,
SortedSet
<
String
>
>
intoSet
=
Collector
.
of
(
TreeSet
<
String
>
:
:
new
,
SortedSet:
:
add
,
(
left
,
right
)
-
>
{
left
.
addAll
(
right
)
;
return
left
;
}
,
Collections:
:
unmodifiableSortedSet
)
;
return
Stream
.
of
(
strings
)
.
filter
(
s
-
>
s
.
length
(
)
%
2
!
=
0
)
.
collect
(
intoSet
)
;
}
The result will be a sorted, unmodifiable set of strings, ordered lexicographically.
This example used one of the two overloaded versions of the static
of
method for producing collectors, whose signatures are:
static
<
T
,
A
,
R
>
Collector
<
T
,
A
,
R
>
of
(
Supplier
<
A
>
supplier
,
BiConsumer
<
A
,
T
>
accumulator
,
BinaryOperator
<
A
>
combiner
,
Function
<
A
,
R
>
finisher
,
Collector
.
Characteristics
...
characteristics
)
static
<
T
,
R
>
Collector
<
T
,
R
,
R
>
of
(
Supplier
<
R
>
supplier
,
BiConsumer
<
R
,
T
>
accumulator
,
BinaryOperator
<
R
>
combiner
,
Collector
.
Characteristics
...
characteristics
)
Given the convenience methods in the Collectors
class that produce collectors for you, you rarely need to make one of your own this way. Still, it’s a useful skill to have, and once again illustrates how the functional interfaces in the java.util.function
package come together to create interesting objects.
See Also
The finisher
function is an example of a downstream collector, discussed further in Recipe 4.6. The Supplier
, Function
, and BinaryOperator
functional interfaces are discussed in various recipes in Chapter 2. The static utility methods in Collectors
are discussed in Recipe 4.2.
1 Ty Webb, of course, is from the movie Caddyshack. Judge Smails: “Ty, what did you shoot today?” Ty Webb: “Oh, Judge, I don’t keep score.” Smails: “Then how do you measure yourself with other golfers?” Webb: “By height.” Adding a sort by height is left to the reader as an easy exercise.
2 The names in this recipe come from Mystery Men, one of the great overlooked movies of the ’90s. (Mr. Furious: “Lance Hunt is Captain Amazing.” The Shoveler: “Lance Hunt wears glasses. Captain Amazing doesn’t wear glasses.” Mr. Furious: “He takes them off when he transforms.” The Shoveler: “That doesn’t make any sense! He wouldn’t be able to see!”)
3 For the record, those five longest words are formaldehydesulphoxylate, pathologicopsychological, scientificophilosophical, tetraiodophenolphthalein, and thyroparathyroidectomize. Good luck with that, spell checker.
4 From Carl Martensen’s blog post “Java 9’s Immutable Collections Are Easier To Create But Use With Caution”.
Get Modern Java Recipes now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.