Chapter 4. Pattern Matching
Scala’s pattern matching provides deep inspection and decomposition of objects in a variety of ways. It’s one of my favorite features in Scala. For your own types, you can follow a protocol that allows you to control the visibility of internal state and how to expose it to users. The terms extraction and destructuring are sometimes used for this capability.
Pattern matching can be used in several code contexts, as we’ve already seen in “A Sample Application” and “Partial Functions”. We’ll start with a change in Scala 3 for better type safety, followed by a quick tour of common and straightforward usage examples, then explore more advanced scenarios. We’ll cover a few more pattern-matching features in later chapters, once we have the background to understand them.
Safer Pattern Matching with Matchable
Let’s begin with an important change in Scala 3’s type system that is designed to make compile-time checking of pattern-matching expressions more robust.
Scala 3 introduced an immutable wrapper around Array
s called scala.IArray
. Array
s in Java are mutable, so this is intended as a safer way to work with them. In fact, IArray
is a type alias for Array
to avoid the overhead of wrapping arrays, which means that pattern matching introduces a hole in the abstraction. Using the Scala 3.0 REPL without the -source:future
setting, observe the following:
// src/script/scala/progscala3/patternmatching/Matchable.scala
scala
>
val
iarray
=
IArray
(
1
,
2
,
3
,
4
,
5
)
|
iarray
match
|
case
a
:
Array
[
Int
]
=>
a
(
2
)
=
300
// Scala 3 warning!!
|
println
(
iarray
)
val
iarray
:
opaques
.
IArray
[
Int
]
=
Array
(
1
,
2
,
300
,
4
,
5
)
There are other examples where this can occur. To close this loophole, The Scala type system now has a trait called Matchable
. It fits into the type hierarchy as follows:
abstract
class
Any
:
def
isInstanceOf
def
getClass
def
asInstanceOf
// Cast to a new type: myAny.asInstanceOf[String]
def
==
def
!=
def
##
// Alias for hashCode
def
equals
def
hashCode
def
toString
trait
Matchable
extends
Any
class
AnyVal
extends
Any
,
Matchable
class
AnyRef
extends
Any
,
Matchable
Note that Matchable
is a marker trait, as it currently has no members. However, a future release of Scala may move getClass
and isInstanceOf
to Matchable
, as they are closely associated with pattern matching.
The intent is that pattern matching can only occur on values of type Matchable
, not Any
. Since almost all types are subtypes of AnyRef
and AnyVal
, they already satisfy this constraint, but attempting to pattern match on the following types will trigger warnings in future Scala 3 releases or when using -source:future
with Scala 3.0:
-
Type
Any
. UseMatchable
instead, when possible. -
Type parameters and abstract types without bounds. Add
<: Matchable
. -
Type parameters and abstract types bounded only by universal traits. Add
<: Matchable
.
We’ll discuss universal traits in “Value Classes”. We can ignore them for now. As an example of the second bullet, consider the following method definition in a REPL session with the -source:future
flag restored:
scala
>
def
examine
[
T
](
seq
:
Seq
[
T
]):
Seq
[
String
]
=
seq
map
{
|
case
i
:
Int
=>
s"Int:
$
i
"
|
case
other
=>
s"Other:
$
other
"
|
}
2
|
case
i
:
Int
=>
s"Int:
$
i
"
|
^^^
|
pattern
selector
should
be
an
instance
of
Matchable
,
|
but
it
has
unmatchable
type
T
instead
Now the type parameter T
needs a bound:
scala
>
def
examine
[
T
<:
Matchable
](
seq
:
Seq
[
T
]):
Seq
[
String
]
=
seq
map
{
|
case
i
:
Int
=>
s"Int:
$
i
"
|
case
other
=>
s"Other:
$
other
"
|
}
def
examine
[
T
<:
Matchable
](
seq
:
Seq
[
T
]):
Seq
[
String
]
scala
>
val
seq
=
Seq
(
1
,
"two"
,
3
,
4.4
)
|
examine
(
seq
)
val
seq
:
Seq
[
Matchable
]
=
List
(
1
,
two
,
3
,
4.4
)
val
res0
:
Seq
[
String
]
=
List
(
Int
:
1
,
Other
:
two
,
Int
:
3
,
Other
:
4.4
)
Notice the inferred common supertype of the values in the sequence, seq
. In Scala 2, it would be Any
.
Back to IArray
, the example at the beginning now triggers a warning because the IArray
alias is not bounded by Matchable
:
scala
>
val
iarray
=
IArray
(
1
,
2
,
3
,
4
,
5
)
|
iarray
match
|
case
a
:
Array
[
Int
]
=>
a
(
2
)
=
300
|
3
|
case
a
:
Array
[
Int
]
=>
a
(
2
)
=
300
|
^^^^^^^^^^
|
pattern
selector
should
be
an
instance
of
Matchable
,
|
but
it
has
unmatchable
type
opaques
.
IArray
[
Int
]
instead
IArray
is considered an abstract type by the compiler. Abstract types are not bounded by Matchable
, which is why we now get the warning we want.
This is a significant change that will break a lot of existing code. Hence, warnings will only be issued starting in a future Scala 3 release or when compiling with -source:future
.
Values, Variables, and Types in Matches
Let’s cover several kinds of matches. The following example matches on specific values, all values of specific types, and it shows one way of writing a default clause that matches anything:
// src/script/scala/progscala3/patternmatching/MatchVariable.scala
val
seq
=
Seq
(
1
,
2
,
3.14
,
5.5F
,
"one"
,
"four"
,
true
,
(
6
,
7
)
)
val
result
=
seq
.
map
{
case
1
=>
"int 1"
case
i
:
Int
=>
s"
other int:
$
i
"
case
d
:
(
Double
|
Float
)
=>
s"
a double or float:
$
d
"
case
"one"
=>
"string one"
case
s
:
String
=>
s"
other string:
$
s
"
case
(
x
,
y
)
=>
s"
tuple: (
$
x
,
$
y
)
"
case
unexpected
=>
s"
unexpected value:
$
unexpected
"
}
assert
(
result
==
Seq
(
"int 1"
,
"other int: 2"
,
"a double or float: 3.14"
,
"a double or float: 5.5"
,
"string one"
,
"other string: four"
,
"unexpected value: true"
,
"tuple: (6, 7)"
)
)
Because of the mix of values,
seq
is of typeSeq[Matchable]
.If one or more case clauses specify particular values of a type, they need to occur before more general clauses that just match on the type. So we first check if the anonymous value is an
Int
equal to1
. If so, we simply return the string"int 1"
. If the value is anotherInt
value, the next clause matches. In this case, the value is cast toInt
and assigned to the variablei
, which is used to construct a string.Match on any
Double
orFloat
value. Using|
is convenient when two or more cases are handled the same way. However, for this to work, the logic after the=>
must be type compatible for all matched types. In this case, the interpolated string works fine.Two case clauses for strings.
Match on a two-element tuple where the elements are of any type, and extract the elements into the variables
x
andy
.Match all other inputs. The variable
unexpected
has an arbitrary name. Because no type declaration is given,Matchable
is inferred. This functions as the default clause. The Boolean value from the sequenceseq
is assigned tounexpected
.
We passed a partial function to Seq.map()
. Recall that the literal syntax requires case
statements, and we have put the partial function inside parentheses or braces to pass it to map
. However, this function is effectively total, because the last clause matches any Matchable
. (It would be Any
in Scala 2.) This means it wouldn’t match instances of the few other types that aren’t Matchable
s, like IArray
, but these types are no longer candidates for pattern matching. From now on, I’ll just call partial functions like this total.
Don’t use clauses with specific floating-point literal values because matching on floating-point literals is a bad idea. Rounding errors mean two values that you might expect to be the same may actually differ.
Matches are eager, so more specific clauses must appear before less specific clauses. Otherwise, the more specific clauses will never get the chance to match. So the clauses matching on particular values of types must come before clauses matching on the type (i.e., on any value of the type). The default clause shown must be the last one. Fortunately, the compiler will issue an “Unreachable case” warning if you make this mistake. Try switching the two Int
clauses to see what happens.
Match
clauses are expressions, so they return a value. In this example, all clauses return strings, so the return type of the match
expression (and the partial function) is String
. Hence, the return type of the map
call is List[String]
. The compiler infers the least upper bound, the closest supertype, for the types of values returned by all the case
clauses.
This is a contrived example, of course. When designing pattern-matching expressions, be wary of relying on a default case
clause. Under what circumstances would “none of the above” be the correct answer? It may indicate that your design could be refined so you know more precisely all the possible matches that might occur, like a sealed type hierarchy or enum
, which we’ll discuss further. In fact, as we go through this chapter, you’ll see more realistic scenarios and no default clauses.
Here is a similar example that passes an anonymous function to map
, rather than a partial function, plus some other changes:
// src/script/scala/progscala3/patternmatching/MatchVariable2.scala
val
seq2
=
Seq
(
1
,
2
,
3.14
,
"one"
,
(
6
,
7
)
)
val
result2
=
seq2
.
map
{
x
=>
x
match
case
_:
Int
=>
s"
int:
$
x
"
case
_
=>
s"
unexpected value:
$
x
"
}
assert
(
result2
==
Seq
(
"int: 1"
,
"int: 2"
,
"unexpected value: 3.14"
,
"unexpected value: one"
,
"unexpected value: (6,7)"
)
)
Use
_
for the variable name, meaning we don’t capture it.Catch-all clause that also uses
x
instead of capturing to a new variable.
The first case clause doesn’t need to capture the variable because it doesn’t exploit the fact that the value is an Int
. For example, it doesn’t call Int
methods. Otherwise, just using x
wouldn’t be sufficient, as it has type Matchable
.
Once again, braces are used around the whole anonymous function, but the optional braces syntax is used inside the function for the match
expression. In general, using a partial function is more concise because we eliminate the need for x => x match
.
Tip
When you use pattern matching with any of the collection methods, like map
and foreach
, use a partial function.
There are a few rules and gotchas to keep in mind for case
clauses. The compiler assumes that a term that begins with a lowercase letter is the name of a variable that will hold a matched value. If the term starts with a capital letter, it will expect to find a definition already in scope.
This lowercase rule can cause surprises, as shown in the following example. The intention is to pass some value to a method, then see if that value matches an element in the collection:
// src/script/scala/progscala3/patternmatching/MatchSurprise.scala
def
checkYBad
(
y
:
Int
):
Seq
[
String
]
=
for
x
<-
Seq
(
99
,
100
,
101
)
yield
x
match
case
y
=>
"found y!"
case
i
:
Int
=>
"int: "
+
i
// Unreachable case!
The first case clause is supposed to match on the value passed in as y
, but this is what we actually get:
def checkBad(y: Int): Seq[String] 10 | case i: Int => "int: "+i // Unreachable case! | ^^^^^^ | Unreachable case
We treat warnings as errors in our built.sbt
settings, but if we didn’t, then calling checkY(100)
would return found y!
for all three numbers.
The case y
clause means “match anything because there is no type declaration, and assign it to this new variable named y
.” The y
in the clause is not interpreted as a reference to the method parameter y
. Rather, it shadows that definition. Hence, this clause is actually a default, match-all clause and we will never reach the second case
clause.
There are two solutions. First, we could use capital Y
, although it looks odd to have a method parameter start with a capital letter:
def
checkYGood1
(
Y
:
Int
):
Seq
[
String
]
=
for
x
<-
Seq
(
99
,
100
,
101
)
yield
x
match
case
Y
=>
"found y!"
case
i
:
Int
=>
"int: "
+
i
Calling checkYGood1(100)
returns List(int: 99, found y!, int: 101)
.
The second solution is to use backticks to indicate we really want to match against the value held by y
:
def
checkYGood2
(
y
:
Int
):
Seq
[
String
]
=
for
x
<-
Seq
(
99
,
100
,
101
)
yield
x
match
case
`y`
=>
"found y!"
case
i
:
Int
=>
"int: "
+
i
Warning
In case
clauses, a term that begins with a lowercase letter is assumed to be the name of a new variable that will hold an extracted value. To refer to a previously defined variable, enclose it in backticks or start the name with a capital letter.
Finally, most match
expressions should be exhaustive:
// src/script/scala/progscala3/patternmatching/MatchExhaustive.scala
scala
>
val
seq3
=
Seq
(
Some
(
1
),
None
,
Some
(
2
),
None
)
val
seq3
:
Seq
[
Option
[
Int
]]
=
List
(
Some
(
1
),
None
,
Some
(
2
),
None
)
scala
>
val
result3
=
seq3
.
map
{
|
case
Some
(
i
)
=>
s"int
$
i
"
|
}
5
|
case
Some
(
i
)
=>
s"int
$
i
"
|
^
|
match
may
not
be
exhaustive
.
|
|
It
would
fail
on
pattern
case
:
None
The compiler knows that the elements of seq3
are of type Option[Int]
, which could include None
elements. At runtime, a MatchError
will be thrown if a None
is encountered. The fix is straightforward:
// src/script/scala/progscala3/patternmatching/MatchExhaustiveFix.scala
scala
>
val
result3
=
seq3
.
map
{
|
case
Some
(
i
)
=>
s"int
$
i
"
|
case
None
=>
""
|
}
val
result3
:
Seq
[
String
]
=
List
(
int
1
,
""
,
int
2
,
""
)
“Problems in Pattern Bindings” will discuss additional points about exhaustive matching.
Matching on Sequences
Let’s examine the classic idiom for iterating through a Seq
using pattern matching and recursion and, along the way, learn some useful fundamentals about sequences:
// src/script/scala/progscala3/patternmatching/MatchSeq.scala
def
seqToString
[
T
]
(
seq
:
Seq
[
T
]
)
:
String
=
seq
match
case
head
+:
tail
=>
s"
(
$
head
+:
${
seqToString
(
tail
)
}
)
"
case
Nil
=>
"Nil"
Define a recursive method that constructs a
String
from aSeq[T]
for some typeT
, which will be inferred from the sequence passed in. The body is a singlematch
expression.There are two match clauses and they are exhaustive. The first matches on any nonempty
Seq
, extracting the first element ashead
and the rest of theSeq
astail
. These are common names for the parts of aSeq
, which hashead
andtail
methods. However, here these terms are used as variable names. The body of the clause constructs aString
with the head followed by+:
followed by the result of callingseqToString
on the tail, all surrounded by parentheses,()
. Note this method is recursive, but not tail recursive.The only other possible case is an empty
Seq
. We can use the special case object for an emptyList
,Nil
, to match all the empty cases. This clause terminates the recursion. Note that any type ofSeq
can always be interpreted as terminating with aNil
, or we could use an empty instance of the actual type (examples follow).
The operator +:
is the cons (construction) operator for sequences. Recall that methods that end with a colon (:
) bind to the right, toward the Seq
tail. However, +:
in this case
clause is actually an object
named +:
, so we have a nice syntax symmetry between construction of sequences, like val seq = 1 +: 2 +: Nil
, and deconstruction, like case 1 +: 2 +: Nil =>
…. We’ll see later in this chapter how an object
is used to implement deconstruction.
These two clauses are mutually exclusive, so they could be written with the Nil
clause first.
Now let’s try it with various empty and nonempty sequences:
scala
>
seqToString
(
Seq
(
1
,
2
,
3
))
|
seqToString
(
Seq
.
empty
[
Int
])
val
res0
:
String
=
(
1
+:
(
2
+:
(
3
+:
Nil
)))
val
res1
:
String
=
Nil
scala
>
seqToString
(
Vector
(
1
,
2
,
3
))
|
seqToString
(
Vector
.
empty
[
Int
])
val
res2
:
String
=
(
1
+:
(
2
+:
(
3
+:
Nil
)))
val
res3
:
String
=
Nil
scala
>
seqToString
(
Map
(
"one"
->
1
,
"two"
->
2
,
"three"
->
3
).
toSeq
)
|
seqToString
(
Map
.
empty
[
String
,
Int
].
toSeq
)
val
res4
:
String
=
((
one
,
1
)
+:
((
two
,
2
)
+:
((
three
,
3
)
+:
Nil
)))
val
res5
:
String
=
Nil
Note the common idiom for constructing an empty collection, like Vector.empty[Int]
. The empty
methods are in the companion objects.
Map
is not a subtype of Seq
because it doesn’t guarantee a particular order when you iterate over it. Calling Map.toSeq
creates a sequence of key-value tuples that happen to be in insertion order, which is a side effect of the implementation for small Map
s and not true for arbitrary maps. The nonempty Map
output shows parentheses from the tuples as well as the parentheses added by seqToString
.
Note the output for the nonempty Seq
(actually List
) and Vector
. They show the hierarchical structure implied by a linked list, with a head and a tail:
(1 +: (2 +: (3 +: Nil)))
So we process sequences with just two case
clauses and recursion. This implies something fundamental about all sequences: they are either empty or not. That sounds trite, but once you recognize fundamental structural patterns like this, it gives you a surprisingly general tool for “divide and conquer.” The idiom used by processSeq
is widely reusable.
To demonstrate the construction versus destruction symmetry, we can copy and paste the output of the previous examples to reconstruct the original objects. However, we have to add quotes around strings:
scala
>
val
is
=
(
1
+:
(
2
+:
(
3
+:
Nil
)))
val
is
:
List
[
Int
]
=
List
(
1
,
2
,
3
)
scala
>
val
kvs
=
((
"one"
,
1
)
+:
((
"two"
,
2
)
+:
((
"three"
,
3
)
+:
Nil
)))
val
kvs
:
List
[(
String
,
Int
)]
=
List
((
one
,
1
),
(
two
,
2
),
(
three
,
3
))
scala
>
val
map
=
Map
(
kvs
*
)
val
map
:
Map
[
String
,
Int
]
=
Map
(
one
->
1
,
two
->
2
,
three
->
3
)
The Map.apply
method expects a repeated parameter list of two-element tuples. In order to use the sequence kvs
, we use the *
idiom so the compiler converts the sequence to a repeated parameter list.
Try removing the parentheses that we added in the preceding string output.
For completeness, there is an analog of +:
that can be used to process the sequence elements in reverse, :+
:
// src/script/scala/progscala3/patternmatching/MatchReverseSeq.scala
scala
>
def
reverseSeqToString
[
T
](
l
:
Seq
[
T
]):
String
=
l
match
|
case
prefix
:+
end
=>
s"(
${
reverseSeqToString
(
prefix
)
}
:+
$
end
)"
|
case
Nil
=>
"Nil"
scala
>
reverseSeqToString
(
Vector
(
1
,
2
,
3
,
4
,
5
))
val
res6
:
String
=
(((((
Nil
:+
1
)
:+
2
)
:+
3
)
:+
4
)
:+
5
)
Note that Nil
comes first this time in the output. A Vector
is used for the input sequence to remind you that accessing a nonhead element is O(1) for a Vector
, but O(N) for a List
of size N! Hence, reverseSeqToString
is O(N) for a Vector
of size N and O(N2) for a List
of size N!
As before, you could use this output to reconstruct the collection:
scala
>
val
revList1
=
(((((
Nil
:+
1
)
:+
2
)
:+
3
)
:+
4
)
:+
5
)
val
revList1
:
List
[
Int
]
=
List
(
1
,
2
,
3
,
4
,
5
)
// but List is returned!
scala
>
val
revList2
=
Nil
:+
1
:+
2
:+
3
:+
4
:+
5
// unnecessary () removed
val
revList2
:
List
[
Int
]
=
List
(
1
,
2
,
3
,
4
,
5
)
scala
>
val
revList3
=
Vector
.
empty
[
Int
]
:+
1
:+
2
:+
3
:+
4
:+
5
val
revList3
:
Vector
[
Int
]
=
Vector
(
1
,
2
,
3
,
4
,
5
)
// how to get a Vector
Pattern Matching on Repeated Parameters
Speaking of repeated parameter lists, you can also use them in pattern matching:
// src/script/scala/progscala3/patternmatching/MatchRepeatedParams.scala
scala
>
def
matchThree
(
seq
:
Seq
[
Int
])
=
seq
match
|
case
Seq
(
h1
,
h2
,
rest
*
)
=>
// same as h1 +: h2 +: rest => ...
|
println
(
s"head 1 =
$
h1
, head 2 =
$
h2
, the rest =
$
rest
"
)
|
case
_
=>
println
(
s"Other!
$
seq
"
)
scala
>
matchThree
(
Seq
(
1
,
2
,
3
,
4
))
|
matchThree
(
Seq
(
1
,
2
,
3
))
|
matchThree
(
Seq
(
1
,
2
))
|
matchThree
(
Seq
(
1
))
head
1
=
1
,
head
2
=
2
,
the
rest
=
List
(
3
,
4
)
head
1
=
1
,
head
2
=
2
,
the
rest
=
List
(
3
)
head
1
=
1
,
head
2
=
2
,
the
rest
=
List
()
Other
!
List
(
1
)
We see another way to match on sequences. If we don’t need rest
, we can use the placeholder, _
, that is case Seq(h1, h2, _*)
. In Scala 2, rest*
was written rest @ _*
. The Scala 3 syntax is more consistent with other uses of repeated parameters.
Matching on Tuples
Tuples are also easy to match on, using their literal syntax:
// src/script/scala/progscala3/patternmatching/MatchTuple.scala
val
langs
=
Seq
(
(
"Scala"
,
"Martin"
,
"Odersky"
)
,
(
"Clojure"
,
"Rich"
,
"Hickey"
)
,
(
"Lisp"
,
"John"
,
"McCarthy"
)
)
val
results
=
langs
.
map
{
case
(
"Scala"
,
_
,
_
)
=>
"Scala"
case
(
lang
,
first
,
last
)
=>
s"
$
lang
, creator
$
first
$
last
"
}
Match a three-element tuple where the first element is the string “Scala” and we ignore the second and third arguments.
Match any three-element tuple, where the elements could be any type, but they are inferred to be
String
s due to the inputlangs
. Extract the elements into variableslang
,first
, andlast
.
A tuple can be taken apart into its constituent elements. We can match on literal values within the tuple, at any positions we want, and we can ignore elements we don’t care about.
In Scala 3, tuples have enhanced features to make them more like linked lists, but where the specific type of each element is preserved. Compare the following example to the preceding implementation of seqToString
, where *:
replaces +:
as the operator:
scala
>
langs
.
map
{
|
case
"Scala"
*:
first
*:
last
*:
EmptyTuple
=>
|
s"Scala ->
$
first
->
$
last
"
|
case
lang
*:
rest
=>
s"
$
lang
->
$
rest
"
|
}
val
res0
:
Seq
[
String
]
=
List
(
Scala
->
Martin
->
Odersky
,
Clojure
->
(
Rich
,
Hickey
),
Lisp
->
(
John
,
McCarthy
))
The analog of Nil
for tuples is EmptyTuple
. The second case clause can handle any tuple with one or more elements. Let’s create a new list by prepending EmptyTuple
itself and a one-element tuple:
scala
>
val
l2
=
EmptyTuple
+:
(
"Indo-European"
*:
EmptyTuple
)
+:
langs
val
l2
:
Seq
[
Tuple
]
=
List
((),
(
Indo
-
European
,),
(
Scala
,
Martin
,
Odersky
),
(
Clojure
,
Rich
,
Hickey
),
(
Lisp
,
John
,
McCarthy
))
scala
>
l2
.
map
{
|
case
"Scala"
*:
first
*:
last
*:
EmptyTuple
=>
|
s"Scala ->
$
first
->
$
last
"
|
case
lang
*:
rest
=>
s"
$
lang
->
$
rest
"
|
case
EmptyTuple
=>
EmptyTuple
.
toString
|
}
val
res1
:
Seq
[
String
]
=
List
((),
Indo
-
European
->
(),
Scala
->
Martin
->
Odersky
,
Clojure
->
(
Rich
,
Hickey
),
Lisp
->
(
John
,
McCarthy
))
You might think that ("Indo-European")
would be enough to construct a one-element tuple, but the compiler just interprets the parentheses as unnecessary wrappers around the string! ("Indo-European" *: EmptyTuple)
does the trick.
Just as we can construct pairs (two-element tuples) with ->
, we can deconstruct them that way too:
// src/script/scala/progscala3/patternmatching/MatchPair.scala
val
langs2
=
Seq
(
"Scala"
->
"Odersky"
,
"Clojure"
->
"Hickey"
)
val
results
=
langs2
.
map
{
case
"Scala"
->
_
=>
"Scala"
case
lang
->
last
=>
s"
$
lang
:
$
last
"
}
assert
(
results
==
Seq
(
"Scala"
,
"Clojure: Hickey"
)
)
Match on a tuple with the string “Scala” as the first element and anything as the second element.
Match on any other, two-element tuple.
Recall that I said +:
in patterns is actually an object
in the scala.collection
package. Similarly, there is an *:
object
and a type alias for ->
to Tuple2.type
(effectively the companion object
for the Tuple2
case class) in the scala
package.
Parameter Untupling
Consider this example using tuples:
// src/script/scala/progscala3/patternmatching/ParameterUntupling.scala
val
tuples
=
Seq
((
1
,
2
,
3
),
(
4
,
5
,
6
),
(
7
,
8
,
9
))
val
counts1
=
tuples
.
map
{
// result: List(6, 15, 24)
case
(
x
,
y
,
z
)
=>
x
+
y
+
z
}
A disadvantage of the case syntax inside the anonymous function is the implication that it’s not exhaustive, when we know it is for the tuples
sequence. It is also a bit inconvenient to add case
. Scala 3 introduces parameter untupling that simplifies special cases like this. We can drop the case
keyword:
val
counts2
=
tuples
.
map
{
(
x
,
y
,
z
)
=>
x
+
y
+
z
}
We can even use anonymous variables:
val
counts3
=
tuples
.
map
(
_+_+_
)
However, this untupling only works for one level of decomposition:
scala
>
val
tuples2
=
Seq
((
1
,(
2
,
3
)),
(
4
,(
5
,
6
)),
(
7
,(
8
,
9
)))
|
val
counts2b
=
tuples2
.
map
{
|
(
x
,
(
y
,
z
))
=>
x
+
y
+
z
|
}
|
3
|
(
x
,
(
y
,
z
))
=>
x
+
y
+
z
|
^^^^^^
|
not
a
legal
formal
parameter
Guards in Case Clauses
Matching on literal values is very useful, but sometimes you need a little additional logic:
// src/script/scala/progscala3/patternmatching/MatchGuard.scala
val
results
=
Seq
(
1
,
2
,
3
,
4
)
.
map
{
case
e
if
e
%
2
==
0
=>
s"
even:
$
e
"
case
o
=>
s"
odd:
$
o
"
}
assert
(
results
==
Seq
(
"odd: 1"
,
"even: 2"
,
"odd: 3"
,
"even: 4"
)
)
Note that we didn’t need parentheses around the condition in the if
expression, just as we don’t need them in for
comprehensions. In Scala 2, this was true for guard clause syntax too.
Matching on Case Classes and Enums
It’s no coincidence that the same case
keyword is used for declaring special classes and for case
expressions in match
expressions. The features of case classes were designed to enable convenient pattern matching. The compiler implements pattern matching and extraction for us. We can use it with nested objects, and we can bind variables at any level of the extraction, which we are seeing for the first time now:
// src/script/scala/progscala3/patternmatching/MatchDeep.scala
case
class
Address
(
street
:
String
,
city
:
String
)
case
class
Person
(
name
:
String
,
age
:
Int
,
address
:
Address
)
val
alice
=
Person
(
"Alice"
,
25
,
Address
(
"1 Scala Lane"
,
"Chicago"
)
)
val
bob
=
Person
(
"Bob"
,
29
,
Address
(
"2 Java Ave."
,
"Miami"
)
)
val
charlie
=
Person
(
"Charlie"
,
32
,
Address
(
"3 Python Ct."
,
"Boston"
)
)
val
results
=
Seq
(
alice
,
bob
,
charlie
)
.
map
{
case
p
@
Person
(
"Alice"
,
age
,
a
@
Address
(
_
,
"Chicago"
)
)
=>
s"
Hi Alice!
$
p
"
case
p
@
Person
(
"Bob"
,
29
,
a
@
Address
(
street
,
city
)
)
=>
s"
Hi
${
p
.
name
}
! age
${
p
.
age
}
, in
${
a
}
"
case
p
@
Person
(
name
,
age
,
Address
(
street
,
city
)
)
=>
s"
Who are you,
$
name
(age:
$
age
, city =
$
city
)?
"
}
assert
(
results
==
Seq
(
"Hi Alice! Person(Alice,25,Address(1 Scala Lane,Chicago))"
,
"Hi Bob! age 29, in Address(2 Java Ave.,Miami)"
,
"Who are you, Charlie (age: 32, city = Boston)?"
)
)
Match on any person named “Alice”, of any age at any street address in Chicago. Use
p @
to bind variablep
to the wholePerson
, while also extracting fields inside the instance, in this caseage
. Similarly, usea @
to binda
to the wholeAddress
while also bindingstreet
andcity
inside theAddress
.Match on any person named “Bob”, age 29 at any street and city. Bind
p
the wholePerson
instance anda
to the nestedAddress
instance.Match on any person, binding
p
to thePerson
instance andname
,age
,street
, andcity
to the nested fields.
If you aren’t extracting fields from the Person
instance, we can just write p: Person =>
…
This nested matching can go arbitrarily deep. Consider this example that revisits the enum Tree[T]
algebraic data type from “Enumerations and Algebraic Data Types”. Recall the enum
definition, which also supports “automatic” pattern matching:
// src/main/scala/progscala3/patternmatching/MatchTreeADTEnum.scala
package
progscala3
.
patternmatching
enum
Tree
[
T
]:
case
Branch
(
left
:
Tree
[
T
],
right
:
Tree
[
T
])
case
Leaf
(
elem
:
T
)
Here we do deep matching on particular structures:
// src/script/scala/progscala3/patternmatching/MatchTreeADTDeep.scala
import
progscala3
.
patternmatching
.
Tree
import
Tree
.{
Branch
,
Leaf
}
val
tree1
=
Branch
(
Branch
(
Leaf
(
1
),
Leaf
(
2
)),
Branch
(
Leaf
(
3
),
Branch
(
Leaf
(
4
),
Leaf
(
5
))))
val
tree2
=
Branch
(
Leaf
(
6
),
Leaf
(
7
))
for
t
<-
Seq
(
tree1
,
tree2
,
Leaf
(
8
))
yield
t
match
case
Branch
(
l
@
Branch
(
_
,
_
),
r
@
Branch
(
rl
@
Leaf
(
rli
),
rr
@
Branch
(
_
,
_
)))
=>
s"l=
$
l
, r=
$
r
, rl=
$
rl
, rli=
$
rli
, rr=
$
rr
"
case
Branch
(
l
,
r
)
=>
s"Other Branch(
$
l
,
$
r
)"
case
Leaf
(
x
)
=>
s"Other Leaf(
$
x
)"
The same extraction could be done for the alternative version we defined using a sealed class hierarchy in the original example. We’ll try it in “Sealed Hierarchies and Exhaustive Matches”.
The last two case clauses are relatively easy to understand. The first one is highly tuned to match tree1
, although it uses _
to ignore some parts of the tree. In particular, note that it isn’t sufficient to write l @ Branch
. We need to write l @ Branch(_,_)
. Try removing the (_,_)
here and you’ll notice the first case no longer matches tree1
, without any obvious explanation.
Warning
If a nested pattern match
expression doesn’t match when you think it should, make sure that you capture the full structure, like l @ Branch(_,_)
instead of l @ Branch
.
It’s worth experimenting with this example to capture different parts of the trees, so you develop an intuition about what works, what doesn’t, and how to debug match
expressions.
Here’s an example using tuples. Imagine we have a sequence of (String,Double)
tuples for the names and prices of items in a store, and we want to print them with their index. The Seq.zipWithIndex
method is handy here:
// src/script/scala/progscala3/patternmatching/MatchDeepTuple.scala
val
itemsCosts
=
Seq
((
"Pencil"
,
0.52
),
(
"Paper"
,
1.35
),
(
"Notebook"
,
2.43
))
val
results
=
itemsCosts
.
zipWithIndex
.
map
{
case
((
item
,
cost
),
index
)
=>
s"
$
index
:
$
item
costs
$
cost
each"
}
assert
(
results
==
Seq
(
"0: Pencil costs 0.52 each"
,
"1: Paper costs 1.35 each"
,
"2: Notebook costs 2.43 each"
))
Note that zipWithIndex
returns a sequence of tuples of the form (element, index)
, or ((name, cost), index)
in this case. We matched on this form to extract the three elements and construct a string with them. I write code like this a lot.
Matching on Regular Expressions
Regular expressions (or regexes) are convenient for extracting data from strings that have a particular structure. Here is an example:
// src/script/scala/progscala3/patternmatching/MatchRegex.scala
val
BookExtractorRE
=
"""Book: title=([^,]+),\s+author=(.+)"""
.
r
val
MagazineExtractorRE
=
"""Magazine: title=([^,]+),\s+issue=(.+)"""
.
r
val
catalog
=
Seq
(
"Book: title=Programming Scala Third Edition, author=Dean Wampler"
,
"Magazine: title=The New Yorker, issue=January 2021"
,
"Unknown: text=Who put this here??"
)
val
results
=
catalog
.
map
{
case
BookExtractorRE
(
title
,
author
)
=>
s"""
Book
"
$
title
"
, written by
$
author
"""
case
MagazineExtractorRE
(
title
,
issue
)
=>
s"""
Magazine
"
$
title
"
, issue
$
issue
"""
case
entry
=>
s"
Unrecognized entry:
$
entry
"
}
assert
(
results
==
Seq
(
"""Book "Programming Scala Third Edition", written by Dean Wampler"""
,
"""Magazine "The New Yorker", issue January 2021"""
,
"Unrecognized entry: Unknown: text=Who put this here??"
)
)
Match a book string, with two capture groups (note the parentheses), one for the title and one for the author. Calling the
r
method on a string creates a regex from it. Also match a magazine string, with capture groups for the title and issue (date).Use the regular expressions much like using case classes, where the string matched by each capture group is assigned to a variable.
Because regexes use backslashes for constructs beyond the normal ASCII control characters, you should either use triple-quoted strings for them, as shown, or use raw interpolated strings, such as raw"foo\sbar".r
. Otherwise, you must escape these backslashes; for example "foo\\sbar".r
. You can also define regular expressions by creating new instances of the Regex
class, as in new Regex("""\W+""")
.
Warning
Using interpolation in triple-quoted strings doesn’t work cleanly for the regex escape sequences. You still need to escape these sequences (e.g., s"""$first\\s+$second""".r
instead of s"""$first\s+$second""".r
). If you aren’t using interpolation, escaping isn’t necessary.
scala.util.matching.Regex
defines several methods for other manipulations, such as finding and replacing matches.
Matching on Interpolated Strings
If you know the strings have an exact format, such as a precise number of spaces, you can even use interpolated strings for pattern matching. Let’s reuse the catalog
:
// src/script/scala/progscala3/patternmatching/MatchInterpolatedString.scala
val
results
=
catalog
.
map
{
case
s"""Book: title=
$
t
, author=
$
a
"""
=>
(
"Book"
->
(
t
->
a
))
case
s"""Magazine: title=
$
t
, issue=
$
d
"""
=>
(
"Magazine"
->
(
t
->
d
))
case
item
=>
(
"Unrecognized"
,
item
)
}
assert
(
results
==
Seq
(
(
"Book"
,
(
"Programming Scala Third Edition"
,
"Dean Wampler"
)),
(
"Magazine"
,
(
"The New Yorker"
,
"January 2020"
)),
(
"Unrecognized"
,
"Unknown: text=Who put this here??"
)))
Sealed Hierarchies and Exhaustive Matches
Let’s revisit the need for exhaustive matches and consider the situation where we have an enum
or the equivalent sealed
class hierarchy.
First, let’s use the enum Tree[T]
definition from earlier. We can pattern match on the leafs and branches knowing we’ll never be surprised to see something else:
// src/script/scala/progscala3/patternmatching/MatchTreeADTExhaustive.scala
import
progscala3
.
patternmatching
.
Tree
import
Tree
.{
Branch
,
Leaf
}
val
enumSeq
:
Seq
[
Tree
[
Int
]]
=
Seq
(
Leaf
(
0
),
Branch
(
Leaf
(
6
),
Leaf
(
7
)))
val
tree1
=
for
t
<-
enumSeq
yield
t
match
case
Branch
(
left
,
right
)
=>
(
left
,
right
)
case
Leaf
(
value
)
=>
value
assert
(
tree1
==
List
(
0
,
(
Leaf
(
6
),
Leaf
(
7
))))
Because it’s not possible for a user of Tree
to add another case
to the enum
, these match
expressions can never break. They will always remain exhaustive.
As an exercise, change the case Branch
to recurse on left
and right
(you’ll need to define a method), then use a deeper tree example.
Let’s try a corresponding sealed hierarchy:
// src/main/scala/progscala3/patternmatching/MatchTreeADTSealed.scala
package
progscala3
.
patternmatching
sealed
trait
STree
[
T
]
// "S" for "sealed"
case
class
SBranch
[
T
](
left
:
STree
[
T
],
right
:
STree
[
T
])
extends
STree
[
T
]
case
class
SLeaf
[
T
](
elem
:
T
)
extends
STree
[
T
]
The match code is essentially identical:
import
progscala3
.
patternmatching
.{
STree
,
SBranch
,
SLeaf
}
val
sealedSeq
:
Seq
[
STree
[
Int
]]
=
Seq
(
SLeaf
(
0
),
SBranch
(
SLeaf
(
6
),
SLeaf
(
7
)))
val
tree2
=
for
t
<-
sealedSeq
yield
t
match
case
SBranch
(
left
,
right
)
=>
(
left
,
right
)
case
SLeaf
(
value
)
=>
value
assert
(
tree2
==
List
(
0
,
(
SLeaf
(
6
),
SLeaf
(
7
))))
A corollary is to avoid using sealed
hierarchies and enum
s when the type hierarchy needs to evolve. Instead, use an “open” object-oriented type hierarchy with polymorphic methods instead of match
expressions. We discussed this trade-off in “A Sample Application”.
Chaining Match Expressions
Scala 3 changed the parsing rules for match
expressions to allow chaining, as in this contrived example:
// src/script/scala/progscala3/patternmatching/MatchChaining.scala
scala
>
for
opt
<-
Seq
(
Some
(
1
),
None
)
|
yield
opt
match
{
|
case
None
=>
""
|
case
Some
(
i
)
=>
i
.
toString
|
}
match
{
// matches on the String returned from the previous match
|
case
""
=>
false
|
case
_
=>
true
|
}
val
res10
:
Seq
[
Boolean
]
=
List
(
true
,
false
)
Pattern Matching Outside Match Expressions
Pattern matching is not restricted to match
expressions. You can use it in assignment statements, called pattern bindings:
// src/script/scala/progscala3/patternmatching/Assignments.scala
scala
>
case
class
Address
(
street
:
String
,
city
:
String
,
country
:
String
)
scala
>
case
class
Person
(
name
:
String
,
age
:
Int
,
address
:
Address
)
scala
>
val
addr
=
Address
(
"1 Scala Way"
,
"CA"
,
"USA"
)
scala
>
val
dean
=
Person
(
"Dean"
,
29
,
addr
)
val
addr
:
Address
=
Address
(
1
Scala
Way
,
CA
,
USA
)
val
dean
:
Person
=
Person
(
Dean
,
29
,
Address
(
1
Scala
Way
,
CA
,
USA
))
scala
>
val
Person
(
name
,
age
,
Address
(
_
,
state
,
_
))
=
dean
val
name
:
String
=
Dean
val
age
:
Int
=
29
val
state
:
String
=
CA
They work in for
comprehensions:
scala
>
val
people
=
(
0
to
4
).
map
{
|
i
=>
Person
(
s"Name
$
i
"
,
10
+
i
,
Address
(
s"
$
i
Main Street"
,
"CA"
,
"USA"
))
|
}
val
people
:
IndexedSeq
[
Person
]
=
Vector
(
Person
(
Name0
,
10
,
Address
(...)),
...)
scala
>
val
nas
=
for
|
Person
(
name
,
age
,
Address
(
_
,
state
,
_
))
<-
people
|
yield
(
name
,
age
,
state
)
val
nas
:
IndexedSeq
[(
String
,
Int
,
String
)]
=
Vector
((
Name0
,
10
,
CA
),
(
Name1
,
11
,
CA
),
...)
Suppose we have a function that takes a sequence of doubles and returns the count, sum, average, minimum value, and maximum value in a tuple:
// src/script/scala/progscala3/patternmatching/AssignmentsTuples.scala
/** Return the count, sum, average, minimum value, and maximum value. */
def
stats
(
seq
:
Seq
[
Double
]):
(
Int
,
Double
,
Double
,
Double
,
Double
)
=
assert
(
seq
.
size
>
0
)
val
sum
=
seq
.
sum
(
seq
.
size
,
sum
,
sum
/
seq
.
size
,
seq
.
min
,
seq
.
max
)
val
(
count
,
sum
,
avg
,
min
,
max
)
=
stats
((
0
until
100
).
map
(
_
.
toDouble
))
Pattern bindings can be used with interpolated strings:
// src/script/scala/progscala3/patternmatching/AssignmentsInterpStrs.scala
val
str
=
"""Book: "Programming Scala", by Dean Wampler"""
val
s"""Book: "
$
title
", by
$
author
"""
=
str
:
@unchecked
assert
(
title
==
"Programming Scala"
&&
author
==
"Dean Wampler"
)
I’ll explain the need for @unchecked
in a moment.
Finally, we can use pattern bindings with a regular expression to decompose a string. Here’s an example for parsing (simple!) SQL strings:
// src/script/scala/progscala3/patternmatching/AssignmentsRegex.scala
scala
>
val
c
=
"""\*|[\w, ]+"""
// cols
|
val
t
=
"""\w+"""
// table
|
val
o
=
""".*"""
// other substrings
|
val
selectRE
=
|
s"""SELECT\\s*(DISTINCT)?\\s+(
$
c
)\\s*FROM\\s+(
$
t
)\\s*(
$
o
)?;"""
.
r
scala
>
val
selectRE
(
distinct
,
cols
,
table
,
otherClauses
)
=
|
"SELECT DISTINCT col1 FROM atable WHERE col1 = 'foo';"
:
@unchecked
val
distinct
:
String
=
DISTINCT
val
cols
:
String
=
"col1 "
val
table
:
String
=
atable
val
otherClauses
:
String
=
WHERE
col1
=
'fo
o
'
See the source file for other examples. Because I used string interpolation, I had to add extra backslashes (e.g., \\s
instead of \s
) in the last regular expression.
Next I’ll explain why the @unchecked
type annotation was used.
Problems in Pattern Bindings
In general, keep in mind that pattern matching will throw MatchError
exceptions when the match fails. This can make your code fragile when used in assignments because it’s harder to make them exhaustive. In the previous interpolated string and regex examples, the String
type for the righthand side values can’t ensure that the matches will succeed.
Assume I didn’t have the : @unchecked
type declaration. In Scala 2 and 3.0, both examples would compile and work without MatchError
s. Starting in a future Scala 3 release or when compiling with -source:future
, the examples fail to compile, for example:
scala
>
val
selectRE
(
distinct
,
cols
,
table
,
otherClauses
)
=
|
"SELECT DISTINCT col1 FROM atable WHERE col1 = 'foo';"
|
2
|
"SELECT DISTINCT col1 FROM atable WHERE col1 = 'foo';"
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
pattern
's
type
String
is
more
specialized
than
the
righthand
side
|
expression
's
type
String
|
|
If
the
narrowing
is
intentional
,
this
can
be
communicated
by
adding
|
`: @unchecked`
after
the
expression
.
This compile-time enforcement makes your code more robust, but if you know the declaration is safe, you can add the @unchecked
type declaration, as we did earlier, and the compiler will not complain.
However, if we silence these warnings, we may get runtime MatchError
s. Consider the following examples with sequences:
// src/script/scala/progscala3/patternmatching/AssignmentsFragile.scala
scala
>
val
h4a
+:
h4b
+:
t4
=
Seq
(
1
,
2
,
3
,
4
)
:
@unchecked
val
h4a
:
Int
=
1
val
h4b
:
Int
=
2
val
t4
:
Seq
[
Int
]
=
List
(
3
,
4
)
scala
>
val
h2a
+:
h2b
+:
t2
=
Seq
(
1
,
2
)
:
@unchecked
val
h2a
:
Int
=
1
val
h2b
:
Int
=
2
val
t2
:
Seq
[
Int
]
=
List
()
scala
>
val
h1a
+:
h1b
+:
t1
=
Seq
(
1
)
:
@unchecked
// MatchError!
scala
.
MatchError
:
List
(
1
)
(
of
class
scala
.
collection
.
immutable
.
$colon$colon
)
...
Seq
doesn’t constrain the number of elements, so the lefthand matches may work or fail. The compiler can’t verify at compile time if the match will succeed or throw a MatchError
, so it will report a warning unless the @unchecked
type annotation is added as shown. Sure enough, while the first two cases succeed, the last one raises a MatchError
.
Pattern Matching as Filtering in for Comprehensions
However, in a for
comprehension, matching that isn’t exhaustive functions as a filter instead:
// src/script/scala/progscala3/patternmatching/MatchForFiltering.scala
scala
>
val
elems
=
Seq
(
(
1
,
2
)
,
"hello"
,
(
3
,
4
)
,
1
,
2.2
,
(
5
,
6
)
)
val
elems
:
Seq
[
Matchable
]
=
List
(
(
1
,
2
)
,
hello
,
(
3
,
4
)
,
1
,
2.2
,
(
5
,
6
)
)
scala
>
val
what1
=
for
(
case
(
x
,
y
)
<-
elems
)
yield
(
y
,
x
)
|
val
what2
=
for
case
(
x
,
y
)
<-
elems
yield
(
y
,
x
)
val
what1
:
Seq
[
(
Any
,
Any
)
]
=
List
(
(
2
,
1
)
,
(
4
,
3
)
,
(
6
,
5
)
)
val
what2
:
Seq
[
(
Any
,
Any
)
]
=
List
(
(
2
,
1
)
,
(
4
,
3
)
,
(
6
,
5
)
)
Note that the inferred common supertype for the elements in elems
is Matchable
, not Any
. For what1
and what2
, the inferred type is a tuple—a subtype of Matchable
. The tuple members can be Any
.
The case
keyword was not required for Scala 2 or 3.0. Starting with a future Scala 3 release or compiling with -source:future
will trigger the “narrowing” warning if you omit the case keyword:
scala
>
val
nope
=
for
(
x
,
y
)
<-
elems
yield
(
y
,
x
)
1
|
val
nope
=
for
(
x
,
y
)
<-
elems
yield
(
y
,
x
)
|
^^^^^^
|
pattern
's
type
(
Any
,
Any
)
is
more
specialized
than
the
right
hand
side
|
expression
's
type
Matchable
|
|
If
the
narrowing
is
intentional
,
this
can
be
communicated
by
writing
`case`
|
before
the
full
pattern
.
[
source
,
scala
]
When we discussed exhaustive matching previously, we used an example of a sequence of Option
values. We can filter out values in a sequence using pattern matching:
scala
>
val
seq
=
Seq
(
None
,
Some
(
1
),
None
,
Some
(
2.2
),
None
,
None
,
Some
(
"three"
))
scala
>
val
filtered
=
for
case
Some
(
x
)
<-
seq
yield
x
val
filtered
:
Seq
[
Matchable
]
=
List
(
1
,
2.2
,
three
)
Pattern Matching and Erasure
Consider the following example, where we attempt to discriminate between the inputs List[Double]
and List[String]
:
// src/script/scala/progscala3/patternmatching/MatchTypesErasure.scala
scala
>
val
results
=
Seq
(
Seq
(
5.5
,
5.6
,
5.7
),
Seq
(
"a"
,
"b"
)).
map
{
|
case
seqd
:
Seq
[
Double
]
=>
(
"seq double"
,
seqd
)
// Erasure warning
|
case
seqs
:
Seq
[
String
]
=>
(
"seq string"
,
seqs
)
// Erasure warning
|
case
other
=>
(
"unknown!"
,
other
)
|
}
2
|
case
seqd
:
Seq
[
Double
]
=>
(
"seq double"
,
seqd
)
// Erasure warning
|
^^^^^^^^^^^^^^^^^
|
the
type
test
for
Seq
[
Double
]
cannot
be
checked
at
runtime
3
|
case
seqs
:
Seq
[
String
]
=>
(
"seq string"
,
seqs
)
// Erasure warning
|
^^^^^^^^^^^^^^^^^
|
the
type
test
for
Seq
[
String
]
cannot
be
checked
at
runtime
These warnings result from type erasure, where the information about the actual types used for the type parameters is not retained in the compiler output. Hence, while we can tell at runtime that the object is a Seq
, we can’t check that it is a Seq[Double]
or a Seq[String]
. In fact, if we neglect the warning, the second case
clause for Seq[String]
is unreachable. The first clause matches for all Seq
s.
One ugly workaround is to match on the collection first, then use a nested match on the head element to determine the type. We now have to handle an empty sequence too:
// src/script/scala/progscala3/patternmatching/MatchTypesFix.scala
def
doSeqMatch
[
T
<:
Matchable
](
seq
:
Seq
[
T
]):
String
=
seq
match
case
Nil
=>
""
case
head
+:
_
=>
head
match
case
_
:
Double
=>
"Double"
case
_
:
String
=>
"String"
case
_
=>
"Unmatched seq element"
val
results
=
Seq
(
Seq
(
5.5
,
5.6
),
Nil
,
Seq
(
"a"
,
"b"
)).
map
(
seq
=>
doSeqMatch
(
seq
))
assert
(
results
==
Seq
(
"Double"
,
""
,
"String"
))
Extractors
So how does pattern matching and destructuring or extraction work? Scala defines a pair of object
methods that are implemented automatically for case classes and for many types in the Scala library. You can implement these extractors yourself to customize the behavior for your types. When those methods are available on suitable types, they can be used in pattern-matching clauses.
However, you will rarely need to implement your own extractors. You also don’t need to understand the implementation details to use pattern matching effectively. Therefore, you can safely skip the rest of this chapter now and return to this discussion later, when needed.
unapply Method
Recall that the companion object for a case class has at least one factory method named apply
, which is used for construction. Using symmetry arguments, we might infer that there must be another method generated called unapply
, which is used for deconstruction or extraction. Indeed, there is an unapply
method, and it is invoked in pattern-match expressions for most types.
There are several ways to implement unapply
, specifically what is returned from it. We’ll start with the return type used most often: an Option
wrapping a tuple. Then we’ll discuss other options for return types.
Consider again Person
and Address
from before:
person
match
case
Person
(
name
,
age
,
Address
(
street
,
city
))
=>
...
...
Scala looks for Person.unapply(
…)
and Address.unapply(
…)
and calls them. They return an Option[(
…)]
, where the tuple type corresponds to the number of values and their types that can be extracted from the instance.
By default for case classes, the compiler implements unapply
to return all the fields declared in the constructor argument list. That will be three fields for Person
, of types String
, Int
, and Address
, and two fields for Address
, both of type String
. So the Person
companion object has methods that would look like this:
object
Person
:
def
apply
(
name
:
String
,
age
:
Int
,
address
:
Address
)
=
new
Person
(
name
,
age
,
address
)
def
unapply
(
p
:
Person
):
Some
[(
String
,
Int
,
Address
)]
=
Some
((
p
.
name
,
p
.
age
,
p
.
address
))
Why is an Option
used if the compiler already knows that the object is a Person
? Scala allows an implementation of unapply
to veto the match for some reason and return None
, in which case Scala will attempt to use the next case
clause. Also, we don’t have to expose all fields of the instance if we don’t want to. We could suppress our age
, if we’re embarrassed by it. We could even add additional values to the returned tuples.
When a Some
wrapping a tuple is returned by an unapply
, the compiler extracts the tuple elements for use in the case clause or assignment, such as comparison with literal values, binding to variables, or dropping them for _
placeholders.
However, note that the simple compiler-generated Person.unapply
never fails, so Some[
…]
is used as the return type, rather than Option[
…]
.
The unapply
methods are invoked recursively when necessary, so the nested Address
instance is processed first, then Person
.
Recall the head +: tail
expression we used previously. Now let’s understand how it actually works. We’ve seen that the +:
(cons) operator can be used to construct a new sequence by prepending an element to an existing sequence, and we can construct an entire sequence from scratch this way:
val
list
=
1
+:
2
+:
3
+:
4
+:
Nil
Because +:
is a method that binds to the right, we first prepend 4
to Nil
, then prepend 3
to that list, and so forth.
If the construction of sequences is done with a method named +:
, how can extraction be done with the same syntax, so that we have uniform syntax for construction and deconstruction/extraction?
To do that, the Scala library defines a special singleton object named +:
. Yes, that’s the name. Like methods, types can have names with a wide variety of characters.
It has just one method, the unapply
method the compiler needs for our extraction case
statement. The declaration of unapply
is conceptually as follows (some details removed):
def
unapply
[
H
,
Coll
](
collection
:
Coll
):
Option
[(
H
,
Coll
)]
The head is of type H
, which is inferred, and some collection type Coll
, which represents the type of the tail collection. So an Option
of a two-element tuple with the head and tail is returned.
We learned in “Defining Operators” that types can be used with infix notation, so head +: tail
is valid syntax, equivalent to +:(head, tail)
. In fact, we can use the normal notation in a case clause:
scala
>
def
seqToString2
[
T
](
seq
:
Seq
[
T
]):
String
=
seq
match
|
case
+
:(
head
,
tail
)
=>
s"(
$
head
+:
${
seqToString2
(
tail
)
}
)"
|
case
Nil
=>
"Nil"
scala
>
seqToString2
(
Seq
(
1
,
2
,
3
,
4
))
val
res0
:
String
=
(
1
+:
(
2
+:
(
3
+:
(
4
+:
Nil
))))
Here’s another example, just to drive home the point:
// src/script/scala/progscala3/patternmatching/Infix.scala
infix
case
class
And
[
A
,
B
](
a
:
A
,
b
:
B
)
val
and1
:
And
[
String
,
Int
]
=
And
(
"Foo"
,
1
)
val
and2
:
String
And
Int
=
And
(
"Bar"
,
2
)
// val and3: String And Int = "Baz" And 3 // ERROR
val
results
=
Seq
(
and1
,
and2
).
map
{
case
s
And
i
=>
s"
$
s
and
$
i
"
}
assert
(
results
==
Seq
(
"Foo and 1"
,
"Bar and 2"
))
We mentioned earlier that you can pattern match pairs with ->
. This feature is implemented with a val
defined in Predef
, ->
. This is an alias for Tuple2.type
, which subtypes Product2
, which defines an unapply
method that is used for these pattern-matching expressions.
Alternatives to Option Return Values
While it is common to return an Option
from unapply
, any type with the following signature is allowed, which Option
also implements:
def
isEmpty
:
Boolean
def
get
:
T
A Boolean
can also be returned or a Product
type, which is a supertype of tuples, for example. Here’s an example using Boolean
where we want to discriminate between two kinds of strings and the match is really implementing a true versus false analysis:
// src/script/scala/progscala3/patternmatching/UnapplyBoolean.scala
object
ScalaSearch
:
def
unapply
(
s
:
String
)
:
Boolean
=
s
.
toLowerCase
.
contains
(
"scala"
)
val
books
=
Seq
(
"Programming Scala"
,
"JavaScript: The Good Parts"
,
"Scala Cookbook"
)
.
zipWithIndex
// add an "index"
val
result
=
for
s
<-
books
yield
s
match
case
(
ScalaSearch
(
)
,
index
)
=>
s"
$
index
: found Scala
"
case
(
_
,
index
)
=>
s"
$
index
: no Scala
"
assert
(
result
==
Seq
(
"0: found Scala"
,
"1: no Scala"
,
"2: found Scala"
)
)
Define an object with an
unapply
method that takes a string, converts to lowercase, and returns the result of a predicate; does it contain “scala”?Try it on a list of strings, where the first case match succeeds only when the string contains “scala.”
Empty parentheses required.
Other single values can be returned. Here is an example that converts a Scala Map
to a Java HashMap
:
// src/script/scala/progscala3/patternmatching/UnapplySingleValue.scala
import
java
.
util
.{
HashMap
as
JHashMap
}
case
class
JHashMapWrapper
[
K
,
V
](
jmap
:
JHashMap
[
K
,
V
])
object
JHashMapWrapper
:
def
unapply
[
K
,
V
](
map
:
Map
[
K
,
V
]):
JHashMapWrapper
[
K
,
V
]
=
val
jmap
=
new
JHashMap
[
K
,
V
]()
for
(
k
,
v
)
<-
map
do
jmap
.
put
(
k
,
v
)
new
JHashMapWrapper
(
jmap
)
In action:
scala
>
val
map
=
Map
(
"one"
->
1
,
"two"
->
2
)
val
map
:
Map
[
String
,
Int
]
=
Map
(
one
->
1
,
two
->
2
)
scala
>
map
match
|
case
JHashMapWrapper
(
jmap
)
=>
jmap
val
res0
:
java
.
util
.
HashMap
[
String
,
Int
]
=
{
one
=
1
,
two
=
2
}
However, it’s not possible to implement a similar extractor for Java’s HashSet
and combine them into one match
expression (because there are two possible return values, not one):
// src/script/scala/progscala3/patternmatching/UnapplySingleValue2.scala
scala
>
...
scala
>
val
map
=
Map
(
"one"
->
1
,
"two"
->
2
)
scala
>
val
set
=
map
.
keySet
scala
>
for
x
<-
Seq
(
map
,
set
)
yield
x
match
|
case
JHashMapWrapper
(
jmap
)
=>
jmap
|
case
JHashSetWrapper
(
jset
)
=>
jset
...
errors
...
See the source file for the full details. The Scala collections already have tools for converting between Scala and Java collections. See “Conversions Between Scala and Java Collections” for details.
Another option for unapply
is to return a Product
, or more specifically an object that mixes in this trait, which is an abstraction for types when it is useful to treat the member fields uniformly, such as retrieving them by index or iterating over them. Tuples implement Product
. We can use it as a way to provide several return values extracted by unapply
:
// src/script/scala/progscala3/patternmatching/UnapplyProduct.scala
class
Words
(
words
:
Seq
[
String
]
,
index
:
Int
)
extends
Product
:
def
_1
=
words
def
_2
=
index
def
canEqual
(
that
:
Any
)
:
Boolean
=
???
def
productArity
:
Int
=
???
def
productElement
(
n
:
Int
)
:
Any
=
???
object
Words
:
def
unapply
(
si
:
(
String
,
Int
)
)
:
Words
=
val
words
=
si
.
_1
.
split
(
"""\W+"""
)
.
toSeq
new
Words
(
words
,
si
.
_2
)
val
books
=
Seq
(
"Programming Scala"
,
"JavaScript: The Good Parts"
,
"Scala Cookbook"
)
.
zipWithIndex
// add an "index"
val
result
=
books
.
map
{
case
Words
(
words
,
index
)
=>
s"
$
index
: count =
${
words
.
size
}
"
}
assert
(
result
==
Seq
(
"0: count = 2"
,
"1: count = 4"
,
"2: count = 2"
)
)
Now we need a class
Words
to hold the results when a match succeeds.Words
implementsProduct
.Define two methods for retrieving the first and second items. Note the method names are the same as for two-element tuples.
The
Product
trait declares these methods too, so we have to provide definitions, but we don’t need working implementations. This is becauseProduct
is actually a marker trait for our purposes. All we really need is forWords
to mixin this type. So we simply invoke the???
method defined inPredef
, which always throwsNotImplementedError
.Matches on a tuple of
String
andInt
.
unapplySeq Method
When you want to return a sequence of extracted items, rather than a fixed number of them, use unapplySeq
. It turns out the Seq
companion object implements apply
and unapplySeq
, but not unapply
:
def
apply
[
A
](
elems
:
A
*
):
Seq
[
A
]
final
def
unapplySeq
[
A
](
x
:
Seq
[
A
]):
UnapplySeqWrapper
[
A
]
UnapplySeqWrapper
is a helper class.
Matching with unapplySeq
is invoked in this variation of our previous example for +:
, where we examine a sliding window of pairs of elements at a time:
// src/script/scala/progscala3/patternmatching/MatchUnapplySeq.scala
// Process pairs
def
windows
[
T
]
(
seq
:
Seq
[
T
]
)
:
String
=
seq
match
case
Seq
(
head1
,
head2
,
tail
*
)
=>
s"
(
$
head1
,
$
head2
),
"
+
windows
(
seq
.
tail
)
case
Seq
(
head
,
tail
*
)
=>
s"
(
$
head
, _),
"
+
windows
(
tail
)
case
Nil
=>
"Nil"
val
nonEmptyList
=
List
(
1
,
2
,
3
,
4
,
5
)
val
emptyList
=
Nil
val
nonEmptyMap
=
Map
(
"one"
->
1
,
"two"
->
2
,
"three"
->
3
)
val
results
=
Seq
(
nonEmptyList
,
emptyList
,
nonEmptyMap
.
toSeq
)
.
map
{
seq
=>
windows
(
seq
)
}
assert
(
results
==
Seq
(
"(1, 2), (2, 3), (3, 4), (4, 5), (5, _), Nil"
,
"Nil"
,
"((one,1), (two,2)), ((two,2), (three,3)), ((three,3), _), Nil"
)
)
It looks like we’re calling
Seq.apply(
…)
, but in a match clause, we’re actually callingSeq.unapplySeq
. We grab the first two elements separately, and the rest of the repeated parameters list as the tail.Format a string with the first two elements, then move the window by one (not two) by calling
seq.tail
, which is also equivalent tohead2 +: tail
.We also need a match for a one-element sequence, such as near the end, or we won’t have exhaustive matching. This time we use the tail in the recursive call, although we actually know that this call to
windows(tail)
will simply returnNil
.The
Nil
case terminates the recursion.
We could rewrite the second case statement to skip the final invocation of windows(tail)
, but I left it as is for simplicity.
We could still use the +:
matching we saw before, which is more elegant and what I would do:
// src/script/scala/progscala3/patternmatching/MatchWithoutUnapplySeq.scala
val
nonEmptyList
=
List
(
1
,
2
,
3
,
4
,
5
)
val
emptyList
=
Nil
val
nonEmptyMap
=
Map
(
"one"
->
1
,
"two"
->
2
,
"three"
->
3
)
// Process pairs
def
windows2
[
T
](
seq
:
Seq
[
T
]):
String
=
seq
match
case
head1
+:
head2
+:
_
=>
s"(
$
head1
,
$
head2
), "
+
windows2
(
seq
.
tail
)
case
head
+:
tail
=>
s"(
$
head
, _), "
+
windows2
(
tail
)
case
Nil
=>
"Nil"
val
results
=
Seq
(
nonEmptyList
,
emptyList
,
nonEmptyMap
.
toSeq
).
map
{
seq
=>
windows2
(
seq
)
}
assert
(
results
==
Seq
(
"(1, 2), (2, 3), (3, 4), (4, 5), (5, _), Nil"
,
"Nil"
,
"((one,1), (two,2)), ((two,2), (three,3)), ((three,3), _), Nil"
))
Working with sliding windows is actually so useful that Seq
gives us two methods to create them:
scala
>
val
seq
=
0
to
5
val
seq
:
scala
.
collection
.
immutable
.
Range
.
Inclusive
=
Range
0
to
5
scala
>
seq
.
sliding
(
2
).
foreach
(
println
)
ArraySeq
(
0
,
1
)
ArraySeq
(
1
,
2
)
ArraySeq
(
2
,
3
)
ArraySeq
(
3
,
4
)
scala
>
seq
.
sliding
(
3
,
2
).
foreach
(
println
)
ArraySeq
(
0
,
1
,
2
)
ArraySeq
(
2
,
3
,
4
)
Both sliding
methods return an iterator, meaning they are lazy and don’t immediately make a copy of the collection, which is desirable for large collections. The second method takes a stride
argument, which is how many steps to go for the next sliding window. The default is one step. Note that none of the sliding windows contain our last element, 5
.
Implementing unapplySeq
Let’s implement an unapplySeq
method adapted from the preceding Words
example. We’ll tokenize the words as before but also remove all words shorter than a specified value:
// src/script/scala/progscala3/patternmatching/UnapplySeq.scala
object
Tokenize
:
// def unapplySeq(s: String): Option[Seq[String]] = Some(tokenize(s))
def
unapplySeq
(
lim_s
:
(
Int
,
String
)
)
:
Option
[
Seq
[
String
]
]
=
val
(
limit
,
s
)
=
lim_s
if
limit
>
s
.
length
then
None
else
val
seq
=
tokenize
(
s
)
.
filter
(
_
.
length
>=
limit
)
Some
(
seq
)
def
tokenize
(
s
:
String
)
:
Seq
[
String
]
=
s
.
split
(
"""\W+"""
)
.
toSeq
val
message
=
"This is Programming Scala v3"
val
limits
=
Seq
(
1
,
3
,
20
,
100
)
val
results
=
for
limit
<-
limits
yield
(
limit
,
message
)
match
case
Tokenize
(
)
=>
s"
No words of length >=
$
limit
!
"
case
Tokenize
(
a
,
b
,
c
,
d
*
)
=>
s"
limit:
$
limit
=>
$
a
,
$
b
,
$
c
, d=
$
d
"
case
x
=>
s"
limit:
$
limit
=> Tokenize refused! x=
$
x
"
assert
(
results
==
Seq
(
"limit: 1 => This, is, Programming, d=ArraySeq(Scala, v3)"
,
"limit: 3 => This, Programming, Scala, d=ArraySeq()"
,
"No words of length >= 20!"
,
"limit: 100 => Tokenize refused! x=(100,This is Programming Scala v3)"
)
)
If we didn’t match on the
limit
value, this is what the declaration would be.We match on a tuple with the limit for word size and the string of words. If successful, we return
Some(Seq(words))
, where the words are filtered for those with a length of at leastlimit
. We consider it unsuccessful and returnNone
when the inputlimit
is greater than the length of the input string.Split on whitespace.
Capture the first three words returned and the rest of them as a repeated parameters list (
d
).
Try simplifying this example to not do length filtering. Uncomment the line for comment 1 and work from there.
Recap and What’s Next
Along with for
comprehensions, pattern matching makes idiomatic Scala code concise, yet powerful. It provides a protocol for extracting data inside data structures in a principled way, one you can control by implementing custom unapply
and unapplySeq
methods. These methods let you extract that information while hiding other details. In fact, the information returned by unapply
might be a transformation of the actual fields in the instance.
Pattern matching is a hallmark of many functional languages. It is a flexible and concise technique for extracting data from data structures. We saw examples of pattern matching in case
clauses and how to use pattern matching in other expressions too.
The next chapter discusses a unique, powerful, but controversial feature in Scala—context abstractions, formerly known as implicits, which are a set of tools for building intuitive DSLs, reducing boilerplate, and making APIs both easier to use and more amenable to customization.
Get Programming Scala, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.