Chapter 1. (Avoiding) Flow Control
In typical imperative Python programs—including those that make use of
classes and methods to hold their imperative code—a block of code
generally consists of some outside loops (for
or while
), assignment
of state variables within those loops, modification of data structures
like dicts, lists, and sets (or various other structures, either from
the standard library or from third-party packages), and some branch
statements (if
/elif
/else
or try
/except
/finally
). All of
this is both natural and seems at first easy to reason about. The
problems often arise, however, precisely with those side effects that
come with state variables and mutable data structures; they often model
our concepts from the physical world of containers fairly well, but it
is also difficult to reason accurately about what state data is in at a
given point in a program.
One solution is to focus not on constructing a data collection but rather on describing “what” that data collection consists of. When one simply thinks, “Here’s some data, what do I need to do with it?” rather than the mechanism of constructing the data, more direct reasoning is often possible. The imperative flow control described in the last paragraph is much more about the “how” than the “what” and we can often shift the question.
Encapsulation
One obvious way of focusing more on “what” than “how” is simply to refactor code, and to put the data construction in a more isolated place—i.e., in a function or method. For example, consider an existing snippet of imperative code that looks like this:
# configure the data to start with
collection
=
get_initial_state
()
state_var
=
None
for
datum
in
data_set
:
if
condition
(
state_var
):
state_var
=
calculate_from
(
datum
)
new
=
modify
(
datum
,
state_var
)
collection
.
add_to
(
new
)
else
:
new
=
modify_differently
(
datum
)
collection
.
add_to
(
new
)
# Now actually work with the data
for
thing
in
collection
:
process
(
thing
)
We might simply remove the “how” of the data construction from the current scope, and tuck it away in a function that we can think about in isolation (or not think about at all once it is sufficiently abstracted). For example:
# tuck away construction of data
def
make_collection
(
data_set
):
collection
=
get_initial_state
()
state_var
=
None
for
datum
in
data_set
:
if
condition
(
state_var
):
state_var
=
calculate_from
(
datum
,
state_var
)
new
=
modify
(
datum
,
state_var
)
collection
.
add_to
(
new
)
else
:
new
=
modify_differently
(
datum
)
collection
.
add_to
(
new
)
return
collection
# Now actually work with the data
for
thing
in
make_collection
(
data_set
):
process
(
thing
)
We haven’t changed the programming logic, nor even the lines of code, at
all, but we have still shifted the focus from “How do we construct
collection
?” to “What does make_collection()
create?”
Comprehensions
Using comprehensions is often a way both to make code more compact and to shift our focus from the “how” to the “what.” A comprehension is an expression that uses the same keywords as loop and conditional blocks, but inverts their order to focus on the data rather than on the procedure. Simply changing the form of expression can often make a surprisingly large difference in how we reason about code and how easy it is to understand. The ternary operator also performs a similar restructuring of our focus, using the same keywords in a different order. For example, if our original code was:
collection
=
list
()
for
datum
in
data_set
:
if
condition
(
datum
):
collection
.
append
(
datum
)
else
:
new
=
modify
(
datum
)
collection
.
append
(
new
)
Somewhat more compactly we could write this as:
collection
=
[
d
if
condition
(
d
)
else
modify
(
d
)
for
d
in
data_set
]
Far more important than simply saving a few characters and lines is the
mental shift enacted by thinking of what collection
is, and by
avoiding needing to think about or debug “What is the state of
collection
at this point in the loop?”
List comprehensions have been in Python the longest, and are in some ways the simplest. We now also have generator comprehensions, set comprehensions, and dict comprehensions available in Python syntax. As a caveat though, while you can nest comprehensions to arbitrary depth, past a fairly simple level they tend to stop clarifying and start obscuring. For genuinely complex construction of a data collection, refactoring into functions remains more readable.
Generators
Generator comprehensions have the same syntax as list
comprehensions—other than that there are no square brackets around them
(but parentheses are needed syntactically in some contexts, in place of
brackets)—but they are also lazy. That is to say that they are merely
a description of “how to get the data” that is not realized until one
explicitly asks for it, either by calling .next()
on the object, or by
looping over it. This often saves memory for large sequences and defers
computation until it is actually needed. For example:
log_lines
=
(
line
for
line
in
read_line
(
huge_log_file
)
if
complex_condition
(
line
))
For typical uses, the behavior is the same as if you had constructed a list, but runtime behavior is nicer. Obviously, this generator comprehension also has imperative versions, for example:
def
get_log_lines
(
log_file
):
line
=
read_line
(
log_file
)
while
True
:
try
:
if
complex_condition
(
line
):
yield
line
line
=
read_line
(
log_file
)
except
StopIteration
:
raise
log_lines
=
get_log_lines
(
huge_log_file
)
Yes, the imperative version could be simplified too, but the version
shown is meant to illustrate the behind-the-scenes “how” of a for
loop
over an iteratable—more details we also want to abstract from in our
thinking. In fact, even using yield
is somewhat of an abstraction from
the underlying “iterator protocol.” We could do this with a class that
had .__next__()
and .__iter__()
methods. For example:
class
GetLogLines
(
object
):
def
__init__
(
self
,
log_file
):
self
.
log_file
=
log_file
self
.
line
=
None
def
__iter__
(
self
):
return
self
def
__next__
(
self
):
if
self
.
line
is
None
:
self
.
line
=
read_line
(
log_file
)
while
not
complex_condition
(
self
.
line
):
self
.
line
=
read_line
(
self
.
log_file
)
return
self
.
line
log_lines
=
GetLogLines
(
huge_log_file
)
Aside from the digression into the iterator protocol and laziness more generally, the reader should see that the comprehension focuses attention much better on the “what,” whereas the imperative version—although successful as refactorings perhaps—retains the focus on the “how.”
Dicts and Sets
In the same fashion that lists can be created in comprehensions rather
than by creating an empty list, looping, and repeatedly calling
.append()
, dictionaries and sets can be created “all at once” rather
than by repeatedly calling .update()
or .add()
in a loop. For
example:
>>>
{
i
:
chr
(
65
+
i
)
for
i
in
range
(
6
)}
{
0
:
'A'
,
1
:
'B'
,
2
:
'C'
,
3
:
'D'
,
4
:
'E'
,
5
:
'F'
}
>>>
{
chr
(
65
+
i
)
for
i
in
range
(
6
)}
{
'A'
,
'B'
,
'C'
,
'D'
,
'E'
,
'F'
}
The imperative versions of these comprehensions would look very similar to the examples shown earlier for other built-in datatypes.
Recursion
Functional programmers often put weight in expressing flow control through recursion rather than through loops. Done this way, we can avoid altering the state of any variables or data structures within an algorithm, and more importantly get more at the “what” than the “how” of a computation. However, in considering using recursive styles we should distinguish between the cases where recursion is just “iteration by another name” and those where a problem can readily be partitioned into smaller problems, each approached in a similar way.
There are two reasons why we should make the distinction mentioned. On
the one hand, using recursion effectively as a way of marching through a
sequence of elements is, while possible, really not “Pythonic.” It
matches the style of other languages like Lisp, definitely, but it often feels contrived in Python. On the other hand, Python is simply
comparatively slow at recursion, and has a limited stack depth limit.
Yes, you can change this with sys.setrecursionlimit()
to more than the
default 1000; but if you find yourself doing so it is probably a
mistake. Python lacks an internal feature called tail call elimination
that makes deep recursion computationally efficient in some languages.
Let us find a trivial example where recursion is really just a kind of
iteration:
def
running_sum
(
numbers
,
start
=
0
):
if
len
(
numbers
)
==
0
:
()
return
total
=
numbers
[
0
]
+
start
(
total
,
end
=
" "
)
running_sum
(
numbers
[
1
:],
total
)
There is little to recommend this approach, however; an iteration that
simply repeatedly modified the total
state variable would be more
readable, and moreover this function is perfectly reasonable to want to
call against sequences of much larger length than 1000. However, in
other cases, recursive style, even over sequential operations, still
expresses algorithms more intuitively and in a way that is easier to
reason about. A slightly less trivial example, factorial in recursive
and iterative style:
def
factorialR
(
N
):
"Recursive factorial function"
assert
isinstance
(
N
,
int
)
and
N
>=
1
return
1
if
N
<=
1
else
N
*
factorialR
(
N
-
1
)
def
factorialI
(
N
):
"Iterative factorial function"
assert
isinstance
(
N
,
int
)
and
N
>=
1
product
=
1
while
N
>=
1
:
product
*=
N
N
-=
1
return
product
Although this algorithm can also be expressed easily enough with a
running product variable, the recursive expression still comes closer to
the “what” than the “how” of the algorithm. The details of repeatedly
changing the values of product
and N
in the iterative version feels like it’s just bookkeeping, not the nature of the computation itself (but the iterative version is probably faster, and it is easy to reach the
recursion limit if it is not adjusted).
As a footnote, the fastest version I know of for factorial()
in Python
is in a functional programming style, and also expresses the “what” of
the algorithm well once some higher-order functions are familiar:
from
functools
import
reduce
from
operator
import
mul
def
factorialHOF
(
n
):
return
reduce
(
mul
,
range
(
1
,
n
+
1
),
1
)
Where recursion is compelling, and sometimes even the only really obvious way to express a solution, is when a problem offers itself to a “divide and conquer” approach. That is, if we can do a similar computation on two halves (or anyway, several similarly sized chunks) of a larger collection. In that case, the recursion depth is only O(log N) of the size of the collection, which is unlikely to be overly deep. For example, the quicksort algorithm is very elegantly expressed without any state variables or loops, but wholly through recursion:
def
quicksort
(
lst
):
"Quicksort over a list-like sequence"
if
len
(
lst
)
==
0
:
return
lst
pivot
=
lst
[
0
]
pivots
=
[
x
for
x
in
lst
if
x
==
pivot
]
small
=
quicksort
([
x
for
x
in
lst
if
x
<
pivot
])
large
=
quicksort
([
x
for
x
in
lst
if
x
>
pivot
])
return
small
+
pivots
+
large
Some names are used in the function body to hold convenient values, but they are never mutated. It would not be as readable, but the definition could be written as a single expression if we wanted to do so. In fact, it is somewhat difficult, and certainly less intuitive, to transform this into a stateful iterative version.
As general advice, it is good practice to look for possibilities of recursive expression—and especially for versions that avoid the need for state variables or mutable data collections—whenever a problem looks partitionable into smaller problems. It is not a good idea in Python—most of the time—to use recursion merely for “iteration by other means.”
Eliminating Loops
Just for fun, let us take a quick look at how we could take out all loops from any Python program. Most of the time this is a bad idea, both for readability and performance, but it is worth looking at how simple it is to do in a systematic fashion as background to contemplate those cases where it is actually a good idea.
If we simply call a function inside a for
loop, the built-in higher-order function map()
comes to our aid:
for
e
in
it
:
# statement-based loop
func
(
e
)
The following code is entirely equivalent to the functional version,
except there is no repeated rebinding of the variable e
involved, and
hence no state:
map
(
func
,
it
)
# map()-based "loop"
A similar technique is available for a functional approach to sequential
program flow. Most imperative programming consists of statements that
amount to “do this, then do that, then do the other thing.” If those
individual actions are wrapped in functions, map()
lets us do just
this:
# let f1, f2, f3 (etc) be functions that perform actions
# an execution utility function
do_it
=
lambda
f
,
*
args
:
f
(
*
args
)
# map()-based action sequence
map
(
do_it
,
[
f1
,
f2
,
f3
])
We can combine the sequencing of function calls with passing arguments from iterables:
>>>
hello
=
lambda
first
,
last
:
(
"Hello"
,
first
,
last
)
>>>
bye
=
lambda
first
,
last
:
(
"Bye"
,
first
,
last
)
>>>
_
=
list
(
map
(
do_it
,
[
hello
,
bye
],
>>>
[
'David'
,
'Jane'
],
[
'Mertz'
,
'Doe'
]))
Hello
David
Mertz
Bye
Jane
Doe
Of course, looking at the example, one suspects the result one really wants is actually to pass all the arguments to each of the functions rather than one argument from each list to each function. Expressing that is difficult without using a list comprehension, but easy enough using one:
>>>
do_all_funcs
=
lambda
fns
,
*
args
:
[
list
(
map
(
fn
,
*
args
))
for
fn
in
fns
]
>>>
_
=
do_all_funcs
([
hello
,
bye
],
[
'David'
,
'Jane'
],
[
'Mertz'
,
'Doe'
])
Hello
David
Mertz
Hello
Jane
Doe
Bye
David
Mertz
Bye
Jane
Doe
In general, the whole of our main program could, in principle, be a
map()
expression with an iterable of functions to execute to complete
the program.
Translating while
is slightly more complicated, but is possible to do
directly using recursion:
# statement-based while loop
while
<
cond
>
:
<
pre
-
suite
>
if
<
break_condition
>
:
break
else
:
<
suite
>
# FP-style recursive while loop
def
while_block
():
<
pre
-
suite
>
if
<
break_condition
>
:
return
1
else
:
<
suite
>
return
0
while_FP
=
lambda
:
(
<
cond
>
and
while_block
())
or
while_FP
()
while_FP
()
Our translation of while
still requires a while_block()
function
that may itself contain statements rather than just expressions. We
could go further in turning suites into function sequences, using
map()
as above. If we did this, we could, moreover, also return a
single ternary expression. The details of this further purely functional
refactoring is left to readers as an exercise (hint: it will be ugly;
fun to play with, but not good production code).
It is hard for <cond>
to be useful with the usual tests, such as
while myvar==7
, since the loop body (by design) cannot change any
variable values. One way to add a more useful condition is to let
while_block()
return a more interesting value, and compare that return
value for a termination condition. Here is a concrete example of
eliminating statements:
# imperative version of "echo()"
def
echo_IMP
():
while
1
:
x
=
input
(
"IMP -- "
)
if
x
==
'quit'
:
break
else
:
(
x
)
echo_IMP
()
Now let’s remove the while
loop for the functional version:
# FP version of "echo()"
def
identity_print
(
x
):
# "identity with side-effect"
(
x
)
return
x
echo_FP
=
lambda
:
identity_print
(
input
(
"FP -- "
))
==
'quit'
or
echo_FP
()
echo_FP
()
What we have accomplished is that we have managed to express a little
program that involves I/O, looping, and conditional statements as a pure
expression with recursion (in fact, as a function object that can be
passed elsewhere if desired). We do still utilize the utility function
identity_print()
, but this function is completely general, and can be
reused in every functional program expression we might create later
(it’s a one-time cost). Notice that any expression containing
identity_print(x)
evaluates to the same thing as if it had simply
contained x
; it is only called for its I/O side effect.
Eliminating Recursion
As with the simple factorial example given above, sometimes we can
perform “recursion without recursion” by using functools.reduce()
or
other folding operations (other “folds” are not in the Python standard
library, but can easily be constructed and/or occur in third-party
libraries). A recursion is often simply a way of combining something
simpler with an accumulated intermediate result, and that is exactly
what reduce()
does at heart. A slightly longer discussion of
functools.reduce()
occurs in the chapter on higher-order functions.
Get Functional Programming in Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.