Functional programming in Python
Examine the functional aspects of Python: which options work well and which ones you should avoid
Preface
What Is Functional Programming?
We’d better start with the hardest question: “What is functional
programming (FP), anyway?”
One answer would be to say that functional programming is what you do
when you program in languages like Lisp, Scheme, Clojure, Scala,
Haskell, ML, OCAML, Erlang, or a few others. That is a safe answer, but
not one that clarifies very much. Unfortunately, it is hard to get a
consistent opinion on just what functional programming is, even from
functional programmers themselves. A story about elephants and blind men
seems apropos here. It is also safe to contrast functional programming
with “imperative programming” (what you do in languages like C, Pascal,
C++, Java, Perl, Awk, TCL, and most others, at least for the most part).
Functional programming is also not object-oriented programming (OOP),
although some languages are both. And it is not Logic Programming (e.g.,
Prolog), but again some languages are multiparadigm.
Personally, I would roughly characterize functional programming as
having at least several of the following characteristics. Languages that
get called functional make these things easy, and make other things
either hard or impossible:
-
Functions are first class (objects). That is, everything you can do
with “data” can be done with functions themselves (such as passing a
function to another function). -
Recursion is used as a primary control structure. In some languages,
no other “loop” construct exists. -
There is a focus on list processing (for example, it is the source of the name Lisp).
Lists are often used with recursion on sublists as a substitute for
loops. -
“Pure” functional languages eschew side effects. This excludes the
almost ubiquitous pattern in imperative languages of assigning first
one, then another value to the same variable to track the program state. -
Functional programming either discourages or outright disallows
statements, and instead works with the evaluation of expressions (in
other words, functions plus arguments). In the pure case, one program is
one expression (plus supporting definitions). -
Functional programming worries about what is to be computed rather
than how it is to be computed. -
Much functional programming utilizes “higher order” functions (in
other words, functions that operate on functions that operate on
functions).
Advocates of functional programming argue that all these characteristics
make for more rapidly developed, shorter, and less bug-prone code.
Moreover, high theorists of computer science, logic, and math find it a
lot easier to prove formal properties of functional languages and
programs than of imperative languages and programs. One crucial concept
in functional programming is that of a “pure function”—one that always returns the same result given the same arguments—which is more closely akin to the meaning of “function” in mathematics than that in imperative programming.
Python is most definitely not a “pure functional programming
language”; side effects are widespread in most Python programs. That is,
variables are frequently rebound, mutable data collections often change
contents, and I/O is freely interleaved with computation. It is also not
even a “functional programming language” more generally. However, Python
is a multiparadigm language that makes functional programming easy to
do when desired, and easy to mix with other programming styles.
Beyond the Standard Library
While they will not be discussed withing the limited space of this
report, a large number of useful third-party Python libraries for
functional programming are available. The one exception here is that I
will discuss Matthew Rocklin’s
multipledispatch
as the best current implementation of the concept it implements.
Most third-party libraries around functional programming are collections
of higher-order functions, and sometimes enhancements to the tools for
working lazily with iterators contained in itertools
. Some notable
examples include the following, but this list should not be taken as
exhaustive:
-
pyrsistent
contains a number of immutable collections. All methods on a data
structure that would normally mutate it instead return a new copy of the
structure containing the requested updates. The original structure is
left untouched. -
toolz
provides a set of
utility functions for iterators, functions, and dictionaries. These
functions interoperate well and form the building blocks of common data
analytic operations. They extend the standard librariesitertools
and
functools
and borrow heavily from the standard libraries of
contemporary functional languages. -
hypothesis
is a
library for creating unit tests for finding edge cases in your code you
wouldn’t have thought to look for. It works by generating random data
matching your specification and checking that your guarantee still holds
in that case. This is often called property-based testing, and was
popularized by the Haskell library QuickCheck. -
more_itertools
tries to collect useful compositions of iterators that neither
itertools
nor the recipes included in its docs address. These
compositions are deceptively tricky to get right and this well-crafted
library helps users avoid pitfalls of rolling them themselves.
Resources
There are a large number of other papers, articles, and books written
about functional programming, in Python and otherwise. The Python
standard documentation itself contains an excellent introduction called
“Functional Programming HOWTO,” by Andrew Kuchling, that discusses some of the
motivation for functional programming styles, as well as particular
capabilities in Python.
Mentioned in Kuchling’s introduction are several very old public domain
articles this author wrote in the 2000s, on which portions of this report
are based. These include:
-
The first chapter of my book Text Processing in Python, which discusses functional programming for text processing, in
the section titled “Utilizing Higher-Order Functions in Text
Processing.”
I also wrote several articles, mentioned by Kuchling, for IBM’s
developerWorks site that discussed using functional programming in an early version of Python 2.x:
-
Charming Python: Functional programming in Python, Part 1: Making more out of
your favorite scripting language -
Charming Python: Functional programming in Python, Part 2: Wading into
functional programming? -
Charming Python: Functional programming in Python, Part 3: Currying and other
higher-order functions
Not mentioned by Kuchling, and also for an older version of Python, I
discussed multiple dispatch in another article for the same column. The
implementation I created there has no advantages over the more recent
multipledispatch
library, but it provides a longer conceptual
explanation than this report can:
-
Charming Python: Multiple dispatch: Generalizing polymorphism with multimethods
A Stylistic Note
As in most programming texts, a fixed font will be
used both for inline and block samples of code, including simple command
or function names. Within code blocks, a notional segment of pseudo-code
is indicated with a word surrounded by angle brackets (i.e., not valid
Python), such as <code-block>
. In other cases, syntactically valid but
undefined functions are used with descriptive names, such as
get_the_data()
.