Credit: Luther Blissett
The +
operator
concatenates strings and therefore offers seemingly obvious solutions
for putting small strings together into a larger one. For example,
when you have all the pieces at once, in a few variables:
largeString = small1 + small2 + ' something ' + small3 + ' yet more'
Or when you have a sequence of small string pieces:
largeString = '' for piece in pieces: largeString += piece
Or, equivalently, but a bit more compactly:
import operator largeString = reduce(operator.add, pieces, '')
However, none of these solutions is generally optimal. To put
together pieces stored in a few variables, the string-formatting
operator %
is often best:
largeString = '%s%s something %s yet more' % (small1, small2, small3)
To join a sequence of small strings into one large string, the string
operator join
is invariably best:
largeString = ''.join(pieces)
In Python, string objects are immutable. Therefore, any operation on
a string, including string concatenation, produces a new string
object, rather than modifying an existing one. Concatenating
N strings thus involves building and then
immediately throwing away each of N-1
intermediate results. Performance is therefore quite a bit better for
operations that build no intermediate results, but rather produce the
desired end result at once. The string-formatting operator
%
is one such operation, particularly suitable
when you have a few pieces (for example, each bound to a different
variable) that you want to put together, perhaps with some constant
text in addition. In addition to performance, which is never a major
issue for this kind of task, the %
operator has
several potential advantages when compared to an expression that uses
multiple +
operations on strings, including
readability, once you get used to it. Also, you
don’t have to call str
on pieces
that aren’t already strings (e.g., numbers) because
the format specifier %s
does so implicitly.
Another advantage is that you can use format specifiers other than
%s
, so that, for example, you can control how many
significant digits the string form of a floating-point number should
display.
When you have many small string pieces in a sequence, performance can
become a truly important issue. The time needed for a loop using
+
or +=
(or a fancier but
equivalent approach using the built-in function
reduce
) tends to grow with the square of the
number of characters you are accumulating, since the time to allocate
and fill a large string is roughly proportional to the length of that
string. Fortunately, Python offers an excellent alternative. The
join
method of a string object
s
takes as its only argument a sequence of strings
and produces a string result obtained by joining all items in the
sequence, with a copy of s
separating each item
from its neighbors. For example, ''.join(pieces)
concatenates all the items of pieces
in a single
gulp, without interposing anything between them.
It’s the fastest, neatest, and most elegant and
readable way to put a large string together.
Even when your pieces come in sequentially from input or computation,
and are not already available as a sequence, you should use a list to
hold the pieces. You can prepare that list with a list comprehension
or by calling the append
or
extend
methods. At the end, when the list of
pieces is complete, you can build the string you want, typically with
''.join(pieces)
. Of all the handy tips and tricks
I could give you about Python strings, I would call this one the most
significant.
Get Python Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.