Credit: Sami Hangaslammi
You need to operate on iterators (including normal sequences) with the same semantics as normal sequence operations, except that lazy evaluation is a must, because some of the iterators involved could represent unbounded sequences.
Python 2.2 iterators are easy to handle via higher-order functions,
and lazy evaluation (such as that performed by the
xrange
built-in function) can be generalized. Here
are some elementary operations that include concatenating several
iterators, terminating iteration when a function becomes
false, terminating iteration after the first n
values, and returning every n
th result of an
iterator:
from _ _future_ _ import generators def itercat(*iterators): """ Concatenate several iterators into one. """ for i in iterators: i = iter(i) for x in i: yield x def iterwhile(func, iterator): """ Iterate for as long as func(value) returns true. """ iterator = iter(iterator) while 1: next = iterator.next( ) if not func(next): raise StopIteration # or: return yield next def iterfirst(iterator, count=1): """ Iterate through 'count' first values. """ iterator = iter(iterator) for i in xrange(count): yield iterator.next( ) def iterstep(iterator, n): """ Iterate every nth value. """ iterator = iter(iterator) while 1: yield iterator.next( ) # Skip n-1 values for dummy in range(n-1): iterator.next( )
A bit less elementary, but still generally useful, are functions that
transform an iterator’s output, not just selecting
which values to return and which to skip, but actually changing the
structure. For example, here is a function that bunches up an
iterator’s results into a sequence of tuples, each
of length count
:
from _ _future_ _ import generators def itergroup(iterator, count, keep_partial=1): """ Iterate in groups of 'count' values. If there aren't enough values for the last group, it's padded with None's, or discarded if keep_partial is passed as false. """ iterator = iter(iterator) while 1: result = [None]*count for x in range(count): try: result[x] = iterator.next( ) except StopIteration: if x and keep_partial: break else: raise yield tuple(result)
And here are generalizations to lazy evaluation of the non-lazy
existing built-in Python functions
zip
,
map
,
filter
, and
reduce
:
from _ _future_ _ import generators
def xzip(*iterators):
""" Iterative (lazy) version of built-in 'zip' """
iterators = map(iter, iterators)
while 1:
yield tuple([x.next( ) for x in iterators])
def xmap(func, *iterators):
""" Iterative (lazy) version of built-in 'map'. """
iterators = map(iter, iterators)
count = len(iterators)
def values( ):
# map pads shorter sequences with None when they run out of values
result = [None]*count
some_ok = 0
for i in range(count):
if iterators[i] is not None:
try: result[i] = iterators[i].next( )
except StopIteration: iterators[i] = None
else: some_ok = 1
if some_ok: return tuple(result)
else: raise StopIteration
while 1:
args = values( )
if func is None: yield args
else: yield func(*args)
def xfilter(func, iterator):
""" Iterative version of built-in 'filter' """
iterator = iter(iterator)
while 1:
next = iterator.next( )
if func(next):
yield next
def xreduce(func, iterator, default=None):
""" Iterative version of built-in 'reduce' """
iterator = iter(iterator)
try: prev = iterator.next( )
except StopIteration: return default
single = 1
for next in iterator:
single = 0
prev = func(prev, next)
if single:
return func(prev, default)
return prev
This recipe is a collection of small utility functions for iterators (all functions can also be used with normal sequences). Among other things, the module presented in this recipe provides generator (lazy) versions of the built-in sequence-manipulation functions. The generators can be combined to produce a more specialized iterator. This recipe requires Python 2.2 or later, of course.
The built-in
sequence-manipulation functions zip
,
map
, and filter
are specified
to return sequences (and the specifications cannot be changed for
backward compatibility with versions of Python before 2.2, which
lacked iterators); therefore, they cannot become lazy. However,
it’s easy to write lazy iterator-based versions of
these useful functions, as well as other iterator-manipulation
functions, as exemplified in this recipe.
Of course, lazy evaluation is not terribly useful in certain cases.
The semantics of reduce
, for example, require that
all of the sequence is evaluated anyway. While in some cases one
could save some memory by looping through the sequence that the
iterator yields, rather than expanding it, most often it will be more
practical to use reduce(func, iterator)
instead of
the xreduce
function presented in this recipe.
Lazy
evaluation is most useful when the resulting iterator-represented
sequence is used in contexts that may be able to use just a
reasonably short prefix of the sequence, such as the
zip
function and the iterwhile
and iterfirst
functions in this recipe. In such
cases, lazy evaluation enables free use of unbounded sequences (of
course, the resulting program will terminate only if each unbounded
sequence is used only in a context in which only a finite prefix of
it is taken) and sequences of potentially humungous length.
Recipe 17.11 and Recipe 17.12 for other uses of iterators.
Get Python Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.