Looping in Parallel over Index and Sequence Items

Credit: Alex Martelli

Problem

You need to loop on a sequence, but at each step you also need to know what index into the sequence you have reached.

Solution

Together, the built-in functions xrange and zip make this easy. You need only this one instance of xrange, as it is fully reusable:

indices = xrange(sys.maxint)

Here’s how you use the indices instance:

for item, index in zip(sequence, indices):
    something(item, index)

This gives the same semantics as:

for index in range(len(sequence)):
    something(sequence[index], index)

but the change of emphasis allows greater clarity in many usage contexts.

Another alternative is to use class wrappers:

class Indexed:
    def _ _init_ _(self, seq):
        self.seq = seq
    def _ _getitem_ _(self, i):
        return self.seq[i], i

For example:

for item, index in Indexed(sequence):
    something(item, index)

In Python 2.2, with from _ _future_ _ import generators, you can also use:

def Indexed(sequence):
    iterator = iter(sequence)
    for index in indices:
        yield iterator.next(  ), index
    # Note that we exit by propagating StopIteration when .next raises it!

However, the simplest roughly equivalent way remains the good old:

def Indexed(sequence):
    return zip(sequence, indices)

Discussion

We often want to loop on a sequence but also need the current index in the loop body. The canonical Pydiom for this is:

for i in range(len(sequence)):

using sequence[i] as the item reference in the loop’s body. However, in many contexts, it is clearer to emphasize the loop on the sequence items rather than on the indexes. zip provides an easy alternative, looping on indexes and items in parallel, since it truncates at the shortest of its arguments. Thus, it’s okay for some arguments to be unbounded sequences, as long as not all the arguments are unbounded. An unbounded sequence of indexes is trivial to write (xrange is handy for this), and a reusable instance of that sequence can be passed to zip, in parallel to the sequence being indexed.

The same zip usage also affords a client code-transparent alternative to the use of a wrapper class Indexed, as demonstrated by the Indexed class, generator, and function shown in the solution. Of these, when applicable, zip is simplest.

The performance of each of these solutions is roughly equivalent. They’re all O(N) (i.e., they execute in time proportional to the number of elements in the sequence), they all take O(1) extra memory, and none is anything close to twice as fast or as slow as another.

Note that zip is not lazy (i.e., it cannot accept all argument sequences being unbounded). Therefore, in certain cases in which zip cannot be used (albeit not the typical one in which range(len(sequence)) is the alternative), other kinds of loop might be usable. See Recipe 17.13 for lazy, iterator-based alternatives, including an xzip function (Python 2.2 only).

See Also

Recipe 17.13; the Library Reference section on sequence types.

Get Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.