Leanpub: Publish Early, Publish Often

8. Iterators and Generators

In the last section of the previous chapter, the central part sequences play in functional programming and the need for their efficient representation was mentioned. The idea of representing a sequence as an objects that computes and returns the next value of a sequence just at the time such value is needed for computation was also introduces. This may seem hard to grasp at first but this chapter is dedicated to explaining all about this wonderful idea. It however begins with a description of a profound construct that has been left out of the discussion till now, iterators.

8.1 Iterators

An iterable in Python is any object that implements the __iter__ special method that when called returns an iterator (the __iter__ special method is invoked by a call to iter(obj)). Simply put, a Python iterable is any type that can be used with a for..in loop. Python lists, tuples, dicts and sets are all examples of built-in iterables. Iterators are objects that implement the iterator protocol. The iterator protocol in defines the following set of methods that need to be implemented by any object that wants to be used as an iterator.

__iter__ method that is called on initialization of an iterator. This should return an object that has a __next__ method.
__next__ method that is called whenever the next() global function is invoked with the iterator as argument. The iterator’s __next__ method should return the next value of the iterable. When an iterator is used with a for...in loop, the for loop implicitly calls next() on the iterator object. This method should raise a StopIteration exception when there is no longer any new value to return to signal the end of the iteration.

Care should be taken when distinguishing between an iterable and an iterator because an iterable is not necessarily an iterator. The following snippet shows how this is possible.

    >>> x = [1, 2, 3]
    >>> type(x) 
    <class 'list'>
    >>> x_iter = iter(x) 
    >>> type(x_iter)
    <class 'list_iterator'>
    # x is iterable & can be used in a for loop but is not an iterators as it
    # does not have the __next__ method
    >>> dir(x)
    ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__',\
 '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', \
'__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__re\
duce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__',\
 '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 're\
verse', 'sort']
    # x_iter is an iterator as it has the __iter__ and __next__ methods
    >>> dir(x_iter)
    ['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__\
', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__length_hint__', '__lt__', '__ne__', '__new__\
', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__\
str__', '__subclasshook__']

Something worth noting is that most times an iterable object is also an iterator so a call to such an object’s __iter__ special method returns the object itself. This will be seen later on in this section.

Any class that fully implements the iterator protocol can be used as an iterator. This is illustrated in the following by implementing a simple iterator that returns Fibonacci numbers up to a given maximum value.

    class Fib:                                        
        def __init__(self, max):                      
            self.max = max

        def __iter__(self):       self.a = 0     self.b = 1     return self #
        object is an iterable and an iterator

        def __next__(self):                          
            fib = self.a
            if fib > self.max:
                raise StopIteration                  
            self.a, self.b = self.b, self.a + self.b
            return fib           

    >>>for i in Fib(10):
           print i      

    0
    1
    1
    2
    3
    5
    8

A custom range function for looping through numbers can also be modelled as an iterator. The following is a simple implementation of a range function that loops from 0 upwards.

    class CustomRange:
        def __init__(self, max):
            self.max = max

        def __iter__(self):
            self.curr = 0
            return self

        def __next__(self):
            numb = self.curr
            if self.curr >= self.max:
                raise StopIteration
            self.curr += 1
            return numb

    for i in CustomRange(10):
         print i 
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9

Before attempting to move on, stop for a second and study both examples carefully. The essence of an iterator is that an iterator object knows how to calculate and return the elements in the sequence as needed not all at once. The CustomRange does not return all the elements in the range when it is initialized rather it returns an object that when the object’s __iter__ method is called returns an iterator object that can calculate the next element of the range using the steps defined in the __next__ method. It is possible to define a range function that returns all positive whole numbers (an infinite sequence) by simply removing the upper bound on the method. The same idea applies to the Fib iterator. This basic idea just explained above can be seen in built-in functions that return sequences. For example, the built-in range function does not return a list as one would intuitively expect but returns an object that returns a range iterator object when its __iter__ method is called. To get the sequence as expected the range iterator object is passed to the list constructor as shown in the following example.

    >>> ran = range(0, 10)
    >>> type(ran)
    <class 'range'>
    >>> dir(ran)
    ['__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '\
__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt\
__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__setattr__', '__siz\
eof__', '__str__', '__subclasshook__', 'count', 'index', 'start', 'step', 'stop']
    >>> iter = ran.__iter__()
    >>> iter
    <range_iterator object at 0x1012a4090>
    >>> type(iter)
    <class 'range_iterator'>
    >>> iter.__next__()
    0
    >>> iter.__next__()
    1
    >>> iter.__next__()
    2
    >>> iter.__next__()
    3
    >>> iter.__next__()
    4
    >>> iter.__next__()
    5
    >>> iter.__next__()
    6
    >>> iter.__next__()
    7
    >>> iter.__next__()
    8
    >>> iter.__next__()
    9
    >>> iter.__next__()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    >>> ran=range(10)
    >>> ran
    range(0, 10)
    >>> list(ran) # use list to calculate all values in the sequence at once
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>>

The iterator protocol implements a form of computing that is referred to as lazy computation; it does not do more work than it has to do at any given time.

The `Itertools` Module

The concept of iterators is so important that Python comes with a module, the itertools module, that provides some useful general purpose functions that return iterators. The results of these functions can be obtained eagerly by passing the returned iterator to the list() constructor. A few of these functions are described below.

accumulate(iterable[, func]): This takes an iterable and an optional func argument that defaults to the operator.add function. The supplied function should take two arguments and return a single value. The elements of the iterable must be a type that is acceptable to the supplied function. A call to accumulate returns an iterator that represents the result of applying the supplied function to the elements of the iterable. The accumulated result for the first element of an iterable is the element itself while the accumulated result for the nth element is func(nth element, accumulated result of (n-1)th element). Examples of the usage of this function are shown in the following snippet.
```
     >>> from itertools import *
     >>> accumulate([1,2,3,4,5])
     <itertools.accumulate object at 0x101c67c08>
     >>> list(accumulate([1,2,3,4,5]))
     [1, 3, 6, 10, 15]
     >>> import operator
     >>> accumulate(range(1, 10), operator.mul)
     <itertools.accumulate object at 0x101c6d0c8>
     >>> list(accumulate(range(1, 10), operator.mul))
     [1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
     >>> 
```

chain(*iterables): This takes a single iterable that contains a variable number of iterables and returns an iterator representing a union of all the iterables in supplied iterable.

     >>> x = [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
     >>> chain.from_iterable(x)
     <itertools.chain object at 0x101c6a208>
     >>> list(chain.from_iterable(x))
     ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
     >>>

combinations(iterable, r) This returns an iterator representing a set of r length sub-sequences of elements from the input iterable. Elements are treated as unique based on their value and not on their position.

     >>> combinations('ABCDE', 3)
     <itertools.combinations object at 0x101c71138>
     >>> list(combinations('ABCDE', 3))
     [('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'B', 'E'), ('A', 'C', 'D'), ('A', 'C', 'E'), ('A', 'D', 'E')\
, ('B', 'C', 'D'), ('B', 'C', 'E'), ('B', 'D', 'E'), ('C', 'D', 'E')]
     >>>

filterfalse(predicate, iterable):: This returns an iterator that filters elements from the iterable argument returning only those for which the value of applying the predicate to the element is False. If predicate is None, the function returns the items that are false.
groupby(iterable, key=None): This returns an iterator that returns consecutive keys and corresponding groups for these keys from the iterable argument. The key argument is a function computing a key value for each element. If a key function is not specified or is None, the key defaults to an identity function that returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function. The returned group is itself an iterator that shares the underlying iterable with groupby(). An example usage of this is shown in the following snippet.
```
 >>> from itertools import groupby
 >>> {k:list(g) for k, g in groupby('AAAABBBCCD')}
 {'D': ['D'], 'B': ['B', 'B', 'B'], 'A': ['A', 'A', 'A', 'A'], 'C': ['C', 'C']}
 >>> >>> [k for k, g in groupby('AAAABBBCCDAABBB')]
 ['A', 'B', 'C', 'D', 'A', 'B']
```
islice(terable, start, stop[, step]): This returns an iterator that returns elements from the iterable that are within the specified range. If start is non-zero, then elements from the iterable are skipped until start is reached. Afterwards, elements are returned consecutively with step elements skipped if step is greater than one just as in the conventional slice until the iterable argument is exhausted. Unlike conventional slicing,islice() does not support negative values for start, stop, or step.
permutation(iterable, r=None): This returns a succession of r length permutations of elements in the iterable. If r is not specified or is None, it defaults to the length of the iterable. Elements are treated as unique based on their position, not on their value and this is where permutations differs from combinations that was previously defined. So if the input elements are unique, there will be no repeat values in each permutation.
product(*iterables, repeat=1): This returns a iterator that returns successive Cartesian product of input iterables. This is equivalent to nested for-loops in a list expression. For example, product(A, B) returns an iterator that returns values that are the same as [(x,y) for x in A for y in B]. This function can compute the product of an iterable with itself by specifying the number of repetitions with the optional repeat keyword argument. For example, product(A, repeat=4) means the same as product(A, A, A, A).

8.2 Generators

Generators and iterators have a very intimate relationship. In short, Python generators are iterators and understanding generators gives one an idea of how iterators can be implemented. This may sound quite circular but after going through an explanation of generators, it will become clearer. PEP 255 that describes simple generators refers to generators by their full name, generator-iterators. Generators just like the name suggests generate (or consume) values when their __next__ method is called. Generators are used either by explicitly calling the __next__ method on the generator object or using the generator object in a for...in loop. Generators are of two types:

Generator Functions
Generator Expressions

Generator Functions

Generator functions are functions that contain the yield expression. Calling a function that contains a yield expression returns a generator object. For example, the Fibonacci iterator can be recast as a generator using the yield keyword as shown in the following example.

    def fib(max):
        a, b = 0, 1
        while a < max:
            yield a
            a, b = b, a + b

The `yield` keyword

The yield keyword has the following syntax.

    `yield expression_list`

The yield keyword expression is central to generator functions but what does this expression really do? To understand the yield expression, contrast it with the return keyword. The return keyword when encountered returns control to the caller of a function effectively ending the function execution. This is shown in the following example by calling the normal Fibonacci function to return all Fibonacci numbers less than 10.

    >>> def fib(max):
    ...     numbers = []
    ...     a, b = 0, 1
    ...     while a < max:
    ...         numbers.append(a)
    ...         a, b = b, a+b
    ...     return numbers
    ...
    >>> fib(10) # all values are returned at once
    [0, 1, 1, 2, 3, 5, 8]

On the other hand, the presence of the yield expression in a function complicates things a bit. When a function with a yield expression is called, the function does not run like a normal function rather it returns a generator expression. This is illustrated by a call to the fib function in the following snippet.

    >>> f = fib(10)
    >>> f
    <generator object fib at 0x1013d8828>

The generator object executes when its __next__ method is invoked and the generator object executes all statements in the function definition till the yield keyword is encountered.

    >>> f.__next__()
    0
    >>> f.__next__()
    1
    >>> f.__next__()
    1
    >>> f.__next__()
    2

The object suspends execution at that point, saves its context and returns any value in the expression_list to the caller. When the caller invokes __next__() method of the generator object, execution of the function continues till another yield or return expression is encountered or end of function is reached. This continues till the loop condition is false and a StopIteration exception is raised to signal that there is no more data to generate. To quote PEP 255,

If a yield statement is encountered, the state of the function is frozen, and the value of expression_list is returned to .__next__()'s caller. By “frozen” we mean that all local state is retained, including the current bindings of local variables, the instruction pointer, and the internal evaluation stack: enough information is saved so that the next time .next() is invoked, the function can proceed exactly as if the yield statement were just another external call. On the other hand when a function encounters a return statement, it returns to the caller along with any value proceeding the return statement and the execution of such function is complete for all intent and purposes. One can think of yield as causing only a temporary interruption in the executions of a function.

With a better understanding of generators, it is not difficult to see how generators can be used to implement iterators. Generators know how to calculate the next value in a sequence so functions that return iterators can be rewritten using the yield statement. To illustrate this, the accumulator function from the itertools module can be rewritten using generators as in the following snippet.

    def accumulate(iterable, func=operator.add):
        'Return running totals'
        # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
        # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
        it = iter(iterable)
        try:
            total = next(it)
        except StopIteration:
            return
        yield total
        for element in it:
            total = func(total, element)
            yield total

Similarly, one can emulate a generator object by implementing the iterator protocol discussed at the start of this chapter. However, the yield keyword provides a more succinct and elegant method for creating generators.

Generator Expressions

In the previous chapter, list comprehensions were discussed. One drawback with list comprehensions is that values are calculated all at once regardless of whether the values are needed at that time or not (eager calculation). This may sometimes consume an inordinate amount of computer memory. PEP 289 proposed the generator expression to resolve this and this proposal was accepted and added to the language. Generator expressions are like list comprehensions; the only difference is that the square brackets in list comprehensions are replaced by circular brackets that return a generator expression object.

To generate a list of the square of number from 0 to 10 using list comprehensions, the following is done.

    >>> squares = [i**2 for i in range(10)]
    >>> squares
    [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

A generator expression could be used in place of a list comprehension as shown in the following snippet.

    >>> squares = (i**2 for i in range(10))
    >>> squares
    <generator object <genexpr> at 0x1069a6d70>

The values of the generator can then be accessed using for...in loops or via a call to the __next__() method of the generator object as shown below.

    >>> squares = (i**2 for i in range(10))
    >>> for square in squares:
                print(square)
    0
    1
    4
    9
    16
    25
    36
    49
    64
    81

Generator expression create generator objects without using the yield expression.

The Beauty of Generators and Iterators

Generators really shine when working with massive amounts of data streams. Consider the very representative but rather trivial example of generating a stream of prime numbers. A method for calculating this set is the Sieve of Eratosthenes. The following algorithm will find all the prime numbers less than or equal to a given integer, n, by the sieve of Eratosthenes’ method:

    1. Create a list of consecutive integers from 2 to n: (2, 3, 4, ..., n).
    2. Initially, let p equal 2, the first prime number.
    3. Starting from p, enumerate its multiples by counting to n in increments of p, and mark them in the \
list (these will be 2p, 3p, 4p, ... ; p itself should not be marked).
    4. Find the first number greater than p in the list that is not marked. If there was no such number, s\
top. Otherwise, let p now equal this new number (which is the next prime), and repeat from step 3.

When the algorithm terminates, the remaining numbers not marked in the list are all the primes below n. Now this is a rather trivial algorithm and this is implemented using generators.

        from itertools import count

        def primes_to(gen, max_val):
            for i in range(max_val):
                print(gen.__next__())

        def filter_multiples_of_n(n, ints):
            # ints is a generator that can have other generators within it
            for i in ints:
                if (i % n) != 0:
                    yield i

        def sieve(ints):
            while True:
                prime = ints.__next__()
                yield prime
                # ints is now a generator that produces integers that are not
                # multiples of prime
                ints = filter_multiples_of_n(prime, ints)

        # all prime numbers less than 400
        primes_to(sieve(count(2)), 10)
        2
        3
        5
        7
        11
        13
        17
        19
        23
        29

The above example though very simple, shows the beauty of how generators can be chained together with the output of one acting as input to another; think of this stacking of generators with one another as a kind of processing pipeline. The filter_multiples_of_n function is worth discussing a bit here because it maybe confusing at first. counts(2) when initialized returns a generator that returns a sequence of consecutive numbers from 2 so the line, prime=ints.__next__() returns 2 on the first iteration. After the yield expression, ints=filter_multiples_of_n(2, ints) is invoked creating a generator that returns a stream of numbers that are not multiples of 2 - note that the original sequence generator is captured within this new generator (this is very important). Now on the next iteration of the loop within the sieve function, the ints generator is invoked. The generator loops through the original sequence now [3, 4, 5, 6, 7, ....] yielding the first number that is not divisible by 2, 3 in this case. This part of the pipeline is easy to understand. The prime, 3, is yielded from the sieve function then another generator that returns non-multiples of the prime, 3, is created and assigned to ints. This generator captures the previous generator that produces non- multiples of 2 and that generator captured the original generator that produces sequences of infinite consecutive numbers. A call to the __next__() method of this generator will loop through the previous generator that returns non-multiples of 2 and every non-multiple of 2 returned by the generator is checked for divisibility by 3 and if the number is not divisible by 3 it is yielded. This chaining of generators goes on and on. The next prime is 5 so the generator excluding the multiples of primes will loop through the generator that returns non-multiples of 3 which in turn loops through the generator that produces non-multiple of 2.

This streaming of data through multiple generators can be applied to the space and sometime time efficient processing of any other stream of massive data such as log files and data bases. Generators however have other nifty and mind-blowing use cases as will be seen in the following sections.

8.3 From Generators To Coroutines

“Subroutines are special cases of … coroutines.”

– Donald Knuth

A subroutine is a set of program instructions bundled together to perform a specific task. Functions and methods are examples of subroutines in Python. Subroutines have a single point of entry or exit; this is seen in ordinary functions and methods which once called execute till they exit and cannot be suspended. Coroutines however are a more general program construct that allow multiple entry points for suspending and resuming execution. Multiple entry points for suspending and resuming sounds exactly just like what the yield expression provides to generator functions and in-fact one could argue that Python generators are in-fact
coroutines because they allow the production and consumption of values at their suspension or resumption points. The send() method of generators added in Python version 2.5 provides generators with the ability to consume data when a generator resumes execution. The documentation provided for the send() method by the Python documentation follows.

generator.send(value): Resumes the execution and “sends” a value into the generator function. The value argument becomes the result of the current yield expression. The send() method returns the next value yielded by the generator, or raises StopIteration if the generator exits without yielding another value. When send() is called to start the generator, it must be called with None as the argument, because there is no yield expression that could receive the value.

The above explanation maybe a little ambiguous so an illustration of the use of the send() method is provided with the following example.

    >>> def find_pattern(pattern):
    ...     print("looking for {}".format(pattern))
    ...     while True:
    ...         line = yield
    ...         if pattern in line:
    ...             print(line)
    ...

The generator is initialized and run as shown in the following snippet.

    >>> g = find_pattern('python')
    # coroutines must be primed with a call to next before sending in values
    >>> g.__next__() 
    looking for python
    >>> g.send("Yeah, but no, but year, but no")
    >>> g.send("python generators rock!")
    python generators rock!

To fully grasp the send() method, observe that the argument passed to the send() method of the generator will be the result of the yield expression so in the above example, the value that send() is called with is assigned to the variable, line. The rest of the function is straightforward to understand. Note that calling send(None) is equivalent to calling the generator’s __next__() method.

Simulating Multitasking with Coroutines

The ability to send data into generators using the send() method truly expands the frontier for generators. Now think carefully about the tools in our arsenal at this point:

1. Functions that can run independent of one another while maintaining state 
2. Functions that can suspend execution and resume execution multiple times.
3. Functions that can receive data at resumption.

A little thinking shows that multiple generators or coroutines can be scheduled to run in an interleaved manner and it would be like they were executing simultaneously. With that, we have multitasking or something like it. In this section, rudimentary multitasking is simulated to illustrate the versatility of generators/coroutines.

In reality, even a full blown multitasking operating system is only ever executing a single task at one time. A task is any set of instructions with own internal state. For example, a simple task may take a stream of text count the occurrence of some words and print a running count of some words in the stream or may just print any word it receives. What is important here is that tasks are totally independent of each other. The illusion of multitasking is achieved by giving each task a slice of time to run until it encounters a trap which forces it to stop running so that other tasks can run. This happens so fast that human users cannot sense what is happening. This can easily be simulated with a collection of coroutines that run independent of each as shown in this section. A very simple example of this multitasking is shown in the following snippet in which text is read from a source and then sent for processing to multiple coroutines. In the snippet, a task is modelled as a thin wrapper around a coroutine.

    from collections import defaultdict

    class Task():

        def __init__(self, coroutine):
            self.coroutine = coroutine
            next(self.coroutine)

        def run(self, value):
            self.coroutine.send(value)

    def read(source):
        for line in source:
            yield line

    def print_line():
        while True:
            text = yield
            print(text)

    def word_count():
        word_counts = defaultdict(int)
        while True:
            text = yield
            for word in text.split():
                word_counts[word] += 1
            print("Word distribution so far is ", word_counts)


    def run():
        f = open("data.txt")
        source = read(f)
        tasks = [ Task(print_line()), Task(word_count()) ]
        for line in source:
            for task in tasks:
                try:
                    task.run(line)
                except StopIteration:
                    tasks.remove(task)


    if __name__ == '__main__':
        run()
    
    We love python don't we?
    Word distribution so far is  defaultdict(<class 'int'>, {"don't": 1, 'we?': 1, 'python': 1, 'love': 1,\
 'We': 1})
    No we don't love python
    Word distribution so far is  defaultdict(<class 'int'>, {"don't": 2, 'we?': 1, 'python': 2, 'we': 1, '\
We': 1, 'No': 1, 'love': 2})

Observer how the outputs are interleaved because execution of each coroutine happens for a limited time then another coroutines executes. The above example is very instructive in showing the power of generators and coroutines. The above has just been provided for illustration purposes. The asyncio module is provided in Python 3.5 for concurrent programming using coroutines.

8.4 The `yield from` keyword

Sometimes re-factoring a code block may involve moving some functionality into another generator. Using just the yield keyword may prove cumbersome in some of these cases and impossible in other cases such as when there is a need to send data to the delegated generator that was sent to the delegating generator. This kind of scenarios led to the introduction of the yield from keyword.
This keyword allows a section of code containing yield to be moved into another generator. Furthermore, the delegated generator can return a value that is made available to the delegating generator. An instructive example on how the yield from keyword works is given in the following example (note that a call to the range function returns a generator).

    >>> def g(x):
    ...     yield from range(x, 0, -1)
    ...     yield from range(x)
    ...
    >>> list(g(5))
    [5, 4, 3, 2, 1, 0, 1, 2, 3, 4]

As previously mentioned, yielding data from a delegated generator was not the only reason for the introduction of the yield from keyword because the previous yield from snippet can be replicated without yield from as shown in the following example.

    >>> def g(x):
    ...     r = []
    ...     for i in range(x, 0, -1):
    ...         r.append(i)
    ...     for j in range(x):
    ...         r.append(j)
    ...     return r
    ... 
    >>> x = g(5)
    >>> x
    [5, 4, 3, 2, 1, 0, 1, 2, 3, 4]

The real benefit of using the new yield from keyword comes from the ability of a calling generator to send values into the delegated generator as shown in the following example. Thus if a value is sent into a generator yield from enables that generator to also implicitly send the same value into the delegated generator.

    >>> def accumulate():
    ...     tally = 0
    ...     while 1:
    ...         next = yield
    ...         if next is None:
    ...             return tally
    ...         tally += next
    ...
    >>> def gather_tallies(tallies):
    ...     while 1:
    ...         tally = yield from accumulate()
    ...         tallies.append(tally)
    ...
    >>> tallies = []
    >>> acc = gather_tallies(tallies)
    >>> next(acc) # Ensure the accumulator is ready to accept values
    >>> for i in range(4):
    ...     acc.send(i)
    ...
    >>> acc.send(None) # Finish the first tally
    >>> for i in range(5):
    ...     acc.send(i)
    ...
    >>> acc.send(None) # Finish the second tally
    >>> tallies
    [6, 10]

The complete semantics for yield from is explained in PEP 380 and given below.

Any values that the iterator yields are passed directly to the caller.
Any values sent to the delegating generator using send() are passed directly to the iterator. If the sent value is None, the iterator’s __next__() method is called. If the sent value is not None, the iterator’s send() method is called. If the call raises StopIteration`, the delegating generator is resumed. Any other exception is propagated to the delegating generator.
Exceptions other than GeneratorExit thrown into the delegating generator are passed to the throw() method of the iterator. If the call raises StopIteration, the delegating generator is resumed. Any other exception is propagated to the delegating generator.
If a GeneratorExit exception is thrown into the delegating generator, or the close() method of the delegating generator is called, then the close() method of the iterator is called if it has one. If this call results in an exception, it is propagated to the delegating generator. Otherwise, GeneratorExit is raised in the delegating generator.
The value of the yield from expression is the first argument to the StopIteration exception raised by the iterator when it terminates.
return expr in a generator causes StopIteration(expr) to be raised upon exit from the generator.

8.5 A Game of Life

To end the chapter, a very simple game, Conway’s Game of Life, is implemented using generators and coroutines to simulate the basics of the game; a thorough understanding of this example will prove further enlightening. This example is inspired by Brett Slatkin’s Effective Python chapter on using coroutines for running thousands of function and all credits are due to him.

An explanation of the Game of Life as given by Wikipedia follows. The Game of Life is a simulation that takes place on a two-dimensional orthogonal grid of cells each of which is either in an alive or dead state. Each cell transitions to a new state in a step of time by its interaction with its neighbours, which are the cells that are horizontally, vertically, or diagonally adjacent. At each step of time, the following transitions occur:

1. Any live cell with fewer than two live neighbours dies.
2. Any live cell with two or three live neighbours lives on to the next generation.
3. Any live cell with more than three live neighbours dies.
4. Any dead cell with exactly three live neighbours becomes a live cell.

The initial pattern of cells on the grid constitutes the seed of the system. The first generation is created by applying the above rules simultaneously to every cell and the discrete moment at which this happens is sometimes called a tick. The rules continue to be applied repeatedly to create further generations.

In the following implementation, each cell’s simulation is carried out using a coroutine with the state of the cells stored in a Grid object from generation to generation.

    # Copyright 2014 Brett Slatkin, Pearson Education Inc.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.

    from collections import namedtuple

    ALIVE = '*'
    EMPTY = '-'
    TICK = object()

    Position = namedtuple('Position', 'y x')

    Transition = namedtuple('Transition', 'y x state')

    def count_neighbors(y, x):
        n_ = yield Position(y + 1, x + 0)  # North
        ne = yield Position(y + 1, x + 1)  # Northeast
        e_ = yield Position(y + 0, x + 1)  # East
        se = yield Position(y - 1, x + 1)  # Southeast
        s_ = yield Position(y - 1, x + 0)  # South
        sw = yield Position(y - 1, x - 1)  # Southwest
        w_ = yield Position(y + 0, x - 1)  # West
        nw = yield Position(y + 1, x - 1)  # Northwest
        neighbor_states = [n_, ne, e_, se, s_, sw, w_, nw]
        return sum([1 for state in neighbor_states if state is == ALIVE])

    def game_logic(state, neighbors):
        if state == ALIVE:
            if neighbors < 2 or neighbors > 3:
                return EMPTY     # Die: Too few
        else:
            if neighbors == 3:
                return ALIVE     # Regenerate
        return state

    def transition_cell(y, x):
        # request info on the state of the cell at y, x
        state = yield Position(y, x)
        neighbors = yield from count_neighbors(y, x)
        next_state = game_logic(state, neighbors)
        yield Transition(y, x, next_state)

    def simulate(height, width):
        while True:
            for y in range(height):
                for x in range(width):
                    yield from transition_cell(y, x)
            yield TICK

    def live_a_generation(grid, sim):
        progeny = Grid(grid.height, grid.width)
        item = next(sim)
        while item is not TICK:
            if isinstance(item, Position):
                state = grid[item.y, item.x]
                item = sim.send(state)
            else:  # Must be a Transition
                progeny[item.y, item.x] = item.state
                item = next(sim)
        return progeny

    class Grid(object):
        def __init__(self, height, width):
            self.height = height
            self.width = width
            self.rows = []
            for _ in range(self.height):
                self.rows.append([EMPTY] * self.width)

        def __str__(self):
            output = ''
            for row in self.rows:
                for cell in row:
                    output += cell
                output += '\n'
            return output

        def __getitem__(self, position):
            y, x = position
            return self.rows[y % self.height][x % self.width]

        def __setitem__(self, position, state):
            y, x = position
            self.rows[y % self.height][x % self.width] = state

    class ColumnPrinter(object):
        def __init__(self):
            self.columns = []

        def append(self, data):
            self.columns.append(data)

        def __str__(self):
            row_count = 1
            for data in self.columns:
                row_count = max(row_count, len(data.splitlines()) + 1)
            rows = [''] * row_count
            for j in range(row_count):
                for i, data in enumerate(self.columns):
                    line = data.splitlines()[max(0, j - 1)]
                    if j == 0:
                        rows[j] += str(i).center(len(line))
                    else:
                        rows[j] += line
                    if (i + 1) < len(self.columns):
                        rows[j] += ' | '
            return '\n'.join(rows)

    grid = Grid(5, 5)
    grid[1, 1] = ALIVE
    grid[2, 2] = ALIVE
    grid[2, 3] = ALIVE
    grid[3, 3] = ALIVE
    columns = ColumnPrinter()
    sim = simulate(grid.height, grid.width)
    for i in range(6):
        columns.append(str(grid))
        grid = live_a_generation(grid, sim)
    print(columns)


      0   |   1   |   2   |   3   |   4   |   5  
    ----- | ----- | ----- | ----- | ----- | -----
    -*--- | --*-- | --**- | --*-- | ----- | -----
    --**- | --**- | -*--- | -*--- | -**-- | -----
    ---*- | --**- | --**- | --*-- | ----- | -----
    ----- | ----- | ----- | ----- | ----- | -----

Generators are a fascinating topic and this chapter has barely scratched the surface of what is possible. David Beazley gave a series of excellent talks, 1,2 and 3, that go into great detail about very advanced usage of generators.

Up next

9. MetaProgramming and Co.