Returning generators from with statements

Recently, an interesting issue came up at work that involved a subtle interaction between context managers and generator functions. Here is some example code demonstrating the problem:

@contextlib.contextmanager
def resource():
    """Context manager for some resource"""

    print("Resource setup")
    yield
    print("Resource teardown")


def _load_values():
    """Load a list of values (requires resource to be held)"""

    for i in range(3):
        print("Generating value %d" % i)
        yield i


def load_values():
    """Load values while holding the required resource"""

    with resource():
        return _load_values()

This is the output when run:

>>> for val in load_values(): pass
Resource setup
Resource teardown
Generating value 0
Generating value 1
Generating value 2

Whoops. The resource is destroyed before the values are actually generated. This is obviously a problem if the generator depends on the existence of the resource.

When you think about it, it's pretty clear what's going on. Calling _load_values() produces a generator object, whose code is only executed when values are requested. load_values() returns that generator, exiting the with statement and leading to the destruction of the resource. When the outer for loop (for val) comes around to iterating over the generator, the resource is long gone.

How do you solve this problem? In Python 3.3 and newer, you can use the yield from syntax to turn load_values() into a generator as well. The execution of load_values() is halted at the yield from point until the child generator is exhausted, at which point it is safe to dispose of the resource:

def load_values():
    """Load values while holding the required resource"""

    with resource():
        yield from _load_values()

In older Python versions, an explicit for loop over the child generator is required:

def load_values():
    """Load values while holding the required resource"""

    with resource():
        for val in _load_values():
            yield val

Still another method would be to turn the result of _load_values() into a list and returning that instead. This incurs higher memory overhead since all values have to be held in memory at the same time, so it's only appropriate for relatively short lists.

To sum up, it's a bad idea to return generators from under with statements. While it's not terribly confusing what's going on, it's a whee bit subtle and not many people think about this until they ran into the issue. Hope this heads-up helps.

published November 12, 2015
tags python

Comments