Lecture 44

Generators

MCS 275 Spring 2021
David Dumas

Lecture 44: Generators

Course bulletins:

  • Please complete your course evaluations. The deadline is 11:55pm Sunday.
  • Project 4 due today at 6pm CDT.
  • I'll announce the course material archive site by email when ready
  • Today's generators notebook

Sequences

In Python, a sequence is an object containing elements that can be accessed by a nonnegative integer index.

e.g. list, tuple, str

Iterables

An iterable is a more general concept for an object that can provide items one by one when used in a for loop.

Sequences can do this, but there are other examples:

iterablevalue
fileline of text
sqlite3.Cursorrow
dict, dict.keys()key
rangeinteger

Unlike a sequence, an iterable may not store (or know) the next item until it is requested.

This is called laziness and can provide significant advantages.

Pitch

Generators are do-it-yourself lazy iterables.

The return statement

In a function, return x   will:

  • Destroy all local variables from the function (except when references to them exist in objects still in scope)
  • Return execution to wherever it was when the function was called
  • Replace function call with x for the purposes of evaluation

The yield statement

When a function call is used as an iterable, the statement yield x   will:

  • Pause the function
  • Make x the next value given by the iterable

The next time a value is needed, execution of the function will continue from where it left off.

Comparison with print

Imagine you can write a function which will print a bunch of values (perhaps doing calculations along the way).

If you change print(x) to yield x, then you get a function that can be used as an iterable, lazily producing the same values.

Generator objects

Behind the scenes, a function containing yield will return a generator object (just once), which is an iterable.

It contains the local state of the function, and to provide a value it runs the function until the next yield.

Applications

  • Efficient iterables when items are expensive
  • Representing infinite sequences
  • Retain laziness despite complex logic to determine next element (e.g. nested loops)

Conversion to a sequence

The list and tuple constructors accept an iterable.

So if g is a generator object, list(g) will pull all of its items and put them in a list.

One-shot

Generator objects are "one-shot" iterables, i.e. you can only iterate over them once.

Since generator objects are usually return values of functions, it is typical to have the function call in the loop that performs iteration.

yield from

A generator can delegate to another generator, i.e. say "take values from this other generator until it is exhausted".

The syntax is

yield from GENERATOR

which is approximately equivalent to:


        for x in GENERATOR:
            yield x
    

Generator expressions

You can often remove the brackets from a list comprehension to get a generator comprehension; it behaves similarly but evaluates lazily.


        # Create a list, then sum it
        # Uses memory proportional to N
        sum([ x**2 for x in range(1,N+1) ])
        
        # Create a generator, then sum values
        # it yields.  Memory usage independent
        # of N.
        sum( x**2 for x in range(1,N+1) )
    

This won't work in a context that needs a sequence (e.g. in len(), random.choice(), ...).

Epilogue

Assorted things we didn't talk about

  • Python has type annotations! [1]
  • Python is getting a switch statement in 3.10! [2]
  • Important CS concepts with implementations in the standard library (e.g. heapq, hashlib) [3, 4]
  • Logging [5]
  • Coroutines with async/await [6]
  • Machine learning with scikit-learn [7]
  • Scientific Python (scipy) [8]

Advice

To finish off MCS 275, four pieces of advice:

  • If you like Python, keep writing programs with it. Developing depth in one language has value.
  • Learn a statically typed compiled imperative language like Java, C++, C, C#, or golang.
  • Study some interesting algorithms (e.g. Dijkstra's algorithm, Fortune's algorithm).
  • Learn about version control, e.g. git.

The end

References

  • Chapter 20 of Lutz
  • Chapter 4 of Beazley and Jones
  • MCS 260 Fall 2020 Lecture 7 discussed for loops and promised future discussion of generators.

Revision history

  • 2021-04-30 Initial publication