Python yield

The `yield` statement in Python is used to create generator functions, which return an iterator one value at a time. Unlike `return`, which ends the function’s execution and sends back a single value, `yield` pauses the function, saving its current state and returning a value. When the function is called again, it resumes from where it left off, making it ideal for managing sequences or large data sets efficiently.

How yield Works

When a function contains yield, it does not return a single value and exit; instead, it returns a generator object that can be iterated over. Each time the generator’s __next__() method is called (or when you iterate over it), the function runs until it hits the next yield statement, which provides a value and pauses the function's state.

Benefits of Using yield

Memory Efficiency: Only one value is generated at a time, reducing memory consumption compared to creating and storing a large list.

Lazy Evaluation: Values are computed only when needed, which can improve performance, especially with large datasets.

Simplified Code: Generators can simplify the implementation of iterators, making the code cleaner and more readable.

Use Cases

Processing Streams of Data: Useful for reading large files or streaming data where you don't want to load everything into memory at once.

Infinite Sequences: You can create infinite sequences without the risk of running out of memory.

1. Basics of `yield`

To use `yield`, define a function as you normally would but include `yield` instead of `return`. Calling the function does not execute the function immediately; it instead returns a generator object.
# A simple generator function
def generate_numbers():
    yield 1
    yield 2
    yield 3

# Creating a generator
gen = generate_numbers()
print("Generator:", gen)

# Accessing values in the generator
for value in gen:
    print("Generated:", value)

Output:
Generator: <generator object generate_numbers at 0x7fdce8a50ac0>
Generated: 1
Generated: 2
Generated: 3

Explanation: Here, `generate_numbers()` is a generator function. Each `yield` provides a value one at a time, and the function’s state is paused until the next value is requested.

2. Difference Between `yield` and `return`

The main difference between `yield` and `return` is that `return` exits the function entirely and sends back a single value. `yield`, however, pauses the function's state and can return multiple values over time.
# Using return
def return_example():
    return 1
    return 2  # This line is never reached

print("Return Example:", return_example())

# Using yield
def yield_example():
    yield 1
    yield 2

# Output from yield example
for value in yield_example():
    print("Yield Example:", value)

Output:
Return Example: 1
Yield Example: 1
Yield Example: 2

Explanation: `return_example()` only returns 1 and stops. However, `yield_example()` provides each value sequentially using `yield`, allowing access to both values.

3. Using `yield` in a Loop

A common use case for `yield` is within loops, where each iteration yields the next value. This is especially useful for processing or generating large sequences on-the-fly without consuming excessive memory.
def countdown(n):
    while n > 0:
        yield n
        n -= 1

# Output countdown from 5
for value in countdown(5):
    print("Counting down:", value)

Output:
Counting down: 5
Counting down: 4
Counting down: 3
Counting down: 2
Counting down: 1

Explanation: The `countdown()` function yields the value of `n` on each loop iteration, reducing memory usage by providing values on-demand without storing the whole sequence.

4. `yield` in Recursive Generators

You can use `yield` in recursive functions for generating items from a nested structure:
# Recursive generator function
def flatten(nested_list):
    for item in nested_list:
        if isinstance(item, list):
            yield from flatten(item)  # Recursively yield items from nested lists
        else:
            yield item

# Flattening a nested list
nested = [1, [2, [3, 4]], 5]
for value in flatten(nested):
    print("Flattened value:", value)

Output:
Flattened value: 1
Flattened value: 2
Flattened value: 3
Flattened value: 4
Flattened value: 5

Explanation: The `flatten()` function uses recursion to unpack nested lists and `yield from` to yield items from sub-lists, simplifying nested data traversal.

5. Stateful Generators with `yield`

Because `yield` pauses function execution, it’s possible to maintain state across function calls, allowing it to keep track of previous values or conditions.
def fibonacci_sequence():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Generating first 5 Fibonacci numbers
fib = fibonacci_sequence()
for _ in range(5):
    print("Fibonacci number:", next(fib))

Output:
Fibonacci number: 0
Fibonacci number: 1
Fibonacci number: 1
Fibonacci number: 2
Fibonacci number: 3

Explanation: The `fibonacci_sequence()` generator maintains its own state (`a` and `b` values) across calls, allowing it to keep generating new Fibonacci numbers on-demand.

6. Passing Values to a Generator with `send()`

Generators can receive external values via `send()`, which passes a value back to the `yield` expression, allowing communication with the generator from outside.
def running_total():
    total = 0
    while True:
        num = yield total
        total += num

# Using the generator
gen = running_total()
print("Initial Total:", next(gen))  # Initialize the generator

print("Running Total:", gen.send(10))  # Add 10
print("Running Total:", gen.send(20))  # Add 20
print("Running Total:", gen.send(5))   # Add 5

Output:
Initial Total: 0
Running Total: 10
Running Total: 30
Running Total: 35

Explanation: The `running_total()` generator keeps a cumulative total. Using `send()`, values are sent to `yield`, where they are processed to update and return the total.

7. Using `yield` with `Generator Expressions`

`yield` and `generator expressions` offer memory-efficient ways to create iterators on-the-fly. Generator expressions are similar to list comprehensions but produce items lazily:
# List comprehension (eager evaluation)
squares_list = [x ** 2 for x in range(5)]
print("List comprehension squares:", squares_list)

# Generator expression (lazy evaluation)
squares_gen = (x ** 2 for x in range(5))
print("Generator expression squares:", list(squares_gen))

Output:
List comprehension squares: [0, 1, 4, 9, 16]
Generator expression squares: [0, 1, 4, 9, 16]

Explanation: The generator expression `(x ** 2 for x in range(5))` is similar to `[x ** 2 for x in range(5)]`, but it does not store all values in memory at once, making it more memory-efficient for large datasets.

8. Memory Efficiency of `yield`

The memory efficiency of `yield` is especially noticeable when working with large datasets. Instead of storing all values in memory, `yield` generates items only when needed.
def large_range():
    for i in range(1_000_000):
        yield i

# Generator for large range
large_gen = large_range()
print("First value:", next(large_gen))
print("Second value:", next(large_gen))

Output:
First value: 0
Second value: 1

Explanation: Instead of creating a list with 1 million items, `large_range()` provides items on-demand using `yield`, significantly reducing memory usage.

Summary

Python’s `yield` provides an efficient and powerful way to manage sequences, allowing for memory-friendly, lazy evaluation and supporting complex use cases like recursive generators, stateful processing, and on-the-fly computations. The `yield` statement is key to generator functions, making Python well-suited for handling large data streams and building concise, flexible, and maintainable code.

Previous: Python lambda | Next: Python Iterators and Generators

<
>