Implementing Advanced Iteration Patterns in Python

Implementing Custom Iterators and Iterables

When processing large datasets or fetching data from remote APIs, loading all data into memory at once is inefficient. Instead, a lazy-evaluation approach where data is fetched item-by-item is preferred. This can be achieved by implementing the iterator protocol. The following example defines a custom iterator that fetches data objects from a remote source only when requested.

import requests
from collections.abc import Iterable, Iterator

class DataStreamIterator(Iterator):
    '''Iterator that fetches data items one by one.'''

    def __init__(self, identifiers):
        self.identifiers = identifiers
        self.cursor = 0

    def fetch_item(self, item_id):
        '''Simulates fetching a single item from an API.'''
        response = requests.get(f'https://api.service.com/data/{item_id}')
        data = response.()
        return f"{item_id}: {data['status']}"

    def __next__(self):
        if self.cursor >= len(self.identifiers):
            raise StopIteration
        
        current_id = self.identifiers[self.cursor]
        self.cursor += 1
        return self.fetch_item(current_id)

class DataStreamIterable(Iterable):
    '''Iterable container that returns an iterator instance.'''

    def __init__(self, identifiers):
        self.identifiers = identifiers

    def __iter__(self):
        return DataStreamIterator(self.identifiers)

if __name__ == '__main__':
    ids = ['user-01', 'user-02', 'user-03']
    for item in DataStreamIterable(ids):
        print(item)

Creating Iterable Classes with Generator Functions

Python allows the __iter__ method to be implemented as a generator function. This simplifies the code by removing the need to define a separate iterator class and manually track the iteration state. This is particularly useful for generating sequences based on specific numerical criteria.

class EvenNumbers:
    def __init__(self, min_val, max_val):
        self.min_val = min_val
        self.max_val = max_val

    def is_even(self, val):
        return val % 2 == 0

    def __iter__(self):
        for num in range(self.min_val, self.max_val + 1):
            if self.is_even(num):
                yield num

if __name__ == '__main__':
    for n in EvenNumbers(1, 10):
        print(n)

Implementing Bidirectional Iteration

For custom sequence types, it is often necessary to iterate in both forward and reverse directions. The __reversed__ special method allows an object to define its behavior when used with the reversed() built-in function. The following class generates a range of floating-point numbers and supports both traversal directions.

class FloatInterval:
    def __init__(self, start, end, step=0.1):
        self.start = start
        self.end = end
        self.step = step

    def __iter__(self):
        current = self.start
        while current <= self.end:
            yield current
            current += self.step

    def __reversed__(self):
        current = self.end
        while current >= self.start:
            yield current
            current -= self.step

if __name__ == '__main__':
    print("Forward:")
    for x in FloatInterval(1.0, 3.0, 0.5):
        print(x)

    print("Reverse:")
    for x in reversed(FloatInterval(1.0, 3.0, 0.5)):
        print(x)

Slicing Iterators

Since iterators do not support indexing or the standard slicing syntax, itertools.islice is used to select a specific range of elements. This function creates a new iterator that consumes the original one lazily, making it memory-efficient for large streams.

from itertools import islice

# Simulate a file stream or large list
log_stream = range(1000)

# Get lines 50 to 60 (exclusive)
lines = islice(log_stream, 50, 60)
for line in lines:
    print(line)

# Get the first 30 lines
head = islice(log_stream, 30)

# Skip 100 and get the rest
tail = islice(log_stream, 100, None)

Iterating Over Multiple Iterables

When working with related datasets stored in separate containers, you may need to iterate over them simultaneously (parallel) or sequentially (serial). Python provides zip for parallel iteration and itertools.chain for serial iteration.

Parallel Iteration with zip

from random import randint

# Student grades for different subjects
math_scores = [randint(60, 100) for _ in range(5)]
physics_scores = [randint(60, 100) for _ in range(5)]
chem_scores = [randint(60, 100) for _ in range(5)]

# Calculate total score for each student
averages = []
for m, p, c in zip(math_scores, physics_scores, chem_scores):
    averages.append((m + p + c) / 3)

print(averages)

Serial Iteration with itertools.chain

from random import randint
from itertools import chain

# Scores from different class groups
group_a = [randint(50, 100) for _ in range(30)]
group_b = [randint(50, 100) for _ in range(32)]
group_c = [randint(50, 100) for _ in range(28)]

# Count high scores across all groups
high_score_count = 0

for score in chain(group_a, group_b, group_c):
    if score >= 90:
        high_score_count += 1

print(high_score_count)

Tags: python iterators generators Data Structures programming

Posted on Wed, 20 May 2026 17:12:30 +0000 by reloj_alfred