Python's Powerful Triad: Iterators, Generators, and Decorators
Containers
A container is a data structure that organizes multiple elements. Elements in a container can be retrieved one by one, and the 'in' and 'not in' keywords can be used to check if an element is contained within. Typically, these data structures store all elements in memory (though there are exceptions, like iterators and generator objects which don't store all elements in memory). Common container objects in Python include:
- list, deque...
- set, frozensets...
- dict, defaultdict, OrderedDict, Counter...
- tuple, namedtuple...
- str
The concept of a container is like a box where you can put things. When it can be used to ask whether a particular element is contained within, then this object can be considered a container. For example, list, set, and tuples are all container objects:
>> assert 1 in [1, 2, 3] # lists
>>> assert 4 not in [1, 2, 3]
>>> assert 1 in {1, 2, 3} # sets
>>> assert 4 not in {1, 2, 3}
>>> assert 1 in (1, 2, 3) # tuples
>>> assert 4 not in (1, 2, 3)
To check if an element is in a dict, we use its keys:
>> d = {1: 'foo', 2: 'bar', 3: 'qux'}
>>> assert 1 in d
>>> assert 'foo' not in d # 'foo' is not an element in the dict
To check if a substring is in a string:
>> s = 'foobar'
>>> assert 'b' in s
>>> assert 'x' not in s
>>> assert 'foo' in s
Although most containers provide some way to get each element, this capability is not provided by the container itself but by the iterable object that gives the container this ability. Not all containers are iterable, however. For example, a Bloom filter can be used to check if an element is contained in a container, but you cannot retrieve each value from it because the Bloom filter doesn't actually store elements in the container but maps them to values in an array through a hash function.
Iterable Objects
Most objects are iterable as long as they implement the __iter__ method. The __iter__ method returns an iterator (iterator) itself, for example:
>> lst = [1,2,3]
>>> lst.__iter__()
<listiterator object at 0x7f97c549aa50>
Python provides statements and keywords for accessing elements of iterable objects, such as for loops, list comprehensions, and logical operators.
To determine if an object is iterable:
>> from collections import Iterable
>>> isinstance('abc', Iterable)
True
>>> isinstance(1, Iterable)
False
>>> isinstance([], Iterable)
True
The isinstance() function is used to determine the type of an object. Iterable objects are generally traversed with for loops, meaning any object that can be used with a for loop can be called an iterable object. For example, traversing a list:
>> lst = [1, 2, 3]
>>> for item in lst:
... print(item)
...
Iterators
The iterator protocol states: an object must provide a next method, which either returns the next item in the iteration or raises a StopIteration exception to terminate the iteration (can only move forward, not backward).
Objects that implement the iterator protocol (define a __iter__() method inside the object).
Python's internal tools (such as for loops, sum, min, max functions, etc.) access objects based on the iterator protocol.
Benefits of using iterators:
- If using a list, all values are computed at once when accessed, which consumes more memory. With iterators, values are computed one by one.
- Makes code more generic and simpler.
To check if something is an iterator:
>> from collections import Iterator
>>> isinstance(d, Iterator)
False
>>> isinstance(d.items(), Iterator)
True
Using the next method:
>> iter_items = d.items()
>>> next(iter_items)
('a', 1)
>>> next(iter_items)
('c', 3)
>>> next(iter_items)
('b', 2)
Iterator principles:
# Based on iterator protocol
my_list = [1,2,3]
my_iter = my_list.__iter__()
print(next(my_iter))
print(next(my_iter))
print(next(my_iter))
# Using indexes
print(my_list[0])
print(my_list[1])
print(my_list[2])
# Simulating for loop mechanism with while
my_iter = my_list.__iter__()
while True:
try:
print(next(my_iter))
except StopIteration:
print("Iteration complete, loop terminated")
break
# For loop access method
# The essence of for loops is the same for all objects - they follow the iterator protocol
# First calling my_iter = my_list.__iter__ method
# Or directly my_iter = iter(my_list), then executing my_iter.__next__() repeatedly
# Until StopIteration is caught to terminate the loop
Generators
Generators can be understood as a data type that automatically implements the iterator protocol.
When a generator is running, each time it encounters yield, the function pauses and saves all current running information, returning the value of yield. The next time the next() method is executed, it continues running from the current position.
Generator Functions
Generator functions are functions with yield (1. they return values 2. they preserve the running state of the function).
next(generator), generator.__next__(), generator.send() (can send values to the previous yield)
# Using generator functions
# yield is equivalent to return, controlling the function's return value
# Another characteristic of x=yield is that it can receive values sent by send and assign them to x
def sample_generator():
print("Starting")
first = yield
print("First", first)
yield 2
print("Second")
gen = sample_generator()
print(next(gen))
result = next(gen)
print(result)
result = gen.send("Value assigned to first when paused at yield")
print(result)
Output:
Starting
None
Starting
None
First Value assigned to first when paused at yield
2
Generator Expressions
print(sum(i for i in range(10000))) # Expression typically used in for loops (i for i in range(10000))
# Function: saves memory, has already implemented __iter__ method internally
Decorators
Definition
Decorators add additional functionality to a function without modifying its source code or how it's called.
Decorator = Higher-order function + Function nesting + Closure
import time
def timer_decorator(func):
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
print(f"Execution time: {end_time - start_time} seconds")
return result
return wrapper
def sample_function(name, age):
time.sleep(1)
print("Processing")
return "Completed"
# Decorate the function
sample_function = timer_decorator(sample_function)
# Call the decorated function
result = sample_function("Alice", age=25)
print(result)
Closures
In Python, a closure is formed when three conditions are met, all of which are necessary:
- There must be an inner function (a function defined within another function) - corresponding to nesting between functions
- The inner function must reference a variable defined in the enclosing scope (within the outer function) - the inner function references an external variable
- The outer function must return the inner function
def outer_function():
x = 5
def inner_function():
nonlocal x
x += 1
return x
return inner_function
# Create a closure
closure = outer_function()
print(closure()) # Output: 6
print(closure()) # Output: 7
Advantages of closures in Python:
- Avoids the use of global variables
- Can provide partial data hiding
- Can provide a more elegant object-oriented implementation