Dictionary's Abstract Base Class Inheritance
In Python, dictionaries belong to the mapping type. Let's examine their inheritance relationship by examining the source code.from collections.abc import Mapping, MutableMappingExamining the MutableMapping source code:
class MutableMapping(Mapping):
__slots__ = ()
"""A MutableMapping is a generic container for associating
key/value pairs.
This class provides concrete generic implementations of all
methods except for __getitem__, __setitem__, __delitem__,
__iter__, and __len__.
"""
@abstractmethod
def __setitem__(self, key, value):
raise KeyError
@abstractmethod
def __delitem__(self, key):
raise KeyError
__marker = object()
def pop(self, key, default=__marker):
'''D.pop(k[,d]) -> v, remove specified key and return the corresponding value.
If key is not found, d is returned if given, otherwise KeyError is raised.
'''
try:
value = self[key]
except KeyError:
if default is self.__marker:
raise
return default
else:
del self[key]
return value
def popitem(self):
'''D.popitem() -> (k, v), remove and return some (key, value) pair
as a 2-tuple; but raise KeyError if D is empty.
'''
try:
key = next(iter(self))
except StopIteration:
raise KeyError
value = self[key]
del self[key]
return key, value
def clear(self):
'D.clear() -> None. Remove all items from D.'
try:
while True:
self.popitem()
except KeyError:
pass
def update(*args, **kwds):
''' D.update([E, ]**F) -> None. Update D from mapping/iterable E and F.
If E present and has a .keys() method, does: for k in E: D[k] = E[k]
If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v
In either case, this is followed by: for k, v in F.items(): D[k] = v
'''
if not args:
raise TypeError("descriptor 'update' of 'MutableMapping' object "
"needs an argument")
self, *args = args
if len(args) > 1:
raise TypeError('update expected at most 1 arguments, got %d' %
len(args))
if args:
other = args[0]
if isinstance(other, Mapping):
for key in other:
self[key] = other[key]
elif hasattr(other, "keys"):
for key in other.keys():
self[key] = other[key]
else:
for key, value in other:
self[key] = value
for key, value in kwds.items():
self[key] = value
def setdefault(self, key, default=None):
'D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D'
try:
return self[key]
except KeyError:
self[key] = default
return default
MutableMapping.register(dict)
We can see that MutableMapping inherits from Mapping (immutable), and finally registers dict with MutableMapping. We can verify a dictionary's type using isinstance:
from collections.abc import Mapping, MutableMapping
# dict is a mapping type
sample_dict = {}
print(isinstance(sample_dict, MutableMapping))
Common Dictionary Methods
Shallow Copy
data = {
"person1":{"organization":"tech"},
"person2": {"organization": "tech2"}
}
# copy returns a shallow copy
copied_dict = data.copy()
copied_dict["person1"]["organization"] = "tech3"
fromkeys
Converts an iterable object to a dictionary:
name_list = ["alice", "bob"]
new_dict = dict.fromkeys(name_list, {"company":"tech"})
get
Retrieves values with a default option to avoid KeyError exceptions:
data = {"user1": "john", "user2": "jane"}
value = data.get("user3", "not found")
setdefault
Similar to get(), but if a key doesn't exist, it adds the key with the default value:
data = {"user1": "john", "user2": "jane"}
# When key doesn't exist
data.setdefault("user3", "not found")
update
Used to add dictionary elements:
data = {"user1": "john", "user2": "jane"}
# Direct dictionary addition
data.update({"user3": "hong"})
# Using keyword arguments
data.update(user4="david", user5="emma")
Dictionary Subclasses
When we want to inherit from dict to override logic, it's not recommended to directly inherit from built-in types like dict and list because they're implemented in C. Our overridden methods might not work as expected:
# Not recommended to inherit from dict
class MyDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, value*2)
custom_dict = MyDict(one=1) # Doesn't take effect here
print(custom_dict)
custom_dict["one"] = 1 # Takes effect here
print(custom_dict)
Instead, we should use UserDict provided by Python:
from collections import UserDict
class MyDict(UserDict):
def __setitem__(self, key, value):
super().__setitem__(key, value*2)
custom_dict = MyDict(one=1)
Default Dictionary
When accessing a non-existent value, it returns a default value instead of raising a KeyError:from collections import defaultdict custom_dict = defaultdict(dict) custom_value = custom_dict["new_key"]
Set and Frozenset
Initializing a Set
# Using set keyword
set1 = set('abc')
# Using {}
set2 = {'a', 'b'}
print(type(set1), type(set2))
Adding Elements to a Set
set1 = set('abc')
set1.add('d')
print(set1)
Updating a Set
set1 = set('abc')
set2 = set('xy')
set1.update(set2)
print(set1)
Set Difference
set1 = set('abc')
set2 = set('cd')
# Equivalent to set1 - set2
result_set = set1.difference(set2)
print(result_set)
Set Mathematical Operations
set1 = set('abc')
set2 = set('cd')
# Difference
print(set1-set2)
# Intersection
print(set1 & set2)
# Union
print(set1 | set2)
Subset Check
set1 = set('abc')
set2 = set('c')
print(set2.issubset(set1))
Implementation Principles of Dict and Set
List vs. Dictionary Performance
- Dictionary lookup performance is significantly greater than lists
- In lists, as the data size increases, lookup time for the same amount of data also increases
- In dictionaries, as the data size increases, lookup time is relatively unaffected
Dictionary Principle Summary
Dict values are found quickly because they use the key to calculate a hash offset and directly access the value, resulting in O(1) time complexity.
- Dictionary keys or set values must be hashable. Their implementation principles are the same. Immutable objects are hashable, such as strings, tuples, and frozensets.
- Dictionaries have high memory overhead but fast query speed. In custom classes, as long as the magic function __hash__ is added, the class becomes hashable.
- The storage order of a dictionary is related to the order in which elements are added, but may be affected by hash value conflicts.
- When adding data, the order of existing data might change: when storing a dictionary, Python pre-allocates a continuous memory space larger than the dictionary's requirements to reduce hash conflicts. When the amount of added data exceeds 1/3 of the allocated memory space, Python will apply for a larger memory space, migrate the original data, and recalculate hash values.