When debugging Python programs, exceptions are often raised. These can be due to mistakes made during programming or due to unavoidable conditions. In the former case, it's necessary to trace back to the error point using the exception's traceback and make corrections. For the latter, we can catch and handle the exceptions to prevent the program from terminating.
- Exception Types
1.1 Built-in Exceptions in Python
Python has a powerful exception handling system with many built-in exceptions that provide accurate feedback when an error occurs. All exceptions in Python are objects, and they can be manipulated. BaseException is the base class for all built-in exceptions, but user-defined classes do not directly inherit from BaseException. Instead, all exception classes inherit from Exception and are defined in the exceptions module. Python automatically places all exception names in the built-in namespace, so there's no need to import the exceptions module explicitly. If a SystemExit exception is not caught, the program will terminate. If an uncaught SystemExit exception occurs in an interactive session, the session will also terminate.
The hierarchy of built-in exception classes is as follows:
BaseException # Base class for all exceptions
+-- SystemExit # Request for the interpreter to exit
+-- KeyboardInterrupt # User interruption (typically ^C)
+-- GeneratorExit # Exception raised to notify a generator to exit
+-- Exception # Base class for regular exceptions
+-- StopIteration # No more values in an iterator
+-- StopAsyncIteration # Must be raised by the __anext__() method of an asynchronous iterator to stop iteration
+-- ArithmeticError # Base class for various arithmetic errors
| +-- FloatingPointError # Floating point calculation error
| +-- OverflowError # Result of a numerical operation too large to represent
| +-- ZeroDivisionError # Division (or modulo) by zero (for all data types)
+-- AssertionError # Raised when an assert statement fails
+-- AttributeError # Failure to reference or assign an attribute
+-- BufferError # Raised when an operation cannot be performed on a buffer
+-- EOFError # Raised when input() reaches end-of-file without reading any data
+-- ImportError # Failure to import a module/object
| +-- ModuleNotFoundError # Module not found or None in sys.modules
+-- LookupError # Base class for exceptions raised when a key or index is invalid
| +-- IndexError # Index not present in a sequence
| +-- KeyError # Key not present in a mapping
+-- MemoryError # Out of memory error (not fatal to the Python interpreter)
+-- NameError # Object not declared/initialized (no attribute)
| +-- UnboundLocalError # Accessing an uninitialized local variable
+-- OSError # Operating system error; EnvironmentError, IOError, WindowsError, socket.error, select.error, and mmap.error have been merged into OSError
| +-- BlockingIOError # Operation would block on an object (e.g., socket) set to non-blocking
| +-- ChildProcessError # Operation on a child process failed
| +-- ConnectionError # Base class for connection-related exceptions
| | +-- BrokenPipeError # Writing to a pipe or socket that has been closed
| | +-- ConnectionAbortedError # Connection attempt was aborted by the peer
| | +-- ConnectionRefusedError # Connection attempt was refused by the peer
| | +-- ConnectionResetError # Connection was reset by the peer
| +-- FileExistsError # Attempting to create an existing file or directory
| +-- FileNotFoundError # Requesting a non-existent file or directory
| +-- InterruptedError # System call interrupted by a signal
| +-- IsADirectoryError # Requesting a file operation on a directory (e.g., os.remove())
| +-- NotADirectoryError # Requesting a directory operation on something that is not a directory (e.g., os.listdir())
| +-- PermissionError # Attempting to perform an operation without sufficient access rights
| +-- ProcessLookupError # Given process does not exist
| +-- TimeoutError # System-level timeout occurred
+-- ReferenceError # Weak reference attempts to access an object that has been garbage collected
+-- RuntimeError # Triggered when an error is detected that does not fit into any other category
| +-- NotImplementedError # Indicates that an abstract method in a user-defined base class needs to be overridden
| +-- RecursionError # Interpreter detects maximum recursion depth exceeded
+-- SyntaxError # Python syntax error
| +-- IndentationError # Indentation error
| +-- TabError # Mixed use of tabs and spaces
+-- SystemError # Interpreter discovers an internal error
+-- TypeError # Operation or function applied to an inappropriate type of object
+-- ValueError # Operation or function receives a parameter of correct type but unsuitable value
| +-- UnicodeError # Unicode-related encoding or decoding error
| +-- UnicodeDecodeError # Unicode decoding error
| +-- UnicodeEncodeError # Unicode encoding error
| +-- UnicodeTranslateError # Unicode translation error
+-- Warning # Base class for warnings
+-- DeprecationWarning # Warning about deprecated features
+-- PendingDeprecationWarning # Warning about features that are to be deprecated
+-- RuntimeWarning # Warning about suspicious runtime behavior
+-- SyntaxWarning # Warning about questionable syntax
+-- UserWarning # Warning generated by user code
+-- FutureWarning # Warning about deprecated features
+-- ImportWarning # Warning about potential issues during module import
+-- UnicodeWarning # Warning related to Unicode
+-- BytesWarning # Warning related to bytes and bytearray
+-- ResourceWarning # Warning related to resource usage. Ignored by default.
For detailed information, refer to: https://docs.python.org/3/library/exceptions.html#base-classes
1.2 Excepsions from the requests Module
When working with web scraping, the requests module is very useful. Here, we'll discuss the exceptions specific to this module.
To use the built-in exceptions from the requests module, you can simply import them like this:
from requests.exceptions import ConnectionError, ReadTimeout
Or you can also import them directly:
from requests import ConnectionError, ReadTimeout
The hierarchy of exceptions in the requests module is as follows:
IOError
+-- RequestException # Handles uncertain request exceptions
+-- HTTPError # HTTP error
+-- ConnectionError # Connection error
| +-- ProxyError # Proxy error
| +-- SSLError # SSL error
| +-- ConnectTimeout(+-- Timeout) # Request timed out while connecting to a remote server
+-- Timeout # Request timeout
| +-- ReadTimeout # Server did not send any data within the specified time
+-- URLRequired # A valid URL is required to make a request
+-- TooManyRedirects # Too many redirects
+-- MissingSchema(+-- ValueError) # Missing URL schema (e.g., http or https)
+-- InvalidSchema(+-- ValueError) # Invalid schema, see defaults.py for valid schemas
+-- InvalidURL(+-- ValueError) # Invalid URL
| +-- InvalidProxyURL # Invalid proxy URL
+-- InvalidHeader(+-- ValueError) # Invalid header
+-- ChunkedEncodingError # Server declared chunked encoding but sent an invalid chunk
+-- ContentDecodingError(+-- BaseHTTPError) # Unable to decode response content
+-- StreamConsumedError(+-- TypeError) # The response content has already been used
+-- RetryError # Custom retry logic failed
+-- UnrewindableBodyError # An error occurred while trying to rewind the body
+-- FileModeWarning(+-- DeprecationWarning) # File opened in text mode but Requests determined its binary length
+-- RequestsDependencyWarning # Dependency version mismatch
Warning
+-- RequestsWarning # Base warning for requests
For more details and source code, refer to: http://www.python-requests.org/en/master/_modules/requests/exceptions/#RequestException
Here is a simple example where Python's built-in ConnectionError is used, so there's no need to import it from the requests module:
import requests
from requests import ReadTimeout
def get_page(url):
try:
response = requests.get(url, timeout=1)
if response.status_code == 200:
return response.text
else:
print('Get Page Failed', response.status_code)
return None
except (ConnectionError, ReadTimeout):
print('Crawling Failed', url)
return None
def main():
url = 'https://www.baidu.com'
print(get_page(url))
if __name__ == '__main__':
main()
1.3 Custom Exceptions
You can also define your own exceptions by creating a new exception class that inherits from Exception. Below is an example of a MyError class, which is based on Exception, and provides additional information when an exception is raised.
In the try block, after raising the custom exception, the corresponding except block is executed. The variable e is used to create an instance of the MyError class.
class MyError(Exception):
def __init__(self, msg):
self.msg = msg
def __str__(self):
return self.msg
try:
raise MyError('Type Error')
except MyError as e:
print('My exception occurred', e.msg)
- Exception Handling
When an exception occurs, it should be captured and handled accordingly. Python commonly uses the try...except structure to handle exceptions. Place the code that may cause an error inside the try block, and use except to handle the exception. Each try must have at least one except. Other keywords related to Python exceptions enclude:
2.1 Catch All Exceptions
This includes keyboard interrupts and exit requests (using sys.exit() will not exit the program because the exception is caught). Use this carefully.
try:
<statement>
except:
print('Exception message')
2.2 Catch Specific Exceptions
try:
<statement>
except <exception name>:
print('Exception message')
General exception:
try:
<statement>
except Exception:
print('Exception message')
An example:
try:
f = open("file-not-exists", "r")
except IOError as e:
print("open exception: %s: %s" %(e.errno, e.strerror))
2.3 Catch Multiple Exceptions
There are two ways to catch multiple exceptions. The first way is to handle multiple exceptions in a single except clause without prioritizing:
try:
<statement>
except (<exception name1>, <exception name2>, ...):
print('Exception message')
The second way is to prioritize:
try:
<statement>
except <exception name1>:
print('Exception message1')
except <exception name2>:
print('Exception message2')
except <exception name3>:
print('Exception message3')
The rule for this syntax is:
The try block executes the statements. If an exception is raised, the execution jumps to the first except clause.
If the exception matches the one defined in the first except clause, the code in that except block is executed.
If the exception does not match the first except, it searches the next except, and there is no limit to the number of except clauses.
If none of the except clauses match, the exception is passed to the highest-level try block that called this code.
2.4 Else Clause in Exceptions
If you want to perform other actions after checking for certain exceptions, you can use the else clause.
try:
<statement>
except <exception name1>:
print('Exception message1')
except <exception name2>:
print('Exception message2')
else:
<statement> # This code is executed if no exception occurs in the try block
2.5 Finally Clause in Exceptions
The finally clause is always executed regardless of whether an exception occurs.
try:
<statement>
finally:
<statement>
Here is an example:
str1 = 'hello world'
try:
int(str1)
except IndexError as e:
print(e)
except KeyError as e:
print(e)
except ValueError as e:
print(e)
else:
print('No exception in try block')
finally:
print('I will execute regardless of exceptions')
2.6 Raising Exceptions Manually
You can manually trigger an exception using the raise statement. The syntax is as follows:
raise [Exception [, args [, traceback]]]
In this statement, Exception is the type of the exception (e.g., ValueError), and args is the exception argument. It is optional; if not provided, the argument is None. The last praameter is the traceback object, which is also optional (rarely used in practice).
An example:
def not_zero(num):
try:
if num == 0:
raise ValueError('Parameter error')
return num
except Exception as e:
print(e)
not_zero(0)
2.7 Using the traceback Module to View Exceptions
When an exception occurs, Python remembers the exception and the current state of the program. Python also maintains a traceback object, which contains information about the function call stack at the time of the exception. Remember, exceptions can occur deep within nested function calls. When a function is called, Python inserts the function name at the beginning of the function call stack. Once an exception is raised, Python searches for a matching exception handler. If no handler is found in the current function, the current function terminates, and Python searches the calling function, and so on, until a matching handler is found or Python reaches the main program. This process of finding a matching exception handler is called "stack unwinding." The interpreter maintains information about functions placed on the stack and also about functions that have been "unwound" from the stack.
The format is as follows:
try:
block
except:
traceback.print_exc()
An example:
try:
1/0
except Exception as e:
print(e)
If written this way, the program will only report "division by zero," but it won't know where the error occurred in the file, function, or line.
Using the traceback module, as per the official documentation: https://docs.python.org/2/library/traceback.html
import traceback
try:
1/0
except Exception as e:
traceback.print_exc()
This will help trace back to the exact location of the error:
Traceback (most recent call last):
File "E:/PycharmProjects/ProxyPool-master/proxypool/test.py", line 4, in <module>
1/0
ZeroDivisionError: division by zero
What is the difference between traceback.print_exc() and traceback.format_exc()?
The difference is that format_exc() returns a string, while print_exc() prints it directly. That is, traceback.print_exc() is equivalent to print(traceback.format_exc()). The print_exc() function can also accept a file parameter to write directly to a file. For example, you can write the relevant information to a file named tb.txt as follows.
traceback.print_exc(file=open('tb.txt','w+'))
Original source: https://blog.csdn.net/polyhedronx/article/details/81589196