I/O bound applications spend significantly more time waiting for input/output operations than executing CPU instructions. Common scenarios include web interactions, disk access, web scraping, and database queries.
Python offers three approaches for enhancing concurrency in I/O bound tasks: multiprocessing, multithreading, and asynchronous I/O (asyncio). Among these, asyncio theoretically provides the highest performance due to:
- Elimination of context switching overhead between processes and threads.
- Reduced kernel-level interaction since coroutines operate entirely in user space.
- Scalability limitations in multiprocessing and multithreading (typically limited to CPU core count), whereas asyncio can scale up to system limits defined by file descriptor capabilities (e.g., epoll on Linux).
To evaluate whether theoretical advantages translate into real-world gains, we conducted tests under the following conditions:
- Accessing 500 database instances with a 100ms delay simulating query execution.
- Sequential execution
- Multiprocessing
- Multithreading
- Asyncio
- Asyncio with uvloop
The uvloop variant replaces the default event loop with a Cython implementation based on libuv, offering performance comparable to Node.js and Go.
Below are the test implementations using Python 3.7+:
Sequential Execution
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import records
user = xx
passw = xx
port = xx
hosts = [...] # List of 500 database hosts
def query(host):
conn = records.Database(f'mysql+pymysql://{user}:{passw}@{host}:{port}/mysql?charset=utf8mb4')
rows = conn.query('select sleep(0.1);')
print(rows[0])
def main():
for h in hosts:
query(h)
if __name__ == '__main__':
main()
Multiprocessing
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from concurrent import futures
import records
user = xx
passw = xx
port = xx
hosts = [...] # List of 500 database hosts
def query(host):
conn = records.Database(f'mysql+pymysql://{user}:{passw}@{host}:{port}/mysql?charset=utf8mb4')
rows = conn.query('select sleep(0.1);')
print(rows[0])
def main():
with futures.ProcessPoolExecutor() as executor:
for future in executor.map(query, hosts):
pass
if __name__ == '__main__':
main()
Multithreading
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from concurrent import futures
import records
user = xx
passw = xx
port = xx
hosts = [...] # List of 500 database hosts
def query(host):
conn = records.Database(f'mysql+pymysql://{user}:{passw}@{host}:{port}/mysql?charset=utf8mb4')
rows = conn.query('select sleep(0.1);')
print(rows[0])
def main():
with futures.ThreadPoolExecutor() as executor:
for future in executor.map(query, hosts):
pass
if __name__ == '__main__':
main()
Asyncio
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import asyncio
from databases import Database
user = xx
passw = xx
port = xx
hosts = [...] # List of 500 database hosts
async def query(host):
DATABASE_URL = f'mysql+pymysql://{user}:{passw}@{host}:{port}/mysql?charset=utf8mb4'
async with Database(DATABASE_URL) as database:
query_str = 'select sleep(0.1);'
rows = await database.fetch_all(query=query_str)
print(rows[0])
async def main():
tasks = [asyncio.create_task(query(host)) for host in hosts]
await asyncio.gather(*tasks)
if __name__ == '__main__':
asyncio.run(main())
Asyncio with uvloop
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import asyncio
import uvloop
from databases import Database
user = xx
passw = xx
port = xx
hosts = [...] # List of 500 database hosts
async def query(host):
DATABASE_URL = f'mysql+pymysql://{user}:{passw}@{host}:{port}/mysql?charset=utf8mb4'
async with Database(DATABASE_URL) as database:
query_str = 'select sleep(0.1);'
rows = await database.fetch_all(query=query_str)
print(rows[0])
async def main():
tasks = [asyncio.create_task(query(host)) for host in hosts]
await asyncio.gather(*tasks)
if __name__ == '__main__':
uvloop.install()
asyncio.run(main())
Execution Time Comparison
| Method | Time Taken |
|---|---|
| Sequential | 1m7.745s |
| Multiprocessing | 2.932s |
| Multithreading | 4.813s |
| Asyncio | 1.068s |
| Asyncio + uvloop | 0.750s |
Results show that all methods improve concurrency over sequential execution, with uvloop delivering the best performance—approximately 1/90th the time of sequential execution.
Memory Usage Comparison
Asyncio with uvloop shows the lowest memory footprint at about 60MB, compared to 1.4GB for multiprocessing, confirming that process creation is resource-intensive.
In summary, asyncio outperforms both multiprocessing and multithreading in terms of execution speed and memory efficiency. When combined with uvloop, it achieves even better results, making it an ideal choice for I/O-bound applications.