Python asyncio Fundamentals
Asyncio is Python’s standard library for writing concurrent code using async/await syntax. It’s built around an event loop that manages cooperative multitasking.
Core Concepts
- Event Loop: The central execution mechanism that runs async tasks and handles I/O operations. When a task awaits something, the loop switches to another task instead of blocking.
- Coroutines: Functions defined with
async def. They’re suspended execution units that can yield control back to the event loop usingawait. - Tasks: Wrapped coroutines scheduled to run on the event loop. They enable concurrent execution.
Basic Usage
1import asyncio
2
3async def fetch_data(id):
4 await asyncio.sleep(1) # Simulates I/O
5 return f"Data {id}"
6
7async def main():
8 # Sequential - takes 3 seconds
9 result1 = await fetch_data(1)
10 result2 = await fetch_data(2)
11 result3 = await fetch_data(3)
12
13 # Concurrent - takes 1 second
14 results = await asyncio.gather(
15 fetch_data(1),
16 fetch_data(2),
17 fetch_data(3)
18 )
19
20asyncio.run(main())Key Functions:
asyncio.run(): Entry point that creates and closes the event loopawait: Suspends coroutine execution until the awaited operation completesasyncio.gather(): Runs multiple coroutines concurrently, returns results in orderasyncio.create_task(): Schedules a coroutine to run as a Taskasyncio.sleep(): Non-blocking sleep
When to Use
Asyncio excels at I/O-bound operations where you’re waiting on external resources (network requests, file operations, database queries). It’s ineffective for CPU-bound tasks—those need multiprocessing.
Common Pitfalls
- Blocking calls kill performance: Regular
time.sleep()or synchronous I/O blocks the entire event loop. Use async equivalents or run blocking code in executors. - Can’t await non-async functions: You can’t make synchronous code asynchronous just by adding
await. - Exception handling: Exceptions in tasks won’t propagate unless you await them or use proper exception handling with gather.
- Not a magic bullet: Adds complexity. Only worth it when you have significant I/O wait times and many concurrent operations.
The syntax is straightforward, but debugging async code and understanding execution flow takes practice.
Creating Coroutines
Basic method: use async def
1async def my_coro():
2 return "result"NB: calling a coroutine doesn’t run it - it returns a coroutine object that needs to be awaited:
1async def fetch_data():
2 return "data"
3
4# This creates a coroutine OBJECT, doesn't execute anything
5coro = fetch_data()
6
7# Must await it or schedule it to actually run
8result = await coro # Inside another async function
9# OR
10result = asyncio.run(coro) # From sync codeTasks
A Task wraps a coroutine and schedules it to run on the event loop. Unlike bare coroutines, Tasks start executing immediately (as soon as the event loop gets control).
1async def fetch(id):
2 await asyncio.sleep(1)
3 return f"Result {id}"
4
5async def main():
6 # Create a task - starts running immediately
7 task = asyncio.create_task(fetch(1))
8
9 # Do other work while task runs in background
10 print("Task is running...")
11
12 # Wait for it to complete
13 result = await taskTasks run concurrently with other code. Creating a task doesn’t block—it schedules the coroutine and returns immediately.
Running multiple coroutines
Method 1: asyncio.gather() - Run all, collect all results
1async def main():
2 results = await asyncio.gather(
3 fetch(1),
4 fetch(2),
5 fetch(3)
6 )
7 # results = ["Result 1", "Result 2", "Result 3"]Returns results in the order you passed coroutines. If one fails, by default it raises the exception (use return_exceptions=True to collect exceptions as values).
Method 2: Create tasks explicitly
1async def main():
2 task1 = asyncio.create_task(fetch(1))
3 task2 = asyncio.create_task(fetch(2))
4 task3 = asyncio.create_task(fetch(3))
5
6 # All three are now running concurrently
7
8 result1 = await task1
9 result2 = await task2
10 result3 = await task3More control over individual tasks. You can cancel them, check their status, etc.
Method 3: asyncio.wait() - More control over completion
1async def main():
2 tasks = [asyncio.create_task(fetch(i)) for i in range(3)]
3
4 # Wait for first one to complete
5 done, pending = await asyncio.wait(tasks, return_when=asyncio.FIRST_COMPLETED)
6
7 # Or wait for all
8 done, pending = await asyncio.wait(tasks)Returns sets of done and pending tasks. Useful when you don’t care about order or want partial results.
Method 4: asyncio.as_completed() - Process as they finish
1async def main():
2 coros = [fetch(i) for i in range(3)]
3
4 for coro in asyncio.as_completed(coros):
5 result = await coro
6 print(f"Got {result}") # Prints in completion order, not submission orderKey Differences
gather(): Simple, maintains order, good for “run all and get all results”create_task(): Maximum control, can cancel/monitor individual taskswait(): Control over completion conditions (first, all, or some)as_completed(): Process results as they arrive
Task Methods
1task = asyncio.create_task(fetch(1))
2
3task.cancel() # Cancel the task
4task.done() # Check if finished
5task.cancelled() # Check if cancelled
6task.result() # Get result (blocks if not done)
7task.exception() # Get exception if failedMost common pattern: use gather() for simple concurrent execution, use explicit tasks when you need control.
HTTP Requests
Asyncio doesn’t include HTTP functionality. You need an async HTTP library. aiohttp is the standard choice.
1import asyncio
2import aiohttp
3
4# basic request
5async def fetch(url):
6 async with aiohttp.ClientSession() as session:
7 async with session.get(url) as response:
8 return await response.text()
9
10async def main():
11 html = await fetch('https://example.com')
12 print(html)
13
14asyncio.run(main())
15
16# multiple requests
17async def fetch(session, url):
18 async with session.get(url) as response:
19 return await response.text()
20
21async def main():
22 urls = [
23 'https://example.com',
24 'https://httpbin.org/delay/1',
25 'https://api.github.com'
26 ]
27
28 # Reuse same session for all requests
29 async with aiohttp.ClientSession() as session:
30 tasks = [fetch(session, url) for url in urls]
31 results = await asyncio.gather(*tasks)
32
33 print(f"Fetched {len(results)} pages")
34
35asyncio.run(main())Common operations:
JSON response:
1async with session.get(url) as response:
2 data = await response.json()POST request
1async with session.post(url, json={'key': 'value'}) as response:
2 result = await response.json()Headers and parameters:
1headers = {'Authorization': 'Bearer token'}
2params = {'q': 'search term'}
3
4async with session.get(url, headers=headers, params=params) as response:
5 data = await response.text()Status and error handling:
1async with session.get(url) as response:
2 if response.status == 200:
3 data = await response.json()
4 else:
5 print(f"Error: {response.status}")httpx is a newer library with a requests-like API that supports both sync and async:
1import httpx
2
3async def fetch(url):
4 async with httpx.AsyncClient() as client:
5 response = await client.get(url)
6 return response.text
7
8# Or multiple requests
9async with httpx.AsyncClient() as client:
10 responses = await asyncio.gather(
11 client.get(url1),
12 client.get(url2)
13 )Semaphores
A semaphore limits how many coroutines can execute a section of code simultaneously.
1import asyncio
2import aiohttp
3
4async def fetch(session, url, semaphore):
5 async with semaphore:
6 # Only 3 requests will be here at once
7 async with session.get(url) as response:
8 return await response.text()
9
10async def main():
11 urls = [f'https://httpbin.org/delay/1?id={i}' for i in range(10)]
12
13 semaphore = asyncio.Semaphore(3) # Max 3 concurrent
14
15 async with aiohttp.ClientSession() as session:
16 tasks = [fetch(session, url, semaphore) for url in urls]
17 results = await asyncio.gather(*tasks)
18
19 print(f"Fetched {len(results)} pages")
20
21asyncio.run(main())The semaphore acts as a gatekeeper. When 3 requests are running, the 4th waits at async with semaphore until one completes.
Key Differences with threading library
Concurrency Model
- Threading: Preemptive multitasking. The OS decides when to switch between threads, can happen anytime.
- Asyncio: Cooperative multitasking. You explicitly yield control with await. Context switches only happen at await points.
Parallelism
- Threading: Can achieve true parallelism for I/O operations, but Python’s GIL (Global Interpreter Lock) prevents true parallel CPU execution. Only one thread executes Python bytecode at a time.
- Asyncio: Single-threaded, no parallelism at all. Everything runs on one thread via the event loop.
Overhead
- Threading: Each thread has its own stack (typically 1-8 MB). Creating 10,000 threads will exhaust memory.
- Asyncio: Coroutines are extremely lightweight (few KB). You can easily run 10,000+ concurrent tasks.
Complexity & Bugs
- Threading: Race conditions, deadlocks, and data corruption from shared state are real problems. Requires locks, semaphores, careful synchronization.
- Asyncio: Much safer. Since execution is cooperative and single-threaded, you know exactly when context switches occur. Shared state is less problematic.
Performance
- Threading: Better for I/O when you have moderate concurrency (dozens to hundreds of operations). Lower overhead per operation.
- Asyncio: Better for high concurrency scenarios (thousands of connections). Network servers, web scrapers, etc.
Quick comparison
1# Threading
2import threading
3import time
4
5def task(n):
6 time.sleep(1) # Blocking call is fine
7 return n * 2
8
9threads = [threading.Thread(target=task, args=(i,)) for i in range(5)]
10for t in threads: t.start()
11for t in threads: t.join()
12
13# Asyncio
14import asyncio
15
16async def task(n):
17 await asyncio.sleep(1) # Must use async version
18 return n * 2
19
20asyncio.run(asyncio.gather(*[task(i) for i in range(5)]))Relationship with generators
Coroutines in Python were originally implemented using generators. Before async/await syntax (Python 3.5+), coroutines were just generators that followed certain conventions.
Both coroutines and generators are suspendable functions:
1# Generator - suspends at yield
2def gen():
3 print("Start")
4 yield 1
5 print("Middle")
6 yield 2
7 print("End")
8
9g = gen()
10next(g) # Prints "Start", returns 1
11next(g) # Prints "Middle", returns 2
12
13# Coroutine - suspends at await
14async def coro():
15 print("Start")
16 await asyncio.sleep(0)
17 print("Middle")
18 await asyncio.sleep(0)
19 print("End")Both maintain state between suspension points. The difference is what controls resumption:
- Generators: You control with
next()orsend() - Coroutines: The event loop controls resumption
However they differ in:
- type: they are not the same type
- protocol: generators implement the iterator protocol, coroutines implement the awaitable protocol:
1# iterator
2def gen():
3 yield 1
4
5g = gen()
6next(g) # you pull values out
7
8# awaitable
9async def coro():
10 return 1
11
12c = coro()
13await c # Event loop drives execution#programming #python #software engineering #computer-science #concurrency #preemptive #cooperative #asyncio #await #yield #generators