[2026] Asio Deadlock Debugging: Async Callbacks, Locks, and Strands [#49-3]
이 글의 핵심
Hidden deadlocks in Boost.Asio: mutex + condition_variable with async completion, lock ordering, and fixes with strands, std::lock, and thread dumps.
Introduction: “The Asio server sometimes hangs”
Asio servers can deadlock when a mutex is held while waiting for another async operation whose completion handler needs the same mutex. Multi-threaded io_context::run() makes this timing-dependent and “hidden.” Topics:
- Pattern: lock → *async_ → cv.wait while handler needs the lock
- Different lock orders across threads
- Strands, minimal lock scope, uniform ordering
- gdb
thread apply all bt, logging, TSan See also: Multithreaded Asio, Strand.
Scenarios (short)
- Chat server: sync wait for async_write completion under a lock.
- Session pool: lock A then B vs B then A.
- HTTP proxy: blocking wait for upstream under lock.
- Async logging flush under lock.
- Timer vs I/O callbacks taking locks in opposite order.
Pattern 1: Lock held while waiting for completion
아래 코드는 cpp를 사용한 구현 예제입니다. 비동기 처리를 통해 효율적으로 작업을 수행합니다. 각 부분의 역할을 이해하면서 코드를 살펴보시기 바랍니다.
// Dangerous (deadlock)
void on_send() {
std::unique_lock<std::mutex> lock(mtx);
socket.async_write(..., [&](...) {
std::lock_guard<std::mutex> lk(mtx); // may block forever
done = true;
cv.notify_one();
});
cv.wait(lock, [&] { return done; });
}
아래 코드는 mermaid를 사용한 구현 예제입니다. 비동기 처리를 통해 효율적으로 작업을 수행합니다, 에러 처리를 통해 안정성을 확보합니다. 각 부분의 역할을 이해하면서 코드를 살펴보시기 바랍니다.
// 실행 예제
sequenceDiagram
participant T1 as Thread 1 (on_send)
participant T2 as Thread 2 (io.run)
participant Mtx as mutex
T1->>Mtx: lock
T1->>T2: async_write scheduled
T1->>T1: cv.wait (holds mtx)
T2->>Mtx: try lock in handler → blocks
Note over T1,T2: Deadlock
Fix: Continue work in the completion handler, or post to a strand; do not wait on async completion while holding the mutex the handler needs.
Pattern 2: Lock order inversion
Thread A: mtx1 then mtx2. Thread B: mtx2 then mtx1. → Cycle.
Fix: Global order for all mutexes, or std::lock / std::scoped_lock to acquire both atomically.
Solutions: strand, small critical sections, ordering
Per-connection strand serializes handlers for that connection—often no mutex for session state.
boost::asio::bind_executor(strand_, [self = shared_from_this()](auto ec, auto n) {
self->on_write_done();
});
Rules:
- Do not wait for async completion while holding locks the handler needs.
- If multiple mutexes: fixed order or std::scoped_lock.
Debugging
- Hang + ~0% CPU → suspect deadlock.
- gdb -p
→ thread apply all bt full - Look for pthread_mutex_lock, cond_wait, and cross-thread cycles.
gdb -p <pid> -batch -ex "thread apply all bt"
TSan (-fsanitize=thread) may report lock-order inversion.
Production patterns
- Document lock hierarchy (e.g. session → cache → log).
- Strand-first design for connection state.
- Optional watchdog timer if progress stalls.
- SIGUSR1 handler for stack dump (limited; use gdb for all threads).
Checklist
- No cv.wait under mutex also taken in async handlers for the same operation.
- Consistent lock order or std::scoped_lock
- Per-session strand where possible
- Small lock scope; no unknown callbacks under lock
Related posts
FAQ
Q. When to use this?
A. Any multi-threaded Asio app with mutexes + condition variables + async I/O.
Q. Next reads?
A. Series index, strand and executor docs.
Summary
- Deadlock: lock + wait for async completion whose handler needs the same lock.
- Fix: async chaining, strands, lock ordering, std::scoped_lock.
- Debug: all-thread backtraces, logging, TSan. Previous: CMake link errors (#49-2) Related: High-performance networking guide index