Detect stalls and deadlocks in the tokio async runtime.
- Monitors all spawned async tasks in a tokio runtime.
- Tracks task progress and polls, reporting tasks that have not made progress for a configurable threshold.
- Provides a
WatchdogAPI to register callbacks for stalled tasks. - Integrates with procedural macro
#[monitored]to automatically rewritetokio::spawnandtokio::task::spawn_blockingcalls to monitored versions.
- Monitors lock acquisition and release for
MonitoredMutex(use this type for lock tracking). - Detects cycles in the lock/task graph, reporting deadlocks and the involved tasks.
- Provides a
WatchdogAPI to register callbacks for deadlock events. - Integrates with procedural macro
#[monitored]to track lock ownership and task relationships.
use detector_macros::monitored;
#[monitored]
async fn my_task() -> String {
tokio::spawn(async {
tokio::time::sleep(std::time::Duration::from_secs(1)).await;
"ok".to_string()
}).await
}
fn main() {
let _w = watchdog::WatchdogBuilder::new()
.stall_threshold_ms(200)
.sample_interval_ms(50)
.on_stall(move |stalled| {
// my_task was stalled
if let Some(handle) = handles.get(&stalled.id) { // in case you have access to the handle
handle.abort();
}
})
.start();
my_task().await;
}use detector::MonitoredMutex;
use detector_macros::monitored;
use std::sync::Arc;
#[monitored]
async fn lock_tasks(m1: Arc<MonitoredMutex<()>>, m2: Arc<MonitoredMutex<()>>) {
let m1_clone = m1.clone();
let m2_clone = m2.clone();
let handle1 = tokio::spawn(async move {
let _g1 = m1_clone.lock().await;
sleep(Duration::from_millis(20)).await;
let _g2 = m2_clone.lock().await;
});
let handle2 = tokio::spawn(async move {
let _g1 = m2.lock().await;
sleep(Duration::from_millis(20)).await;
let _g2 = m1.lock().await;
});
handle1.await;
handle2.await;
}
fn main() {
let _w = watchdog::WatchdogBuilder::new()
.sample_interval_ms(50)
.on_deadlock(move |tasks, cycle| {
for info in tasks {
println!(
"Deadlock detected! Task: {}, Location: {:?}, Polls: {}, Cycle: {:?}",
info.name,
info.location,
info.polls.load(std::sync::atomic::Ordering::Acquire),
cycle
);
// abort the task
if let Some(handle) = handles.get(&info.id) {
handle.abort();
}
}
})
.start();
let m1 = Arc::new(MonitoredMutex::new(tokio::sync::Mutex::new(()), "m1"));
let m2 = Arc::new(MonitoredMutex::new(tokio::sync::Mutex::new(()), "m2"));
lock_tasks(m1, m2).await;
}- Procedural macro limitations:
- The macro cannot rewrite function signatures; you must update return/argument types manually if you use
MonitoredMutex. - Aliased imports are only detected inside the function body, not at module scope.
- The macro cannot rewrite function signatures; you must update return/argument types manually if you use
- Task naming:
- By default, the macro uses the call name or function name for task identification.
- Single runtime:
- The monitoring is designed for a single tokio runtime per process. Custom tokio-based threadpools are not captured by this crate.
- Performance:
- Monitoring and graph tracking add some overhead, but are optimized for production use.
-
Add
detectoranddetector-macrosto your dependencies. -
Use the
#[monitored]macro on async functions or blocks that spawn tasks or use mutexes. -
Set up a
Watchdogto monitor stalled tasks and deadlocks. -
Enable the
deadlockfeature if you want deadlock detection:cargo run --example deadlock --features deadlock
Apache 2.0