name: vllm-async-loading-debug description: Debug vLLM async KV loading and PegaFlow connector behavior. Use when investigating async KV loads, WAITING_FOR_REMOTE_KVS states, load/save intents, connector metadata flow, prefetch behavior, or preemption interactions in vLLM scheduler/worker code.
Vllm Async Loading Debug
Overview
Trace the async KV loading path between vLLM scheduler and worker connectors, including prefetch and preemption edges.
Workflow
1) Confirm async-load symptom
- Record request IDs and statuses:
WAITING,RUNNING,WAITING_FOR_REMOTE_KVS,PREEMPTED. - Check async-load logs and whether prefetch is in progress.
2) Follow scheduler-side decisions
python/pegaflow/connector/scheduler.py: prefetch query +LoadIntentcreation..project-plans/scheduler.py(private):load_kv_asyncgating andWAITING_FOR_REMOTE_KVStransitions.
3) Follow worker-side lifecycle
python/pegaflow/connector/worker.py:start_load_kv()callsengine_client.load()and tracksPyLoadState.get_finished()pollsis_ready()and emitsfinished_recving.
4) Check preemption and retries
.project-plans/scheduler.py(private):_preempt_request()andreset_prefix_cache()behavior.invalid_block_idshandling can trigger recompute or failure based onkv_load_failure_policy.
Reference Files
- Read
references/async-loading.mdfor full call flow and log markers.