name: debug-challenge
description: Debugs an existing challenge by starting it, inspecting tasks, troubleshooting over SSH, fixing source files, and restarting. Use when a challenge's tasks are failing or hanging.
argument-hint:
disable-model-invocation: true
Debug the challenge $0. Follow these steps in order.
Debugging why a solution doesn't solve? Prefer the CI helper path (the
solve-challengeskill /tests/challenges/main.go --skip-auth) instead of the manual flow below. It starts its own playground and, while it runs, you can point ordinarylabctlcommands at that playground (labctl playground tasks <id> -o yaml,labctl ssh <id> -- <diag>). Most importantly it writes a per-solution artifacts directory (path printed at the start of the run) whose files are appended dynamically during the attempt —chunk-NN.*.log(per-chunk output),tasks-*.txt(labctl playground tasks -o yamlsnapshots every 15s), andjournal-<machine>.examiner.log. Analyzing those artifacts is the fastest way to find the failing chunk or task. The manual flow below is for debugging the challenge environment/tasks directly.
1. Start the challenge in the background
Run the challenge start command in the background (it may block waiting for init tasks):
labctl challenge start $0 --no-open --no-ssh
Do not wait for it to complete - proceed immediately to the next step.
2. Find the playground ID
Use the list-running-playgrounds skill (or labctl playground list) to find the
playground ID corresponding to this challenge. You'll need this ID for all subsequent steps.
3. Inspect examiner tasks
Get the overview via the API (no SSH tunnel) - poll this in a loop while init tasks are still running:
labctl playground tasks <playID> -o yaml
Tasks are edge-activated and never regress: once a task shows completed
(40) it stays there for the life of the playground, so a single completed
observation is authoritative.
SSH discipline: run labctl ssh calls foreground and one at a time - never in a
backgrounded retry loop. Orphaned labctl ssh children exhaust the tunnel pool and
every later ssh then fails with StartTunnel(): retry after 2562047h47m16s. If you
hit that, pkill -9 -f 'labctl ssh' and retry (see run-playground-command).
4. Troubleshoot over SSH
If the task output is not enough to identify the problem, run commands on the
playground VM using the run-playground-command skill:
labctl ssh <playID> -- <diagnostic-command>
Common diagnostics: checking file contents, running scripts manually, inspecting service status, reviewing logs, testing network connectivity, etc.
Tip: If the machine you need to inspect has noSSH: true, you can still read its
task status and last-run output via labctl playground tasks <playID> -o yaml (the API
covers hidden machines too). Only when you need to run live diagnostic commands on
that machine must you temporarily comment out the directive (#noSSH: true), push, and
restart the challenge. Remember to restore it when done debugging.
5. Fix the challenge source
Based on your findings, edit the challenge's local source files (typically index.md)
to fix the failing tasks. Common fixes include:
- Adjusting init task scripts
- Fixing verification task conditions
- Correcting expected outputs or paths
6. Push changes and restart the challenge
Important: After pushing changes, you must restart the challenge for the changes to take effect. The running playground uses the old version of the content.
First, push the changes using the edit-remote-content skill:
labctl content push challenge $0 --dir <folder> --force
Then stop the current attempt and start a fresh one (the stop is required - a running attempt keeps serving the old content):
labctl challenge stop $0
labctl challenge start $0 --no-open --no-ssh
Go back to step 2 and verify that the tasks now pass.
7. Repeat if needed
If tasks are still failing, repeat steps 2-6 until all tasks pass.