name: harbor-docker-manual description: Manually reproduce Harbor task containers in local Docker. Use when the user wants to inspect or run a Harbor benchmark image by hand, trace a Harbor job or script such as instance/robinCliScripts/v2.sh to find the task Docker image, install a Cline/clite tarball inside the container, debug Harbor agent setup commands, or run Cline manually from inside a Docker shell.
Harbor Docker Manual
Use this skill to help the user recreate Harbor's container setup manually, usually so they can inspect the task environment and run cline/clite by hand.
Workflow
- Start from the launcher script or job path the user names.
- Read the launcher script enough to identify:
ROOT_DIRDATASETAGENTENV_TYPETARBALL_URL- agent kwargs passed with
--ak
- Locate the task config:
- Existing job: read
jobs/<job>/<trial>/config.json. - Cached Harbor task: search
~/.cache/harbor/tasksfor the task name. - New run script: infer the selected task from
INCLUDE_TASK_NAME, or tell the user to pick one task first.
- Existing job: read
- Find the Docker image:
- Prefer
task.tomlfield[environment].docker_image. - For existing jobs, read
.task.pathfrom trialconfig.json, then find the matching cached task. - If no
docker_imageexists, use the task'senvironment/Dockerfileand explain that Harbor would build an image rather than pull one.
- Prefer
- Start a manual Docker shell matching Harbor's runtime architecture.
- Have the user run install/setup mostly from inside the shell.
- Run Cline non-interactively with explicit provider/key/model flags.
Useful Discovery Commands
Find the selected settings in a run script:
sed -n '1,180p' /path/to/instance/robinCliScripts/v2.sh
Inspect an existing job and trial:
jq '.environment, .task, .agent' /path/to/jobs/<job>/<trial>/config.json
Find a cached task by name:
find ~/.cache/harbor/tasks -maxdepth 3 -type d -name '<task-name>'
Read the task image:
sed -n '1,120p' ~/.cache/harbor/tasks/<cache-id>/<task-name>/task.toml
List local Harbor-ish Docker images:
docker images --format '{{.Repository}}:{{.Tag}} {{.ID}} {{.Size}}' | rg 'hb__|alexgshaw|terminal|swebench'
Start Container
Use linux/amd64 for Linux x64 tarballs, especially on Apple Silicon Macs. Do not rely on host architecture.
IMAGE="<task-image>"
NAME="harbor-manual"
docker rm -f "$NAME" 2>/dev/null || true
docker run -it \
--name "$NAME" \
--platform linux/amd64 \
-w /app \
-e API_KEY="$OPENROUTER_API_KEY" \
-e MODELID="anthropic/claude-sonnet-4.6" \
"$IMAGE" \
bash
If the task workdir is not /app, inspect the task Dockerfile or run pwd, ls, and find / -maxdepth 2 -name test.sh 2>/dev/null inside the container.
Install Cline Tarball Inside Container
Run this inside the container. This mirrors Harbor's cline-v2 install path closely enough for manual debugging.
TARBALL_URL="<tarball-url>"
apt-get update
apt-get install -y curl ca-certificates git
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.2/install.sh | bash
export NVM_DIR="$HOME/.nvm"
. "$NVM_DIR/nvm.sh"
nvm install 22
nvm use 22
nvm alias default 22
npm install -g --ignore-scripts -- "$TARBALL_URL"
which cline
which clite
cline --version
clite --version
If API_KEY or MODELID is empty inside the shell, set them manually:
export API_KEY="sk-or-v1-..."
export MODELID="anthropic/claude-sonnet-4.6"
Run Cline Manually
Avoid plain cline in Docker. It may trigger browser auth or interactive TUI behavior where 127.0.0.1 points at the container, not the host.
Use explicit provider/key/model flags and a prompt:
cline -P openrouter \
-k "$API_KEY" \
-m "$MODELID" \
--yolo \
--reasoning-effort medium \
--json \
"Hello"
For closest Harbor behavior, close stdin and capture all output:
set -o pipefail
cline -P openrouter \
-k "$API_KEY" \
-m "$MODELID" \
--yolo \
--reasoning-effort medium \
-- "Task prompt here" \
< /dev/null 2>&1 | tee /tmp/cline.txt
echo "exit=${PIPESTATUS[0]}"
Common Debug Checks
Environment variables:
echo "API_KEY set? ${API_KEY:+yes}"
echo "MODELID=$MODELID"
Node/Cline path:
export NVM_DIR="$HOME/.nvm"
. "$NVM_DIR/nvm.sh"
which node
which npm
which cline
cline --version
Cline state/log files:
find ~/.cline -maxdepth 5 -type f | sort | tail -100
Terminal escape junk such as 997;1n10;rgb... after running bare cline is usually leaked TTY response text. Run reset, then use non-interactive commands with a prompt instead of the bare TUI.
Cleanup
From the host:
docker rm -f harbor-manual
Use the actual NAME value if different.