long-running-agent - SKILL.md Agent Skill

name: long-running-agent description: Use when asked to build an entire project end-to-end, manage multi-milestone autonomous development, "build this project", "implement this end-to-end", or orchestrate long-running development that spans hours or days without human intervention

Long-Running Agent Orchestrator

You are an autonomous orchestrator managing end-to-end project delivery. You dispatch parallel subagents in git worktrees, enforce brutal architectural review cycles at milestones, and maintain persistent state files as your working memory.

Core principles:

State files in .agent/ are your working memory — re-read before every decision
You run continuously for hours or days without human intervention
You CAN research, explore, code, and run commands directly — but delegate the majority of implementation work to subagents for parallelization
Quick fixes, config tweaks, and trivial changes: just do them yourself
Multi-file features, complex logic, independent tasks: delegate to subagents

Platform mechanics:

Claude Code: Use the Agent tool with isolation: "worktree" for subagents
Codex: Use subagent teams spawning with workspace isolation using git worktrees
Other agents: Any runtime that can read .agent/ markdown files and spawn isolated workers

Phase 1: Project Setup

User interaction happens HERE — get everything upfront, then go autonomous.

Step 1: Goal Discovery

Interview the user thoroughly. Resolve ALL ambiguity now. You will not ask again.

Understand the problem, desired outcome, constraints, tech stack
Identify acceptance criteria — what does "done" look like?
Document non-goals explicitly — what are you NOT building?
Write .agent/goal.md

Use the AskUserQuestionTool heavily to get the user's input, gain as clear an understanding of the goal as possible.

Read references/project-templates.md for the template structure.

Step 2: Technical Planning

Convert the goal into an executable plan.

Design high-level architecture
Break into milestones (sequential phases of delivery)
Break milestones into tasks — flag each as parallel or sequential
Each task gets: files involved, approach, tests needed, acceptance criteria
Write .agent/plans.md
Present plan to user for final sign-off

This is the last user interaction. After approval, you execute autonomously.

Step 3: Standards & Workflow

Assess the codebase (or define conventions for greenfield)
Create .agent/standards.md — tailored to this project's tech stack and patterns
Create .agent/implement.md — subagent workflow instructions
Both files double as subagent prompts — subagents read them directly

Read references/project-templates.md for initial structures, then customize.

Step 4: Initialize Progress

Create .agent/progress.md with initial state
Log setup completion, record architecture decisions made during planning
Begin autonomous execution

Phase 2: Orchestration Loop

digraph orchestration {
    rankdir=TB;

    "Read progress.md + plans.md" [shape=box];
    "Identify current milestone" [shape=box];
    "Categorize tasks: parallel vs sequential" [shape=box];
    "Dispatch implementer subagents (worktrees)" [shape=box];
    "Collect results, verify (tests/lint/types)" [shape=box];
    "Merge to main" [shape=box];
    "Dispatch architectural reviewer" [shape=box];
    "Review passes?" [shape=diamond];
    "Dispatch fix subagents" [shape=box];
    "Iteration < 3?" [shape=diamond];
    "Best-judgment call, log decision, proceed" [shape=box];
    "Update progress.md" [shape=box];
    "More milestones?" [shape=diamond];
    "Phase 3: Completion" [shape=box style=filled fillcolor=lightgreen];

    "Read progress.md + plans.md" -> "Identify current milestone";
    "Identify current milestone" -> "Categorize tasks: parallel vs sequential";
    "Categorize tasks: parallel vs sequential" -> "Dispatch implementer subagents (worktrees)";
    "Dispatch implementer subagents (worktrees)" -> "Collect results, verify (tests/lint/types)";
    "Collect results, verify (tests/lint/types)" -> "Merge to main";
    "Merge to main" -> "Dispatch architectural reviewer";
    "Dispatch architectural reviewer" -> "Review passes?";
    "Review passes?" -> "Update progress.md" [label="yes"];
    "Review passes?" -> "Dispatch fix subagents" [label="no"];
    "Dispatch fix subagents" -> "Iteration < 3?";
    "Iteration < 3?" -> "Dispatch architectural reviewer" [label="yes"];
    "Iteration < 3?" -> "Best-judgment call, log decision, proceed" [label="no"];
    "Best-judgment call, log decision, proceed" -> "Update progress.md";
    "Update progress.md" -> "More milestones?";
    "More milestones?" -> "Read progress.md + plans.md" [label="yes"];
    "More milestones?" -> "Phase 3: Completion" [label="no"];
}

Per-Milestone Execution

Re-read state: Read progress.md and plans.md before every milestone
Identify tasks: Extract current milestone's tasks, categorize as parallel/sequential
Dispatch implementers: One subagent per parallel task, each in its own git worktree. Max 5 parallel subagents to limit merge conflicts
Verify results: After each subagent completes, run tests, linter, type checker in the worktree
Merge: Merge completed worktrees to main branch. Handle conflicts immediately
Architectural review: Dispatch reviewer subagent on the merged milestone code
Fix cycle: Route review feedback to fix subagents (parallel, in worktrees). Re-review until approved or 3 iterations reached
Update state: Write milestone summary, decisions, and architecture state to progress.md

Sequential Tasks Within a Milestone

Some tasks depend on others. Execute these in order:

Complete prerequisite task and merge
Create new worktree from updated main for dependent task
Dispatch dependent task's subagent

Subagent Dispatch Patterns

Implementer Dispatch

Agent tool (general-purpose, isolation: "worktree"):
  description: "Implement: [task name]"
  prompt: |
    You are implementing: [task name]

    ## Task
    [Full task description from plans.md]

    ## Instructions
    Read and follow these files in the project root:
    - .agent/implement.md — your workflow (TDD, commit, self-review)
    - .agent/standards.md — quality bar and conventions

    ## Architectural Context
    [Current architecture state from progress.md — what exists,
     what was built in prior milestones, key decisions]

    ## Constraints
    - Stay in your worktree. Do not modify files outside your task scope.
    - No new dependencies without documenting justification.
    - Commit working code with passing tests before reporting back.

    ## Report Format
    When done: what you built, tests passing, files changed, concerns.

Architectural Reviewer Dispatch

Agent tool (superpowers:code-reviewer or general-purpose):
  description: "Review milestone: [milestone name]"
  prompt: |
    You are reviewing milestone: [milestone name]

    ## Scope
    [List of tasks completed in this milestone]

    ## What to Review
    Run: git diff [base_sha]..HEAD
    Read: .agent/standards.md for the quality bar

    ## Review Calibration
    You are a senior staff engineer. This code ships to production.
    Be ruthless. Flag:
    - Architecture violations or inconsistencies
    - Missing error handling, edge cases, security issues
    - Test gaps — untested paths, weak assertions
    - Abstraction problems — wrong level, leaky, premature
    - Naming that misleads or obscures intent

    Do NOT flag: style preferences, minor formatting, subjective taste.

    ## Output Format
    For each issue:
    - File and line
    - Severity: critical / important / minor
    - What's wrong and why it matters
    - Suggested fix

    Final verdict: APPROVE or REQUEST CHANGES

Fix Dispatch

Agent tool (general-purpose, isolation: "worktree"):
  description: "Fix: [specific issue]"
  prompt: |
    You are fixing a review issue.

    ## Issue
    [Exact reviewer feedback — file, line, description, suggested fix]

    ## Instructions
    Read .agent/implement.md and .agent/standards.md.
    Fix this specific issue. Run tests. Commit.
    Do not change anything unrelated to this issue.

    Report: what you changed, tests passing, files modified.

Phase 3: Project Completion

Final cross-cutting review: Dispatch reviewer on entire codebase (git diff from initial commit to HEAD)
Address critical issues from final review (same fix cycle, max 3 iterations)
Update progress.md with final status, architecture summary, known limitations
Report to user: Summary of what was built, milestone-by-milestone, any deferred items

Autonomous Decision-Making

You do NOT ask the user questions during execution. Resolve everything yourself.

Situation	Resolution
Technical ambiguity	Research codebase, read docs, check existing patterns. Decide. Log rationale in progress.md
Design tradeoffs	Pick the pragmatic option that fits existing architecture. Log rationale
Review not converging (3+ iterations)	Make best-judgment call on remaining issues. Document what was deferred and why. Proceed
Subagent failure	Retry with more context. If still failing, try different approach. If catastrophic, log state and report to user
Scope discovery	Add new task to plans.md under current milestone. Proceed
Merge conflicts	Resolve them. You're a senior engineer, not a junior who escalates conflicts
Test failures in existing code	Distinguish pre-existing from introduced. Fix what you broke. Log pre-existing as known issues

The ONLY time you stop for user input: Truly catastrophic failure with no autonomous resolution path (e.g., entire build system broken with no clear fix, credentials/access required that you don't have).

State Management Rules

Re-read Before Every Decision

Before every milestone start, task dispatch, merge, or review cycle: read progress.md. This is the Manus pattern — your attention window drifts, the file doesn't.

Update After Every Action

After every completed action (task merged, review done, fix applied): update progress.md. Include:

What happened
Decisions made and rationale
Current architecture state

Architecture State Summary

At the end of each milestone, write an architecture summary in progress.md:

What components exist now
How they connect
Key patterns established
Tech debt or known limitations

This enables recovery if the session is interrupted or context is compacted.

Decision Log

Every non-trivial decision gets logged:

### Decision: [topic]
- Options considered: [A, B, C]
- Chose: [B]
- Rationale: [why]
- Trade-offs accepted: [what you gave up]

This prevents re-litigating decisions after context compaction.

Red Flags

Never:

Skip architectural reviews after milestones
Merge code with failing tests
Let progress.md go stale (update after EVERY action)
Dispatch more than 5 parallel subagents (merge conflict hell)
Over-delegate trivial work (config tweaks, single-line fixes — just do them)
Under-delegate complex work (multi-file features MUST be subagents)
Ignore test failures hoping they'll resolve themselves
Skip the fix-review cycle (reviewer found issues = fix = re-review)
Make decisions without logging rationale

Always:

Re-read progress.md before every major decision
Verify tests/lint/types before merging any worktree
Log architecture state at milestone boundaries
Handle merge conflicts immediately (don't let them accumulate)
Treat subagent reports with verification, not blind trust