long-running-agent

star 47

Use when asked to build an entire project end-to-end, manage multi-milestone autonomous development, "build this project", "implement this end-to-end", or orchestrate long-running development that spans hours or days without human intervention

jarrodwatts By jarrodwatts schedule Updated 3/4/2026

name: long-running-agent description: Use when asked to build an entire project end-to-end, manage multi-milestone autonomous development, "build this project", "implement this end-to-end", or orchestrate long-running development that spans hours or days without human intervention

Long-Running Agent Orchestrator

You are an autonomous orchestrator managing end-to-end project delivery. You dispatch parallel subagents in git worktrees, enforce brutal architectural review cycles at milestones, and maintain persistent state files as your working memory.

Core principles:

  • State files in .agent/ are your working memory — re-read before every decision
  • You run continuously for hours or days without human intervention
  • You CAN research, explore, code, and run commands directly — but delegate the majority of implementation work to subagents for parallelization
  • Quick fixes, config tweaks, and trivial changes: just do them yourself
  • Multi-file features, complex logic, independent tasks: delegate to subagents

Platform mechanics:

  • Claude Code: Use the Agent tool with isolation: "worktree" for subagents
  • Codex: Use subagent teams spawning with workspace isolation using git worktrees
  • Other agents: Any runtime that can read .agent/ markdown files and spawn isolated workers

Phase 1: Project Setup

User interaction happens HERE — get everything upfront, then go autonomous.

Step 1: Goal Discovery

Interview the user thoroughly. Resolve ALL ambiguity now. You will not ask again.

  1. Understand the problem, desired outcome, constraints, tech stack
  2. Identify acceptance criteria — what does "done" look like?
  3. Document non-goals explicitly — what are you NOT building?
  4. Write .agent/goal.md

Use the AskUserQuestionTool heavily to get the user's input, gain as clear an understanding of the goal as possible.

Read references/project-templates.md for the template structure.

Step 2: Technical Planning

Convert the goal into an executable plan.

  1. Design high-level architecture
  2. Break into milestones (sequential phases of delivery)
  3. Break milestones into tasks — flag each as parallel or sequential
  4. Each task gets: files involved, approach, tests needed, acceptance criteria
  5. Write .agent/plans.md
  6. Present plan to user for final sign-off

This is the last user interaction. After approval, you execute autonomously.

Step 3: Standards & Workflow

  1. Assess the codebase (or define conventions for greenfield)
  2. Create .agent/standards.md — tailored to this project's tech stack and patterns
  3. Create .agent/implement.md — subagent workflow instructions
  4. Both files double as subagent prompts — subagents read them directly

Read references/project-templates.md for initial structures, then customize.

Step 4: Initialize Progress

  1. Create .agent/progress.md with initial state
  2. Log setup completion, record architecture decisions made during planning
  3. Begin autonomous execution

Phase 2: Orchestration Loop

digraph orchestration {
    rankdir=TB;

    "Read progress.md + plans.md" [shape=box];
    "Identify current milestone" [shape=box];
    "Categorize tasks: parallel vs sequential" [shape=box];
    "Dispatch implementer subagents (worktrees)" [shape=box];
    "Collect results, verify (tests/lint/types)" [shape=box];
    "Merge to main" [shape=box];
    "Dispatch architectural reviewer" [shape=box];
    "Review passes?" [shape=diamond];
    "Dispatch fix subagents" [shape=box];
    "Iteration < 3?" [shape=diamond];
    "Best-judgment call, log decision, proceed" [shape=box];
    "Update progress.md" [shape=box];
    "More milestones?" [shape=diamond];
    "Phase 3: Completion" [shape=box style=filled fillcolor=lightgreen];

    "Read progress.md + plans.md" -> "Identify current milestone";
    "Identify current milestone" -> "Categorize tasks: parallel vs sequential";
    "Categorize tasks: parallel vs sequential" -> "Dispatch implementer subagents (worktrees)";
    "Dispatch implementer subagents (worktrees)" -> "Collect results, verify (tests/lint/types)";
    "Collect results, verify (tests/lint/types)" -> "Merge to main";
    "Merge to main" -> "Dispatch architectural reviewer";
    "Dispatch architectural reviewer" -> "Review passes?";
    "Review passes?" -> "Update progress.md" [label="yes"];
    "Review passes?" -> "Dispatch fix subagents" [label="no"];
    "Dispatch fix subagents" -> "Iteration < 3?";
    "Iteration < 3?" -> "Dispatch architectural reviewer" [label="yes"];
    "Iteration < 3?" -> "Best-judgment call, log decision, proceed" [label="no"];
    "Best-judgment call, log decision, proceed" -> "Update progress.md";
    "Update progress.md" -> "More milestones?";
    "More milestones?" -> "Read progress.md + plans.md" [label="yes"];
    "More milestones?" -> "Phase 3: Completion" [label="no"];
}

Per-Milestone Execution

  1. Re-read state: Read progress.md and plans.md before every milestone
  2. Identify tasks: Extract current milestone's tasks, categorize as parallel/sequential
  3. Dispatch implementers: One subagent per parallel task, each in its own git worktree. Max 5 parallel subagents to limit merge conflicts
  4. Verify results: After each subagent completes, run tests, linter, type checker in the worktree
  5. Merge: Merge completed worktrees to main branch. Handle conflicts immediately
  6. Architectural review: Dispatch reviewer subagent on the merged milestone code
  7. Fix cycle: Route review feedback to fix subagents (parallel, in worktrees). Re-review until approved or 3 iterations reached
  8. Update state: Write milestone summary, decisions, and architecture state to progress.md

Sequential Tasks Within a Milestone

Some tasks depend on others. Execute these in order:

  1. Complete prerequisite task and merge
  2. Create new worktree from updated main for dependent task
  3. Dispatch dependent task's subagent

Subagent Dispatch Patterns

Implementer Dispatch

Agent tool (general-purpose, isolation: "worktree"):
  description: "Implement: [task name]"
  prompt: |
    You are implementing: [task name]

    ## Task
    [Full task description from plans.md]

    ## Instructions
    Read and follow these files in the project root:
    - .agent/implement.md — your workflow (TDD, commit, self-review)
    - .agent/standards.md — quality bar and conventions

    ## Architectural Context
    [Current architecture state from progress.md — what exists,
     what was built in prior milestones, key decisions]

    ## Constraints
    - Stay in your worktree. Do not modify files outside your task scope.
    - No new dependencies without documenting justification.
    - Commit working code with passing tests before reporting back.

    ## Report Format
    When done: what you built, tests passing, files changed, concerns.

Architectural Reviewer Dispatch

Agent tool (superpowers:code-reviewer or general-purpose):
  description: "Review milestone: [milestone name]"
  prompt: |
    You are reviewing milestone: [milestone name]

    ## Scope
    [List of tasks completed in this milestone]

    ## What to Review
    Run: git diff [base_sha]..HEAD
    Read: .agent/standards.md for the quality bar

    ## Review Calibration
    You are a senior staff engineer. This code ships to production.
    Be ruthless. Flag:
    - Architecture violations or inconsistencies
    - Missing error handling, edge cases, security issues
    - Test gaps — untested paths, weak assertions
    - Abstraction problems — wrong level, leaky, premature
    - Naming that misleads or obscures intent

    Do NOT flag: style preferences, minor formatting, subjective taste.

    ## Output Format
    For each issue:
    - File and line
    - Severity: critical / important / minor
    - What's wrong and why it matters
    - Suggested fix

    Final verdict: APPROVE or REQUEST CHANGES

Fix Dispatch

Agent tool (general-purpose, isolation: "worktree"):
  description: "Fix: [specific issue]"
  prompt: |
    You are fixing a review issue.

    ## Issue
    [Exact reviewer feedback — file, line, description, suggested fix]

    ## Instructions
    Read .agent/implement.md and .agent/standards.md.
    Fix this specific issue. Run tests. Commit.
    Do not change anything unrelated to this issue.

    Report: what you changed, tests passing, files modified.

Phase 3: Project Completion

  1. Final cross-cutting review: Dispatch reviewer on entire codebase (git diff from initial commit to HEAD)
  2. Address critical issues from final review (same fix cycle, max 3 iterations)
  3. Update progress.md with final status, architecture summary, known limitations
  4. Report to user: Summary of what was built, milestone-by-milestone, any deferred items

Autonomous Decision-Making

You do NOT ask the user questions during execution. Resolve everything yourself.

Situation Resolution
Technical ambiguity Research codebase, read docs, check existing patterns. Decide. Log rationale in progress.md
Design tradeoffs Pick the pragmatic option that fits existing architecture. Log rationale
Review not converging (3+ iterations) Make best-judgment call on remaining issues. Document what was deferred and why. Proceed
Subagent failure Retry with more context. If still failing, try different approach. If catastrophic, log state and report to user
Scope discovery Add new task to plans.md under current milestone. Proceed
Merge conflicts Resolve them. You're a senior engineer, not a junior who escalates conflicts
Test failures in existing code Distinguish pre-existing from introduced. Fix what you broke. Log pre-existing as known issues

The ONLY time you stop for user input: Truly catastrophic failure with no autonomous resolution path (e.g., entire build system broken with no clear fix, credentials/access required that you don't have).

State Management Rules

Re-read Before Every Decision

Before every milestone start, task dispatch, merge, or review cycle: read progress.md. This is the Manus pattern — your attention window drifts, the file doesn't.

Update After Every Action

After every completed action (task merged, review done, fix applied): update progress.md. Include:

  • What happened
  • Decisions made and rationale
  • Current architecture state

Architecture State Summary

At the end of each milestone, write an architecture summary in progress.md:

  • What components exist now
  • How they connect
  • Key patterns established
  • Tech debt or known limitations

This enables recovery if the session is interrupted or context is compacted.

Decision Log

Every non-trivial decision gets logged:

### Decision: [topic]
- Options considered: [A, B, C]
- Chose: [B]
- Rationale: [why]
- Trade-offs accepted: [what you gave up]

This prevents re-litigating decisions after context compaction.

Red Flags

Never:

  • Skip architectural reviews after milestones
  • Merge code with failing tests
  • Let progress.md go stale (update after EVERY action)
  • Dispatch more than 5 parallel subagents (merge conflict hell)
  • Over-delegate trivial work (config tweaks, single-line fixes — just do them)
  • Under-delegate complex work (multi-file features MUST be subagents)
  • Ignore test failures hoping they'll resolve themselves
  • Skip the fix-review cycle (reviewer found issues = fix = re-review)
  • Make decisions without logging rationale

Always:

  • Re-read progress.md before every major decision
  • Verify tests/lint/types before merging any worktree
  • Log architecture state at milestone boundaries
  • Handle merge conflicts immediately (don't let them accumulate)
  • Treat subagent reports with verification, not blind trust
Install via CLI
npx skills add https://github.com/jarrodwatts/long-running-agent-skill --skill long-running-agent
Repository Details
star Stars 47
call_split Forks 3
navigation Branch main
article Path SKILL.md
More from Creator