maintenance

name: maintenance description: Goal-oriented repository maintenance and release-readiness work. Use when the user asks for maintenance, release prep, repo health review, dependency refreshes, spec/docs alignment, test gap review, technical debt analysis, or general cleanup without prescribing an exact sequence. metadata: internal: true user-invocable: true

Goal: leave the repo materially healthier and closer to release-ready state, with evidence.

This skill implements specs/maintenance.md. Keep operational guidance here. Keep design intent and constraints in the spec.

This skill is outcome-oriented. Do not blindly walk a fixed checklist. Choose the smallest set of actions that closes the real maintenance risk in front of you.

When To Use

Use this skill when the task is about repo maintenance rather than a single feature:

release-readiness review
dependency refreshes
spec or docs drift
test coverage gaps
threat-model or security hygiene review
performance review of recently changed code
technical debt analysis and issue tracking
AGENTS/skills/command hygiene

Required Outcomes

The maintenance scope is explicit.
- If the user provided a scope, use it.
- If not, infer a reasonable scope from recent changes, release posture, and obviously stale areas. State the assumption.
The work produces concrete improvement.
- Fix issues when the change is small and local.
- If an issue is too large for the current task, capture a crisp finding with evidence and the next action.
Validation matches risk.
- Run checks that prove the updated areas are healthy.
- Increase depth for auth, persistence, migrations, public API, external integrations, and end-to-end UI flows.
A release claim is backed by evidence.
- Do not call the repo release-ready unless the changed or high-risk surfaces were actually checked.

Operating Model

Start from goals and risk surface, not checklist order.
Prefer the highest-signal path first: recent diffs, flaky areas, failing checks, stale specs, outdated dependencies, or known security/performance hotspots.
Always run cargo outdated (or cargo search per-crate) and pnpm outdated during release-readiness or dependency-scoped maintenance — even when no security advisory exists. Patch/minor bumps are cheap to miss and cheap to apply; skipping them silently accumulates drift.
Check Linear issues in the OSS project (EVE team) already in In Progress when maintenance covers release readiness or workflow hygiene. Treat issues whose updatedAt is older than 1 day as stale by default, then triage or report them.
When maintenance covers release readiness or repo workflow hygiene, review recent upstream plugin-platform changes before declaring local plugins current. Check the Codex and Claude Code Everruns Dev plugin surfaces together: compare .agents/plugins/marketplace.json, .claude-plugin/marketplace.json, plugins/everruns-dev/.codex-plugin/plugin.json, plugins/everruns-dev/.claude-plugin/plugin.json, shipped plugin behavior, skills, docs, and marketplace entries; run scripts/test-everruns-dev-plugin.sh or equivalent metadata validation so registration, version parity, compatibility, and non-contradiction are proven.
When maintenance covers recently shipped features or release readiness, check for half-built cross-surface features: UI disconnected from backend behavior, backend capabilities missing intended MCP/CLI/docs exposure, MCP or CLI behavior lagging API semantics, or tests/manual cases claiming more than the product provides.
Skip untouched areas when there is a clear reason. Say why they were skipped.
Prefer fixing over reporting.
For bugs uncovered during maintenance, prefer a failing test before the fix when practical.
Keep changes PR-sized. If a maintenance theme explodes in scope, finish the highest-value slice and report the boundary.

Maintenance Surfaces

Use judgment on which surfaces matter for the current task.

Dependency Health

Goal: all packages — including CLI, server, worker, integrations, UI, and docs — run on current dependency versions. Outdated major versions are upgraded proactively, not deferred indefinitely.

Actions:

audit every workspace crate and pnpm-managed package for outdated dependencies, including major-version bumps
upgrade major versions when the migration path is clear; document blockers when it is not
flag deprecated crates/packages and identify replacements
check for unused dependencies (cargo udeps or manual review)

Good evidence:

cargo outdated (or cargo search) checked for each CLI and workspace dependency
pnpm outdated checked for apps/ui/ and apps/docs/
major-version upgrades applied and tested, not just noted
deprecated dependencies flagged with replacement plan
lockfiles updated intentionally
relevant build/lint/test checks pass

Specs And Docs Alignment

Goal: docs describe the current system intent and constraints, without drifting into code duplication.

Good evidence:

changed behavior reflected in specs/, apps/docs/, OpenAPI, or examples when relevant
stale or duplicate spec detail removed in favor of links to source files

Feature Completeness Across Surfaces

Goal: features that appear shipped in one surface are connected, reachable, and consistent across the intended UI, backend, MCP, CLI, docs, and tests.

Good evidence:

UI affordances call real backend APIs and handle loading, errors, auth, and state refresh
backend features intended for agent or automation use are exposed through MCP/app surfaces where applicable
CLI commands and flags match current API semantics and do not duplicate stale assumptions
docs, specs, examples, tests, and manual test cases do not claim unavailable behavior
gaps are fixed locally or captured as specific findings with the missing surface, user impact, and next action

Security And Threat Posture

Goal: new or changed attack surface is understood, and mitigations/docs match reality.

Actions:

run DeepSec when maintenance includes security posture, release readiness, auth/tenant/public-ingress review, or repo-wide hygiene:
- if .deepsec/ is missing, initialize it with npx deepsec init
- install from .deepsec/ with pnpm install
- keep data/<project>/INFO.md short and project-specific before processing
- run pnpm deepsec scan --project-id everruns from .deepsec/
- use pnpm deepsec process --project-id everruns --agent codex only with an explicit budgeted focus (--filter, --limit, --only-slugs, or --manifest) unless the user asks for a full AI pass
- revalidate high-severity results with pnpm deepsec revalidate --project-id everruns --agent codex --min-severity HIGH
keep durable DeepSec workspace files tracked: .deepsec/.gitignore, .deepsec/AGENTS.md, .deepsec/README.md, .deepsec/deepsec.config.ts, .deepsec/package.json, .deepsec/pnpm-lock.yaml, .deepsec/pnpm-workspace.yaml, and .deepsec/data/*/{INFO.md,SETUP.md}
do not commit generated DeepSec state unless explicitly requested: .deepsec/node_modules/, .deepsec/.env*.local, .deepsec/data/*/{files,runs,reports,project.json,tech.json}
create Linear issues in the OSS project (EVE team) for actionable DeepSec findings that are not fixed in the current maintenance pass

Good evidence:

threat model updated when behavior or trust boundaries changed
obvious gaps in auth, validation, secret handling, or data exposure were reviewed
DeepSec scan/process/revalidate run IDs, scope, finding count, and budget/cost noted when DeepSec was used
GitHub Security Overview checked for advisories
Dependabot alerts reviewed and triaged
Secret scanning alerts reviewed — no open generic secret leaks

Test And Runtime Confidence

Goal: important paths are covered by the right proof, not ceremony.

Good evidence:

targeted tests added or updated for regressions
smoke tests or manual verification used where unit tests are insufficient
checks match the touched surface instead of running an arbitrary full matrix

Performance And Operational Safety

Goal: recent changes do not introduce obvious scale or latency regressions.

Good evidence:

query shape, pagination, indexes, batching, and background job cost reviewed where relevant
no unbounded list paths or easy N+1 regressions in touched code

Technical Debt Analysis

Goal: structural debt is identified, quantified, and tracked before it compounds into development friction or bugs.

Good evidence:

god objects, duplicated logic, and boilerplate patterns identified with line counts and file locations
severity assessed (critical/high/medium/low) based on active harm vs. friction
concrete Linear issues created for each finding with actionable scope
hacks, shortcuts, and open vulnerabilities surfaced with code references
large files (>2K lines non-test) catalogued with the structural reason they grew

Issue Tracking Hygiene

Goal: Linear reflects reality closely enough that active work is visible, stalled work is noticed, and release planning is not distorted by stale execution state.

Good evidence:

OSS project issues already in In Progress were reviewed for stale ownership or stalled execution
issues whose updatedAt was older than 1 day were triaged, commented, re-scoped, or moved out of In Progress
maintenance findings that should not be fixed immediately were captured as actionable Linear issues or comments instead of left implicit

Repo Workflow Hygiene

Goal: agent instructions, commands, skills, examples, and release helpers still match reality.

Good evidence:

AGENTS.md, .claude/commands/, and .claude/skills/ do not contradict each other
release or maintenance instructions point at the canonical workflow instead of duplicating stale detail
.agents/plugins/marketplace.json, .claude-plugin/marketplace.json, plugins/everruns-dev/.codex-plugin/plugin.json, plugins/everruns-dev/.claude-plugin/plugin.json, and shipped plugin behavior were checked against recent upstream plugin-platform changes; registration, version parity, compatibility, or discoverability gaps were fixed or captured

Common Evidence Commands

Pick only what matches the task:

just pre-push
just pre-pr
cargo fmt --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
cd apps/ui && pnpm run lint && pnpm run build
cd apps/docs && pnpm run build
./scripts/export-openapi.sh
doppler run -- bash -lc 'GH_TOKEN="$GITHUB_TOKEN" gh api repos/everruns/everruns/dependabot/alerts --jq "[.[] | select(.state==\"open\")] | length"' — open Dependabot alert count
doppler run -- bash -lc 'GH_TOKEN="$GITHUB_TOKEN" gh api repos/everruns/everruns/secret-scanning/alerts --jq "[.[] | select(.state==\"open\")] | length"' — open secret scanning alert count
scripts/test-everruns-dev-plugin.sh — Everruns Dev plugin metadata, registration, and version parity
Linear MCP: list OSS project issues in In Progress, compare each issue's updatedAt to current time, and flag items older than 1 day for triage

Deliverable

Report:

what scope was covered
what was fixed or found
what evidence was gathered
which stale In Progress Linear issues were triaged, if that check was in scope
what was intentionally skipped and why

If the user asks to ship after maintenance, hand off to /ship.