09-qa

star 1

Post-build quality assurance. Runs full lint, format, typecheck, security, dependency, data-staging, Modal workspace, and cross-file checks on the entire codebase. Writes docs/qa-report.md with blocking vs advisory findings. Runs in parallel with 10-e2e; results consumed by 11-verify-impl. Advisory remediation is a separate user-requested pass (spawn parallel agents) — not part of the default 09 run.

Math-Data-Justice-Collaborative By Math-Data-Justice-Collaborative schedule Updated 5/27/2026

name: 09-qa description: > Post-build quality assurance. Runs full lint, format, typecheck, security, dependency, data-staging, Modal workspace, and cross-file checks on the entire codebase. Writes docs/qa-report.md with blocking vs advisory findings. Runs in parallel with 10-e2e; results consumed by 11-verify-impl. Advisory remediation is a separate user-requested pass (spawn parallel agents) — not part of the default 09 run.

09 — QA Checks

Final quality assurance pass on the complete codebase after the build is done.

Preamble: pipeline-preamble.md — shared conventions for stages 00–17. Sessions: sessions-reference.md — requires active_session unless waived; reports under docs/sessions/{id}/reports/. Cross-cutting: considerations.md, connectivity-gates.md. State agent: workflow-state-manager — mandatory read/update.

Connectivity (stage 09)

Full-repo QA must report connectivity status in docs/qa-report.md:

  • Blocking: H0c (test_cors_policy.py) and H0i (tests/integration) in Python test agent
  • Staging: H4–H5 when VECINITA_STAGING_*_FRONTEND_URL set; else advisory with QA-ID pointing to connectivity-gates

Do not mark QA PASS if H0c fails. See connectivity-gates §Stage 09.

When to Use

  • After 07-build completes (or when execution plan shows build substantially done): the "final exam" for anything that slipped past milestone 08-verify-build
  • Runs in parallel with 10-e2e when both are scheduled
  • Results collected by 11-verify-impl for user sign-off

Difference from 08-verify-build

Aspect 08-verify-build 09-qa
When During build, at milestones After build completes
Scope Changed files / milestone Entire codebase
Auto-correct Yes (lint/format) Report only — no fixes
Blocking Non-blocking for auto-fix Async — summary for 11
Extra checks No Cross-file, data staging, Modal workspace, frontends
Output docs/verification-report.md (optional) docs/qa-report.md (required)

Prerequisites

  1. Build gate: Execution plan tasks for the active phase are completed (or user explicitly requests QA mid-build — note partial scope in the report).
  2. docs/execution-plan.md §Tech Stack Summary — tool commands.
  3. workflow-state.yaml §template — conformance checks (Vecinita: api+worker).
  4. Baseline expectation: 08-verify-build recently PASS (re-run full checks anyway; do not assume).

Session management

Per sessions-reference.md §10 and workflow-state-agent-protocol.md.

  1. Agent read_context must return active_session (or blocking deviation).
  2. Current stage must appear in active_session.routing_plan unless user amends plan.
  3. Write stage reports to active_session.artifacts_dir/reports/ when this stage produces a report.
  4. On completion: update routing-plan entry status; mirror project.stages.{key} via agent update.
  5. 00-context exempt from active_session requirement (session opener). Report: reports/qa-report.md.

State management

Agent protocol: workflow-state-agent-protocol.md. Stage key: stages.09-qa.

Invoke workflow-state-manager read_context before any other action; update after each substep. Do not edit workflow-state.yaml directly.

Update §stages.09-qa:

09-qa:
  status: completed
  started_at: "<ISO date>"
  completed_at: "<ISO date>"
  report: docs/qa-report.md
  overall: pass | fail | pass_with_advisories

Append docs/qa-report.md to §artifacts when created.


Delta / feature-addition mode

  • Scope QA report to affected Fn, apps, and journeys in the active evolve cycle.
  • Do not re-audit entire codebase unless user requests full 09 pass.

Workflow

Phase 1 — Configuration

Read from disk (do not guess commands):

Source Use for
docs/execution-plan.md §Tech Stack Summary ruff, basedpyright, pytest, pip-audit, vitest
docs/typing-policy.md No Any/any (ADR-018)
.github/workflows/ci.yml CI parity (exact paths, ignore files, frontend matrix)
docs/data-staging-state.md D1–D7 asset status
workflow-state.yaml §template Layout / Modal / deploy pattern
infra/modal/README.md Modal app names, workspace, volumes

Vecinita default commands (repo root, uv sync --group dev):

uv run ruff check apps packages tests
uv run ruff format --check apps packages tests
uv run basedpyright apps packages tests
uv run pytest tests/unit tests/integration tests/privacy tests/e2e tests/smoke tests/eval
uv run pip-audit  # with audit/pip-audit-ignore.txt if present
bash scripts/check_secrets.sh
bash scripts/check_modal_no_database_url.sh
bash scripts/check_openapi_specs.sh

Frontends (each app dir):

cd apps/chat-rag-frontend && npm ci && npm run lint && npm test -- --run
cd apps/data-management-frontend && npm ci && npm run lint && npm test -- --run

Coverage gate (CI parity — run from repo root): make test-unit-coverage enforces the 95% frontend branch gate (and Python unit coverage); for a frontend-only change use cd apps/<app> && npm run test:coverage (no DB needed). Plain vitest run does not enforce coverage, so QA must run this to match CI's coverage job — a separate required CI job distinct from the frontend matrix.


Phase 2 — Run QA Checks

Prefer one message with parallel shell/tool work (or Task subagents) for independent checks. Do not block the whole QA on optional live Modal/staging unless env vars are set.

Agent 1 — Linter (full codebase)

  • uv run ruff check apps packages tests
  • Optional: uv run ruff check --select F401,F841 for unused imports/vars
  • Return: total issues by rule; PASS if zero

Agent 2 — Formatter

  • uv run ruff format --check apps packages tests
  • Return: count of files needing format; PASS if zero

Agent 3 — Typechecker

  • uv run basedpyright apps packages tests
  • Return: error/warning count; PASS if zero errors

Agent 4 — Test suite (Python + frontend)

Python: same pytest paths as CI (see Phase 1).

Frontend: Vitest + ESLint per app in CI matrix.

Return:

  • Python: passed / failed / skipped (list skipped reasons if env-gated)
  • Frontend: per-app pass/fail
  • Coverage gate (make test-unit-coverage): pass/fail vs the 95% FE branch threshold
  • Note non-blocking warnings (e.g. Pydantic/LlamaIndex validate_default)

PASS if zero failures and the coverage gate passes (skips are OK when documented).

Agent 5 — Security

Run three layers; classify each finding as blocking or advisory:

Layer Command / tool Blocking when
CVEs uv run pip-audit (+ ignore file) High/critical on PyPI-resolved deps
Current tree secrets bash scripts/check_secrets.sh Any match in apps/, packages/, tests/, infra/, openapi/
Working-tree gitleaks gitleaks detect --no-git --config .gitleaks.toml Any leak in current files (if gitleaks installed)
Dangerous patterns ripgrep pickle.loads, eval(, exec( in apps/ + packages/ Any match in app code

Git history (advisory only):

  • gitleaks detect on full history may report dozens of hits in deleted legacy paths.
  • Split results: count hits in current working tree vs history-only.
  • Do not recommend git filter-repo unless user asks or real live credentials were committed.
  • Document resolution in report → point to docs/security/gitleaks-resolution.md if it exists.

Expected non-blocking: workspace packages vecinita-* skipped by pip-audit (not on PyPI).

Agent 6 — Cross-file analysis

Check How Severity
Unused imports Ruff F401/F841 Blocking if >0
Circular deps Workspace package import graph (apps/*, packages/*) Blocking if cycles
Dead code Optional (vulture); if not run, note SKIPPED Advisory
Public docstrings AST scan: public defs/classes without docstring in apps/, packages/ Advisory (style)
Naming Project rules / ruff N rules if enabled Advisory

Agent 7 — Dependency health

  • pip list --outdated — flag LlamaIndex/pinned stacks per docs/dependency-inventory.md
  • Heuristic import vs pyproject.toml per workspace member — ignore __future__, workspace packages
  • Return: outdated count (advisory if pins intentional), missing (blocking if real)

Agent 8 — Template & platform conformance (Vecinita api+worker)

Read template-registry.md and verify:

Criterion Vecinita expectation
Layout apps/*, packages/*, tests/, openapi/, infra/
Modal isolation import modal only under infra/modal/
Modal workspace Deploy scripts use vecinita profile (see scripts/modal_ensure_workspace.sh)
Deploy URLs https://vecinita--vecinita-*not fontface-- or other workspace prefixes
DO backends No DATABASE_URL in Modal worker paths (scripts/check_modal_no_database_url.sh)
OpenAPI openapi/*.yaml present and parse (scripts/check_openapi_specs.sh)
CI .github/workflows/ci.yml matches Phase 1 commands

Modal live checks (advisory unless env set):

  • If VECINITA_MODAL_EMBED_URL / VECINITA_MODAL_LLM_URL set: GET /health, embed dim 384
  • Else: read docs/data-staging-state.md for D6/D7 status

Agent 9 — Data staging & deploy readiness (Vecinita)

Read docs/data-staging-state.md and docs/execution-plan.md Phase 4 gate:

Asset / gate Report
D1–D5 fixtures/migrations Must be verified for full data QA PASS
D6 FastEmbed / D7 Qwen verified | staged_procedure | pending
Phase 4 H1–H3 live Advisory if VECINITA_STAGING_CHAT_URL unset — point to docs/staging-runbook.md, scripts/deploy/staging_smoke.sh
H0c CORS / connectivity Blocking: pytest tests/unit/test_cors_policy.py must pass in CI
H4–H5 live Advisory if staging frontend URLs unset — scripts/deploy/verify_connectivity.sh, connectivity-gates.md
Modal secret vecinita-data-management Note if data_management_app deploy requires it

Phase 3 — Compile results

Overall status:

  • PASS — all blocking checks green; advisories may remain
  • FAIL — any blocking check failed
  • pass_with_advisories — use in workflow-state.yaml when blocking PASS but advisories need 11-verify-impl review

Short summary block (always include in report header):

QA Results:
  Lint:           [PASS/FAIL] — [N] issues
  Format:         [PASS/FAIL] — [N] files
  Typecheck:      [PASS/FAIL] — [N] errors
  Tests (Python): [PASS/FAIL] — [N] passed, [N] skipped, [N] failed
  Tests (FE):     [PASS/FAIL] — [N] passed per app
  Coverage gate:  [PASS/FAIL] — FE branch ≥95% (make test-unit-coverage)
  Security:       [PASS/FAIL] — [N] CVEs; [N] secrets (tree); [N] history (advisory)
  Cross-file:     [N] unused imports; [N] cycles; [N] docstrings missing (advisory)
  Dependencies:   [N] outdated (advisory); [N] missing
  Template:       [PASS/FAIL]
  Data / Modal:   D6 [status]; D7 [status]; workspace [vecinita|WRONG]

Assign QA-00N IDs to advisories in §Findings for 11-verify-impl (see docs/qa-report.md examples).


Phase 4 — Write report

Write docs/qa-report.md with:

  1. Executive summary table (blocking vs advisory)
  2. Commands run (copy-paste reproducible)
  3. Per-check details (verbatim tool output for failures)
  4. Findings for 11-verify-impl table (ID, severity, finding, suggested action)
  5. Data integrity / Modal workspace subsection when relevant
  6. Phase / execution-plan alignment (deferred gates called out explicitly)

Do not auto-fix code, rewrite git history, or deploy to Modal in the default 09 run.


Advisory remediation (separate invocation)

09-qa is assessment only. When the user asks to address advisories after a QA report, use a follow-up pass (not a re-run of 09):

  1. Read docs/qa-report.md §Findings.
  2. Spawn parallel Task agents (one concern per agent), e.g.:
    • D6/D7scripts/stage_modal_weights.sh, scripts/modal_ensure_workspace.sh, update docs/data-staging-state.md; enforce vecinita Modal profile; stop mistaken deploys on other workspaces via scripts/teardown_fontface_vecinita.sh
    • Secrets history.gitleaks.toml, scripts/check_secrets.sh, CI --no-git, docs/security/gitleaks-resolution.md (no history rewrite unless user requests)
    • Docstrings — public symbols in apps/ + packages/; verify ruff/basedpyright (no Any)
    • Phase 4 staging — H2 in staging_smoke.sh, docs/staging-runbook.md, skip-safe tests/smoke/test_staging_health.py
  3. Re-run blocking checks only for touched areas; update docs/qa-report.md or add a short QA remediation note with date.
  4. Modal workspace rule: all Vecinita deploys → profile vecinita; URLs must use vecinita-- prefix; document in infra/modal/env.example.

Do not bundle unrelated fixes into one agent.


Output rules

  1. Report only in the default 09 run — no auto-fix, no commits unless user asks.
  2. Full codebase scope — not just diff since last milestone.
  3. Distinguish blocking vs advisory — especially security (tree vs history) and data staging.
  4. No AskQuestion in 09 — surface choices in docs/qa-report.md for 11-verify-impl.
  5. Async-safe — report is self-contained; include exact commands and env prerequisites.
  6. CI parity — if local PASS but CI differs, note branch/workflow mismatch as advisory.

Common Vecinita advisories (reference)

Use as checklist when writing findings; not all apply every run.

ID pattern Typical finding Typical remediation (user-requested)
QA-003 D6/D7 pending on Modal volumes stage_modal_weights.sh; verify embed 384-dim; vecinita workspace
QA-004 Outdated transitive deps (LlamaIndex pins) ADR + pip-audit after intentional bump
QA-005 gitleaks hits in git history only .gitleaks.toml + CI --no-git; no filter-repo
QA-002 Public symbols without docstrings Doc pass in apps/ + packages/
QA-006 Phase 4 H1–H3 deferred (no staging URLs) staging-runbook.md, env-gated smoke tests
QA-007 Modal apps on wrong workspace (fontface--) modal_ensure_workspace.sh, teardown script, redeploy

Handoff to 11-verify-impl

11-verify-impl should:

  • Walk user through blocking FAIL items first
  • Present advisories with approve / defer / fix-now (fix-now → advisory remediation pass above)
  • Cross-check docs/qa-report.md against docs/feature-list.md and acceptance criteria
  • Not re-run full 09 unless codebase changed materially since report date
Install via CLI
npx skills add https://github.com/Math-Data-Justice-Collaborative/vecinita --skill 09-qa
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
Math-Data-Justice-Collaborative
Math-Data-Justice-Collaborative Explore all skills →