name: dogfood-target description: >- Technical behavior test harness for Processminer's Target / Transformation track. Takes a process that already has an As-Is and drives it through the to-be skills via the running app's UI — transformation-agent, council-review, area-summary — exercising every interaction path (Y / E / R, Accept / Reject / Reopen), asserting that responses reflect correctly in the UI and the process JSON, and benchmarking speed per turn. The DTP Enhancer skills (dtp-regenerate, dtp-compare, dtp-summary) are covered by the separate /dogfood-dtp harness. Run ONLY when the user explicitly invokes /dogfood-target in the CLI. Never auto-route to it from the app's chat.
Dogfood Target
You are an automated skill behavior test harness for the Processminer
Target / Transformation track — the As-Is → To-Be development arc. You are
the sibling of dogfood-run (which covers As-Is authoring); this harness covers
everything downstream of a documented As-Is. You care about three things:
- Interaction paths — every target skill's branching paths (Y / E / R on drafts, Accept / Reject / Reopen on council items) must be explicitly exercised and produce the correct app behavior.
- UI + state reflection — skill responses must produce observable, correct
changes in the UI and the backing state: target element arrays
(
to-be-design,transformation-decisions,gap-resolution, …) and their provenance/approval inwiki/processes/<slug>.json; andtargetReview/summarieson the process JSON. - Execution speed — wall-clock per skill turn is recorded. Slow turns are findings; runaway turns are failures.
You are not a domain expert and you do not score content quality. The target you develop is a throwaway test artifact.
This skill is a test harness, invoked only by the user typing
/dogfood-target. Never run it against a real process and never let the app's
free chat route to it.
Two Claude instances — keep them straight
When you run this skill you are the outer Claude, running in the CLI with
browser tools. The app's chat spawns its own claude worker (see
SKILLS.md §2) that runs the wiki skills. You never call the target skills
yourself — you type into the app's chat or click its CTAs and the app's worker
runs them. You only ever drive the browser and write the report.
The precondition that makes this track testable
The Target track develops a To-Be from an existing As-Is. This harness does not author an As-Is — it needs one already on disk. On invocation:
- With a slug argument (
/dogfood-target funds-release-dogfood-v1) — use that process. - Without an argument — pick the most recently-modified process whose As-Is
arrays are populated (
process-steps≥ 5 and at least one ofpain-points/process-gaps/compliance-gapspresent — the problems the target must cover). Good candidates are processes a priordogfood-runbuilt.
If no process has a usable As-Is, stop and tell the user to run
/dogfood-run first (or name a slug). A target with nothing to transform is not
a valid test.
Tools
Use the preview_* MCP tools for everything browser-related — never Bash for
browser work, never the Chrome MCP. Use Bash only for: reading the LAN IP,
git-status style checks, and reading wiki / runtime JSON files the run produced.
Available preview_* tools: preview_start, preview_list, preview_stop,
preview_logs, preview_console_logs, preview_network, preview_screenshot,
preview_click, preview_fill, preview_inspect, preview_eval.
There is no preview_snapshot. Use preview_eval to query DOM state. Use
preview_screenshot for your own live verification; evidence in the report is
textual.
Screenshots cannot be written to disk. Record per-stage evidence as DOM
snapshots from preview_eval, assertion logs, and console/network captures.
Preconditions and setup
- Dev server. Start with
preview_startbound to0.0.0.0(command:npm run dev -- -H 0.0.0.0). If one is already running on localhost only, stop it and restart. - LAN IP.
ipconfig getifaddr en0(fall back toen1). The report URL ishttp://<ip>:<port>/target-report.html. - Run id.
YYYY-MM-DD-HHMM. Keys everything in this run. - Run folder. Create
public/test-report-assets/target-<run-id>/— holdsstate.jsonand any DOM snapshot text files.
Resumability
A full run takes time. Make it resumable.
- After each stage, write
public/test-report-assets/target-<run-id>/state.json: run id, target process slug, completed stage numbers, per-stage results (pass/fail, assertions, notes, timings). - On invocation with a run id argument (
/dogfood-target target-2026-06-08-1234), load thatstate.jsonand skip completed stages. - With no run-id argument, start fresh. Never delete a previous run's work.
Timing
Record wall-clock time for every skill turn:
- Start the timer just before
preview_fill+ submit (or the CTA click). - Stop it when the textarea placeholder returns to
Message the assistant…. - Log the elapsed seconds in the stage notes.
Speed thresholds:
- ≤ 60 s — fast (PASS)
- 61–180 s — acceptable (PASS, note it)
- 181–300 s — slow (PASS with WARNING)
300 s — runaway (FAIL; stop waiting after 600 s and mark the turn TIMEOUT)
Reading a chat turn
The chat textarea placeholder is Working… while a turn runs and
Message the assistant… when idle. To run one turn:
preview_fillthe textarea (orpreview_clickthe CTA).- Submit —
preview_clickthe send control (button.chat-send; while a turn runs it isbutton.chat-stop). Find selectors withpreview_eval. - Poll
preview_evaluntil placeholder isMessage the assistant…. Stop polling after 600 s; if not done, mark the turn TIMEOUT. - Read the assistant's reply via
preview_evalof the chat message list. - Run
preview_console_logsandpreview_network; record errors as findings.
Interaction matrix — target track
| Path | Where | What to do | Expected behavior |
|---|---|---|---|
| Y — accept | transformation-agent draft | [Y] |
Element stays status: draft; provenance updated |
| E — edit | transformation-agent draft | [E] + correction |
Edit incorporated; edited heading provenance resets to proposed; not approved |
| R — reject | transformation-agent draft | [R]/[N] + reason |
Skill redrafts (differs); element stays in-progress/draft; not approved |
| Accept | council item | click Accept |
targetReview.items[i].triage = "accepted" |
| Reject | council item | click Reject |
triage = "rejected" |
| Reopen | council item | click Reopen |
triage = "pending" |
UI + state assertions — what to check after each interaction
After each interaction, read from the DOM and the backing state:
- Target elements — do new
target-state(TS-…),transformation-decision(TD-…),gap(VG-…),requirement,dependency,assumptionelements appear in their arrays inwiki/processes/<slug>.json? - Draft, not approved — everything the target skills write is
meta.status: "draft"/meta.approvalunset. Nothing should beapproved(the SME approves later in the app). - Provenance — for any edited heading, is
meta.provenance.<heading>.sourceproposedafter an[E]? - Council review —
process.targetReviewhasran: [...]anditems: [{ specialist, title, detail, targets, triage }]; triage clicks fliptriage. - Coverage —
CoverageRollupshows "{covered} / {total} open problems covered" and reflects the transformation-decisions'resolveslinks. - Summaries —
process.summaries.<area>(area-summary) gets the four-section memo.
Fail the assertion if the expected state is not present. Record the raw DOM text or JSON field value as evidence.
The run — stages
Each stage is a checkpoint. Complete it, assert every item, capture DOM
snapshots as text, write state.json, move on. A stage with any FAIL assertion
is itself FAIL, but the run always continues to the next stage.
Stage 0 — Preflight & As-Is precondition
Open the app and select the target process (per "the precondition" above). Assert:
- App loads without JS errors (console clean)
- The chosen process exists and is open (URL
?p=<slug>) - Its As-Is is usable:
wiki/processes/<slug>.jsonhasprocess-steps≥ 5 and at least one problem array (pain-points/process-gaps/compliance-gaps) with ≥ 1 element
Record the slug into state.json. If the precondition fails, mark Stage 0 FAIL,
write the report, and stop (the rest of the run cannot be meaningful).
Stage 1 — transformation-agent skill
transformation-agent is chat-triggered (no CTA). Trigger it:
Let's design the target state for this process.
Note on SME identity. The session scope preamble (which hands the SME's name/role to the skill) is only injected on the first turn of a fresh chat session. If you trigger transformation-agent in a chat that already has history, no identity is supplied and the skill will correctly ask for it (per its Phase 0 spec) — that prompt is expected, not a defect. To avoid it, start this stage from a fresh chat (clear the conversation) so the preamble lands; otherwise just answer the identity question and continue.
Paths to exercise: Y, E, R.
Walk the first few elements the skill drafts, cycling the paths:
| Turn | Path | What to send |
|---|---|---|
| 1 | Y | [Y] |
| 2 | E | [E] + a concrete correction (e.g. "tighten the rationale to name the pain-point it closes") |
| 3 | Y | [Y] after the edit is incorporated |
| 4 | R | [R] This decision doesn't say what it replaces — redraft with the As-Is steps it supersedes |
| 5 | Y | [Y] after the redraft |
After each path, assert:
- Y: the element is written to its target array
(
to-be-design/transformation-decisions/gap-resolution/ …) asstatus: draft; provenance present; notapproved. - E: the edit is incorporated in the re-presented draft; the edited heading's
provenance is
proposed; the element is not approved. - R: the next turn's draft differs from the rejected one; the element stays
in-progress/ unapproved.
Then drive the skill through its later optional phases so the full target
data model is exercised — these do not run unless you ask for them (the skill
offers [N] close out after target-states + decisions, and stops there if you
take it). Send, accepting each draft with [Y]:
Now capture the gap(s) for this target.→ at least onegap(VG-…) ingap-resolution.Now capture the key requirements the target state implies.→ at least onerequirementinrequirements.Now capture the dependencies and assumptions behind this target.→ at least onedependencyindependenciesand oneassumptioninassumptions.
Then Wrap up the target design for this test.
Assert: at least one element exists in each of to-be-design,
transformation-decisions, gap-resolution, requirements, dependencies
and assumptions; all status: draft, none approved. (A phase the skill
genuinely cannot populate — e.g. no dependencies apply — is a documented skip,
not a fail: record which array stayed empty and the skill's stated reason.)
Speed: per-turn timing. Flag any turn > 180 s.
Stage 2 — Coverage + TO-BE synthesis (UI reflection)
No skill — assert the UI reflects Stage 1's writes.
- Open the TO-BE Design section (nav key
to-be-design). Assert theTargetSynthesisview renders one row per As-Is process-step, with target themes (or "Unchanged") mapped to each. - Open the Validation section (nav key
validation). Assert theCoverageRolluprenders "{covered} / {total} open problems covered by the target", and thatcoveredreflects theresolveslinks on thetransformation-decisionelements written in Stage 1 (cross-check the count against the JSON).
Record the coverage line as evidence. (No timing — this is a render check.)
Stage 3 — council-review skill
In the Validation section, find the council CTA.
Paths to exercise: full council, then per-item Accept / Reject / Reopen.
- Click
✦ Run full council(button.canvas-act). Wait for completion. Assert:process.targetReview.ranlists the five perspective specialists andtargetReview.items[]is non-empty. - In the Council Review panel (
TargetReviewPanel), on three different items: click Accept on the first, Reject on the second, Reopen on the third (button.act/button.act.ai). Assert each item'striageinprocess.targetReviewbecomesaccepted/rejected/pendingrespectively. - Click an element-ref chip (
button.link-chip-nav) on one item; assert the app navigates to that element (?URL or the element card opens).
Speed: time the full-council run (web/LLM heavy; 180–300 s acceptable; flag
300 s). Triage clicks are instant UI writes — assert, don't time.
Stage 4 — area-summary skill (Target area)
Open the Target area view (nav key __area:target). Click
✦ Generate executive summary (button.section-summary-btn.primary →
generateAreaSummary("target")).
Paths to exercise: none interactive (the four parts are editable post-hoc; optionally exercise one Edit → Save on a part).
Assert:
process.summaries.targetis written with the four required section headings (## Introduction,## Current state,## What stands out,## Recommendation)- The
SummaryPanelrenders the four parts, each with anEditaffordance - (Optional) editing one part and saving persists the change to
summaries.target
Speed: time from click to summary rendering.
Stage 5 — Report
Assemble the QA report and write it to:
public/target-report.html(latest)public/target-report-<run-id>.html(archived)
The report
A self-contained pass/fail QA report — single HTML file, inline CSS, no external
assets. Match DESIGN.md for type, colour, spacing.
Contents, in order:
- Verdict header — run id, date, target process slug, overall PASS/FAIL, health score (% of assertions passed), total duration.
- Stage table — every stage: name, PASS/FAIL, duration, assertion count passed/total, one-line note.
- Per-stage detail — every assertion with PASS/FAIL and reason; DOM snapshot text as evidence; console / network errors; per-turn timing table (turn, path, elapsed s, speed rating).
- Interaction path coverage — a matrix: every target path (Y / E / R, Accept / Reject / Reopen) × every skill exercised, EXERCISED / NOT EXERCISED. Every "NOT EXERCISED" is a finding.
- Speed summary — per-skill average turn time, ranked slowest to fastest; slow (> 180 s) and runaway (> 300 s) highlighted.
- State reflection findings — every case where a skill action did NOT
produce the expected UI or state change (target element, provenance,
targetReviewtriage, summary). Behavior bugs, not content issues. - Target artifact — slug, counts per target array (
to-be-design,transformation-decisions,gap-resolution, …),targetReviewitem count + triage breakdown, whichsummariesexist. - Errors and findings — every console error, network failure, app exception, or unexpected behavior, with the stage it occurred in.
Close-out
Report to the user: overall verdict, health score, report URL (with real LAN IP), target process slug, interaction-path coverage (how many paths exercised / total), speed summary (any runaway turns?), and count of state-reflection failures.
If the run was stopped early, say which stages still need to run and how to
resume (/dogfood-target target-<run-id>).