name: agent-feedback-artifact description: "Use when the user wants in-page annotation widget on HTML artifacts, marker-local chat, or comment-triggered agent work. Add, serve, queue, and process marker feedback. Triggers: annotation, feedback, marker, artifact."
Agent Feedback Artifact
Injects a managed annotation widget into an HTML artifact (static file or running web app), serves it through a local feedback server, and queues marker-scoped user comments as push-delivered agent work items.
Self-validate after edits:
./scripts/validate.shfrom the skill directory.
Delivery modes
The one thing that decides whether markers arrive: where the widget POSTs /api/feedback, and whether a queue server is listening there. The two modes differ only in that.
| Static artifact | Running app | |
|---|---|---|
| When | Single HTML file | Live web app (devpulse, dashboards) |
| Widget injected by | add-agent-feedback.mjs (baked into the file) |
hermes-chrome extension popup toggle (no app code change) |
| Processes | 1 — artifact-feedback-server.mjs both serves the file AND hosts the queue |
2 — the app (its own server) + a queue-only artifact-feedback-server.mjs sidecar |
| Widget posts to | same-origin (__FB_BASE="") — the server that served the page |
the queue sidecar: popup "Queue origin" override, else default http://localhost:4177 |
| Operator opens | the served URL http://localhost:<port>/<file>.html |
the app's own URL |
| Queue-origin config | none | none if the sidecar runs on 4177; else set the popup override |
Two rules that prevent silent "Send failed" / markers-never-arrive:
- Static: open the SERVED URL, never
file://…. Afile://page posts to afile://base → no server → marker dropped. - App: the queue sidecar must be reachable at the widget's queue origin. Default is
4177; if you run the sidecar on another port, set the popup "Queue origin" to match. The server's default port is also4177(artifact-feedback-server.mjs), so leaving both at the default just works.
Diagnosing a no-show marker: a
curl -X POST :<port>/api/feedbackthat returns 202 + fires the Monitor proves server+watcher are fine → the break is browser→server (wrong port/origin orfile://). In extension mode, read the live tab with the hermes bridge (page_context, read-onlyevaluateof__FB_BASE) rather than guessing.
Operating sequence
preflight → add widget → serve → arm/heal wake → process markers → disarm Monitor → closeout (+ optimize self-introspect)
Full step-by-step + script invocations → references/operate.md.
close lane (/agent-feedback-artifact close) — tear down the live session. Distinct from closeout (which stays report-only + optimize self-introspection; close does not run it). close stops both running pieces:
- Server —
node scripts/agent-feedback-close.mjs [--port 4177]SIGTERMs the listener (the server traps it and closes gracefully; SIGKILLs a straggler).serverWasRunning:false= already stopped, fine. - Monitor — the script can't end a harness Task. You call
TaskStopon theagent-feedback-watch.mjsMonitor handle you armed in step 4 of the operating sequence. Skip if you never armed one.
Key invariants (detail in operate.md):
- Wake isn't durable; the queue is. The Monitor can die silently mid-session → markers sit
queued, unwoken. At session entry / each/agent-feedback-artifact(re)invocation runagent-feedback-wake-status.mjs --root <root>; onrearm_required(exit 1),pkill -f agent-feedback-watch.mjs+ re-arm the Monitor (backfills the backlog). - Closeout
optimize= model-judged self-introspection, not a ritual. Clean session → change nothing; only when warranted, loadoptimize.md→ "Optimize step". - Args: read-only/file scripts take
--root; mutating CLIs (dispatch,mark) take--port(default 4177).
Per-marker action (every wake notification)
Wake payload: {id, markerId, route, status, url, origin, artifactTitle, artifactPath, summary, visibleText, sentAt, createdAt, emittedAt} (~400–460 chars). origin ∈ localhost | external | github | file | blocked | unknown — derived from url. visibleText is the marked element's text (≤60 chars), dedup'd to null when identical to summary — disambiguates deictic comments ("change this", "make the number red"). Pre-decide the action from these fields before paying any round-trip.
Decision is origin × route × summary clarity:
origin |
route + summary |
What to do |
|---|---|---|
localhost |
direct + clear ("make X red") | Act on the local app — file follows from URL/artifactPath. Skip details. |
localhost |
direct + vague, or any cheap_marker_worker |
Pull details once for selector/rect/ui |
external |
direct/cheap + clear question | Answer from summary + URL. Skip details. |
external |
direct/cheap + vague | Pull details (selectedText, visibleText) |
github |
any | URL is the work target — fetch/inspect the linked repo, then answer or act |
file |
any | Static artifact mode — file path is in the URL |
blocked |
any | Surface as warning (chrome://, about:, view-source:) |
| any | deep_marker_worker |
Pull details. Dispatch a background Task subagent for data/calc work |
Where to run it — inline vs background subagent
The dispatch field on the dispatched item ("inline" | "background_task") says where to run the work:
Route / dispatch |
Run it | Why |
|---|---|---|
no_worker_main_agent_direct → inline |
Main thread, directly | Single-element style/text tweak; spawning a subagent is pure overhead |
deep_marker_worker / work-requests → background_task |
Background Task subagent (Agent, run_in_background:true); prompt = the item's workerPrompt; Explore (read-only diagnosis/scoping), general-purpose/Plan when you assign a write scope |
Real investigation/implementation — keeps the main thread free and isolates context |
cheap_marker_worker |
Inline by default; background if ≥2 markers are concurrent | One scoped diagnosis is cheap inline; a burst crosses the parallelism threshold |
| Multiple worker markers queued together | Dispatch concurrently (one message, N Agent calls); cap concurrency; serialize write-scope markers per file (read-only parallelize freely) |
Wall-clock |
spawn.model / reasoningEffort / fork_context are advisory — the harness Task tool owns model/isolation; you can't set them per-subagent from a skill.
Durability rules (the operator may have walked away)
- Terminal-state guarantee: every claimed marker MUST end
doneorblocked— even on subagent error (catch →mark <id> blocked "<error>"). Never leave a marker inprocessing. The server's reclaim sweep is the backstop (a dead worker's lease expires → requeued, orblockedafterAGENT_FEEDBACK_MAX_ATTEMPTS), but don't rely on it for normal completion. - Ambiguity →
blocked, neverAskUserQuestion: there may be no one to answer.mark <id> blocked "<what you need>"— the reason surfaces on the marker for the returning operator. - Long subagent (> lease TTL, default 10 min): send a heartbeat
mark <id> processing "still working: <phase>"once per window so the sweep doesn't reclaim live work.
After applying the fix (the CLIs are now HTTP clients of the server — pass --port, default 4177, not --root):
node scripts/agent-feedback-mark.mjs <id> done "reply" [--reload | --reload-full] [--port <port>]
--reload— CSS hot-swap (use for every CSS change; ~700 ms to visible)--reload-full— full page reload (use for HTML/template edits)- (no flag) — reply only, no auto-refresh (question-only replies)
Batch concurrent wakes into one turn — dispatch worker-routed ones as parallel background subagents (~2× wall-clock on 3 markers).
Rationale + reply conventions → references/best-practices.md.
Load references on need
| When | Load |
|---|---|
| Step-by-step operating procedure | references/operate.md |
| Optimize step (closeout self-introspection) or diagnosing a runtime issue | references/optimize.md |
| Skill-specific defaults + conventions | references/best-practices.md |
| Modifying or extending the skill itself | references/agent-handbook.md |
| Wake adapter (Monitor / file-watch / Hermes) | references/wake-bridge.md |
| CORS issues | references/cors-setup.md |
| Browser acceptance checklist | references/browser-acceptance.md |
| Widget HTML/CSS/JS internals | references/overlay.html (canonical source) |
| Headed browser testing for running-app mode | hermes-chrome skill (separate top-level skill) |