name: yggui-changelog-demo description: Capture deterministic proof bundles, screenshots, traces, and curated changelog notes for YggUI app changes.
YggUI Changelog Demo
Use this workflow when a yggui app feature or fix should ship with proof, screenshots, and a curated changelog entry.
Observability note: the terminal attempt ledger, viewport classifier, and app-control terminal-surface helpers now live in crates/yggterm-shell/src/terminal_observe.rs. Do not keep spelunking only in shell.rs when validating or extending proof semantics.
Goals
- capture deterministic evidence
- produce a reusable proof bundle
- draft changelog text from real artifacts
- keep documentation and workflow in sync when automation grows
Inputs
- a user-visible feature or fix
- the relevant macro or app-control path
- a target proof bundle id under
artifacts/demos/unreleased/
Workflow
- Identify the user-visible claim.
- Choose or write the deterministic macro path.
- Capture:
- screenshots
- optional recording
- app-state snapshot
- event trace / perf evidence
active_surface_requestswhen a terminal load/restore claim depends on whether the request is still truthfully in flight- when terminal open/restore is involved, the exact
terminal_open_attemptobject,active_terminal_surface,interactive, andterminal_settled_kind - for terminal geometry bugs, include whether
active_terminal_surface.geometry_problemwas set - for input/focus bugs, include
dom.active_element,terminal_hosts[].effective_input_focus,terminal_hosts[].helper_textarea_focused,terminal_hosts[].host_has_active_element, andshell.terminal_input_override_active - for startup input-contract bugs, also include
terminal_hosts[].host_stdin_enabled - for degraded app-control state under live terminal load, include
dom.snapshot_mode,dom.degraded_reason, active terminal host geometry, retained replay prompt-follow fields, and viewport-force diagnostics - for session/view ownership bugs, include
session_view_contract_violationsand reject the proof unless it is empty - for session-selection/copy-budget bugs, include
generation.copy_generation_start_count,generation.implicit_copy_generation_enabled, and the title/precis/summary in-flight path arrays before and after selection - for inline rename bugs, include
shell.tree_rename_value,dom.tree_rename_input_value,dom.tree_rename_input_focused,dom.tree_rename_input_selection_start,dom.tree_rename_input_selection_end, and anysnapshot_mode == "action-fallback"evidence if KDE forced a degraded snapshot - for titlebar search typing bugs, include
shell.search_query,shell.search_focused,dom.active_element.value,dom.titlebar_search_active, and anysnapshot_mode == "action-fallback"evidence if KDE forced a degraded snapshot
- Create or update the proof bundle:
manifest.jsonsummary.mdcaptures/trace/
- Update
CHANGELOG.mdwith a concise user-facing note. - If new automation or capture powers were required, update:
docs/demos/ARCHITECTURE.mddocs/demos/FORMAT.mddocs/demos/STYLE.md- this skill file
Standards
- Prefer exact screenshots and traces over vague prose.
- For terminal restore claims, bind the proof to one attempt id and fail the claim if that attempt latched any failure, even if a later state looks healthy.
- For remote careful-restore claims, prove the one-minute boundary and the protected-runtime rule together: a Keep Alive or temporary update-restored runtime that is still running may get a non-destructive
resume_recovery/ensure request after the timeout, but the trace must not showforce_remote_restart_beginor a duplicatecodex resumeunder the same label unless daemon truth first reports the runtime process gone or the proof explicitly drove a user/harness force restart. - For startup restore work, prove the app did not issue a second reopen of the already-active terminal. One startup mount sequence is correct. A duplicate reopen is a bug, even if a later attempt recovers.
- For remote startup restore, the hot path no longer blocks on a separate saved-session existence probe. Expect
remote_saved_session_preflight_elided_runtime_launchin the daemon trace, then prove missing-session truth from the runtime launch itself, the attempt ledger, and the overlay excerpt. - For fresh remote full-screen attaches, also capture whether
ui/terminal_mountemittedresize_nudge_begin/resize_nudge_end. The nudge is part of the product contract now: it forces a repaint before Yggterm concludes that a live TUI attach is still blank. - Do not treat a visible terminal failure overlay as final proof if
shell.terminal_attach_in_flightstill contains the active session path. That is an in-flight recovery state, not a finished verdict. - In
Terminalmode, saved preview context is no longer accepted as a terminal-ready settle. Expectterminal_settled_kind == "recovering"until the resume chip clears and the live terminal is visually revealed. - If the terminal host already has staged transcript bytes while the resume chip is still up, treat that as
recovering, notoverlay_contextand notinteractive. - Do not call a terminal
overlay_contextjust because the host has meaningful text.overlay_context_visibleonly applies when the saved-context fallback is still the user-visible truth. Terminal-mode recovery should now stayrecoveringinstead. - Codex model-permission setup selectors are interactive terminal surfaces even when they sit mid-screen with many blank rows below the hidden cursor, no attach-ready visual deadline, or only the lower half of the selector visible to the live health tail, including tails that start inside the
auto-reviewerline. A proof should showterminal_settled_kind == "interactive", no remote-attention notification, andterminal_hosts[].host_stdin_enabled == truewhile the selector text is visible. - Codex
Conversation interrupted - tell the model what to do differentlyinput surfaces are interactive terminal surfaces when the active remote host is mounted, input-enabled, focused, and the cursor/interrupted line is visible. Do not require a normal›prompt glyph for this state; prove it withterminal_settled_kind == "interactive", no active surface problem, screenshot evidence, and a deterministic non-submittedprobe-typeecho. - The main viewport should stay available during terminal recovery. Resume progress belongs in notifications/toasts, not as a full-viewport curtain over the host. If a proof screenshot shows the terminal surface replaced by a recovery card, treat that as a UX regression.
- For startup timing, prefer the app trace
startup/window_spawnedevent over slower X11 root-tree detection when both are available. Use X11 tree timing only as fallback evidence. - For terminal session-switch bugs, capture both the source and destination terminal surfaces on a second X11 display and verify the destination screenshot text matches the destination
active_session_path, not stale text from the previous session. - For KDE/restart lifecycle bugs, include the
linux_daemon_sweeptrace slice plus anyspawned_daemon_child/spawned_daemon_exitorlocal_spawned_daemon_child/local_spawned_daemon_exitevents. The proof should show same-home daemon cleanup only, no cross-home orphan reap, and no lingering temp-home GUI/daemon after the bundle closes. For hot-update daemon herds, the sweep should protect the current preserved PTY owner endpoint and the newest clean preserved-only startup bridge sidecar, then retire older preserved-only sidecars withowned_terminal_session_count == 0; a sidecar must not stay alive solely because another daemon in the same home has recoverable runtime activity, and a daemon owning only runtime keys absent from the current owner registry is a cleanup candidate. A pid named inhot-update-terminal-owners.jsonis a session-survival root, so exact-key coverage by a newer daemon is not by itself permission to kill that pid. Cleanup client checks must use the candidate daemon's exact client-instance endpoint scope; app-control may scan legacy scopes for handoff discovery, but that broad scan is not a cleanup guard. Startup handoff must reject old owners whose inferred terminal key set includes keys absent from the current owner registry, even when the old owner also has authorized kept sessions. - For small-window chrome or settings-rail UI regressions, resize and scroll through app-control rather than desktop-global automation:
server app resize-window --width 520 --height 380,server app panel settings, andserver app panel scroll --ratio 1. Pair screenshots with--only-check small_window_chrome,--only-check settings_zoom_input, and, when a terminal is active,--only-check settings_terminal_theme_dropdownso the proof includes visible titlebar bounds, native-free zoom inputs, dropdown scroll-into-view, keyboard filtering, and Enter commit state. - For daemon hot-update, multi-version control, hung-session, or latency incident claims, start the proof with
yggterm-headless server monitor --scenario panic-report --expect-path <session-path> --jsonl-out <path>. Includeserver-list, the matchinghot-restartresult when lifecycle recovery is used, and a post-restartlatency-check --allorwait-sessionproof so the bundle shows both the incident picture and the recovered server surface. If any target daemon still owns live terminal runtimes, the proof must prefer session survival. A successful handoff should exposedaemon_update_state.state == "hot_update_handoff_active",update_priority == "handoff_preserve_sessions",owned_terminal_session_count,preserved_terminal_owner_count > 0, and preserved runtime keys in app-control/server status, while the monitor hot-restart result should showhot_update_handoff == trueandfallback_shutdown_skipped == true. A daemon with only preserved-owner entries andowned_terminal_session_count == 0should be allowed to restart without extending the sidecar chain, but its preserved-owner registry must be retargeted before exit, and startup reconcile should prefer that active/default sidecar over older orphaned PTY owners. If an older daemon owns only runtime keys that are absent from the current preserved-owner registry and persisted live-session state, treat that as a ghost-owned closed session, not a hot-update handoff reason. Preserved-owner entries are not durable session truth: current live-session metadata must authorize them, update-restart state may protect an unkept row only whenserver status.terminal_session_keysstill contains that runtime key, and daemon load/keep/close proofs should show unrepresented entries pruned fromhot-update-terminal-owners.jsoninstead of allowing old non-keep-alive sessions to reappear. If handoff cannot be prepared safely, app-control should exposedaemon_update_state.state == "hot_update_pending"andupdate_priority == "defer_update_preserve_sessions", while the monitor result should showfallback_shutdown_skipped == truerather thanprepare_update_restartplusshutdown. Treat any forced shutdown of a live PTY owner during update handoff as a failed hot-update proof. For KDE duplicate-icon claims, addyggterm-headless server app desktop-identityso the bundle captures pinned launchers, desktop file fields, live client app ids, and update-handoff env. - For stale multi-version remote runtime incidents, prove that the current client treats a stale daemon with live PTYs as a hot-update owner before stdio attach: include
server/remote_runtime hot_update_stale_runtime_owner_beginplus eitherhot_update_stale_runtime_owner_handoffor an explicithot_update_stale_runtime_owner_direct_bridge_fallback,server-listorlatency-check --allshowing both versions, and a final state where the terminal-open attempt either reachesreadyon the current daemon or latches a failure withterminal_attach_in_flightandactive_surface_requestscleared. A trace that only skips the stale owner and then spawns a duplicate failed resume path is a regression. - For update-restored remote Live Sessions that are no longer really live, capture a fresh remote scan or app state after scan. Unkept temporary update-restore rows must disappear from
Live Sessionsonce the scan reportslive_runtime=false, and the trace should includeserver/remote_machine prune_temporary_stale_live_sessions. Explicit keep-alive rows may remain as recovery targets, but the proof must showlive_session_snapshot_debug[].keep_alive == true, cleanactive_terminal_surface.problem/active_terminal_surface.geometry_problemafter settle, and live remote runtime truth. Until all remote hosts are updated, a daemon terminal key shaped aslocal://<session-id>is compatible live-runtime evidence for the matching remote Codexcodex-runtime://<session-id>. Fresh daemon-owned Codex starts can have a synthetic runtime key before Codex creates the actual transcript id; proof must use the snapshotCodex Session+Storagemetadata from the PTY process tree's open JSONL fd as the saved-session source of truth while preservingRuntime Sessionas the terminal I/O key. The settled app-control proof must also showsession_view_contract_violations == [],runtime_truth.live_row_count > 0, and a visibleLive Sessionsrow that preserves thecodex-runtime://...runtime path rather than rewriting it tolocal://.... - For Live Sessions keep-alive UI changes, prove the kept marker is a fixed left-side status rail and the close affordance remains a separate right-side hit target. Reject proof where
live_keep_alive_rect.leftvaries with the session title width. For remote live sessions, a second row under the remote cwd folder is expected when that row reportslive_member = true;live_keep_aliveshould reflect durability only and may be false on the cwd projection. Local historical transcript rows should still avoid stored-tree duplicates until explicitly opened into a runtime. - For title/summary budget regressions, prove selection did not start LLM work by showing
generation.copy_generation_start_countunchanged across the open/select action. Cached copy hydration is allowed; title, precis, or summary generation is not allowed unless the user used an explicit regenerate action. - For stored Codex transcript regressions, prove two separate moments: the cold-start selected row must stay idle with no
active_session_path, no matchingterminal_open_attempt, and no matchingactive_surface_requests; then an explicitserver app open <path>without--viewmust promote the row toTerminal, move the resultingLiveLocalruntime underLive Sessionswithout leaving a duplicate stored-tree row, keepgeneration.copy_generation_start_countunchanged, and show sidebar row cursors as normalpointervalues rather than idlegrab/grabbing. - For expandable sidebar hit-zone regressions, prove both actions from the real DOM row: clicking
dom.sidebar_visible_rows[*].label_rectselects the group and shows its scoped Startpage with no active terminal input target, while clickingicon_toggle_rect,group_expander_rect, orrow_trailing_toggle_recttoggles onlygroup_expanded. Apply the same proof to cwd folders, machine rows, andLive Sessions. - For inline rename regressions, prove the initial title is selected from
0..len(title), each typed prefix survives app-control observation, Ctrl+A selects inside the input rather than the sidebar, and Enter or click-away clearsshell.tree_rename_path. If KDE forcesdom_debug_snapshot_timeout, the proof may use the action fallback orshell.tree_rename_value, but a state that loses both DOM and shell rename values whileshell.tree_rename_pathis set is a failed proof. - For titlebar rename regressions, capture
dom.titlebar_title_rectanddom.titlebar_summary_title_rect, then prove clicking either the title chip text or the title/summary modal title enters the same focused inline rename contract as the sidebar context menu. - For titlebar search typing regressions, prove the shell query and focused DOM input value advance together. If the app falls back to
snapshot_mode == "action-fallback", it must still include the active search input rect and active element value. - When a proof bundle uses
server app screenshoton Linux X11, state whether the branch includes the real-window screenshot path. Older WebKit-only captures could miss embedded xterm content and produce false blank-terminal evidence. - For terminal geometry or overdraw bugs, include
terminal_hosts[].host_rect,terminal_hosts[].screen_rect, andterminal_hosts[].viewport_rectalongside the screenshot and attempt ledger. - Include
terminal_hosts[].host_content_width,host_content_height,host_padding_left_px,host_padding_right_px,host_padding_top_px, andhost_padding_bottom_pxwhen the fix uses xterm gutter compensation or any host-content-box adjustment. - For typing/cursor visibility bugs, also include
terminal_hosts[].viewport_yandterminal_hosts[].base_yso the proof shows whether the live cursor fell below the visible viewport. - For retained replay or hot-update prompt-follow bugs, include
terminal_hosts[].retained_replay_source,retained_replay_expected,retained_replay_prompt_follow_ready,retained_replay_unsafe_skip_prompt_ready,retained_replay_rejected_visible_text,last_retained_replay_follow_debug,scrollback_expected,scrollback_intent,scroll_controller_visible,scroll_controller_distance_rows,last_viewport_force_debug,viewport_y, andbase_y. A degradedterminal-fallbackstate is acceptable only when those fields are still present and the state command stays within the live app-control latency budget. A prompt-ready unsafe skip only proves the skipped cursor-addressed snapshot did not break interactivity; it is not proof that scrollback was preserved unlessprobe-scrollalso moves real viewport/text. Scroll-controller evidence is a YggUI control-surface signal only; it must not be treated as terminal-render proof. - For xterm input-hitbox or overtyping bugs, also include
terminal_hosts[].helpers_rectandterminal_hosts[].helper_textarea_rect. A drifted helper textarea is now a classified geometry failure, not a cosmetic quirk. - For terminal input bugs, also prove focus ownership. The good state is an active
xterm-helper-textareainside the active host plushelper_textarea_focused: trueandhost_has_active_element: true. - Do not treat stale helper focus on an inactive retained host as foreground truth by itself. A different-session host is an identity mismatch only if it still reports
host_stdin_enabled/raw_input_enabledor is otherwise the active session host with focused document truth. Proof bundles should include the stale host fields when this class is under investigation. - For remote terminal input bugs, also capture one deterministic
server app terminal send ... --data "__SENTINEL__"proof and the matching terminal text sample after settle. When multiple GUI clients exist, first captureserver app clientsand then target the proof with--pid <pid>so automation cannot bleed into the wrong desktop window. If local input works but remote input does not, inspectserver/remote_stdio_bridgeevents such asbridge_stdin_raw_mode_enable,bridge_stdin_raw_mode_skip, andbridge_stdin_raw_mode_restorein~/.yggterm/event-trace.jsonl. - For remote live-session latency regressions, inspect whether terminal writes are using the hot local runtime bridge before remote-direct fallback. A live remote session with a mounted local runtime should not spawn a fresh remote
yggterm server terminal write --stdincommand per character; the proof should includescripts/smoke_ui_latency.py --host <host> --pid <pid> --clear-afterand the daemon write-strategy regression test. server app terminal probe-typeandprobe-scrollnow exercise the active xterm host in the main viewport.probe-typefirst uses xterm core data injection, then falls back to the mounted input path only if needed, with optional--per-char,--ctrl-c,--tab, and--enter. It reportsvisible_echo_observedplustimings.visible_echo_msfrom the xterm buffer/cursor sample, so canvas-rendered terminals cannot pass by returning emptyhost.innerText. On 2.1.113+--per-chardispatches characters without per-character settle sleeps in the fallback path; slow visible echo means the app/input path is slow, not the probe loop.- For latency reports, run
scripts/smoke_ui_latency.py --host <host> --pid <pid> --clear-afteragainst the live client or a second-display client. Use--read-only-drawingfor live Codex sessions where typing is not acceptable; that mode now records idle render/write churn plus current/proc-delta combined GUI/WebKit CPU and should fail when a readable terminal is still burning CPU or repainting continuously. The smoke fails before typing if the active terminal is still interminal_attach_in_flight, not rendered, not interactive, missing xterm/viewport evidence, input-disabled, scrollback-locked away from the prompt, reporting the cursor outside the visible viewport, or showing leaked internal transport output such as a prompt-lineterminal session not found: local://...intext_tail/buffer_text_sample. Retained replay must also reject an already-visible xterm buffer with internal attach/SSH transport residue and force sanitized daemon replay instead of treating existing scrollback as healthy; if a later clean write repairs a dirty visible buffer, recordtransport_leak_reset_countas the recovery evidence.--clear-afterclears the prompt before and after short marker samples, so marker runs cannot wrap and create false missing-echo failures. The first post-open terminal token is reported as warmup; steady-state samples enforce the stricter visible-echo budget, drift budget, and terminal scroll budget. The proof should include the readiness gate result, state/rows/search/panel timings, terminal warmup or read-only activity rates, terminal steady p50/p95/max/drift, scroll probe result, scrollback intent after wheel release, combined CPU where available, and the active session path. - For typing-fan regressions, run the same latency smoke with a longer sample count, for example
--samples 40, and includeprocess_samples,terminal_render_events_per_sample,terminal_write_flushes_per_sample, andterminal_skipped_perf_events. A pass must show bounded visible-echo drift and bounded client/render churn, not only fast first echo. - For resize redraw regressions, include
terminal_hosts[].last_fit_guard,last_skipped_fit,xterm_dimensions,fit_overflow_px, andcursor_bottom_overflow_px. Reject proof where a visible host with usable dimensions still reportslast_skipped_fit.cause == "host_not_usable"after resize settle. - The default latency-smoke budgets are for live SSH app-control proof: 1200 ms for state/rows/search/panel command round trips, 700 ms for the first terminal warmup visible echo, 500 ms for steady terminal visible echo, and 450 ms for steady terminal visible-echo p95. Tighten the flags for local CI runs.
- On direct installs, run terminal probe actions through the public launcher on 2.1.55+ so the headless path can use the real X11 keyboard probe. On 2.1.52-2.1.54, the launcher/headless path cannot dispatch
focus,probe-type,probe-scroll, orprobe-select, so use the exact active GUI executable frominstall-state.jsonfor those actions and state that limitation in the proof. - For generic terminal input regressions, raise the proof bar further: use a deterministic non-submitted marker such as
server app terminal probe-type --mode xterm --data '__YGGTERM_STREAM_PROOF__'on a fresh second-X11 client and require the resulting screenshot plus state to show the marker echoed in the live runtime prompt, no transcript-resume footer, noUSER/ASSISTANTpreview artifacts, a visible cursor, and a still-interactive terminal. - Use
/statusonly for a defect specifically about Codex slash-command handling or status-panel rendering. During partial/statustyping, Codex may keep updating slash-command suggestions; require the typed prefix, focused helper textarea, enabled input, cursor evidence, and the screenshot instead of quiescent app-state samples. - For prompt/cursor regressions, also run the partial-input loop: type
/stawithout Enter, capture screenshot + state, then scroll and capture again. Reject the fix if the typed partial input is not visible oncursor_line_text, if the cursor row drifts out of the prompt band near the bottom of the viewport, or if focus/input drops during the scroll step. server app terminal probe-selectdrives xterm's pointer selection path against mounted rows and reports the selected excerpt/length/contrast,selection_method,selection_layer_rect_count, and a gesture paint stack. A pass requires non-emptyterm.getSelection()plus xterm selection-layer rectangles. In canvas/no-row diagnostics it may reportselection_method = "buffer_fallback_unverified"with visible text length/excerpt, but that is not selection proof; pair it with app-state low-contrast diagnostics and a screenshot before calling readability fixed.- For terminal-selection hit-test regressions, also capture
terminal_hosts[].focus_capture_pointer_eventsandfocus_capture_hit_target_enabled. The focus-capture overlay must stay observer-only withpointer-events: none, and a visible context-menu backdrop must not block primary xterm drag/double-click gestures. - For terminal selection-copy hangs, reject any fix whose xterm embed still calls
navigator.clipboard.writeText. Proof should include the focused script/unit guard, a live app-control state response after the copy path is exercised, andterminal_clipboard/selection_copy_queuedorselection_copy_owner_updatedtrace evidence. The copy operation must leave the WebKit render loop responsive even if the desktop clipboard stack is slow. - For browser-selection leak regressions on embedded xterm, also capture
terminal_hosts[].xterm_root_user_select,rows_user_select,selection_range_count,selection_layer_count, andselection_layer_rect_count. Reject the fix if the mounted host can still accumulate a browser DOM range selection or if the xterm root/rows stop reportinguser-select: none. scripts/smoke_xterm_embed_faults.pyis now the top-level fault-model suite for embedded xterm regressions. Use it when the bug spans multiple symptoms such as cursor drift, invisible text, geometry mismatch, focus/input breakage, scroll failure, and theme/readability regressions at once.- For isolated second-display labs, pass
--home /tmp/...to the smoke script and prefix any follow-upserver app ...commands with the sameYGGTERM_HOME=/tmp/...so the proof does not accidentally target your real desktop client. - For fresh local-terminal regressions, keep one detached second-display proof that uses
server app terminal newand reject the fix unless the screenshot shows the prompt in the main viewport within a few seconds, the runtime row appears under the firstLive Sessionsgroup with a close affordance, the active host reports non-emptytext_sample/text_tailor canvas-modebuffer_text_sample/cursor_line_text, fresh terminals are not marked keep-alive until explicitly toggled, blank Enter does not leave the row spinning, and the same row can enter the rotatingbusyicon during a foreground command and recover back toplain-terminalonce the prompt returns. - On 2.1.93+
server app terminal newmust return a non-emptysession_pathfor both local and remote terminal creation. Treat a missing path as an app-control regression because latency,/status, and spawn-timeline probes cannot target the created terminal deterministically. - Codex managed-CLI refresh/update checks must stay out of the foreground
terminal new --kind codexpath. If a proof shows npm install/managed CLI ensure blocking terminal creation, classify it as a launch-latency regression; the expected foreground event is only a fast managed-CLI launch probe followed by PTY creation. - For local startup-restore regressions, also run
scripts/smoke_terminal_local_restart.py(or an equivalent second-display proof) and reject the fix unless the same local session survives app restart, reopens without a blank xterm host, and agrees three ways:active_session_path, the DOM-selected sidebar row, andbrowser.selected_rowmust all point at the same session. A stale same-sessionactive_surface_requestsentry or nonzero open request id must not keep startup restore permanently stuck interminal_attach_in_flight; after the recovery window, the app should clear the stale bootstrap lease and retry. - The renderer contract defaults to
canvasand treatsdomas an explicit opt-out path. For canvas mode, inspectterminal_hosts[].buffer_text_sampleandcursor_line_textbecause.xterm-rowsis absent by design; still reject any proof where the screenshot is visually blank, geometry is wrong, or cursor/input evidence is missing. For DOM mode, reject buffered terminal text withxterm_present=true,screen_present=true,rows_present=false, and zero canvas layers; that isdom_renderer_missing_text_layer_with_buffer_text, not a healthy but empty viewport. - On 2.1.93+ active visible Codex and remote output should render through xterm.js, not the low-power text overlay. Treat
terminal_hosts[].low_power_tui_overlay_active == trueon those active hosts as suspect unless the proof is explicitly about offscreen/replay behavior. For 2.1.166+ plain local full-screen TUI bursts, the active low-power text surface is allowed only while the alternate-screen TUI is running; proof must showlow_power_tui_frame_countadvancing with readablelow_power_tui_text_sample, Codex/remote sessions not using the overlay, and the overlay cleared after exit. A corrupted low-power sample with repeated incremental words such asBBoBooBoot...is a rendering regression, not a valid readiness signal. - The sidebar proof now has an explicit idle contract for local shells: after probe traffic settles back to a prompt, the selected row must recover from the rotating
busyicon to the macOS-commandplain-terminalicon, even when the active summary is condensed aspi@host$ >.. It also has a scroll-bounds contract: after launch, search, refresh, or expansion shrink, a sidebar whose rows fit must reportsidebar_scroll_top == 0, and visible top rows such asLive Sessionsmust not be clipped above the sidebar frame. - Cursor visibility now has explicit native-cursor evidence too:
terminal_hosts[].cursor_sample_rect,cursor_sample_text,cursor_sample_color,cursor_node_rects, andxterm_cursor_hidden. For light-theme terminal readability fixes, reject the proof unless the screenshot itself shows the cursor,cursor_sample_rectis visible while input is enabled, andxterm_cursor_hiddenagrees with what the screenshot shows. - Cursor alignment now has explicit native-cursor evidence too: compare
terminal_hosts[].cursor_sample_rectagainstcursor_expected_rect, and usecursor_node_rectsas supporting evidence when xterm exposes additional raw cursor DOM spans. Reject the fix if the visible native cursor drifts away from the expected cursor cell. - Codex prompt-band proof is xterm-owned. Reject Yggterm software prompt/cursor overlays, and require the
xterm_input_line_decoration_*state to agree with the screenshot: no decoration error, not disposed, and marker line matching the cursor line. In DOM renderer mode also requirexterm_input_line_decoration_element_visible == trueandxterm_input_line_decoration_render_count > 0; in canvas renderer mode require screenshot/pixel proof because canvas can paint the decoration without an xterm decoration DOM element. - Retained terminal hosts can coexist. Do not assume
terminal_hosts[0]is the active terminal. Select the host that matches the active session path and focused input ownership, or use an explicit active-host marker if present. - When xterm emits a very wide raw
.xterm-cursorspan, do not fail on width alone. Fail only if that wide span is still visually active via background, border, outline, or box-shadow. The native xterm cursor is now the visible cursor contract. - Do not trust the
probe-typeresponse by itself. Always pair it with a follow-upserver app stateandserver app screenshot, then judge the bug from the resulting screenshot plusterminal_hosts[].text_sample,terminal_hosts[].text_tail, and in canvas modeterminal_hosts[].buffer_text_sample/cursor_line_text. - For UI-theme or terminal-theme claims, prefer
server app theme light|dark --pid <pid>over click-based toggles during proof capture. The resulting app state now exposessettings.theme,settings.terminal_light_theme_name,settings.terminal_dark_theme_name,settings.effective_terminal_theme_name, and the mounted xterm renderer fieldsterminal_hosts[].xterm_font_family,xterm_font_weight,xterm_font_weight_bold,xterm_line_height,xterm_theme_background, andxterm_theme_foreground. Also inspect the actual rendered row sample fieldsterminal_hosts[].rows_sample_font_family,rows_sample_font_weight,rows_sample_font_feature_settings,rows_sample_letter_spacing,rows_sample_line_height,rows_sample_color,rows_sample_class_name,rows_sample_style_attr,dim_sample_*,cursor_sample_*,low_contrast_span_count,low_contrast_min_contrast, andlow_contrast_span_samples(with the olderrows_*fields as fallback), or runscripts/smoke_terminal_theme_ui.py, so the proof covers the actual rendered xterm rows and not just terminal option values. Reject any proof where the sampled row font family is still a single doubly-quoted literal stack, the cursor styling is transparent, visible low-contrast spans remain, or the mounted screen width drifts far from the host viewport. - Still keep one second-X11-display proof in the loop for GUI fixes, but do not rely on flaky
xdotoolfocus alone when the viewport probe can prove the same input path more deterministically. - For startup restore, the healthy recovery state is a visible toast plus
host_stdin_enabled == falseuntil the live terminal actually settles interactive. - For fresh Codex startup, the
Update Model Permissionsselector is an interactive surface when it shows the Default/Auto-review/Full Access options plus the "Press enter to confirm or esc to go back" hint, even if app-control reports many blank rows below the hidden cursor because the selector sits mid-screen. Proof should showterminal_settled_kind == "interactive", no remote-attention timeout notification, andhost_stdin_enabled == truefor that mounted host instead of treating the menu as stale retained transcript text. - For fresh local or remote Codex startup, use
scripts/smoke_codex_launch_timeline.pyorserver app terminal new --machine-key <machine> --kind codex. Keep its resource baseline enabled; the smoke now records pre-launchresource_timeline.jsonl, resource-relativephase_trace.jsonl, storage preflight output, a per-phase resource summary split by live profile, isolated test profile, SSH, Codex, and WebKit buckets, focuses the owned test window and reclaims terminal focus before capture, drives app-control through the matchedyggterm-headlesssibling, and rejects prompt-rendered states that are not actually input-ready. Capture sub-1s, 1s, 3s, 5s, ready, and post-30s state/rows/screenshot triples; when reconciling a screenshot, use the post-screenshot state (screenshot_state_*.json), not the screenshot command response. Reject the proof if the visible host is blank, shows local Codex scaffold text, shows a prompt-only remote Codex surface without the welcome frame, has a session-specificRemote Terminal Needs Attentionnotification, reports readiness after settle withactive_terminal_surface.problemset, reports a rendered prompt withruntime_truth.active_host_input_enabled != true, reports a rendered prompt that is focus-gated withouteffective_terminal_input_focus, reports app-control focus command/state disagreement after consideringterminal_hosts[].effective_input_focusandshell.terminal_input_override_active, leaves the generated session id alive on the app host or remote worker host during cleanup, or reports readiness-gatedactive_terminal_surface.host_stdin_enabled=truebeforeruntime_truth.active_host_ready=true.active_terminal_surface.raw_input_enabledmay remain true during startup only to keep terminal-emulator protocol responses flowing back to the PTY, and xterm focus-in/focus-out bytes should be classified as protocol traffic rather than user input. On remote live-user hosts, prefer the default/home/pi/.cachesmoke profile/output location over/tmpso low-space temp storage cannot corrupt staged proof runs. - For fresh remote Codex onboarding or permission-setup surfaces, also prove saved-session durability: before
Codex Sessionplus non-emptyStoragemetadata exists, closing the live runtime must leave no savedremote-session://...row under the machine/cwd tree. If a sidebar row exists with a generated UUID but no storage path, treat it as a phantom-session regression, not a valid saved session. - A local Codex welcome card with a focused blank cursor line can be accepted as a blank-prompt surface when
host_stdin_enabled=true, the xterm bridge is connected, and the screenshot shows the cursor. Remote Codex title-card-only or prompt-only surfaces must stay rejected unless the remote welcome/status/prompt frame is complete. - For fan/CPU regressions, run
scripts/remote_linux_idle_cpu_smoke.pyagainst an isolated release artifact and keep the default post-state cooldown enabled. The smoke should measure app CPU after app-control has settled, and active/background high-volume TUI proof should show a real TUI frame/drop signal before sampling, post-interrupt drain before background-idle sampling,active_write_frame_budget,effective_terminal_write_frame_ms, frame-like render probes/canvas health sampling, chunked alt-screen read cadence, unfocused local stream cadence, per-thread CPU rows, render-counter deltas,hot_host_health_suppressed_count, and the active plain-local low-power TUI overlay state rather than heredoc echo churn or a GUI/WebKit spin loop. On KDE Wayland+Xwayland machines, the default proof should showGDK_BACKEND=wayland,WINIT_UNIX_BACKEND=wayland,linux_desktop_backend_policy.policy == "kde_wayland_native_default",transparent_window_profile_reason == "kde_wayland_transparent_profile", andYGGTERM_XTERM_CANVAS_POLICY=xterm_canvas_enabled_for_wayland; the vendored Dioxus DMA-BUF workaround may setWEBKIT_DISABLE_DMABUF_RENDERER=1but must not forceGDK_BACKEND=x11after Yggterm selected Wayland. Canvas mounted in an X11 WebKit child is a renderer-policy mismatch, not an acceptable CPU sample. - Stable-channel idle CPU proof must also include
dom.css_running_animation_count. Sidebar/tree busy marks may be visible, but they should be static; ifcss_running_animation_countstays nonzero after settle without an explicit modal/probe animation, treat it as a GUI/WebKit fan-budget failure. - For active Codex Working/status CPU regressions, also capture
terminal_hosts[].last_raw_payload_lengthandterminal_hosts[].last_coalesced_payload_length. Repeated synchronized?2026hrepaint bursts should either arrive bounded from the Rust bridge or report a much smaller coalesced payload before xterm writes; a 64KB+ repaint with no coalesced-size evidence is a failed resource proof. - After app-control backgrounds a proof window, include
shell.app_control_backgroundedin the resource evidence. A selected terminal in that state must not keepactive_write_frame_budget=trueor userhost_stdin_enabled=true; if compositor focus truth lags, the app-control background flag is the deterministic contract for the low-power path. - For canvas-mode idle CPU claims, include
terminal_hosts[].visible_canvas_layer_count,hidden_canvas_layer_count,software_canvas_layer_optimization_active,software_canvas_cursor_overlay_present,software_canvas_cursor_overlay_visible, and the Codex prompt-bandxterm_input_line_decoration_*diagnostic fields. Idle canvas mode may hide inactive selection/link layers, but the xterm cursor layer stays xterm-owned, live Codex/remote sessions must not use Yggterm software prompt or cursor overlays, and release proof should showxterm_input_line_decoration_present == falseunless the run explicitly opts intoYGGTERM_ALLOW_XTERM_INPUT_LINE_DECORATION=1for diagnostic comparison. The Codex prompt band, typed input, cursor, status panel, resize redraw, and Working/status animation must be painted by PTY bytes and xterm cells, not by a Yggterm overlay or a release-visible decoration layer. Reject proof where the terminal host is missing, required canvas policy is inactive, prompt/cursor software overlays are visible, typed-input glyph pixel density is below the smoke threshold, the xterm decoration unexpectedly appears, or unexpected full-viewport layer churn remains. App-control cheap snapshots must expose these counters without canvas pixel reads; adom_debug_snapshot_timeoutfrom diagnostics is not a valid CPU proof. - If a remote terminal drops back into retry/recovery after a bad intermediate surface, the resume toast should stay visible until the session reaches the real visual reveal again. Do not accept a run where the toast disappears while
host_stdin_enabled == false,terminal_settled_kind != "interactive", or the terminal request is still truthfully recovering. - If a remote resume times out, the attention toast may remain as user-facing error state, but the open-attempt ledger must move to failed and the matching
terminal_attach_in_flight, bootstrap lease, and terminal surface request must clear. Reject proof where a no-progress loading toast stays inactive_surface_requestsindefinitely or drives high idle render counts. - For terminal-resume toast regressions, also verify the inverse case: once there are no visible notifications, the screenshot should not show an empty blurred/white toast shell still hanging under the titlebar. Capture both
notifications_countand a screenshot of the same moment. - Treat
/bin/bash: line 1: exec: __yggterm_initial_tty_size=...: not foundas a remote startup transport regression, not as user shell output. The proof should include the open-attempt ledger staying non-interactive before the fix and a fresh remote startup restore that becomeshost_stdin_enabled == trueafter the command wrapper is corrected. - Treat any non-null
active_terminal_surface.geometry_problemas a failed terminal proof, even if the surface otherwise looks rendered. - Treat non-null
active_terminal_surface.performance_problemas performance evidence, not a readiness failure by itself. It must not be used to justify disabled input unlessproblem/geometry_problemis also non-null. For Codex activity spinner proof, require the sidebar rowbusyhint to come from mounted xterm activity while hot frame text remains excluded from title/detail sampling. - Exception: the stable retained-xterm layout may present
screen_rect/helpers_rectabout16pxnarrower thanhost_rectwhileviewport_rectstill matches the host. That compensated gap is now accepted and should not be treated as a failed proof by itself. - For startup latency claims, include whether the daemon emitted
daemon/startup_prewarm begin|end|errorfor the active terminal. Startup restore should now be prewarmed after the control socket binds instead of waiting for the first UI mount to pay the whole cost. - For remote terminal startup restore, also capture whether the initial attach stream included
__YGGTERM_ATTACH_READY__. That server marker now means the PTY attach itself is live even when Codex is sitting on low-signal idle/footer chrome. - Once
__YGGTERM_ATTACH_READY__has arrived, a quiet attached terminal is allowed to settle after the reveal grace deadline only when the retained host surface is prompt-ready. Retained non-prompt text from a previous Codex answer is stale evidence: it may remain visible, but it must not clear the resume toast, mark the attempt interactive, or enable input. - For loading-truth bugs, capture one state while
active_surface_requestsstill contains the terminal request and one after settle so the bundle shows that the UI did not silently drop the request before attach finished. - Keep changelog language user-visible and concise.
- Treat demo assets as release material, not disposable debugging leftovers.
- When a result is not live-verified, say so explicitly.