name: probe-win95 description: Boot Windows 95 in Electron under Claude's control, without a human clicking anything. Use when testing v86 updates, SMB changes, keyboard input, boot stability, or bisecting regressions.
Probing Windows 95 autonomously
You can run and test the Win95 VM yourself. The harness is already wired up — three pieces:
| File | Role |
|---|---|
src/renderer/debug-harness.ts |
Activated by WIN95_PROBE=1. Boots fresh automatically, samples CPU + VGA + text screen every 5s, writes /tmp/win95-probe.json + /tmp/win95-screen.png, detects SUCCESS vs FAIL modes, optionally drives keyboard input. |
src/renderer/smb/index.ts |
Wraps console.log so [smb] and [nbns] lines tee to $TMPDIR/windows95-smb.log (outside Electron, readable by any polling script — no CDP needed). |
tools/probe-boot.sh |
One-shot: kill leftovers → parcel build → launch Electron → poll /tmp/win95-probe.done → report → kill. |
Running from a git worktree
images/ is gitignored, so a fresh worktree has no disk image or default
state and every probe will fail at boot. Clone them from the main checkout
first (APFS clonefile — instant, no extra disk space):
mkdir -p images
cp -c "$(git rev-parse --git-common-dir)/.."/images/*.{img,bin} images/
One-shot boot test
tools/probe-boot.sh
Prints SUCCESS or a FAIL verdict. ~40s on a clean run.
Boot + type into Run
pkill -9 -f "windows95.*electron"; sleep 2
rm -f "$HOME/Library/Application Support/windows95/"state-v*.bin
rm -f /tmp/win95-probe.json /tmp/win95-probe.done \
"$TMPDIR/windows95-smb.log"
WIN95_PROBE=1 \
WIN95_PROBE_SCRIPT='HOST/HOST' \
WIN95_SMB_SHARE="$HOME/Downloads" \
./node_modules/.bin/electron . > /tmp/win95-electron.log 2>&1 &
WIN95_PROBE_SCRIPT='HOST/HOST' types \\HOST\HOST into Start → Run on
desktop. WIN95_PROBE_DOSBOX=1 instead opens command, types dir,
and (with WIN95_PROBE_DOSBOX_ALTENTER=1) toggles fullscreen — this is
the regression scenario for the windowed-DOS-box VBE leak.
WIN95_PROBE_CDROM=/path/to.iso mounts an ISO on the secondary-IDE
ATAPI drive (bypasses the settings UI). WIN95_PROBE_CDTRACE=1 logs
every secondary-channel ATA/ATAPI command to /tmp/win95-cdtrace.log.
WIN95_PROBE_VGATRACE=1 wraps the VGA I/O ports at the io.ports[]
layer and writes [port, op, value, "eip VMPE cplN"] tuples to
/tmp/win95-vgatrace.json every tick (heavy — can hit 1M entries during
boot). / → \ substitution (env var / shell quoting, pragmatism). The
harness drives it via XT scancodes — Win95 doesn't have Win+R (Win98+
only), so the sequence is Esc, Esc, Ctrl+Esc, R, backslashes + text,
Enter.
Reading results
| File | What |
|---|---|
/tmp/win95-probe.json |
Live status: phase (init/text-mode/splash/desktop), gfxW/H, textScreen, instructionDelta, verdict |
/tmp/win95-probe.done |
Written once when verdict is decided |
/tmp/win95-screen.png |
Canvas screenshot, refreshed each tick |
$TMPDIR/windows95-smb.log |
SMB/NBNS protocol trace |
/tmp/win95-electron.log |
Electron stderr |
Verdicts
| Verdict | Meaning | Action |
|---|---|---|
SUCCESS |
Canvas ≥640×480, CPU active, uptime >30s | desktop reached |
FAIL_VXDLINK |
"Invalid VxD dynamic link call" | flaky — retry |
FAIL_IOS / FAIL_PROTECTION |
IOS subsystem protection error | usually driver/BIOS mismatch |
FAIL_KRNL386 |
"Cannot find KRNL386.EXE" in safe mode | disk reads returning garbage — wasm/BIOS drift |
FAIL_SPLASH_HANG |
Canvas stuck 320×400 for >70s | IRQ starvation — if you're on v86 master, check the IDE register fix |
FAIL_HUNG |
CPU stopped advancing or text screen frozen 40s | hard hang |
Rules of the road
- Sporadic bluescreens are normal on all v86 versions. One FAIL_VXDLINK or FAIL_HUNG doesn't prove anything — retry up to 3×.
- Always clean state (
state-v*.bin— the suffix tracksSTATE_VERSIONinsrc/constants.ts, so never hardcode a version) before a probe.pkillon a wedged Electron triggersonbeforeunload, saving the corrupted state. Deleting it forces fallback toimages/default-state.bin. - Don't trust the text buffer in graphics mode. After desktop (≥640×480)
the stale BIOS text lingers in the buffer. The harness's
phasefield accounts for this; don't re-readtextScreenin adesktopphase and think you hit a BSOD. - Kill Electron when done. Background processes pile up, each holding
the disk image lock.
pkill -f "windows95.*electron"on every path out.
Bisecting v86
tools/bisect-v86.sh <commit> handles one step. The harness retries 3×
per commit. Hard-won lessons:
- Validate bounds against a known-good binary. Source-built wasm can drift from prod due to cargo/rustc version differences. We hit this: the "GOOD" bound produced a wasm that couldn't read the disk at all.
- JS-only when toolchain drifts. Keep the prod wasm, rebuild only libv86.js at each commit. Closure is deterministic enough; cargo isn't always. Works until you cross a commit that changes the JS↔wasm ABI (for v86, the APIC→Rust port in Aug 2025).
- Retry on FAIL, never on SUCCESS. One SUCCESS = commit is good. Three different FAILs at the same commit = commit is bad.
- State cleanup between runs (see above). Skipping this is the #1 cause of spurious "bad" verdicts during bisect.
Extending the harness
- New verdicts: add to the chain in
collectStatusindebug-harness.ts - New keyboard actions: extend
runScript(current types:keys,chord,text,wait) - New probe signals: add to
ProbeStatusinterface
Gate everything new on process.env.WIN95_PROBE === "1" so it stays out
of the normal app.
Common failure diagnostics
| Symptom | Check |
|---|---|
| No SMB traffic at all | $TMPDIR/windows95-smb.log should have hooked adapter line. If absent, v86 API changed — see src/renderer/smb/README.md |
| SMB hooks fire, no connection | Win95's "NetBIOS over TCP/IP" checkbox — bake into default-state.bin |
Boot hangs on 2996c087 or older v86 |
You probably have a ABI-mismatched wasm/JS pair. Prod wasm is the ground truth; rebuild JS against it. |
VXDLINK: flake vs. real bug
Two different things produce FAIL_VXDLINK:
- Sporadic flake (~1 in 2–3 runs even on known-good images): passes on retry. This is why the retry-3× rule exists.
- Deterministic failure (same address every run, e.g.
VMM(01)+000036E5 → device "C000" service E3E4): the image's disk layout triggers a real v86 disk-path bug. Known triggers: zeroing free space via the "mcopy a giant zero file, then delete it" trick, and offline (mtools) mass-deletion of recently-written file trees. The same image boots fine in QEMU and passes fsck. Retrying never helps; the image content must change.
This is the canonical verdict-interpretation policy (other docs link here):
- One SUCCESS = the image can boot. Good — same rule as bisecting.
- Three identical failures (same VxD address) = the image is in a bad state. Stop retrying; the content must change.
- Anything in between = keep retrying, you are looking at flakes.
Never conclude anything from a single FAIL.
Probing the state-restore path
WIN95_PROBE_RESTORE=1 tools/probe-boot.sh makes the probe restore state
(user state → images/default-state.bin fallback) instead of cold
booting. Use it to verify a freshly generated default-state.bin actually
resumes to the desktop — required after every image change.