name: gameplay-playthrough-testing description: Run an authenticated end-to-end gameplay playthrough of a D20 Adventures adventure in a real browser to test the actual player flow — character select, turns, dice rolls, encounter transitions, NPC roleplay, and completion. Use when verifying a gameplay change, reproducing or confirming a fix in the running app (not just unit/bridge tests), assessing AI Game Master / roleplay quality, or driving an adventure (Midnight Summons, Covert Cargo, Road to Kordavos, March of Davos) to completion. Bridge tests cover compiled data flow; this covers the live runtime that only a real authenticated session exercises.
Gameplay playthrough testing
Drive a real authenticated adventure in a browser to exercise the live runtime (create → turns → rolls → transitions → completion). This catches bugs the wiki-adventure bridge tests cannot — e.g. the solo auto-start redirect swallow, adventurePatch dropping malformed AI fields, and the player-reply / NPC-DM paths reading a stale legacy S3 plan instead of the wiki runtime.
1. Start the app
pnpm dev # Convex + Next on :3000; confirm `next-server` (this project's cwd) owns :3000
A stray wikibop-2 dev server may also be running on the machine — verify the :3000 owner's cwd is this repo.
2. Test env flags (local only; revert when done)
Add to .env.local (gitignored), then restart pnpm dev (env is read at startup):
NEXT_PUBLIC_USE_PLACEHOLDER_IMAGES=true— disables Replicate image generation (returns placeholders). Replicate (104.18.x) is often unreachable from dev machines and its connect-timeouts bog down heavy multi-NPC turns; it is never on the gameplay/LLM path, so disabling it is safe and recommended. The text model (gemini-3.1-flash-lite) is fast and is the real engine.
In .env, add the test user's Clerk id to ADMIN_USER_IDS (comma-separated) for admin + practice-mode access, then restart.
In-game token economy: a fresh user starts ~700 tokens and each LLM call charges it (INSUFFICIENT_TOKENS stalls turns). Top up:
npx convex run userTokenManagement:incrementTokens '{"userId":"<clerkId>","tokensToCredit":1000000,"transactionType":"adjustment_manual"}'
3. Auth — create a test user (don't sign up in-browser)
Clerk is a dev instance (pk_test). Browser sign-up hangs on Clerk bot/CAPTCHA, so create the user via the Backend API with CLERK_SECRET_KEY (in .env):
curl -s -X POST https://api.clerk.com/v1/users \
-H "Authorization: Bearer $CLERK_SECRET_KEY" -H "Content-Type: application/json" \
-d '{"username":"d20tester","email_address":["d20-tester+clerk_test@example.com"],"password":"<pw>","skip_password_checks":true,"skip_legal_checks":true}'
API-created emails are auto-verified, so password sign-in works (sign-in is not CAPTCHA-gated). +clerk_test emails need no real inbox.
4. Drive the browser with agent-browser
agent-browser (CLI; use npx -y agent-browser@latest skills get core for current docs). --headed shows the window; persist auth with state save/--state <file> (there is no standalone Clerk sign-in page — it's a modal — so state-restore is the reliable rerun).
Turn loop per character: Go To Reply → wait ~2.5s+ for the <textarea> to appear → fill a present-tense third-person action → Send Reply → (maybe Roll D20) → Go To Next Turn.
Hard-won driving rules:
- The reply textbox takes ~2.5s after "Go To Reply"; retry, and
agent-browser reloadto recover when it doesn't appear (long sessions leave the UI stale). Reloading also re-triggers stalled NPC processing. - Match the
button "Roll D20"exactly — don't grep loosely (the "D20 Adventures" logo also matchesd20). - Multi-PC turns act in initiative order; the engine's
[LLM] Stopping at player character: Xlog (not the roster "YOU" badge) tells you the real current actor. - LLM turn-advance takes ~20–60s; long
agent-browser waitcalls may background — poll the task output file.
5. Adventure shapes
- Solo (1-player, premade — e.g. Midnight Summons): selecting the premade auto-starts and redirects to turn 1.
- Multi-player (Covert Cargo, party 2-2): the public flow is a real lobby (needs 2 players + invite). Solo-test via practice mode (
/settings/<setting>/<plan>/practice, admin-gated) to control all PCs. - Custom-character (Road to Kordavos, 1-3): select a saved character or create one. To backend-create a character (the UI image step needs Replicate), write a
pcTemplateSchema(types/character.ts) JSON to S3characters/<clerkId>/<id>.jsonviaupdateJsonOnS3(run withnode --env-file=.env --env-file=.env.local --import tsx scripts/<x>.ts).
6. Encounter transitions
Transitions fire when the AI GM picks the next encounter based on the ## Transitions cues in the encounter source (content/settings/<setting>/adventures/<plan>/encounters/*.md). Read them first. Examples: the-shipment → the-transaction needs "Lyra verbally confirms the magic"; battle-on-the-boat → the-crate needs overcoming the guards; well-met → the-gates-ahead is "after 4 turns" (automatic). Generic actions that match no cue correctly keep the GM in the encounter — that is not a bug; match the real cue.
7. Proving a fix
Reproduce the broken state, fix, and prove the fix uses the intended path by deliberately breaking the old source. E.g. for the legacy-plan bugs: re-stub the S3 plan (empty sections[].scenes[].encounters) and confirm the action still works (proving it reads the wiki runtime), then restore the complete plan from wiki/sources/adventure plans/*.json.
8. Cleanup
Revert .env.local (NEXT_PUBLIC_USE_PLACEHOLDER_IMAGES) and the ADMIN_USER_IDS test entry. The Clerk test user, credited tokens, and any backend-created characters/plans persist in the dev backend (harmless to leave). Stop pnpm dev and agent-browser close. The agent-browser session/vault lives in gitignored .agent-browser/.