name: contributing-to-element-interactions
subagent-only: true
description: >
Subagent-only. The full contribution methodology is too heavy to keep
in the orchestrator's transcript every cycle. The orchestrator detects
contribution intent and dispatches a contribution-handover- subagent;
the subagent loads this skill. Loading this skill into orchestrator
context is a methodology violation (the skill is heavy enough to
contaminate orchestrator context). The previous harness-side guard was
retired in 0.3.6; respect the convention by dispatching a subagent.
Use this skill when contributing to the @civitas-cerebrum/element-interactions package or its skill suite — and, just as importantly, when a consumer hits the package's edges from the outside. Two trigger families:
(A) API gap. A user, test, or skill needs a method, option, matcher, or
assertion shape that does not exist on Steps / ElementAction / ExpectMatchers
/ the matcher tree, and the temptation is to drop down to raw Playwright
Locator.* calls. Triggers: "extend the Steps API", "add a new method to
ElementAction", "no equivalent in the framework", "the package doesn't have",
"missing API in element-interactions", "missing matcher", "drop down to raw
Playwright", "fall back to page.locator", "the framework doesn't expose X",
"how do I add to this framework".
(B) Structural / framework / protocol gap. A skill, workflow, or documented invariant declares a rule that the package's current architecture cannot satisfy without changing the package itself, switching its underlying tooling, or relaxing the rule. The MCP→playwright-cli migration (#121, #122) is the canonical example: the parallel-isolation rule was structurally unsatisfiable on top of the Playwright MCP plugin and required a tooling change at the package layer, not a skill-level workaround. Triggers: "the framework can't satisfy", "framework limitation", "this rule cannot be satisfied", "the package's architecture prevents", "structural gap", "protocol gap", "isolation can't be guaranteed", "this prereq isn't satisfiable", "the underlying tooling doesn't support", "should I file an issue against the package", "is this a skill issue or a package issue", "do we need to change the package to fix this", any case where a skill is about to silently weaken or skip a documented invariant because the package can't back it.
Also use when contributing to the skill suite under skills/ (adding,
modifying, or registering a skill, debugging the package's own tests, or
opening a PR against this repo). This skill explains the separation of
concerns between element-repository and element-interactions, the test
coverage rules, the design principles that must be respected when scaling
the package, the exact workflow for adding new APIs cleanly, and how to
distinguish an API gap from a structural gap.
(C) Issue-queue / roadmap work on this repo. Any request to triage,
plan, or implement open issues filed against civitas-cerebrum/element- interactions. Triggers: "check the github issues", "look at the open
issues", "implementation roadmap", "implement issue #N", "ship issue #N",
"work on the open issues", "let's get started on the issues", "address
the issue queue", "pick up an issue", "what's left to ship", "go through
the issues", "what should we work on next" (when CWD is the package's
own repo).
Triggers also on: "contribute to element-interactions", any request to
modify files under the package's src/, skills/, or hooks/, "open an
issue on element-interactions", "open a PR on element-interactions", any
of the structural / protocol-gap phrases above, or any framing that
implies work on the package itself rather than with it.
Activation banner: The first user-facing reply after this skill loads MUST begin with the line: Protocol Achilles activated. Once per session — skip if already declared in this conversation. Subagents (which return structured data, not user-facing text) are exempt.
Contributing to @civitas-cerebrum/element-interactions
This package is a Playwright-on-top facade. Every API decision should preserve the framework's two non-negotiable promises:
- No raw selectors in user test files. Tests refer to elements by name (
'submitButton','CheckoutPage'), never by CSS/XPath/locator strings. - No raw Playwright
Locator.*calls in user test files. Every interaction, verification, and extraction goes throughSteps,ElementAction, or the matcher tree — neverawait page.locator('x').click()directly.
If a contribution undermines either promise, it doesn't ship.
🏛️ Software Architecture
The two packages
The framework is split across two packages for a reason. Understand the split before adding anything.
┌──────────────────────────────────────────────────────────────────┐
│ User test file (tests/*.spec.ts) │
│ │
│ await steps.expect('price', 'ProductPage').text.toBe('$19.99') │
│ await steps.on('btn', 'Page').nth(2).click() │
└────────────────────────────┬─────────────────────────────────────┘
│ string names only — no selectors,
│ no Locators, no driver primitives
▼
┌──────────────────────────────────────────────────────────────────┐
│ @civitas-cerebrum/element-interactions │
│ │
│ Steps ──┬─► Interactions (click, fill, hover, ...) │
│ ├─► Verifications (verifyText, verifyCount, ...) │
│ ├─► Extractions (getText, getAttribute, ...) │
│ └─► ExpectBuilder (.text.toBe, .count.toBeGT, ...) │
│ │
│ ElementAction (fluent builder behind steps.on(...)) │
│ BaseFixture (wires Steps + Repository + Interactions) │
└────────────────────────────┬─────────────────────────────────────┘
│ uses Element abstraction —
│ never raw Locator
▼
┌──────────────────────────────────────────────────────────────────┐
│ @civitas-cerebrum/element-repository │
│ │
│ ElementRepository.get('btn', 'Page') ──► Element │
│ │
│ Element (platform-agnostic interface) │
│ ├─► WebElement (Playwright-backed) │
│ └─► PlatformElement (Appium / WebDriverIO-backed) │
│ │
│ page-repository.json (single source of truth for selectors) │
└────────────────────────────┬─────────────────────────────────────┘
│
▼
Playwright Locator / WebDriverIO Element
Layer responsibilities
| Layer | Responsibility | Forbidden |
|---|---|---|
| User test | Describe scenarios in domain language | Constructing locators, importing @playwright/test directly for assertions, calling page.locator() |
Steps |
Top-level facade users call | Holding state across calls, exposing Locator in return types |
ElementAction |
Fluent builder for steps.on(...) chains |
Long-lived state (only in-flight chain state); exposing raw Playwright |
ExpectMatchers |
Chain-style assertion tree | Mocking, side-effects beyond the awaited assertion |
Interactions / Verifications / Extractions |
Internal helpers — accept Element only (no Locator). Wrap raw Locators in new WebElement(locator) at the seam if you must. |
Calling raw locator.X() instead of going through Element |
BaseFixture |
Constructs Steps with the right deps; auto-attaches failure screenshots | Test-specific logic |
Element interface |
Cross-platform element abstraction | Concept that doesn't exist on one of the platforms |
WebElement |
Playwright impl + web-only methods | Anything that's not a thin Playwright delegation |
PlatformElement |
WebDriverIO/Appium impl | Web-only DOM concepts |
ElementRepository |
Resolves name → Element, owns page-repository.json |
Wrapping interactions or assertions — that's element-interactions' job |
Data flow — anatomy of one call
Tracing await steps.on('submit-button', 'CheckoutPage').text.toBe('Place Order'):
steps.on('submit-button', 'CheckoutPage')—Stepsconstructs anElementActionwith the element/page names and a freshExpectBuildercontext..text— getter onElementActionreturns aTextMatchercarrying the builder's context (timeout, page, name, negation flag)..toBe('Place Order')—TextMatcher.toBequeues aQueuedAssertionon the builder's queue and returns the builder. No work runs yet. The chain is synchronous up to this point.await— JavaScript invokesbuilder.then(...)becauseExpectBuilderimplementsPromiseLike<void>.thencallsflush().flush()— drains the queue. For each assertion:- Calls
ctx.captureSnapshot()→ElementAction.captureSnapshot()resolves the named element viaElementRepository.get(...)(returning anElement), then callsElement.count/textContent/inputValue/getAllAttributes/isVisible/isEnabledin parallel. - Runs the matcher's predicate against the snapshot.
- On failure, throws with a structured error that includes the snapshot pretty-printed.
- Calls
Element.click/textContent/...under the hood call intoWebElement(PlaywrightLocator) orPlatformElement(WebDriverIO). User test code never sees these primitives.
The same shape applies to actions — steps.on('btn', 'Page').click() flows through Interactions.click(target) → toElement(target) → Element.click({ timeout }) → WebElement.click() → Locator.click().
Why this split exists
- Cross-platform abstraction has to be at the bottom. If
Elementlived in element-interactions, every package that wanted platform support would have to depend on the entire interaction surface. KeepingElementin its own package means future platforms (desktop, smart TV, native macOS) can implement only the Element contract. - Element acquisition is a different concern from interaction. Repository logic (parsing
page-repository.json, applying selection strategies, formatting selectors per platform) is independent of what you do with the resolved element. Mixing them produces a god-class. - The fixture is the wiring layer, not the API. Tests import from
BaseFixture;Stepsitself is constructible standalone for unusual scenarios. The fixture is opinionated;Stepsis composable.
Module / file conventions
src/steps/— user-facingSteps,ElementAction,ExpectMatchers. The chain-style API lives here.src/interactions/— internalInteractions,Verifications,Extractions, plus thefacade/ElementInteractionsaggregator.src/utils/— shared helpers (ElementUtilitiesfor waiting,DateUtilitiesfor date formatting). Pure functions only.src/enum/— public enum types (DropdownSelectType,EmailFilterType, etc.).src/fixture/—BaseFixtureand related fixture helpers.src/config/— environment / credentials parsing.src/logger/— debug logger for verify/interact/email categories.tests/— Playwright tests, all hitting the real Vue test app.tests/fixture/— test fixture wiring + shared helper functions (e.g.pageHelpers.ts).tests/data/—page-repository.jsonand any fixture data.skills/contributing-to-element-interactions/— this skill (top-level so the harness auto-discovers it). Agent-facing skill files for the broader suite live under sibling directories atskills/<skill-name>/SKILL.md.
When you add a new file:
- New public API entrypoint?
src/steps/. - New internal helper (called only by the package itself)?
src/utils/or co-located in the file that uses it. - New enum or public type?
src/enum/Options.ts(or a new file in the same dir for large groups). - Never create a top-level "misc" folder.
🚦 Decision tree: where does my new API go?
When you want to add something, walk this in order:
Is it a raw element capability (e.g. "read CSS variable", "drag with custom timing")? → Add to
Elementinterface in element-repository (and/orWebElementif web-only). Bump element-repository version. Then expose it through element-interactions.Is it a verification/assertion (e.g. "assert element has class X", "assert N items in this list")? → Add a matcher to
ExpectMatchers.ts. Either extend an existing matcher class (TextMatcher,CountMatcher, etc.) or add a new field matcher underExpectBuilder.Is it a composite workflow (e.g. "fill an entire form from an object", "retry an action until verification passes")? → Add a method to
StepsinCommonSteps.ts. Use existing primitives (steps.fill,steps.verifyText,steps.on(...)) — never callpage.locator()from inside the new method.Is it a strategy selector or filter (e.g. "select first matching by aria-label")? → Add to
ElementActionas a chainable strategy method. It should mutateresolutionOptionsand returnthis.Is it a fixture-level concern (e.g. "auto-clean cookies between tests")? → Extend
BaseFixtureor compose viatest.extend<T>()— don't pollute Steps with global cross-cutting setup.
If none of the above fit, stop and discuss before writing code. There's probably a deeper design issue.
🚨 Hard rules — don't violate
Read this skill before editing the package
Rule. Any agent preparing to modify files inside this package's contribution surface — src/, hooks/, skills/, scripts/, package.json, tsconfig*.json, .github/ — MUST first load this skill (skills/contributing-to-element-interactions/SKILL.md) in the current session. Either invoke it via the Skill tool or Read the file directly. The skill encodes the architecture, the API-vs-structural-gap distinction, the hard rules, and the design invariants every contribution must respect; an agent that hasn't loaded it is editing blind.
Methodology rule — any agent preparing to modify the package's contribution surface MUST first load this skill in the current session. The previous harness pre-read guard (which DENY'd edits when CWD was this package's repo and the skill hadn't been loaded) was retired in the 0.3.6 cleanup for public-dep cleanliness; the rule itself still applies.
Editing this SKILL.md itself is exempt — the edit IS the read.
Methodology improvements ship as programmatic hooks, not just markdown
Every PR that adds, modifies, or strengthens a rule, workflow, phase, gate, invariant, or contract in any skills/*/SKILL.md (or its referenced files under references/) MUST ship a corresponding harness hook in hooks/ that enforces the rule programmatically — or include an explicit, reviewer-visible note explaining why mechanical enforcement is impossible.
Markdown is documentation, not enforcement. Under context pressure, an orchestrator reading its own rule will rationalise around it ("this case is different", "given session constraints", "I'll be transparent about the trade-off") and stop / narrow / skip anyway. This is not a hypothetical — it is the documented failure pattern of issues #139, #154, #155, and #156. The harness layer is the only second-reader the orchestrator cannot talk past.
Decision rule (apply when you write or edit any SKILL.md rule):
| Rule shape | Hook surface |
|---|---|
| "Read X before doing Y" | PreToolUse:Edit|Write|Agent checks transcript for the required Read before allowing the dependent tool call. |
| "Don't stop until Y is done" | Stop or SubagentStop reads a ledger / state file, denies stop when invariant fails. |
| "Don't dispatch shape Z here" | PreToolUse:Agent greps tool_input.prompt for the forbidden pattern. |
| "State file Z must satisfy invariant W" | PreToolUse:Write validates the JSON / markdown shape. |
| "Subagent return must follow shape S" | SubagentStop parses the handover envelope, exit-2-blocks non-compliant returns. |
| "After phase N, file F must exist" | PreToolUse:Agent denies advancing to phase N+1 when F is absent or stale. |
If none of these apply because the rule is genuinely unenforceable mechanically (e.g. "use the right level of detail in the brief", "be honest about uncertainty"), the SKILL.md edit MUST add a markdown-only tag to the relevant entry in coverage-expansion/references/anti-rationalizations.md so the registry continues to track the failure surface even without harness backing.
Why this is non-negotiable: every markdown-only methodology rule that survives a release is a future incident waiting to happen. The cost of writing the hook is hours; the cost of debugging a wrong-classification incident the rule was meant to prevent is days plus the operator trust the package is supposed to earn. The asymmetry is the rule.
Reference: skills/contributing-to-element-interactions/SKILL.md §"Workflow: adding a harness hook" (line 803 of this file) details the hook authoring patterns, test-case expectations, and scripts/postinstall.js registration. Read it before authoring any SKILL.md edit so the hook is designed alongside the rule rather than retro-fitted.
Before filing an issue or opening a PR — check existing work and sync status
Two duplicate-prevention checks are mandatory before creating any new GitHub issue or PR. Skipping them wastes maintainer time and has produced duplicate issues / PRs against already-fixed code.
1. Search existing issues and PRs first. Both open AND closed — a closed issue often contains the resolution you need:
# Issues matching the topic
gh issue list --state all --search "<keyword>" --repo civitas-cerebrum/element-interactions
gh issue list --state all --search "<keyword>" --repo civitas-cerebrum/element-repository
# PRs matching the topic
gh pr list --state all --search "<keyword>" --repo civitas-cerebrum/element-interactions
gh pr list --state all --search "<keyword>" --repo civitas-cerebrum/element-repository
If a matching open issue/PR exists, comment on it — don't open a duplicate. If a matching closed one exists, read the resolution first; the fix may already be on main (see check #2).
2. Diff local vs. latest upstream before claiming a gap. "Missing API" / "this is broken" reports filed from stale local branches are the single largest source of false-positive issues. Before filing anything:
git fetch origin
git log --oneline HEAD..origin/main # commits you don't have locally
git diff HEAD origin/main -- src/ # source changes you're missing
If there are incoming commits, pull/rebase first, rebuild, and re-verify the gap still exists before filing.
3. For cross-package gaps, also check the published dependency version. When the report is "element-interactions doesn't expose X" but X is really an Element capability, the fix may already be in a newer element-repository release you simply haven't bumped to:
# Currently pinned version in this repo
grep -E '"@civitas-cerebrum/element-(repository|interactions)"' package.json
# Latest published version
npm view @civitas-cerebrum/element-repository version
npm view @civitas-cerebrum/element-interactions version
# Diff of what landed since your pinned version
npm view @civitas-cerebrum/element-repository versions --json
If the capability landed in a newer version, bump the dep and re-verify — don't file "missing" against an outdated pin.
Report the check results in the issue/PR body so maintainers don't have to redo them. One line each:
Searched existing issues/PRs: gh issue list / gh pr list — no matches for "<keyword>" in either repo.
Local vs. origin/main: in sync (or: rebased onto <sha> and re-verified).
Dependency version: element-repository pinned at 1.4.2; latest is 1.4.2.
Attribute issue reporters
Every commit and PR that closes a GitHub issue MUST credit the issue's author with a Reported-by: line in the commit body and the PR description.
The contract:
The commit body that includes a
Closes #N/Fixes #N/Resolves #Nreference also includes:Reported-by: @<github-handle>Multi-reporter is fine:
Reported-by: @contributor-a, @contributor-b.The PR description repeats the same attribution near the top, before the rest of the summary.
Why: issue-driven improvements are the load-bearing input that makes this package's methodology improve faster than any internal review process could. The minimum acknowledgement is a verifiable line in the commit body — it travels with the merge commit, survives squash-merge, surfaces in git log, and is mechanically detectable. Without it, the issue author's contribution silently disappears into the maintainer's PR description and the credit graph rots over time.
How to find the author:
gh issue view <N> --json author -q .author.login
# Multi-issue:
for n in 156 157; do gh issue view $n --json author -q '.number, .author.login' --jq @csv; done
Self-reported / chore caveat. When the contributor is also the issue author, self-attribution is still appropriate — the audit trail is the value, not the social acknowledgement. For purely-chore commits with no upstream issue, the rule does not apply.
Harness backstop. The PreToolUse:Bash commit-attribution guardrail that previously surfaced missing Reported-by: attribution at commit time was retired in 0.3.6; the rule still applies and PR reviewers enforce it. The live hooks/commit-message-gate.sh checks commit-message conventions (type/scope/bypass flags) but does not check attribution trailers. (See harness-hooks.md.)
AI assistants don't get Co-Authored-By: trailers
Rule. Every commit's sole author is the human contributor. AI assistants (Claude, Anthropic, borealis.local, anything similar) MUST NOT appear as a Co-Authored-By: trailer in the commit body. Real-human co-author lines (Co-Authored-By: Jane Doe <jane@example.com>) are unaffected.
The Anthropic CLAUDE.md template appends Co-Authored-By: borealis.local … to every commit Claude generates — that is the single source of these trailers. The upstream fix is to remove the trailer instruction from your project CLAUDE.md or ~/.claude/CLAUDE.md so it stops being suggested.
No raw locator.*() in element-interactions src/
Every locator.click(), locator.fill(), locator.evaluate(), etc. that creeps into src/ is a regression. If you need a primitive Playwright doesn't expose through Element, add it to the Element interface in element-repository first.
The one exception: the WebElement constructor itself (new WebElement(locator)) is the boundary where a raw Locator legitimately enters. Everywhere else uses Element.
To audit:
grep -rn "locator\.\(click\|fill\|textContent\|inputValue\|getAttribute\|count\|evaluate\|isVisible\|isEnabled\|waitFor\|scrollIntoView\|hover\|check\|uncheck\|selectOption\|dispatchEvent\|boundingBox\|press\|setInputFiles\|screenshot\|dragTo\|clear\)" src/ --include="*.ts" | grep -v "dist/"
Should return zero results in user-facing call sites. The only allowed calls are in Element implementations themselves (which live in element-repository).
Action methods presence-detect
Every action on Element (click, fill, dragTo, ...) calls ensureAttached(timeout) first. When you add a new action to element-repository, follow the same pattern:
async myNewAction(options?: ElementActionOptions): Promise<Element> {
await this.ensureAttached(options?.timeout); // <-- mandatory
await this.locator.myUnderlyingCall({ timeout: options?.timeout });
return this;
}
This is what gives the framework predictable failure modes ("element never attached" instead of opaque driver errors) and makes Appium actions stable without depending on auto-wait.
Web-only methods only get cast at the call site, not aliased
If element-interactions needs selectOption (which is WebElement-only), the call site does the narrowing:
const element = toElement(target) as WebElement;
await element.selectOption(...);
Don't smuggle web-only methods onto Element with throw-stubs on PlatformElement. The cast makes the web-only intent explicit and keeps the cross-platform contract honest.
Maintain 100% API coverage
The CI gate requires 100% API coverage — every public method on Steps, ElementAction, Verifications, Interactions, Extractions, and the matcher classes must have at least one test that exercises it. The coverage tool (@civitas-cerebrum/test-coverage) introspects the public surface and fails the build if anything is uncovered.
When you add a new method:
- Add a passing test for it (even a one-liner against the Vue test app).
- Run
npx test-coverage --format=github-plainlocally to confirm 100%. - The CI coverage job will fail otherwise.
In-package smoke tests must still verify — and the verification must be causally meaningful
100% API coverage is a floor, not a ceiling. The coverage tool only checks that every public method is called from at least one test — it doesn't check that the test asserts anything after calling it, let alone that the assertion proves the action did something.
Two levels of failure to avoid:
Level 1 — no assertion at all. A test like
test('hover()', async ({ steps }) => {
await steps.on('primaryButton', 'ButtonsPage').hover(); // ❌ no assertion
});
satisfies coverage but is indistinguishable from a no-op — it only catches thrown exceptions.
Level 2 — tautological assertion. Worse than no assertion, because it looks like coverage:
test('clickListedElement with regex alternation', async ({ steps }) => {
await steps.clickListedElement('rows', 'TablePage', { text: { regex: 'A|B|C' } });
await steps.verifyPresence('rows', 'TablePage'); // ❌ list was there before the click
});
test('hover', async ({ steps }) => {
await steps.on('btn', 'Page').hover();
await steps.on('btn', 'Page').verifyState('visible'); // ❌ it was visible to be hovered
});
test('fill', async ({ steps }) => {
await steps.fill('input', 'Page', 'hello');
await steps.verifyPresence('input', 'Page'); // ❌ inputs don't disappear when filled
});
These pass even if the action silently does nothing.
Rule: every test in tests/ must end with an assertion that would fail under a no-op. Ask yourself: "If the exercised method had been replaced with an empty function body, would this test still pass?" If yes, the assertion is tautological — rewrite it.
Acceptable verification forms (ordered by strength):
- Direct effect on a feedback element — the action updates a
resultText,status,stateSummary,selectedCount, etc. Verify that specific element's text/attribute. - Navigation — click a listed element that navigates;
verifyUrlContains(...)orverifyAbsence(...)on an element only present before the click. - Extraction + assertion —
expect(await steps.getInputValue(...)).toBe('filled')forfill;expect(cellText).toMatch(/pattern/)for regex filters. - State-change verification —
verifyState('checked')aftercheck(),verifyState('disabled')after a submit that disables the button, etc. - Fallback —
verifyState('visible')orverifyPresence(...)on the target is acceptable ONLY when (a) the method has genuinely no observable side-effect at any layer, and (b) a one-line comment explains why. Framework-only smoke cases qualify; feature tests do not.
When reviewing a PR:
grepthe diff forawait steps.*\.\(click|fill|drag|hover|check|uncheck|type|upload|setSliderValue|scrollIntoView|rightClick|doubleClick|clickListedElement)\(as the last line of a test body — every hit is a missing assertion (Level 1).- For every
verifyPresence/verifyState('visible')/verifyState('enabled')added in the diff, ask whether the element was in that state before the action. If yes, it's a tautology (Level 2). The fix is usually to reach for a feedback element (resultText,status, etc.) instead.
No mocked unit tests
Every test in this repo runs against the real Vue test app at https://civitas-cerebrum.github.io/vue-test-app/ via Playwright. We do not use mocked locators / mock Steps / spy fixtures.
Reason: the framework is a Playwright facade. Mocked tests would only verify that we wire up Playwright "correctly" — but Playwright's actual behavior is what users care about. End-to-end tests catch real regressions; mocks don't.
When adding tests, place them in tests/ and use the existing StepFixture import pattern:
import { test, expect } from './fixture/StepFixture';
test('new feature', async ({ steps }) => {
await steps.navigateTo('/');
// ... real interactions against the live app
});
📐 Design rules — invariants that must stay consistent
These are the contracts that hold the framework together. Every change must respect them. If a change requires breaking one, that's a major-version-bump conversation, not a casual PR.
1. Argument order — (elementName, pageName, ...rest) everywhere
Every method that targets a named element starts with elementName, pageName. No exceptions, no historical accidents.
steps.click('submit-button', 'CheckoutPage');
steps.verifyText('summary', 'CartPage', 'Total: $42');
steps.expect('price', 'ProductPage').text.toBe('$19');
steps.on('row', 'TablePage').nth(2).text.toBe('Active');
repo.get('submit-button', 'CheckoutPage');
repo.getByText('option', 'DropdownPage', 'United States');
Adding a method that flips this (e.g. (pageName, elementName)) is a hard rejection in review.
2. Async-everywhere
Every public method that reaches the DOM/driver is async. No synchronous element accessors. If you find yourself wanting a sync getter, you're doing something wrong (the only sync exception is repo.getSelector() which returns a string, not an element).
3. Chain-style for assertions, flat for actions
- Assertions extend the matcher tree (
steps.expect(el, page).field.matcher(value)). New assertions add toExpectMatchers.ts, not new flatverifyXonSteps. - Actions stay flat on
Steps(steps.click,steps.fill,steps.dragAndDrop). Composite workflows (steps.fillForm,steps.retryUntil) stay flat too.
Every element-scoped verify* is exposed in two forms that share one implementation:
- Fluent form on
ElementAction—steps.on(el, page).verifyX(...). This is the canonical implementation. Each method is either a thin wrapper over the matcher tree (forverifyPresence,verifyText,verifyTextContains,verifyCount,verifyAttribute,verifyInputValue,verifyCssProperty) or a direct call intoVerificationswhere a specialized fast path is needed (verifyAbsenceviatoBeHidden,verifyState,verifyImages,verifyOrder,verifyListOrder). - Standalone form on
Steps—steps.verifyX(el, page, ...). This is a thin delegate that constructs the fluent builder viaactionWithStrategy(...)and calls the matchingElementAction.verifyX(...). One implementation, two entry points.
This is the invariant to preserve when adding a new verification:
- Add the method on
ElementAction(or growVerificationsfirst if the underlying primitive doesn't exist). - Add the matching standalone method on
Stepsthat delegates viathis.actionWithStrategy(elementName, pageName, options).verifyX(...). Keep the logging on the Steps side sotester:verifyoutput stays consistent.
A handful of verifications only make sense as page-level or filter-then-match shapes and only exist on Steps:
verifyUrlContains,verifyTabCount— page-level, not element-scoped; the tree starts at an element.verifyListedElement— filter-then-match; the fluent tree operates on a single resolved element.
The matcher tree (.text.toBe, .count.toBeGreaterThan, etc.) remains the place to grow new assertion shapes — chainable negation, regex, substring, custom predicates, etc. When a matcher-tree shape lands that subsumes an existing verify* form, don't deprecate the verify*; the two coexist as equally valid entry points.
3a. Implementation lives in the Interactions / Verifications / Extractions layer. Everything else is a facade.
The single source of truth for assertion behavior — retry mechanics, web-first polling, error formatting, negation, timeout handling — is the Verifications class. For actions, it's Interactions. For reads, Extractions.
All user-facing layers are dispatch-only and must ultimately call into the appropriate interaction class:
Steps.verifyText(el, page, ...) ──┐
Steps.expect(el, page).text.toBe(...) │
ElementAction.verifyText(...) ├─► Verifications.text(target, expected, options)
ElementAction.text.toBe(...) │ ↑ one implementation, one codepath
interactions.verify.text(locator, ...) ──┘
The rule for new assertions:
- If
Verificationscan do what you need, add a matcher inExpectMatchers.tsthat delegates to it (2–3 lines — e.g.return this.ctx.verify.X(target, ..., opts)). - If
Verificationscan't do what you need, add a method toVerificationsfirst. Implementation goes there. Then add the matcher that delegates. - Never reimplement assertion logic in the matcher tree (snapshot-capture + predicate polling + custom retry). The exception is
.satisfy(predicate)— the predicate escape hatch legitimately needs a snapshot-based poll because user lambdas run against plain data, not against a live element.
The rule for new actions:
- Same shape —
Steps.XandElementAction.Xboth delegate intoInteractions.X. Never write click/fill/hover logic directly onSteps.
Why this matters:
- One bug fix propagates everywhere. Fix Playwright's web-first assertion handling in one place, every entry point benefits.
- Error messages stay consistent because
describeFailure-style messages are threaded aserrorMessageinto the single implementation, which embeds them via Playwright'sexpect(locator, message)overload. - The raw
interactions.verify.X/interactions.interact.Xpublic API (documented as the escape hatch for users with custom locators) is never out of sync with the matcher-tree / Steps behavior. - Adding a new matcher is cheap: write a one-liner in the tree, add one method to Verifications (which is itself a thin Playwright wrapper).
Helper pattern the matcher tree uses:
// Matcher method — 2-line dispatch
toBe(expected: string): ExpectBuilder {
return this.builder.enqueue(this.ctx, (entry) =>
runWithElement(entry.ctx,
el => entry.ctx.verify.text(el, expected, this.msgOpts(entry.ctx, 'text', 'to be', expected)),
entry.messageOverride));
}
runWithElement handles the ifVisible gate + resolves the Element. this.msgOpts builds the { negated, timeout, errorMessage } shape every Verifications method accepts. Verifications does the actual work.
Audit grep: if you find yourself writing retry loops, snapshot capture, or Playwright expect(locator)... calls outside of Verifications / Interactions / Extractions, stop. It probably belongs in one of those classes instead.
4. One-shot semantics for .not
.not flips the next matcher only, then resets. Don't introduce sticky-negation modes or multi-matcher negation scopes; it confuses reading. Both steps.expect('el', 'Page').not.text.toBe('x') and steps.expect('el', 'Page').text.not.toBe('x') produce the same single-call negation.
5. One timeout, uniform mutation
A single chain-level timeout var is the source of truth across the whole chain:
Steps.timeout (fixture) → ElementAction._timeout → ExpectContext.timeout → VerifyOptions.timeout (threaded into Verifications)
.timeout(ms) mutates at every layer it appears — no cloning, no divergent semantics:
ElementAction.timeout(ms)mutates_timeout;.text,.count, etc. getters rebuild the ExpectContext with the new value.ExpectBuilder.timeout(ms)mutatesctx.timeoutand retroactively patches the last queued assertion (so.satisfy(pred).timeout(500)applies 500ms to that predicate).- Matcher
.timeout(ms)(e.g..text.timeout(500)) mutates its own ctx AND propagates to the builder for subsequent matchers — but does NOT retroactively patch a prior matcher's queued entry.
Scope — what .timeout(ms) affects:
- Every verification/matcher (
.text.toBe,.count.toBeGreaterThan,.satisfy(pred),.verifyText,.verifyCount, etc.). - Element-routed actions that go through
element.action(this._timeout).X()onElementAction—hover,fill,check,uncheck,doubleClick,typeSequentially,clearInput,scrollIntoView,getText,getAttribute,getCount,getInputValue. - Interactions-routed actions —
click,clickIfPresent,rightClick,uploadFile,dragAndDrop,selectDropdown,setSliderValue,selectMultiple.ElementActionpassesthis._timeoutthrough the option bag of eachinteractions.interact.*call, which then uses it for both the pre-actionUtils.waitForState(...)and the Playwright primitive (element.click({ timeout }), etc.).
When adding a new Interactions-routed action, extend its option bag with timeout?: number (or accept an ActionTimeoutOptions parameter for modifier-free methods) and plumb it to the same two places — pre-wait and primitive. The ElementAction call site passes { timeout: this._timeout } into the bag.
Repo resolution has its own timeout. repo.get(...) pays ElementRepository.defaultTimeout (configured by repoTimeout on the fixture, 15000ms default) waiting for the element to reach attached. This is upstream of ElementAction._timeout — the chain-level .timeout(ms) only governs action + verification, not resolution. If you need to bound resolution too, use repo.setDefaultTimeout(ms) on the fixture or in a beforeEach.
Visibility probe/gate is another deliberate exception. isVisible(options?) (the unified replacement for the old ifVisible() / boolean isVisible() pair) and its older aliases use a short visibilityTimeout (default 2000ms) because their whole purpose is fast-skip: a hidden element should abort the action in ~2s, not 30s. Do not unify it into the main timeout.
isVisible(options?) returns a VisibleChain that is both awaitable (Promise<boolean>) and chainable (.click(), .text.toBe(...), etc.). The probe constructs a WebElement directly from repo.getSelector(...) rather than going through repo.get(...) — otherwise the 15s repository-resolution wait would swallow the caller's short timeout. Every probe and gate decision is logged under tester:visible with a [probe] or [gate] tag.
Other builder state (queue, pendingNot) also mutates, but stays scoped: each .expect() / .on() call returns a fresh builder, so mutation doesn't leak across chains. .not is one-shot — it flips the next matcher only, then resets.
6. Snapshot-based predicates
The predicate escape hatch (steps.expect(el, page).satisfy(predicate)) takes a function that receives an ElementSnapshot — plain data, no async access. This keeps custom assertions readable and predictable.
// ✓
await steps.expect('price', 'Page').satisfy(el => parseFloat(el.text.slice(1)) > 10);
// ✗ Never change to this — users would need to await inside the predicate
await steps.expect('price', 'Page').satisfy(async el => (await el.getText()) === '$10');
7. Naming conventions
| Prefix | Returns | Behavior on failure |
|---|---|---|
verify* |
Promise<void> |
Throws |
expect(...)... (matcher tree) |
thenable that throws on failure | Throws on failure |
is* |
Promise<boolean> |
Returns false (never throws) |
get* |
Promise<value> |
Throws if element not found |
wait* |
Promise<void> |
Throws on timeout |
click* / fill* / hover* etc. |
Promise<void> (or Promise<boolean> for the IfPresent variants) |
Throws on failure |
If your new method doesn't fit one of these, reconsider the shape — the naming is the API contract.
8. Public API stability
steps.click, steps.verifyText, steps.on(...).fill, the matcher tree shape — all the entry points users have written tests against — stay stable across patch and minor versions. Internal refactors are fine; signature changes on user-facing methods need a major bump and a clear migration note in the PR description.
The public Target type on Interactions, Verifications, Extractions, and Utils is Element (no Locator union). Consumers with custom Playwright locators wrap them via new WebElement(locator) at the seam — that's the single documented bridging point.
9. Action methods presence-detect
Every action on Element (click, fill, hover, dragTo, ...) calls ensureAttached(timeout) first. New action methods MUST do the same. This is what gives the framework predictable failure modes ("element not attached" instead of opaque driver errors) and stable Appium behavior.
async myNewAction(options?: ElementActionOptions): Promise<Element> {
await this.ensureAttached(options?.timeout); // mandatory
await this.locator.myUnderlyingCall({ timeout: options?.timeout });
return this;
}
10. No raw locator.*() in element-interactions src/
Every locator.click(), locator.fill(), locator.evaluate(), etc. that creeps into src/ is a regression. If you need a primitive Playwright doesn't expose through Element, add it to the Element interface in element-repository first.
The one exception: the WebElement constructor itself (new WebElement(locator)) is the boundary where a raw Locator legitimately enters. Everywhere else uses Element.
To audit:
grep -rn "locator\.\(click\|fill\|textContent\|inputValue\|getAttribute\|count\|evaluate\|isVisible\|isEnabled\|waitFor\|scrollIntoView\|hover\|check\|uncheck\|selectOption\|dispatchEvent\|boundingBox\|press\|setInputFiles\|screenshot\|dragTo\|clear\)" src/ --include="*.ts" | grep -v "dist/"
Should return zero results in user-facing call sites.
11. Web-only methods only get cast at the call site
If element-interactions needs selectOption (which is WebElement-only), the call site does the narrowing:
const element = toElement(target) as WebElement;
await element.selectOption(...);
Don't smuggle web-only methods onto Element with throw-stubs on PlatformElement. The cast makes the web-only intent explicit at the call site and keeps the cross-platform contract honest.
12. Error message format
User-facing assertion failures follow a consistent header format:
expected <PageName>.<elementName> <field> [not ]<verb> <expected>
Examples:
expected ProductPage.price text to be "$19.99"expected CheckoutPage.submitBtn count not to be 5
The actual value comes from Playwright's built-in "Expected / Received" diff block appended below the header — we pass the header string as the message argument to expect(locator, message).<matcher>(), and Playwright prepends it to its own assertion output. Don't hand-roll the got <actual> suffix — it'll duplicate what Playwright already emits.
Use the BaseMatcher.msgOpts(ctx, field, verb, expected) helper in ExpectMatchers.ts — it builds { negated, timeout, errorMessage } in the exact shape every Verifications method accepts. Don't hand-roll error strings.
For predicate failures (satisfy(pred)), the path is different — we poll a snapshot manually, so there's no Playwright diff block. The message includes the full ElementSnapshot JSON pretty-printed under the header. Don't truncate or summarize the snapshot — users debug from it.
13. Logging
Every public method on Steps logs at one of: tester:navigate, tester:interact, tester:verify, tester:extract, tester:wait, tester:email. The category mirrors the operation kind. Use the existing log.X(...) helpers in CommonSteps.ts rather than console.log.
14. TypeScript discipline
- No
anyinsrc/. Test fixtures are exempted (the Playwright fixture types are awkward to spell exactly). - Prefer interfaces over type aliases for public surfaces.
ExpectContext,ElementSnapshot,QueuedAssertionare interfaces. - Use
readonlyon snapshot/data interface fields. Mutable internal state is fine on classes; data passing between layers should be readonly. - Use
as constfor matcher verb strings and similar string literals when they need narrow types. - Avoid
as unknown as Xdouble-casts. If you need one, the type model is wrong somewhere — refactor.
15. No version bumps without explicit authorisation
Don't run npm version <X>. Don't edit package.json's version field. Don't push a tag. Versioning is release-time, not per-PR. The user controls when bumps happen.
A contributor (or an agent acting for one) may bump only when the user has explicitly authorised that specific bump in the conversation. The authorisation must be visible in the conversation thread — auditable in context, copy-pasteable from the user's authorising message.
npm version patch --no-git-tag-version
npm version 0.4.0
Why. Per-PR bumping causes version-number collisions when PRs merge out of order, reviewer cognitive cost from a version line in every diff, and rebase churn that has nothing to do with the actual change. Release-time bumping collapses every PR diff to "the actual change" and keeps release control with the maintainer.
Why bump against npm-latest, not package.json:
When multiple PRs are open in parallel, every branch bumping current+1 from its own diverged base produces version collisions on merge — two branches off the same base both bump to the same next version, the second to merge clobbers or duplicates the first's published version. Bumping against npm-latest collapses every open branch to a known monotonic ceiling: the first PR to merge sets the new published version, and subsequent PRs rebase + re-bump against the new ceiling. No collisions, no manual reconciliation in CI.
Edge case — npm view fails (no network, package not yet published). Fall back to bumping against the current package.json value (the old recipe) and call out the deviation in the PR description so the reviewer can spot-check for collision against any other open PR. Fall back to bumping against the current package.json value and call out the deviation in the PR description so the reviewer can spot-check for collision against any other open PR.
For minor/major bumps, same rule: bump once, at the start, against (npm-latest + 1 minor/major).
16. Tests hit the real Vue test app
No mocks, no spies, no fake locators. Every test in tests/ runs against https://civitas-cerebrum.github.io/vue-test-app/ via Playwright. The framework's value is its Playwright wiring — mocks would only verify wiring against itself.
17. 100% API coverage is a CI gate
Every public method on Steps, ElementAction, Verifications, Interactions, Extractions, and the matcher classes must have at least one test that exercises it. The coverage tool (@civitas-cerebrum/test-coverage) introspects the public surface and fails the build if anything is uncovered. New methods need new tests.
18. Keep Steps lightweight — fewer methods, more flexibility
Steps is the user-facing facade. It is a dispatch surface, not an implementation surface. The implementation layers — Interactions, Verifications, Extractions — should grow many small specialized methods (localStorage, localStorageContains, localStorageMatches, localStoragePresent). Steps should grow as few methods as possible, each accepting a flexible options shape that selects between the underlying variants.
Why this split exists:
- Grep-ability for users. A user reads a test and asks "what assertions exist for X?" — finding one
verifyX(key, options)plus typed options is faster than scanning five sibling methods. - Discoverability via TypeScript. A discriminated-union options type (e.g.
StorageVerifyOptions) gives autocomplete the matcher names without forcing the user to recall five method suffixes. - Refactor blast radius. Adding a new matcher variant means adding one method to
Verificationsand one branch to a Steps dispatcher — not a full new public method onSteps(with logging, doc block, coverage test, surface-area churn). - Cognitive load on the API surface. Every method on
Stepsis a thing a user can call. The API budget is finite; spend it on distinct resources (an element, the URL, page HTML, browser storage), not on every variant of how to assert against them.
The rule:
When you add a new family of related verifications/extractions on Steps, the default shape is one method per resource, accepting a discriminated-union options type that picks the matcher.
✓ DO:
// One Steps method, four matchers selected via discriminated union.
type StorageVerifyOptions =
| { equals: string; contains?: never; matches?: never; present?: never; ... }
| { equals?: never; contains: string; matches?: never; present?: never; ... }
| { equals?: never; contains?: never; matches: RegExp; present?: never; ... }
| { equals?: never; contains?: never; matches?: never; present: boolean; ... };
async verifyLocalStorage(key: string, options: StorageVerifyOptions): Promise<void> {
// Dispatch to verify.localStorage / localStorageContains / localStorageMatches / localStoragePresent.
}
✗ DON'T:
// Four separate Steps methods — bloats the surface, splits docs, splits log lines.
async verifyLocalStorage(key, expected, options?) { ... }
async verifyLocalStorageContains(key, substring, options?) { ... }
async verifyLocalStorageMatches(key, regex, options?) { ... }
async verifyLocalStoragePresent(key, options?) { ... }
Variety still belongs on Interactions / Verifications / Extractions. Those classes are the implementation. They take concrete arguments and have concrete shapes — one method per matcher is the right granularity there because each method maps to a single Playwright primitive (e.g. expect.toHaveText vs expect.toContainText vs expect.toMatch). Don't try to merge Verifications.localStorage and Verifications.localStorageContains into one — the implementation layer benefits from specialization.
Existing technical debt. Several legacy families on Steps do have multiple methods per resource (verifyText / verifyTextContains / verifyTextMatches, verifyHtml / verifyHtmlContains / etc.). These predate this rule. Don't refactor them in the same PR that adds new work — that's a separate cleanup. But every new family must follow this rule. When in doubt: one Steps method, dispatch via options.
Exception: matcher tree. The matcher tree (steps.expect(el, page).text.toBe(...)) is itself the flexible-shape API — the chained matchers play the role that an options-union plays for flat methods. So .text.toBe / .text.toContain / .text.toMatch are correct on the matcher tree. The rule applies to flat verifyX methods on Steps, not to the chain.
19. Doc updates are mandatory for new public API
Any PR that adds a new public method to Steps, ElementAction, the matcher tree, or a new public matcher class must update both of:
README.md— under the relevant🛠️ API Reference: Stepssubsection (Interaction / Verification / Data Extraction / Visibility / Listed Elements / etc.). One bullet per new method, plus an inline code example block when the API has a non-obvious option shape (e.g. discriminated unions, multi-form matchers).skills/element-interactions/references/api-reference.md— under the matching section. The api-reference is the canonical documentation consumed by other skills (test-composer, coverage-expansion, bug-discovery), so missing entries here cause downstream agents to write tests that drop out of the framework.
No "headline-worthy" exception. The previous version of this rule allowed README updates only for headline-worthy features and produced silent doc drift — the HTML extraction surface (commit d2f200e) shipped without a README entry. If the change adds a method a user can call from a test, both files get an entry. The PR description should quote the new bullets verbatim so reviewers can grep them.
Internal-only changes don't trigger this rule. Adding a method to Verifications, Interactions, or Extractions without a corresponding Steps / ElementAction / matcher-tree entry point is internal — it's reachable only from the raw escape hatch (interactions.verify.X). The README docs the recommended surface; raw escape-hatch methods are documented inline via JSDoc on the class.
Skill files updates (skills/element-interactions/SKILL.md, skills/contributing-to-element-interactions/SKILL.md, etc.) are required only when the change affects a workflow stage, the contribution rules, or a hard rule. A new verify* method does not normally require a SKILL.md change.
📝 Contribution Handover
Every PR against this repo must produce a populated .contribution-handover.json at the repo root before push. The handover captures one boolean per guardrail in this skill, plus a small set of free-form fields (PR title, summary, version delta).
The schema lives at schemas/contribution-handover.schema.json. A blank template lives at .contribution-handover.template.json. Copy the template, fill it in, and run the gate at push time. The file is gitignored — DO NOT commit it. Carrying a previous PR's handover into a new branch is the failure mode the gate exists to catch (each PR's claims must reflect that PR's actual contents, not whatever the prior handover said).
The companion PreToolUse:Bash push/PR gate that previously intercepted git push origin and gh pr create while the handover was missing, malformed, or had unset booleans was retired in 0.3.6; the rule still applies — populate and self-validate the handover before pushing, and PR reviewers enforce it. See harness-hooks.md.
Why a handover, not just a checklist:
- Structured booleans are machine-checkable. The gate spot-verifies a subset of claims against the actual repo state (e.g.
readmeUpdated: trueis cross-checked against the README diff vs.origin/main). - The local handover is the contributor's pre-push sign-off. The gate validates the contributor's working-tree claims against the working-tree diff at push time — no chance of a stale handover travelling with the branch and being mistaken for a fresh one.
- The shape evolves with the rules. When a new hard rule lands in this skill, it gets a new field in the schema. Old handovers fail validation and contributors can't push until they review the new rule. The schema is the rule index.
Field families:
preflight— duplicate-search, branch sync, dependency version checks (Hard Rule "Before filing").design— argument order, async, no-raw-locator, action-presence-detect, lightweight Steps, naming, error format, logging, TypeScript discipline (Design Rules 1–18).tests— implementation, real-Vue-app, non-tautological assertions, passing (Hard Rules "no mocked", "must verify causally").build— TypeScript build clean, full suite green, knownFailures (free-form for legitimate skips).coverage— 100% API coverage gate (Hard Rule).docs— README, api-reference, skill files (Rule 19).version— single patch bump (Rule 15).
For any boolean set to false or "n/a", the corresponding *Reason field must be populated. Vague reasons ("not applicable", "didn't need it") fail the gate; specific reasons ("change is internal-only on Verifications, no public Steps surface added — Rule 19 doesn't apply") pass.
Worked example. Copy .contribution-handover.template.json and run the gate; it prints the per-field validation map. The template is the canonical populated shape.
Hook error message format — repo standard
Every hook under hooks/*.sh that emits a permissionDecision: "deny" (or a systemMessage warn) must format the reason text using the layout below. The shape is identical across hooks so contributors recognize a hook block instantly and know where to look.
[BLOCKED] <one-line headline — what's wrong, in present tense>
──────────────────────────
Do this instead:
──────────────────────────
Option A — <case>
<concrete template / command / config diff>
Option B — <other case>
<concrete next step>
──────────────────────────
What was wrong:
──────────────────────────
File: <path or N/A>
<observed values — claim, actual, diff, etc.>
<one-paragraph why it matters — the rule, the prior incident, the cost of the failure>
──────────────────────────
If <common motivation> — read this:
──────────────────────────
<pointer to the upstream fix or the rule the contributor is bumping against>
References:
<canonical docs — file paths or URLs>
[WARN] replaces [BLOCKED] for systemMessage-style soft warnings. Box-drawing characters are U+2500 — copy them from this skill, not from any other hook (existing hooks predate this standard and use ad-hoc formatting; they'll be normalized in a separate cleanup PR).
Why these sections exist:
- Headline — the contributor sees the failure in one line in their terminal. Don't bury the rule in paragraph two.
- Do this instead — concrete, copy-pasteable. At least two options when there are two valid resolutions (fix the work vs. update the claim). One option when there's only one path (e.g. file-corruption → repair the file).
- What was wrong — observed state, including the file path, the claim, and the actual value. This is the audit-log section; without it, contributors can't tell which check fired.
- If
— the empathy line. Anticipates the most common reason a contributor hit this gate ("you ticked the box without updating the file") and routes them to the right fix path. Skip this section if there's no common motivation worth naming. - References — the canonical docs for the rule. Always include the SKILL.md section that defines the rule, plus the schema / config file the contributor will edit. Two to four lines.
The commit-message-gate.sh hook is the canonical implementation — copy its message-extraction block (the MSG=/SCAN= section handling -m, --message=, -F/--file, with the raw-command fallback that never denies blind on extraction failure) and its rich-error-context deny messages when writing a new hook.
🧰 Workflow: adding a new API
A. Adding to element-repository (the underlying capability)
cd /path/to/element-repository
git checkout main && git pull
git checkout -b feat/your-feature
# 1. Update src/types/Element.ts (interface)
# 2. Implement in src/types/WebElement.ts
# 3. Implement in src/types/PlatformElement.ts (or stub if web-only — but prefer cross-platform)
# 4. Add live test in tests/live-element-location.spec.ts using the Vue test app
# 5. Verify
npm run build
npx playwright test tests/live-element-location.spec.ts
npx test-coverage --format=github-plain # must show 100%
# 6. Bump version against npm-latest (Rule 15 — collision-safe across parallel PRs)
npm version "$(npm view @civitas-cerebrum/element-repository version | awk -F. '{print $1"."$2"."$3+1}')" --no-git-tag-version
# 7. Commit + push + open PR
git add -A
git commit -m "feat: add Element.<method> for <use case>"
git push -u origin feat/your-feature
gh pr create --base main --title "feat: ..." --body "..."
After this PR merges, element-repository auto-publishes to npm. Then update element-interactions to use the new version.
B. Adding to element-interactions (the user-facing API)
cd /path/to/element-interactions
git checkout main && git pull
git checkout -b feat/your-feature
# 1. Add the API to the right layer:
# - New matcher → src/steps/ExpectMatchers.ts
# - New step / composite → src/steps/CommonSteps.ts
# - New strategy → src/steps/ElementAction.ts
# - Internal helper → src/interactions/{Interaction,Verification,Extraction}.ts
# 2. Add tests in tests/ — must hit the real Vue test app
# 3. Run full suite + coverage
npm run build
npm run test # all tests must pass
npx test-coverage --format=github-plain # must show 100%
# 4. Update docs (Rule 19 — both files mandatory for any new public API):
# - skills/element-interactions/references/api-reference.md (the canonical source)
# - README.md (the user-facing reference under "🛠️ API Reference: Steps")
# - skills/element-interactions/SKILL.md (only if the change affects workflow stages)
# 5. Bump version once, against npm-latest (Rule 15 — collision-safe across parallel PRs)
npm version "$(npm view @civitas-cerebrum/element-interactions version | awk -F. '{print $1"."$2"."$3+1}')" --no-git-tag-version
# 6. Populate the contribution handover
cp .contribution-handover.template.json .contribution-handover.json
# fill in every boolean; pair every false / "n/a" with a *Reason field
# 7. Commit + push + open PR
# (Methodology rule: do not push or open a PR until the
# `.contribution-handover.json` is valid. The harness gate
# that previously refused `git push` / `gh pr create` was
# retired in 0.3.6; the rule still applies.)
git add -A
git commit -m "feat: add steps.<method> for <use case>"
git push -u origin feat/your-feature
gh pr create --base main --title "feat: ..." --body "..."
C. Cross-package change (new Element capability + matching Steps API)
Open both PRs in parallel. Element-repository PR ships first; element-interactions PR depends on it:
- Push element-repository PR.
- Locally, point element-interactions at
file:../element-repositoryso you can develop both sides simultaneously. - Once element-repository PR merges and the new version publishes, flip element-interactions back to
^X.Y.Z. - Push the version-flip commit; CI goes green; merge.
🪝 Workflow: adding a harness hook
Hooks live in hooks/<name>.sh, are installed into ~/.claude/hooks/ by scripts/postinstall.js, and are registered in ~/.claude/settings.json via the HOOK_MANIFEST array. They run at PreToolUse / PostToolUse / SubagentStop / Stop boundaries to enforce skill contracts mechanically — markdown rules can be rationalised away mid-run, hooks cannot.
This section is the how. The when is fixed by the Hard rule §"Methodology improvements ship as programmatic hooks": every SKILL.md rule edit comes paired with a hook unless the rule is genuinely unenforceable mechanically. Re-read that hard rule first if you're authoring a SKILL.md change — its decision table maps each rule shape to a concrete hook surface.
When to add a hook (vs declaring the rule markdown-only):
- The rule is mechanically detectable at a tool-use boundary (specific tool, file path, command pattern, response-shape signal). → Hook.
- Markdown enforcement has been observed to fail under context pressure. → Hook (the failure mode is no longer hypothetical).
- The cost of a violation is high (corrupt state, lost work, contract violation propagating downstream). → Hook.
- The rule is too contextual to detect mechanically (e.g. "use the right level of detail in this brief", "be honest about uncertainty"). → Stays markdown-only and the rule gets tagged in
coverage-expansion/references/anti-rationalizations.mdso the un-backed surface stays visible.
The default is "ship a hook." Choosing markdown-only is an explicit reviewer-visible exception, not the absence of a decision.
Hook authoring — three required patterns
1. Documentation header — uniform across all hooks
Every hook starts with a structured comment block. Readers should be able to scan the header alone and answer: what event does it fire on, what does it block / warn on, where's the canonical rule it implements, what's the exact failure → action mapping.
#!/bin/bash
# <name>.sh — <one-line summary of what this hook does>
#
# Hook : <event>:<matcher> (e.g. PreToolUse:Agent, PostToolUse:Bash, SubagentStop)
# Mode : <DENY | WARN | RECORD | combinations> (DENY blocks the tool call,
# WARN emits systemMessage, RECORD updates state without output)
# State : <none | <repo-or-home>/.claude/<file>.json>
# Env : <none | CIVITAS_X_Y=<int> (default <N>, semantics)>
#
# Rule
# ----
# <Single paragraph: what this hook enforces. Names the contract surface
# concretely. No ambiguity about which tool calls are caught.>
#
# Why
# ---
# <Single paragraph: motivation. Why mechanical enforcement here? What
# failure mode does it catch that markdown couldn't?>
#
# Canonical reference
# -------------------
# skills/<skill>/SKILL.md §"<section>" (and/or)
# skills/<skill>/references/<file>.md §"<section>"
#
# (Optional sections: Conventions / Allowed list / Migration / etc. —
# use them when the rule has a non-trivial vocabulary the reader needs
# alongside the comment block.)
#
# Failure → action
# ----------------
# - <violation> → DENY|WARN|RECORD
# - <other violation> → DENY|WARN|RECORD
# - <legitimate-looking case that's exempt> → silent allow
# - Anything else → silent allow
This pattern is followed by every hook in hooks/. Adding a new hook with a different shape regresses scannability — match the existing template. Examples to read first: hooks/playwright-cli-isolation-guard.sh (DENY with multi-case classification), hooks/commit-message-gate.sh (DENY with rich error context), hooks/subagent-return-schema-guard.sh (PostToolUse observer + state-file deregister).
2. Helper functions — consistent shape
Hooks emit two output shapes: a deny JSON (PreToolUse-only, blocks the tool call) and a warn JSON (any event, emits a systemMessage). Both are wrapped in helpers defined inline at the top of the script:
emit_deny() {
jq -n --arg r "$1" '{
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "deny",
"permissionDecisionReason": $r
}
}'
}
emit_warn() {
jq -n --arg m "$1" '{
"systemMessage": $m,
"suppressOutput": false
}'
}
Define only what the hook actually uses (a deny-only hook doesn't need emit_warn). Don't inline a fresh jq -n in each call site — use the unified helpers instead.
3. Action-first error message template — guide the agent back on track
Hook deny / warn messages are read by an agent under context pressure. The agent's next action is what matters most — not the diagnosis, not the references. Lead with the action.
Template:
[BLOCKED|WARN] <one-line headline of what was caught>
──────────────────────────────────────────────────────────────────
Do this instead — <option list or concrete template>:
──────────────────────────────────────────────────────────────────
Option A — <case>
<concrete next step: code template, command, or option>
Option B — <other case, if applicable>
<concrete next step>
──────────────────────────────────────────────────────────────────
What was wrong:
──────────────────────────────────────────────────────────────────
File: <path>
<observed values that triggered the rule>
<one-paragraph diagnosis: what the violation was, what failure mode it
represents, name the framings/symptoms verbatim where applicable>
──────────────────────────────────────────────────────────────────
If <common motivation for the violation> — read this:
──────────────────────────────────────────────────────────────────
<pointer to the upstream fix that resolves the underlying concern, NOT
the symptom-level workaround>
References:
<canonical-doc-path-1>
<canonical-doc-path-2>
Why this shape:
- Action first. The agent reading the message under context pressure should see the next step in the first ~10 lines. References at the end are for follow-up, not primary action.
- Show, don't describe. A concrete
Agent({...})template, command, or option-list beats prose. Substitute extracted values where possible (slug from file path, count from JSON, etc.) so the agent can copy-paste. - Named symptoms. When the violation has a recognisable internal-monologue framing ("honest stopping point", "I'll be transparent", "given session constraints"), name it verbatim in the diagnosis. Future agents recognise their own self-talk.
- Underlying concern + upstream fix. When a violation is driven by a real concern (e.g., parallel dispatch felt unsafe due to shared-DB races), acknowledge the concern and point at the upstream fix (per-test-user pattern in test-optimization §1.A) — NOT the symptom-level workaround. Otherwise the agent re-violates as soon as the same concern recurs.
- References last. Two to four canonical doc paths. Don't bury them in prose; list them.
Examples to read: hooks/subagent-schema-preread-gate.sh (PreToolUse gate citing a schema) and hooks/commit-message-gate.sh (DENY with rich error context; Option A / Option B layout). The earlier reference implementations of the in-flight-composer registry pattern (a dispatch-guard registrar paired with a direct-compose-block consumer) were removed in the 0.3.6 cleanup; the pattern itself is documented here for future contributors.
Hook checklist
When opening a PR that adds or modifies a hook:
- Documentation header follows the unified template (Hook / Mode / State / Env / Rule / Why / Canonical reference / Failure → action).
-
emit_deny/emit_warnhelpers used consistently — no inlinejq -n --argcalls in the body. - Error messages follow the action-first template (headline → Do this instead → What was wrong → upstream fix → References).
- Test cases added to
hooks/tests/cases/<NN>-<topic>.shcovering: happy-path allow, each rule's deny/warn path, exempt cases, edge cases (empty inputs, special characters, alternate runner forms, etc.). -
bash hooks/tests/run.shreports green on the new case file plus all existing cases. - If the hook records state, the state-file path and shape are documented in the canonical reference.
-
scripts/postinstall.jsHOOK_MANIFEST updated with the new entry (file, event, matcher, timeout, optional async). - If the hook gates a markdown rule, the kernel-resident invariants in the relevant SKILL.md mention the harness backstop, naming the live hook precisely (e.g. "Harness-enforced by
hooks/standard-mode-first-pass-guard.sh"). Never cite a retired hook; if a hook is removed, rewrite its skill-side claims to the honest-retirement form ("the harness guard for this rule was retired in; the rule still applies"). - If the rule has a category in the anti-rationalization registry, the registry entry's
Hooks that catch this:list is updated.
Approximating is_subagent — the in-flight-registry pattern
The Claude Code harness payload doesn't include an is_subagent field on hook input — Write calls from a dispatched subagent and Write calls from the orchestrator are indistinguishable at hook-fire time.
When a hook needs to distinguish "was this tool call made by a legitimately-dispatched subagent doing its expected work" from "was this the orchestrator absorbing work that should have been delegated", use the in-flight-registry pattern:
- PreToolUse:Agent (the dispatch-guard) writes a registration entry to a state file (e.g.
tests/e2e/docs/.in-flight-composers.json) when the dispatch matches a known role-prefix that produces specific tool calls (e.g.composer-j-<slug>:produces aWrite tests/e2e/j-<slug>.spec.ts). - PostToolUse / PreToolUse on the produced tool call reads the registry and gates the call: if the slug is in-flight (within a TTL window), the writer is the legitimate subagent — ALLOW. If not in-flight, it's the orchestrator absorbing — DENY with a redirect to dispatch the right subagent.
- TTL / cleanup as a failsafe: the registry uses a rolling 30-min TTL — entries that aren't deregistered explicitly (see point 4) expire on the next dispatch-guard run, so stale registrations don't accumulate when a subagent crashes or is abandoned mid-flight.
- Explicit deregistration on terminal handover (the primary cleanup path). Each subagent return is prefaced with a
handover:envelope (role,cycle,status,next-action— schema in../element-interactions/references/subagent-return-schema.md§2.0). In this pattern, a PostToolUse:Agent consumer parses the envelope, cycle-matches against the registry entry, and deregisters the slot immediately on terminal status instead of waiting for TTL. Cycle-mismatch (envelope claims a different cycle than the registered dispatch) refuses to deregister and asks the orchestrator to redispatch under the correct cycle. This shorter leash matters because the orchestrator's redispatch under the same slug can race with stale handovers from a slow / auto-compacted prior cycle — the cycle-match contract pins the deregistration to one specific dispatch.
The reference implementation paired a dispatch-guard registrar (registering composer-j-* / composer-sj-* / probe-j-* / probe-sj-* dispatches with a cycle field) with a direct-compose-block consumer (gating tests/e2e/{j,sj}-*.spec.ts writes against the registry) and hooks/subagent-return-schema-guard.sh (parses the handover envelope, cycle-matches, deregisters terminal handovers). All three registry-coupled behaviours were removed in the 0.3.6 cleanup; subagent-return-schema-guard.sh survives, but today it only validates returns against the role schemas via the bundled validator — it no longer parses-and-deregisters registry entries. The pattern avoids false positives that would otherwise force a WARN — the gate runs as a hard DENY because the registry mechanically distinguishes legitimate from violation, and the leash is bounded by the explicit handover instead of the looser 30-min window. (Reference implementation removed in 0.3.6; pattern documented here for future contributors.)
When you ship a new harness pattern that needs the same distinction, register at the dispatch boundary, gate at the produced-tool-call boundary, deregister on the canonical handover envelope, and keep the TTL as a failsafe. Use a hidden state file under tests/e2e/docs/.<topic>-<scope>.json to keep the registry alongside other coverage-expansion state.
📚 Contributing to the niche-edge-cases catalogue
skills/failure-diagnosis/references/niche-edge-cases.md documents failure shapes that LLMs routinely misclassify during the failure-diagnosis pipeline. It's a living catalogue: new entries are added as diagnostic sessions surface new shapes that trap the diagnoser and aren't already covered. The full criteria + entry template live in that file's §"Adding an entry"; this section explains the contribution path and how it slots into the rest of this skill's PR conventions.
When an entry qualifies
All three must hold:
- The shape misclassifies in practice. Stage 0 + Stage 4 of
failure-diagnosis/SKILL.mdweren't enough to land the right answer cleanly — the diagnoser went the wrong direction (or was visibly close to). The catalogue is for traps, not for failures whose classification was obvious. - The disambiguating probe was non-obvious. The thing that flipped the classification (a specific tool call, DOM read, evidence grab) is what the next diagnoser most needs. "Look at the screenshot more carefully" is not a probe.
- The shape is reproducible across consumers. A bug in this app's checkout flow is a project finding (goes in that project's bug ledger). A bug shape any consumer of the package could plausibly hit (modal-fetch hangs, stale page-repo entry resolves to a hidden duplicate, role-attribute serialisation breaking implicit ARIA roles, etc.) is catalogue-worthy.
If any criterion fails: don't add an entry. The catalogue's value is in being skimmable during a live diagnosis, not in being exhaustive.
Entry shape
Five fields per entry — Symptom / Why LLMs struggle / Disambiguating probe / Classification / Cross-link. One paragraph per field is the target. The full template + worked examples live in niche-edge-cases.md's §"Adding an entry" — read it once before authoring your first entry; it's the single source of truth for the structure.
How to ship the addition
Three pathways depending on what you're already shipping:
| Situation | PR shape |
|---|---|
| You're already mid-PR for something else (a hook fix, a skill rule edit, etc.) | Add the catalogue entry to the same PR — one extra commit, scope-clean (purely additive to a docs file). Mention in the PR description that the entry was discovered while debugging the PR's own work. Reviewers expect this path; it doesn't trigger a scope-split flag. |
| You hit the niche shape outside any PR (during a normal coverage / authoring / debugging session) | Open a small standalone PR titled docs(failure-diagnosis): catalogue <shape-name> in niche-edge-cases. Single-commit, single-file (this catalogue). The docs(...) commit-message convention from coverage-expansion's commit table applies; no version bump per Rule 15. |
You hit it inside a dispatched subagent (e.g. failure-diagnosis sub-skill, bug-discovery per-journey probe) |
Surface the find in the subagent's return — name the shape, the probe, and the classification. The parent orchestrator either appends to the catalogue inline (if mid-PR) or opens the standalone PR above. Subagents do NOT push commits directly to this catalogue, the same way they don't push commits directly to other source files; the parent owns the write. |
Cross-link discipline
When a new entry refines an existing Stage 4 / 4a row in failure-diagnosis/SKILL.md, update that row to point at the new entry — short citation only (see [\references/niche-edge-cases.md`](references/niche-edge-cases.md) entry (N)`), don't duplicate the entry's prose into the SKILL.md table cell. The table is the skim path; the catalogue carries the depth.
When a new entry is a brand-new shape with no existing Stage 4 / 4a row, leave the cross-link as (none — new shape). Don't fabricate a Stage 4 row to point back at the entry; let the table remain stable until the shape is well-trodden enough to deserve a row.
What does NOT belong in the catalogue
- Project-specific failure shapes (those go in the project's adversarial-findings ledger or its own bug tracker).
- War stories from a long debugging session (the catalogue is the answer — the trap and the probe and the classification, nothing more).
- Failure shapes whose Stage 4 row already covers them adequately (extending the existing row is sufficient).
- Anything that contradicts the canonical
subagent-return-schema.mdfinding-block shape (the catalogue lives alongside the finding format, not as an alternative to it).
When in doubt: if the next diagnoser would benefit from finding your entry under a Cmd-F for the symptom keyword, add it. If they'd just shrug and skim past, leave it out.
🧯 When a user runs into an API gap
If you're using the package and want to write something like:
// ❌ Don't do this — drops out of the framework
const locator = page.locator('button.submit');
const cssVar = await locator.evaluate(el => getComputedStyle(el).getPropertyValue('--brand-color'));
Stop. The right path:
Check if the framework already supports it. Read
skills/element-interactions/references/api-reference.mdend-to-end. The matcher tree, predicate form,.css(prop), andinteractionsraw escape hatch cover most needs.Run the duplicate-prevention checks from the "Before filing an issue or opening a PR" hard rule above — search existing issues/PRs (open + closed) in both repos, diff local vs.
origin/main, and confirm your pinned dependency version is the latest. A large share of "missing API" reports are already fixed on main or in a newer published version.If it's genuinely missing after those checks:
- Open an issue on
civitas-cerebrum/element-interactionsdescribing the use case. Include the check results (see the hard rule's reporting template). - If it's a generic element capability (CSS variable, custom property, drag with timing), it belongs in element-repository's
Elementinterface first. - If it's an assertion shape, it belongs on the matcher tree.
- Open an issue on
If you need to ship NOW, the documented escape hatch is
interactions.interact.*,interactions.verify.*,interactions.extract.*— they accept eitherLocatororElement. Use these for the one-off, but file the issue so the proper API can land.Never check raw
locator.*()calls into a test file or into the element-interactions src/. The audit grep above will catch it in code review.
🧱 When the framework cannot satisfy a documented rule
Sometimes the problem is not a missing method on Steps — it's that a skill, workflow, or invariant declares a rule the package's current architecture cannot back. The MCP→playwright-cli migration (#121, #122) is the canonical case: every browser-using skill in this suite required parallel-subagent isolation, but the Playwright MCP plugin shared one browser process across all subagents. The rule was unsatisfiable until the package switched tooling.
Distinguishing a structural gap from an API gap:
| Symptom | Class | What you're missing |
|---|---|---|
User wants steps.foo() and it doesn't exist |
API gap | A method on the public surface |
| Skill prereq says "X must be true at dispatch time" and the package can't make X true | Structural gap | A primitive / mechanism the package doesn't currently provide |
| Workaround would mean turning off, weakening, or silently skipping a documented invariant | Structural gap | The invariant is load-bearing; the fix is at the package layer |
| Two parallel subagents corrupt each other's state through the package's chosen tool | Structural gap | OS-level isolation the current tool can't give |
| The package's protocol assumes a host capability the runtime doesn't expose | Structural gap | A different protocol or a different tool |
If it's a structural gap, the workflow is different from "open an API-gap issue":
Write down the unsatisfied invariant precisely. Quote the rule from the skill that depends on it (file + line). State the mechanism in the package that fails to back it. Without this, the issue reads as "a thing didn't work" instead of "this contract is structurally broken."
Don't relax the invariant in the consuming skill. The rest of the suite is built on it. Patching around it locally hides the structural problem and creates inconsistencies between skills that respect the rule and skills that don't.
Open an issue on
civitas-cerebrum/element-interactions(the package, not the consuming skill repo, even if you found the gap while writing a skill) — with the duplicate-prevention checks above and a "smallest credible structural fix" sketch. Examples of "smallest fix": switch underlying tool, expose a new primitive, change a protocol shape. If the fix is large, that's fine — name it; don't hide it.The PR that fixes it lands in the package, not in the consuming skill. The consuming skill only updates once the new primitive is published — and at that point, the consuming skill's job is to delete its workaround and trust the new contract.
Decide between "block the rollout" and "ship a documented workaround." A structural gap blocks the rollout when the invariant is safety-critical (data corruption, cross-tenant leakage, false-pass tests). A documented workaround is acceptable when (a) the workaround is local and reversible, (b) the cost of waiting exceeds the cost of the workaround, and (c) the issue is filed and the cleanup is tracked.
Examples that should trigger this skill, not a skill-level workaround:
- "I need parallel browser isolation, but the package's MCP protocol shares one browser." → File an issue; consider a tool swap. (#121 / #122 — actual case.)
- "My skill needs auth state to survive a failure boundary, but the package doesn't expose state-save / state-load." → File an issue against the package; do not write a brittle re-login loop in the skill.
- "The orchestrator's Rule X requires Y before dispatch, but the package can't tell us Y." → File an issue; add the primitive in the package; consume it from the orchestrator.
If a skill's prereq check is consistently failing because the package can't satisfy it, that's a structural gap, not a skill bug. Route it here.
📋 PR checklist
Before opening a PR on element-interactions:
- Searched existing issues + PRs (both repos, open + closed) for duplicates — none found, or linked to related work in the PR body
- Local branch is up-to-date with
origin/main(git fetch && git log HEAD..origin/mainis empty, or rebased) - Dependency versions (
@civitas-cerebrum/element-repository) checked againstnpm view— pinned to latest or intentionally older with a reason - Tests pass:
npm run testshows all tests passing - Coverage 100%:
npx test-coverage --format=github-plainshows ✅ - No raw Playwright leak:
grep -rn "locator\.\(click\|fill\|...\)" src/ --include="*.ts"returns zero matches in non-Element-impl code - No version bump in this PR (Rule 15 — versioning is release-time, not per-PR). Bump only when the user has explicitly authorised it in the conversation.
- API reference updated (
skills/element-interactions/references/api-reference.md) — mandatory for any new public method on Steps / ElementAction / matcher tree (Rule 19) - README updated under
🛠️ API Reference: Steps— mandatory for any new public method on Steps / ElementAction / matcher tree (Rule 19) - If adding a new method, it has a JSDoc block on the public-facing class
-
.contribution-handover.jsonpopulated againstschemas/contribution-handover.schema.json— every boolean set; everyfalse/"n/a"paired with a specific*Reasonfield (methodology rule — the harness gate that previously verified this on push / PR-create was retired in 0.3.6 for public-dep cleanliness) - If this PR adds, modifies, or strengthens any
skills/*/SKILL.mdrule, workflow, phase, gate, invariant, or contract, it ALSO ships a hook underhooks/that enforces the rule programmatically (Hard rule §"Methodology improvements ship as programmatic hooks"). When mechanical enforcement is genuinely impossible, the PR description includes a paragraph explaining why and the rule is taggedmarkdown-onlyincoverage-expansion/references/anti-rationalizations.md.
If you're adding to element-repository first:
- Searched existing issues + PRs on
civitas-cerebrum/element-repository(open + closed) — no duplicate - Local branch is up-to-date with
origin/mainon element-repository - New method on
Elementinterface (cross-platform) ORWebElementonly (with rationale comment) -
WebElementimplementation included -
PlatformElementimplementation included if cross-platform - Action methods include the
ensureAttached(timeout)preamble - Live test added in
tests/live-element-location.spec.ts - Coverage 100% (
npx test-coverage) - No version bump in this PR — release-time only, per Rule 15. Bump happens on the release branch when the maintainer publishes.
- README updated if adding to the public surface