marimo-pair - SKILL.md Agent Skill

name: marimo-pair description: >- Collaboration protocol for pairing with a user through a running marimo notebook via bundled scripts or MCP. Use when the user asks you to work in, build, explore, or modify a marimo notebook — or when you detect a running marimo session by listing sessions. Do NOT use for general Python scripting outside of marimo or for marimo plugin/package development.

marimo Pair Programming Protocol

This skill gives you full access to a running marimo notebook. You can read cell code, create and edit cells, install packages, run cells, and inspect the reactive graph — all programmatically. The user sees results live in their browser while you work through bundled scripts or MCP.

Philosophy

marimo notebooks are a dataflow graph — cells are the fundamental unit of computation, connected by the variables they define and reference. When a cell runs, marimo automatically re-executes downstream cells. You have full access to the running notebook.

Cells are your main lever. Use them to break up work and choose how and when to bring the human into the loop. Not every cell needs rich output — sometimes the object itself is enough, sometimes a summary is better. Match the presentation to the intent.
Understand intent first. When clear, act. When ambiguous, clarify.
Follow existing signal. Check imports, pyproject.toml, existing cells, and dir(ctx) before reaching for external tools.
Stay focused. Build first, polish later — cell names, layout, and styling can wait.

Prerequisites

The marimo server must be running with token and skew protection disabled.

How to invoke marimo

marimo must be invoked with --no-token --no-skew-protection to be discoverable. The right way to invoke it depends on context (project tooling, global install, sandbox mode). See finding-marimo.md for the full decision tree.

Do NOT use --headless unless the user asks for it. Omitting it lets marimo auto-open the browser, which is the expected pairing experience. If the user explicitly requests headless, offer to open it with open http://localhost:<port>.

How to Discover Servers and Execute Code

Two operations: discover servers and execute code.

Operation	Script	MCP
Discover servers	`bash scripts/discover-servers.sh`	`list_sessions()` tool
Execute code	`bash scripts/execute-code.sh -c "code"`	`execute_code(code=..., session_id=...)` tool
Execute code (complex)	`bash scripts/execute-code.sh /tmp/code.py`	same

Scripts auto-discover sessions from the registry on disk. Use --port to target a specific server when multiple are running. If the server was started with --mcp, you'll have MCP tools available as an alternative.

No servers running?

Always discover before starting. Background task "completed" notifications do not mean the server died — check the output or run discover first.

If no servers are found, read the user's intent — if they want a notebook, start one. See finding-marimo.md.

If there's no .py file yet, pick a descriptive filename based on context (e.g., exploration.py, analysis.py, dashboard.py). Don't ask — just pick something reasonable.

Use a file for complex code. When code contains quotes, backticks, ${} template literals, or multiline strings (common with anywidget ESM modules), write the code to a temp file with the Write tool first, then pass the file path as a positional argument. This avoids shell escaping issues entirely. Write temp files under /tmp/mo-<id>/ where <id> is a short random ID you pick once per session, to avoid collisions with stale files from other sessions.

Inline ESM in cell code. Temp files are for execute-code.sh transport only — never for runtime. Use """ for ESM inside ''' for the cell code.

Executing Code

Every execute-code call runs inside the notebook's kernel. All cell variables are in scope — print(df.head()) just works. Nothing you define persists between calls (variables, imports, side-effects all reset), but you can freely introspect the notebook: inspect variables, test code snippets, check types and shapes. Use this to explore, prototype, and validate before committing anything to the notebook — then create cells to persist state and make results visible to the user.

To mutate the notebook's dataflow graph — create, edit, and delete cells, install packages, and run cells — use marimo._code_mode:

import marimo._code_mode as cm

async with cm.get_context() as ctx:
    cid = ctx.create_cell("x = 1")
    ctx.install_packages("pandas")
    ctx.run_cell(cid)

You must use async with — without it, operations silently do nothing. All ctx.* methods are synchronous — they queue operations and the context manager flushes them on exit. Do not await them.

Cells are not auto-executed. create_cell and edit_cell are structural changes only — use run_cell to queue execution.

code_mode is a tested, safe API for notebook mutations — prefer it for all structural changes. You also have access to marimo internals from the kernel, but treat that as a last resort and only with high confidence after exploration.

UI state lives outside the reactive graph. Anywidget traitlets can be read or set directly (e.g., slider.value = 5). For mo.ui.* elements, use ctx.set_ui_value(element, new_value) inside code_mode.

First Step: Explore the API

The code_mode API can change between marimo versions — and each running server could be a different version. Inspect what's available at the start of each session, especially when switching between servers.

import marimo._code_mode as cm

async with cm.get_context() as ctx:
    ctx  # inspect me — dir(), help(), .cells, ...

Guard Rails

Skip these and the UI breaks:

Install packages via ctx.install_packages(), not uv add or pip. The code API handles kernel restarts and dependency resolution correctly. Only fall back to external CLIs if the API is unavailable or fails.
Custom widget = anywidget. For bespoke visual components, use anywidget with HTML/CSS/JS. Composed mo.ui is fine for simple forms and controls. See rich-representations.md.
NEVER write to the .py file directly while a session is running — the kernel owns it.
No temp-file deps in cells. pathlib.Path("/tmp/...") in cell code is a bug.
Avoid empty cells. Prefer edit_cell into existing empty cells rather than creating new ones. Clean up any cells that end up empty after edits.
Don't worry about cell names. Most cells don't need explicit names — see notebook-improvements.md.

Widgets and Reactivity

Anywidget state (traitlets) lives outside marimo's reactive graph. To hook a widget trait into the graph, pick one strategy per widget — never mix them:

mo.state + .observe() — you pick specific traits to bridge. Default choice.
mo.ui.anywidget() — wraps all synced traits into one reactive .value. Convenient but coarser.

Read rich-representations.md before wiring either.

Keep in Mind

The user is editing too. The notebook can change between your calls — re-inspect notebook state if it's been a while since you last looked.
Deletions are destructive. Deleting a cell removes its variables from kernel memory — restoring means recreating the cell and re-running it and its dependents. If intent seems ambiguous, ask first.
Installing packages changes the project. ctx.install_packages() adds real dependencies — confirm when it's not obvious from context.

References

finding-marimo.md — how to find and invoke the right marimo
gotchas.md — cached module proxies and other traps
rich-representations.md — custom widgets and visualizations
notebook-improvements.md — improving existing notebooks