name: improve-codebase-architecture
description: Find deepening opportunities in the ELI PANDA codebase, informed by the domain glossary in docs/CONTEXT.md, the per-feature pages in docs/technical/, the ADRs in docs/adr/, and the conventions in CLAUDE.md. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make the codebase more testable and AI-navigable.
Improve Codebase Architecture
Surface architectural friction in this codebase and propose deepening opportunities — refactors that turn shallow modules into deep ones. The aim is testability and AI-navigability.
Glossary
Use these terms exactly in every suggestion. Consistent language is the point — don't drift into "component," "service," "API," or "boundary." Full definitions in LANGUAGE.md.
- Module — anything with an interface and an implementation (a function, a hook, a
.cont.tsx/.comp.tsxpair, a Zustand store, a feature folder undersrc/modules/, a server-side helper). - Interface — everything a caller must know to use the module: types, invariants, error modes, ordering, config. Not just the TypeScript signature.
- Implementation — the code inside.
- Depth — leverage at the interface: a lot of behaviour behind a small interface. Deep = high leverage. Shallow = interface nearly as complex as the implementation.
- Seam — where an interface lives; a place behaviour can be altered without editing in place. (Use this, not "boundary.")
- Adapter — a concrete thing satisfying an interface at a seam.
- Leverage — what callers get from depth.
- Locality — what maintainers get from depth: change, bugs, knowledge concentrated in one place.
Key principles (see LANGUAGE.md for the full list):
- Deletion test: imagine deleting the module. If complexity vanishes, it was a pass-through. If complexity reappears across N callers, it was earning its keep.
- The interface is the test surface.
- One adapter = hypothetical seam. Two adapters = real seam.
This skill is informed by PANDA's technical docs. They give names to good seams and record decisions the skill should not re-litigate.
Project context this skill relies on
docs/CONTEXT.md— cross-cutting domain glossary. The first place to look when naming a deepened module. Authoritative names; full definitions live on the per-feature pages it links to.docs/technical/app-architecture.md— stack, module layout, request lifecycle, deprecated/legacy clusters, maintenance recommendations. Read this first when proposing cross-cutting refactors.docs/technical/<feature>.md(e.g.orders-and-order-items.md,catalogue-and-items.md,permissions-model.md,systems-family/*.md) — per-feature domain vocabulary and known integration points. Read the page covering the area you're touching before naming anything.docs/adr/— architecture decision records. Check this directory for prior decisions that would conflict with a proposed refactor. Seedocs/adr/README.mdfor when an ADR should be written.CLAUDE.md— coding conventions, container/component split, toast/fetch/modal patterns..agents/skills/architecture/SKILL.md— canonical folder layout and file-naming rules. A deepening proposal that moves code must respect these or call out the deviation explicitly.
The ADR directory is intentionally lightweight today. Decisions also live in the Maintenance recommendations, Deprecated / legacy, Open questions, and 🔮 Planned sections of the per-feature pages and app-architecture.md — treat those sections as informal ADRs and promote them into docs/adr/ when a real decision crystallizes.
Process
1. Explore
Read docs/CONTEXT.md for vocabulary, the relevant docs/technical/ page(s) for the area, and app-architecture.md for anything cross-cutting. Scan docs/adr/ for any decision that touches the area. Note the existing Maintenance recommendations and Open questions in the technical pages — your proposals should either build on them or explain why they're wrong.
Then use the Agent tool with subagent_type=Explore to walk the codebase. Don't follow rigid heuristics — explore organically and note where you experience friction:
- Where does understanding one concept require bouncing between many small modules?
- Where are modules shallow — interface nearly as complex as the implementation?
- Where have pure functions been extracted just for testability, but the real bugs hide in how they're called (no locality)?
- Where do tightly-coupled modules leak across their seams?
- Which parts of the codebase are untested, or hard to test through their current interface?
Apply the deletion test to anything you suspect is shallow: would deleting it concentrate complexity, or just move it? A "yes, concentrates" is the signal you want.
PANDA-specific friction patterns to look for
The codebase has recurring shapes that often produce shallow modules. Use these as prompts, not as a checklist:
.cont.tsxleaking into.comp.tsx— a "pure" component that still reads from a store, calls a mutation, or knows about GraphQL types. The container's interface is wider than it looks. Deepening usually means pulling logic back into the container (or a hook) and narrowing the props the.comp.tsxreceives.- Hooks extracted for tests, then called from one place — a
useXyzhook whose body is small and whose only caller is one container. The deletion test usually clears it. - Mutation glue around
queryMutate— wrappers that re-throw, re-shape, or re-toast around an axios-shaped response. The duplication is locality begging to live behindtoast.promise(see thetoastskill) or a typed mutation helper. - Filter / table state spread across module store +
useTableStateStore+ URL params — three places to update when a filter is added. A deeper module owns the filter state and exposes a small read/write interface. - Modal duality — code that touches both
useModalStore(legacy) anduseDynamicModalStore(current). Seeapp-architecture.md"Maintenance recommendations" #3; new proposals should pick the dynamic side. fetchClient/axiosInstance/queryMutate'sAxiosResponse<T>shape — three transports for the REST gateway. The interface is wider than the work needs (seeapp-architecture.md"Maintenance recommendations" #1).- Form module + Zod schema + RHF wiring repeated per feature — wizard steps, field-level Zod, RHF resolver bindings duplicated across
src/modules/<feature>/components/. Look for a deeper "form module" the feature can instantiate. - GraphQL query + custom hook + container plumbing — a query is defined in
queries/, a typed hook generated by codegen, then re-wrapped by a feature hook that adds one parameter. Often the codegen-generated hook is already deep enough.
These are pointers, not mandates. Use the deletion test before proposing any of them.
2. Present candidates
Present a numbered list of deepening opportunities. For each candidate:
- Files — which files/modules are involved (use repo-relative paths like
src/modules/orders/Orders.cont.tsx:42) - Problem — why the current architecture is causing friction
- Solution — plain English description of what would change
- Benefits — explained in terms of locality and leverage, and also in how tests would improve
Use docs/CONTEXT.md + docs/technical/ vocabulary for the domain, and LANGUAGE.md vocabulary for the architecture. If docs/CONTEXT.md (and docs/technical/orders-and-order-items.md) calls something an "order line," say "the order-line intake module" — not "the OrderItemHandler," not "the order-item service."
Existing decision conflicts: if a candidate contradicts an ADR in docs/adr/, or an item in a docs/technical/ page's Maintenance recommendations, Open questions, or Deprecated / legacy section, mark it clearly (e.g. "contradicts ADR-0007 — but worth reopening because…" or "contradicts app-architecture.md Maintenance rec #2 — but worth reopening because…"). Don't list every theoretical refactor those sources forbid.
Do NOT propose interfaces yet. Ask the user: "Which of these would you like to explore?"
3. Grilling loop
Once the user picks a candidate, drop into a grilling conversation. Walk the design tree with them — constraints, dependencies, the shape of the deepened module, what sits behind the seam, what tests survive. The companion grill-me skill describes the cadence.
Side effects happen inline as decisions crystallize:
- Naming a deepened module after a concept that's not in
docs/CONTEXT.md? Add a short entry todocs/CONTEXT.md(one or two lines + link to the per-feature page that owns the long definition). If the long definition doesn't exist yet, add it to the relevantdocs/technical/page (orapp-architecture.mdif it's cross-cutting). Don't invent terminology that lives only in code. - Sharpening a fuzzy term during the conversation? Update
docs/CONTEXT.mdand the per-feature doc right there. Vocabulary drift across docs is the most common cause of repeat re-suggestions. - User rejects the candidate with a load-bearing reason? Offer to record it as an ADR under
docs/adr/, framed as: "Want me to record this as an ADR so future architecture reviews don't re-suggest it?" Use the template atdocs/adr/0000-template.md. Only offer when the reason would actually be needed by a future explorer to avoid re-suggesting the same thing — skip ephemeral reasons ("not worth it right now") and self-evident ones. For ephemeral or open-ended rejections, an Open question entry in the per-featuredocs/technical/page is the right home instead. - Want to explore alternative interfaces for the deepened module? See INTERFACE-DESIGN.md.
- Deepening crosses into a domain another skill already owns? Defer to it for the implementation shape:
fetchingfor query/mutation hooks,toastfor mutation feedback,modalsfor dialog/sheet wiring,tablesforPandaTableV2,wizardfor multi-step forms,architecturefor where new files land. The architectural argument is still yours; the local pattern is theirs.