context-engineer - SKILL.md Agent Skill

name: context-engineer description: Design agent memory architectures and context window optimization strategies. Use when building persistent memory systems, context budgeting, dynamic context loading, knowledge retrieval, or managing token limits. Covers three-tier memory (episodic, semantic, procedural), context priority frameworks, just-in-time loading patterns, cache invalidation, and provider-agnostic context layers. Based on patterns from Kimi's skill injection, Cursor's scratchpad, BabyAGI's graph memory, and emerging context engineering practices.

Context Engineer

Design memory architectures and context window strategies for AI agents.

Workflow

Memory Design Workflow

Identify what the agent needs to remember (facts, procedures, episodes)
Classify memory into tiers using the three-tier model
Select storage backend for each tier
Define retrieval strategies and cache invalidation rules
Set token budgets per context section

Context Audit Workflow

Measure current context utilization (tokens per section)
Identify redundant or stale content
Apply the priority framework to rank sections
Implement dynamic loading for low-priority knowledge
Re-measure and compare

Three-Tier Memory Model

Read the relevant reference for implementation templates.

Tier	What It Stores	Lifespan	Reference
Episodic	Specific interaction logs and outcomes	Session or cross-session	`references/01-episodic-memory.md`
Semantic	General knowledge and learned patterns	Persistent	`references/02-semantic-memory.md`
Procedural	Workflows, strategies, and refined processes	Persistent, versioned	`references/03-procedural-memory.md`

Context Budgeting

Read the reference for budget allocation templates.

Section	Priority	Budget	Reference
System Identity	Critical	Fixed (5-10%)	`references/04-context-budgeting.md`
Active Task Context	Critical	Dynamic (30-50%)	`references/04-context-budgeting.md`
Retrieved Knowledge	High	Dynamic (20-30%)	`references/04-context-budgeting.md`
Conversation History	Medium	Sliding window (10-20%)	`references/04-context-budgeting.md`
Cached Results	Low	Evictable (5-10%)	`references/04-context-budgeting.md`

Dynamic Context Loading

Read the reference for loading pattern templates.

Pattern	Description	Reference
Just-In-Time	Load knowledge only when task requires it	`references/05-dynamic-loading.md`
Prefetch	Predict and preload likely-needed context	`references/05-dynamic-loading.md`
Eviction	Remove low-relevance content when budget exceeded	`references/05-dynamic-loading.md`

Context Priority Framework

When context window is full, evict in this order (lowest priority first):

Cached tool outputs — regenerable on demand
Old conversation turns — summarize instead of keeping verbatim
Background reference material — reload from storage if needed
Retrieved examples — keep only the most relevant
NEVER evict — system identity, safety rules, active task state

Provider-Agnostic Context Layer

Separate context from model:

<context_layer>
  <identity>[System prompt — model-independent]</identity>
  <knowledge>[Retrieved facts — stored externally]</knowledge>
  <state>[Task progress — persisted to DB/file]</state>
  <history>[Conversation — sliding window]</history>
</context_layer>

<model_layer>
  <provider>[OpenAI | Anthropic | Google | Local]</provider>
  <model>[Specific model name]</model>
  <token_limit>[Context window size]</token_limit>
</model_layer>

Switching providers requires ONLY changing the model layer. Context layer stays identical.

Anti-Patterns

Context Stuffing — cramming everything into the prompt regardless of relevance
Stateless Agent — no memory between sessions, relearns everything
Stale Cache — cached information never expires, becomes incorrect
Token Waste — verbose formatting consuming budget (XML when plain text suffices)
Lost in the Middle — critical information buried in the center of long contexts

Validation Scripts

Validate context architecture with automated scoring (0-10):

python3 scripts/validate_context.py <config_file> [--strict]

Checks three-tier memory detection (episodic/semantic/procedural), token budgeting, eviction policies, and flags anti-patterns (unbounded injection, raw history dumping, no eviction).