name: context-engineer description: Design agent memory architectures and context window optimization strategies. Use when building persistent memory systems, context budgeting, dynamic context loading, knowledge retrieval, or managing token limits. Covers three-tier memory (episodic, semantic, procedural), context priority frameworks, just-in-time loading patterns, cache invalidation, and provider-agnostic context layers. Based on patterns from Kimi's skill injection, Cursor's scratchpad, BabyAGI's graph memory, and emerging context engineering practices.
Context Engineer
Design memory architectures and context window strategies for AI agents.
Workflow
Memory Design Workflow
- Identify what the agent needs to remember (facts, procedures, episodes)
- Classify memory into tiers using the three-tier model
- Select storage backend for each tier
- Define retrieval strategies and cache invalidation rules
- Set token budgets per context section
Context Audit Workflow
- Measure current context utilization (tokens per section)
- Identify redundant or stale content
- Apply the priority framework to rank sections
- Implement dynamic loading for low-priority knowledge
- Re-measure and compare
Three-Tier Memory Model
Read the relevant reference for implementation templates.
| Tier | What It Stores | Lifespan | Reference |
|---|---|---|---|
| Episodic | Specific interaction logs and outcomes | Session or cross-session | references/01-episodic-memory.md |
| Semantic | General knowledge and learned patterns | Persistent | references/02-semantic-memory.md |
| Procedural | Workflows, strategies, and refined processes | Persistent, versioned | references/03-procedural-memory.md |
Context Budgeting
Read the reference for budget allocation templates.
| Section | Priority | Budget | Reference |
|---|---|---|---|
| System Identity | Critical | Fixed (5-10%) | references/04-context-budgeting.md |
| Active Task Context | Critical | Dynamic (30-50%) | references/04-context-budgeting.md |
| Retrieved Knowledge | High | Dynamic (20-30%) | references/04-context-budgeting.md |
| Conversation History | Medium | Sliding window (10-20%) | references/04-context-budgeting.md |
| Cached Results | Low | Evictable (5-10%) | references/04-context-budgeting.md |
Dynamic Context Loading
Read the reference for loading pattern templates.
| Pattern | Description | Reference |
|---|---|---|
| Just-In-Time | Load knowledge only when task requires it | references/05-dynamic-loading.md |
| Prefetch | Predict and preload likely-needed context | references/05-dynamic-loading.md |
| Eviction | Remove low-relevance content when budget exceeded | references/05-dynamic-loading.md |
Context Priority Framework
When context window is full, evict in this order (lowest priority first):
- Cached tool outputs — regenerable on demand
- Old conversation turns — summarize instead of keeping verbatim
- Background reference material — reload from storage if needed
- Retrieved examples — keep only the most relevant
- NEVER evict — system identity, safety rules, active task state
Provider-Agnostic Context Layer
Separate context from model:
<context_layer>
<identity>[System prompt — model-independent]</identity>
<knowledge>[Retrieved facts — stored externally]</knowledge>
<state>[Task progress — persisted to DB/file]</state>
<history>[Conversation — sliding window]</history>
</context_layer>
<model_layer>
<provider>[OpenAI | Anthropic | Google | Local]</provider>
<model>[Specific model name]</model>
<token_limit>[Context window size]</token_limit>
</model_layer>
Switching providers requires ONLY changing the model layer. Context layer stays identical.
Anti-Patterns
- Context Stuffing — cramming everything into the prompt regardless of relevance
- Stateless Agent — no memory between sessions, relearns everything
- Stale Cache — cached information never expires, becomes incorrect
- Token Waste — verbose formatting consuming budget (XML when plain text suffices)
- Lost in the Middle — critical information buried in the center of long contexts
Validation Scripts
Validate context architecture with automated scoring (0-10):
python3 scripts/validate_context.py <config_file> [--strict]
Checks three-tier memory detection (episodic/semantic/procedural), token budgeting, eviction policies, and flags anti-patterns (unbounded injection, raw history dumping, no eviction).