aaf-architecture-review

star 0

Applies the Agentic Architecture Framework when designing or reviewing agentic systems. Covers pillar checklist, trade-offs, design-time spec and pre-production review. Use when designing agentic systems, reviewing architectures, pre-production readiness, or discussing autonomy and governance trade-offs.

AgenticAF-Community By AgenticAF-Community schedule Updated 2/22/2026

name: aaf-architecture-review description: Applies the Agentic Architecture Framework when designing or reviewing agentic systems. Covers pillar checklist, trade-offs, design-time spec and pre-production review. Use when designing agentic systems, reviewing architectures, pre-production readiness, or discussing autonomy and governance trade-offs.

AAF Architecture Review

Use this skill when designing new agentic systems or reviewing existing architectures against the Agentic Architecture Framework (AAF). The framework is a set of architectural lenses and a method for surfacing trade-offs—not a one-size-fits-all checklist.

When to use

  • Designing an agentic system (before you build).
  • Conducting an architecture review (pre-production readiness).
  • Discussing autonomy levels, epistemic gates, or pillar trade-offs.
  • Aligning with AAF pillars (security, reliability, cost, operations, performance, sustainability) and cross-cutting foundations (context optimization, autonomy & outcome governance).

Mode 1: Design-time (before you build)

A design spec should answer, at minimum:

Autonomy declaration

  • What autonomy level: assistive, delegated, bounded autonomous, or supervisory?
  • What tasks are permitted at each level?

Authority model (epistemic gates)

  • Where are the gates between generation → validation → authority?
  • Which gates are deterministic checks, human approval, or policy-engine enforced?

Outcome specification

  • Definition of Done for each task class.
  • Evidence required to validate completion.

Tool governance

  • Which tools exist? Read vs write vs irreversible?
  • Least-privilege scopes; verification after each tool call.

Budgets

  • Step/tool/token/time/spend budgets.
  • Behavior on budget exhaustion (escalation, defer, degrade).

Context policy

  • How context is constructed and bounded.
  • Separation of memory from task context; provenance (trusted vs untrusted).

Failure and escalation

  • Escalation triggers (uncertainty, verification failure, high-risk action, suspected injection, tool failures).
  • Degraded modes (read-only, observe-only, human-required).

Mode 2: Architecture review (pre-production)

Apply pillars as structured review lenses. Absence of budgets, verification, and observability is a strong indicator the system is not production-ready.

Security Architecture

  • Are all entry points authenticated and authorized?
  • Are tool scopes least privilege?
  • Are write actions gated and verified?
  • Are untrusted inputs (including retrieved content) treated as hostile?

Reliability

  • Is success defined as a verifiable end state?
  • Are tool failures expected and handled?
  • Are actions idempotent or checkpointed?
  • Are retries safe?

Cost Optimization

  • Are budgets enforced at runtime?
  • Is model routing explicit by phase and risk?
  • Is context budgeted (no uncontrolled prompt accumulation)?
  • Are caching and early stopping designed in?

Operational Excellence

  • Is the full control loop observable (Trigger → Decide → Act → Verify), with observability traces (intent → plan → act → verify) captured?
  • Is there an evaluation harness and regression suite?
  • Is rollout staged with rollback?
  • Are skills/tools versioned and reviewed?

Performance Efficiency

  • Is topology justified by task structure (single-agent by default; orchestration only where it helps)?
  • Are tool round trips minimized?
  • Is work partitioned into interactive vs batch?

Sustainability

  • Is usage measured and visible?
  • Are efficiency levers used as defaults (minimal context, concise outputs, cached prefixes, bounded loops)?

Context Optimization

  • Is context separated from memory (task-scoped vs durable)?
  • Is context budgeted per task/step with explicit allocations?
  • Is provenance tracked (trusted policy vs untrusted data)?
  • Is retrieval/context construction minimal and explainable?

Autonomy & Outcome Governance

  • Is autonomy level declared (assistive, delegated, bounded autonomous, supervisory)?
  • Is there a Definition of Done with acceptance checks and evidence per task class?
  • Are budgets (steps/tools/tokens/time/spend) enforced and visible?
  • Are escalation triggers and degraded modes defined?

Maturity model (scaling autonomy safely)

  • Stage 0 — Assistive: Read-only tools; human decision authority; logging and evaluation baselines.
  • Stage 1 — Delegated: Preview/approve for write actions; deterministic outcome verification; scoped tool permissions.
  • Stage 2 — Bounded autonomy: Enforced budgets; policy gates for privileged actions; escalation triggers and degraded modes; canary and rollback.
  • Stage 3 — Supervisory: Orchestrator as validation bottleneck; specialist agents with narrow scopes; cross-agent budgets and provenance.
  • Stage 4 — Interoperable: MCP/A2A under strict governance; shared policy and audit; cross-domain budgets and verification.

Key principle

Epistemic and autonomy gates must scale with risk. Low-stakes workloads can tolerate lighter gates; high-stakes require strong, unavoidable gates and explicit accountability.

Additional resources

  • Framework overview and pillars: docs/05-framework-overview.md
  • Application method (design + review): docs/15-application-method.md
  • Whitepaper and website: https://agenticaf.io/
Install via CLI
npx skills add https://github.com/AgenticAF-Community/FrameworkCore --skill aaf-architecture-review
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
AgenticAF-Community
AgenticAF-Community Explore all skills →