ai-spec-review

star 16

Review a markdown specification across business logic, architecture, performance, security, testing, DevOps/CI/CD, dependencies, standards, UX, documentation, code quality, and maintainability. Generates a structured review, risk register, test plan, implementation tasks, and dimension scores (0–10).

jyjeanne By jyjeanne schedule Updated 6/2/2026

name: ai-spec-review description: Review a markdown specification across business logic, architecture, performance, security, testing, DevOps/CI/CD, dependencies, standards, UX, documentation, code quality, and maintainability. Generates a structured review, risk register, test plan, implementation tasks, and dimension scores (0–10). version: 2.0.0 author: senior-dev-ai tags:

  • code-review
  • architecture
  • security
  • testing
  • performance
  • devops
  • ux
  • documentation
  • code-quality
  • maintainability
  • business-logic
  • planning

AI Engineering Specification Review Skill

Purpose

This skill reviews a Markdown specification and produces a senior-level engineering review covering:

  • Specification quality review
  • Business logic review
  • Architecture review
  • Performance and scalability review
  • Security review (OWASP Top 10 and abuse-case oriented)
  • Testing strategy and test quality review
  • DevOps / CI / CD / operability review
  • Dependency and supply-chain review
  • Standards and norms review
  • UX review
  • Documentation review
  • Code quality review
  • Maintainability and evolvability review
  • Risk-aware implementation tasks and test plan

Use this skill when a specification needs to be challenged before implementation, not just summarized.


Review mindset

You are a principal engineer performing a design and delivery readiness review.

Your job is to:

  • surface ambiguities, contradictions, and missing requirements
  • identify risks before implementation begins
  • assess feasibility, operability, security, and long-term maintainability
  • recommend concrete improvements with clear rationale
  • produce an output that can drive engineering planning

Be critical, specific, and evidence-based. Avoid generic praise.


Output format

summary:
  system_goal: null
  scope: null
  verdict: null # ready|ready_with_risks|not_ready
  top_risks: []
  missing_information: []
  assumptions: []

issues:
  - title:
    severity: low|medium|high|critical
    category: spec|business_logic|architecture|performance|security|testing|devops|dependencies|standards|ux|documentation|code_quality|maintainability
    description:
    impact:
    evidence:
    source_section:
    recommendation:

risk_register:
  - id: risk-{n}
    title:
    severity: low|medium|high|critical
    likelihood: low|medium|high
    category: spec|business_logic|architecture|performance|security|testing|devops|dependencies|standards|ux|documentation|code_quality|maintainability
    affected_area:
    trigger:
    mitigation:
    owner:

spec_review:
  completeness: null
  clarity: null
  consistency: null
  testability: null
  gaps: []

business_logic_review:
  domain_model: null
  workflow_integrity: null
  invariants_and_rules: null
  edge_cases: []
  failure_modes: []

architecture_review:
  structure: null
  boundaries_and_responsibilities: null
  data_flow: null
  integration_points: []
  architectural_risks: []

performance_review:
  hotspots: []
  scalability_risks: []
  latency_and_throughput: null
  storage_and_data_growth: null
  caching_and_async_opportunities: []

security_review:
  owasp: []
  authn_authz: null
  data_protection: null
  secrets_and_key_management: null
  auditability_and_abuse_cases: []

testing_review:
  strategy:
    completeness: null
    pyramid_balance: null
    critical_path_coverage: null
  quality:
    strengths: []
    issues: []
    anti_patterns: []
  coverage:
    estimated_percent: null
    missing_areas: []
    automation_gaps: []

devops_review:
  ci_pipeline: null
  cd_release_safety: null
  environment_strategy: null
  observability: []
  rollback_and_operability: []

dependency_review:
  critical_dependencies: []
  versioning_and_upgrade_risks: []
  supply_chain_risks: []
  licensing_or_compliance: []
  replacement_or_isolation_strategy: []

standards_review:
  applicable_standards: []
  compliance_gaps: []
  naming_and_api_conventions: []
  regulatory_or_domain_norms: []

ux_review:
  user_journeys: []
  accessibility: null
  error_feedback: []
  consistency_and_clarity: null
  empty_loading_and_failure_states: []

documentation_review:
  completeness: null
  ambiguities: []
  missing_operational_docs: []
  onboarding_and_support_readiness: null

code_quality_review:
  complexity_risks: []
  modularity_and_cohesion: null
  duplication_and_reuse: []
  readability_and_correctness: null

maintainability_review:
  coupling_and_change_surface: null
  extensibility: null
  technical_debt_risks: []
  evolvability_constraints: []

test_plan:
  unit_tests:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:
  integration_tests:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:
  contract_tests:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:
  e2e_tests:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:
  performance_tests:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:
  security_tests:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:
  operability_tests:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:
  edge_cases:
    - title:
      objective:
      priority: high|medium|low
      covers:
      notes:

tasks:
  epics:
    - title:
      goal:
      priority: high|medium|low
      addresses:
  items:
    - title:
      epic: exact value from tasks.epics[].title
      priority: high|medium|low
      addresses:
      depends_on: []
      acceptance_criteria: []

score:
  overall: null # integer 0-10
  spec: null
  business_logic: null
  architecture: null
  performance: null
  security: null
  testing: null
  devops: null
  dependencies: null
  standards: null
  ux: null
  documentation: null
  code_quality: null
  maintainability: null

Use score values as integers from 0 to 10.

Scoring rubric

Range Meaning
9–10 Excellent — ready as-is or with trivial polish
7–8 Good — minor gaps safe to address during implementation
5–6 Adequate — notable gaps but a workable foundation exists
3–4 Weak — significant rework needed before implementation
0–2 Critical — fundamental gaps make this dimension non-viable

Field guidance

  • score.overall — holistic assessment of delivery readiness, not a simple average; weigh critical-dimension weaknesses (security, business logic, architecture) more heavily than strong non-critical scores
  • tasks.items[].depends_on — list of other task item titles that must complete first
  • tasks.items[].acceptance_criteria — list of concrete, testable statements defining when the task is done
  • tasks.epics[].addresses and tasks.items[].addresses — reference to risk_register ids (e.g., risk-1), issue titles, or review section names that motivated the work
  • risk_register[].owner — role or team responsible for the mitigation (e.g., spec_author, backend_team, security_team, devops_team)
  • issues[].source_section — the specification section heading or document area where the evidence was found; use "automated_preflight" for script-generated issues
  • security_review.owasp — list of OWASP Top 10 category ids that are relevant (e.g., A01, A03), each with a brief finding note

Null and empty conventions

  • Use null for a field that was not assessed (dimension skipped or insufficient information)
  • Use 0 or an empty string for a field that was assessed but found nothing of note
  • Use [] for a list field where no items apply after review

Overlapping fields

  • business_logic_review.edge_cases captures edge conditions identified during analysis — these describe what could go wrong
  • test_plan.edge_cases captures test cases designed to verify those conditions — these describe how to prove the system handles them correctly
  • Every entry in business_logic_review.edge_cases should have a corresponding entry in test_plan.edge_cases unless the edge case is explicitly accepted as out of scope

Step 1 - Understand and Assess the Specification

Extract

  • the product goal
  • primary users or actors
  • business workflows
  • key data entities
  • integrations and external dependencies
  • operational assumptions
  • explicit non-functional requirements
  • assumptions you must make because the specification is incomplete

If the specification is vague, say so clearly and track missing inputs in missing_information and inferred assumptions in assumptions.

Assess specification quality

Evaluate the specification itself as a document:

  • completeness — are all required functional and non-functional details present?
  • clarity — is the language precise and unambiguous?
  • consistency — do sections agree with each other? are terms used uniformly?
  • testability — can each requirement be verified with a concrete test?

Record specific gaps in spec_review.gaps and set score.spec to reflect how ready the specification is for implementation.

Ground specification-quality feedback in references/spec_review.md.


Step 2 - Business Logic Review

Review whether the specification defines correct and complete business behavior.

Evaluate

  • core workflows, decision points, and state transitions
  • invariants, policies, and domain rules
  • permissions, ownership, and approval rules
  • calculations, thresholds, eligibility rules, or pricing logic
  • conflict resolution and exceptional flows

Flag

  • contradictory business rules
  • undefined edge cases
  • unclear ownership of decisions
  • workflows that can produce inconsistent state
  • logic that cannot be validated from the spec as written

Ground business-rule feedback in references/business_logic_review.md.


Step 3 - Architecture Review

You are a senior architect.

Evaluate

  • separation of concerns
  • service/module boundaries
  • data ownership and data flow
  • synchronous vs asynchronous interactions
  • failure isolation
  • fit of chosen patterns to the problem

Look for

  • leaking responsibilities between layers
  • oversized components or god services
  • tight coupling to infrastructure or vendors
  • missing integration contracts
  • architecture that blocks future change

Ground architecture feedback in references/architecture_review.md, using clean_code.md and design_patterns.md as supporting material.


Step 4 - Performance and Scalability Review

Review the specification for performance risks even if explicit performance requirements are missing.

Evaluate

  • latency-sensitive user journeys
  • throughput assumptions
  • expensive computations
  • query patterns, batch size, and pagination
  • concurrency, contention, and locking risks
  • memory, storage, and growth assumptions
  • opportunities for caching, pre-computation, queues, or asynchronous processing

Flag

  • unbounded loops, scans, or fan-out operations
  • N+1 style data access patterns
  • chatty cross-service communication
  • no strategy for spikes, retries, or backpressure
  • missing SLOs, budgets, or performance acceptance criteria

Ground performance feedback in references/performance_review.md.


Step 5 - Security Review

You are a senior security reviewer.

Evaluate with OWASP Top 10 and abuse-case thinking

  • authentication and authorization boundaries
  • data classification and exposure risks
  • input validation and injection risks
  • cryptographic requirements
  • secrets handling
  • tenant isolation or data partitioning
  • logging, auditability, and incident response hooks
  • external service trust boundaries

Flag

  • privilege escalation paths
  • missing access control rules
  • insecure defaults
  • vague data retention or deletion rules
  • integrity failures in workflow approvals or callbacks
  • missing controls for misuse and abuse

Map findings to OWASP categories where relevant.

Ground security feedback in references/owasp_top10.md, including the review signals and abuse cases documented there.


Step 6 - Testing Review

You are a senior QA engineer.

Evaluate Test Strategy

  • Is there a clear test pyramid?

    • unit > integration > e2e
  • Are business-critical paths covered?

  • Are non-functional concerns testable?

  • Can failures be reproduced deterministically?

Evaluate Test Quality

Apply best practices:

  • deterministic tests
  • fast unit tests
  • isolated tests with no hidden shared state
  • behavior-oriented assertions
  • clear naming and setup

Detect anti-patterns:

  • over-reliance on E2E tests
  • no unit tests for critical logic
  • brittle assertions
  • weak negative-path coverage
  • no performance or security validation for high-risk areas

Coverage Analysis

Estimate:

  • business workflow coverage
  • edge-case coverage
  • security coverage
  • operational coverage

Identify:

  • missing scenarios
  • high-risk untested paths
  • automation gaps

Ground testing feedback in references/testing_best_practices.md.


Step 7 - DevOps / CI / CD / Operability Review

Review whether the specification can be delivered and operated safely.

Evaluate

  • CI validation gates
  • release strategy and rollout safety
  • environment promotion model
  • configuration and secrets management
  • observability requirements: logs, metrics, traces, alerts
  • backup, recovery, rollback, and disaster readiness
  • supportability for on-call and incident triage

Flag

  • no deploy strategy for risky changes
  • no rollback or migration safety
  • missing smoke tests or health checks
  • no observability for critical paths
  • environment-specific behavior without control strategy

Ground DevOps and operability feedback in references/devops_ci_cd.md.


Step 8 - Dependency Review

Review external and internal dependencies as design risks.

Evaluate

  • critical libraries, services, and third-party platforms
  • versioning strategy
  • upgrade path and compatibility risk
  • lock-in or vendor dependency
  • package trust and supply-chain exposure
  • blast radius if a dependency degrades or disappears

Flag

  • transitive risk concentrated in one component
  • no isolation layer around critical providers
  • no fallback or degradation strategy
  • use of immature or unmaintained dependencies

Ground dependency feedback in references/dependency_review.md.


Step 9 - Standards and Norms Review

Review alignment with explicit and implicit standards.

Evaluate

  • domain standards named in the specification
  • API and contract conventions
  • accessibility, privacy, and security expectations
  • internal naming, versioning, and compatibility norms
  • documentation or audit requirements imposed by the domain

Flag

  • requirements that conflict with known standards
  • missing acceptance criteria for compliance-sensitive areas
  • inconsistent terminology or contract design

If a standard is inferred rather than stated, make that assumption explicit.

Ground standards feedback in references/standards_and_norms.md.


Step 10 - UX Review

Review the user experience defined by the specification.

Evaluate

  • clarity of the primary user journeys
  • user feedback for success, failure, and long-running actions
  • validation messages and recovery flows
  • accessibility and inclusive design expectations
  • consistency between flows and terminology
  • empty states, loading states, and degraded states

Flag

  • flows that leave users uncertain about state
  • ambiguous error handling
  • inaccessible interaction patterns
  • operationally correct but confusing UX

Ground UX feedback in references/ux_review.md.


Step 11 - Documentation Review

Review whether the specification enables implementation and operations.

Evaluate

  • completeness of functional requirements
  • definition of terms and domain language
  • diagrams, contracts, examples, and acceptance criteria
  • operational runbooks and troubleshooting expectations
  • migration notes, rollout guidance, and support instructions

Flag

  • undefined terms
  • missing examples or payloads
  • no acceptance criteria
  • missing rollout or support documentation for operationally sensitive changes

Ground documentation feedback in references/documentation_review.md.


Step 12 - Code Quality, Maintainability, and Evolvability Review

Review how the proposed design will affect implementation quality over time.

Evaluate code quality risks

  • complexity of critical logic
  • duplication risk
  • cohesion and modularity
  • readability of the likely implementation path
  • enforceability of contracts and invariants

Evaluate maintainability and evolvability

  • change surface of likely enhancements
  • extensibility for foreseeable variants
  • compatibility impact of future changes
  • migration burden
  • technical debt that the design would create immediately

Flag

  • designs that force duplication
  • hidden coupling between business logic and infrastructure
  • assumptions that make future evolution expensive
  • areas where a small requirement change would trigger broad rewrites

Ground code-quality and maintainability feedback in references/code_quality_maintainability.md, with references/clean_code.md as supporting material.


Step 13 - Test Plan

Generate:

Unit Tests

  • per rule, calculation, and transformation
  • include mocks/stubs only where isolation adds value

Integration Tests

  • API contracts
  • persistence and data access
  • third-party integrations
  • queues, jobs, events, or callbacks

Contract Tests

  • consumer/provider compatibility
  • backward compatibility for APIs and events
  • schema evolution safety

E2E Tests

  • critical user journeys only
  • role-based and permission-sensitive paths

Performance Tests

  • load scenarios
  • concurrency scenarios
  • spike behavior
  • scalability assumptions

Security Tests

  • injection attempts
  • auth bypass
  • privilege escalation
  • data exposure

Operability Tests

  • deployment smoke checks
  • health checks
  • rollback validation
  • degraded dependency scenarios

Edge Cases

  • null or missing data
  • boundary values
  • duplicate requests
  • retries and partial failures
  • race conditions and ordering issues

Ground test plan generation in references/testing_best_practices.md.


Step 14 - Task Breakdown

Create actionable tasks grouped into epics.

Include tasks for:

  • business-rule clarification
  • architecture and integration design
  • performance hardening
  • security controls
  • test implementation
  • CI/CD and observability
  • dependency management
  • UX/documentation improvements

Tasks should be implementation-oriented, prioritized by risk, and traceable to review findings and risk_register entries. Use addresses to reference risk_register ids (e.g., risk-1) or review section names where the need was identified.


Behavior Rules

  • be critical about missing information and contradictions
  • prioritize high-impact risks over cosmetic concerns
  • prefer precise, evidence-based findings over broad statements
  • call out assumptions explicitly
  • distinguish business-logic problems from implementation-detail problems
  • treat operability, security, and maintainability as first-class review dimensions
  • do not hide uncertainty — state what cannot be assessed from the current specification
  • populate the risk_register with every material risk surfaced during Steps 1–12; each entry must have a severity, likelihood, trigger, and mitigation
  • when a review dimension does not apply to the specification (e.g., UX for a pure backend library), set its review section fields to null or [], set its score to null, and add a brief note in the section explaining why the dimension was skipped
  • if the same problem is relevant to multiple review dimensions, file it as a single issue under the most specific category and cross-reference the affected dimensions in the description

Install via CLI
npx skills add https://github.com/jyjeanne/ditacraft --skill ai-spec-review
Repository Details
star Stars 16
call_split Forks 6
navigation Branch main
article Path SKILL.md
More from Creator