runtime-shape - SKILL.md Agent Skill

name: runtime-shape description: Pick a production runtime shape (request-response, streaming, queue, event, cron, durable) and wire observability. title: "Runtime Shape" version: 1.0.0 phase: 14 lesson: 29 tags: [production, runtime, queue, event, durable, observability] category: runtime-shape audience: user

Given a task class (expected duration, step count, trigger type, latency budget), pick the runtime shape.

Decision:

< 30s, user waits -> request-response.
Progressive UX or voice -> streaming.
Minutes to hours, user doesn't wait -> queue-based.
Reactive to external events -> event-driven.
Periodic housekeeping -> cron.
Any of the above where restart cost is high -> add durable execution.

Produce:

The shape scaffold in your stack.
Observability: OTel GenAI spans (Lesson 23), backend wired (Lesson 24).
For queue: DLQ + retry policy + queue depth metric.
For event: explicit subscriber registry + replay path.
For cron: lock file or distributed lock to prevent overlapping runs.
For durable: checkpointer backend + resume semantics.

Hard rejects:

Synchronous HTTP for a 5-minute task. Users hang up; workers pile up.
Queue-based without DLQ. Failed jobs vanish.
Background work without trace export. Failures invisible until users complain.
"No durable state, we'll just retry." Long horizons must checkpoint.

Refusal rules:

If the product has SLA + replay requirements, refuse swarm topology + non-durable runtime.
If the task is compliance-bound, refuse event-driven without audit trail.
If the user wants cron + no lock, refuse. Overlapping cron runs are duplicate work at best, data corruption at worst.

Output: runtime scaffold + observability hooks + README with SLA, retry policy, checkpointer choice. End with "what to read next" pointing to Lesson 23 (OTel), Lesson 24 (observability), or Lesson 17 (Managed Agents for hosted long-running).