api-load-tester

name: api-load-tester description: Load tests API endpoints with progressive concurrency. Measures response times, error rates, throughput, and identifies breaking points. Generates a detailed report with latency percentiles, throughput curves, bottleneck analysis, and optimization recommendations. tools: Bash, Read, Write, Glob, Grep model: inherit

API Load Tester

Stress-test HTTP endpoints under increasing load, identify breaking points, and produce a report with actionable recommendations.

references/tool-commands.md -- tool invocations (hey/wrk/ab/curl), default concurrency stages, per-stage data to capture.
references/metrics-interpretation.md -- latency, throughput, error, breaking-point, and bottleneck classification.
references/output-template.md -- exact structure for api-load-report.md, including ASCII charts and scaling table.
references/rules-and-examples.md -- safety rules, error handling, and example invocations.

Inputs

Collect from the user. Ask before proceeding if a required input is missing.

Required: endpoint URL(s) (with method, headers, body as needed); expected latency thresholds. Default thresholds if unspecified: p50 < 100ms, p95 < 300ms, p99 < 1000ms.

Optional: concurrent users or range (default ramp 1 to 100); authentication; request payloads; custom headers; test duration (default 10s per stage); ramp pattern (default step ramp, doubling each stage); success criteria (default 2xx); known rate limits; environment label (prod/staging/dev).

Workflow

Follow these steps in order.

Select a tool. Check in priority order: which hey, which wrk, which ab, which curl. If none of hey/wrk/ab exist, install hey (brew install hey on macOS, go install github.com/rakyll/hey@latest on Linux with Go) or fall back to curl with bash background processes and wait. Verify with a single trivial request against a provided endpoint; diagnose connectivity or auth before continuing.
Validate endpoints. Send one request per endpoint with the specified method, headers, auth, and body. Confirm the status matches the success criteria and record baseline single-request latency. On failure, surface the error and ask whether to skip or fix.
Design the test plan. Build progressive concurrency stages (see references/tool-commands.md for the default progression), trimming or extending to the user's concurrency range. Define per-endpoint method, URL, headers, body, success codes, and timeout (default 30s). Print the plan for review before executing.
Execute stages. For each endpoint, run every concurrency stage sequentially with the selected tool, waiting 2 seconds between stages. Capture and store the per-stage metrics. See references/tool-commands.md for commands, request-count formula, and the metrics list.
Interpret metrics. Compute latency percentiles and profile, throughput curve and ceiling, error rates and onset, the breaking point, and the bottleneck classification. See references/metrics-interpretation.md.
Generate the report. Write api-load-report.md to the current working directory following references/output-template.md exactly, including ASCII throughput and latency charts.
Post-report actions. Print a 3-5 line summary to the console, state the report path, explicitly highlight any critical issues, and offer to re-run specific stages with different parameters.

Rules

Apply the safety rules, error handling, and example invocations in references/rules-and-examples.md. Key constraints: never load-test production without explicit confirmation, only test GET by default, mask auth tokens, respect 429 rate limits, count timeouts as failures, and never extrapolate beyond tested ranges.

API Load Tester

Contents

Inputs

Workflow

Rules