openrouter-generations

star 169

Retrieve detailed metadata and stored content for individual OpenRouter generations. Use when the user wants to inspect a specific request — its cost, latency, token usage, provider routing, or the actual prompt/completion text — or is debugging a failed or unexpected generation.

OpenRouterTeam By OpenRouterTeam schedule Updated 6/10/2026

name: openrouter-generations description: Retrieve detailed metadata and stored content for individual OpenRouter generations. Use when the user wants to inspect a specific request — its cost, latency, token usage, provider routing, or the actual prompt/completion text — or is debugging a failed or unexpected generation. version: 0.1.0

openrouter-generations

Retrieve detailed metadata and stored content for individual OpenRouter generations. Use this skill when you need to inspect a specific request — its cost, latency, token usage, provider routing, or the actual prompt/completion text.

Prerequisites

  • Any valid OpenRouter API key (regular or management key). Get one at openrouter.ai/settings/keys.
  • Pass it via --api-key <key> or set the OPENROUTER_API_KEY environment variable
  • Generation IDs look like gen-1234567890 or gen-aBcDeFgHiJkLmNoPqRsT.

First-Time Setup

cd <skill-path>/scripts && npm install

Endpoints

Endpoint Method Purpose
/api/v1/generation GET Request metadata and usage (tokens, cost, latency, model, provider)
/api/v1/generation/content GET Stored prompt and completion text

Both take a single query parameter: id (the generation ID).

Full API reference: openrouter.ai/docs/api/api-reference/generations/get-generation

Workflow

1. Get generation metadata

Retrieves everything about a generation except the actual prompt/completion text:

cd <skill-path>/scripts && npx tsx get-generation.ts gen-1234567890
npx tsx get-generation.ts --id gen-1234567890 --json

What you get back:

  • Model & routing: model, provider_name, router, service_tier
  • Tokens: tokens_prompt, tokens_completion, native_tokens_reasoning, native_tokens_cached
  • Cost: total_cost, usage, upstream_inference_cost, cache_discount
  • Performance: latency, generation_time, moderation_latency
  • Status: finish_reason, streamed, cancelled, is_byok
  • Context: created_at, app_id, external_user, session_id, request_id
  • Provider chain: provider_responses array showing fallback attempts with per-provider latency and status

2. Get generation content

Retrieves the stored prompt and completion:

cd <skill-path>/scripts && npx tsx get-generation-content.ts gen-1234567890
npx tsx get-generation-content.ts --id gen-1234567890 --json

What you get back:

  • Input: prompt (raw text) and/or messages (array of {role, content})
  • Output: completion (the model's response) and reasoning (chain-of-thought, if applicable)

Note: Content is only available if the generation was not made with Zero Data Retention (ZDR) enabled. If ZDR was on, this endpoint returns empty/null content.

Direct API Usage (curl)

Get metadata

curl -G https://openrouter.ai/api/v1/generation \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -d id=gen-1234567890

Get content

curl -G https://openrouter.ai/api/v1/generation/content \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -d id=gen-1234567890

Response Schemas

Metadata response (/api/v1/generation)

{
  "data": {
    "id": "gen-3bhGkxlo4XFrqiabUM7NDtwDzWwG",
    "api_type": "completions",
    "model": "openai/gpt-4o",
    "provider_name": "OpenAI",
    "created_at": "2024-07-15T23:33:19.433273+00:00",
    "tokens_prompt": 10,
    "tokens_completion": 25,
    "native_tokens_reasoning": 5,
    "native_tokens_cached": 3,
    "total_cost": 0.0015,
    "usage": 0.0015,
    "upstream_inference_cost": 0.0012,
    "latency": 1250,
    "generation_time": 1200,
    "finish_reason": "stop",
    "streamed": true,
    "is_byok": false,
    "cancelled": false,
    "router": "openrouter/auto",
    "service_tier": "priority",
    "provider_responses": [
      {
        "provider_name": "OpenAI",
        "model_permaslug": "openai/gpt-4o",
        "status": 200,
        "latency": 1200,
        "is_byok": false
      }
    ]
  }
}

Content response (/api/v1/generation/content)

{
  "data": {
    "input": {
      "prompt": "What is the meaning of life?",
      "messages": [
        {
          "content": "What is the meaning of life?",
          "role": "user"
        }
      ]
    },
    "output": {
      "completion": "The meaning of life is a philosophical question...",
      "reasoning": null
    }
  }
}

Common Use Cases

Debug a failed generation

# Check what happened — look at finish_reason, provider_responses, and cancelled
cd <skill-path>/scripts && npx tsx get-generation.ts gen-abc123 --json

Look for:

  • finish_reason = "length" means the model hit max tokens
  • finish_reason = "content_filter" means content was filtered
  • cancelled = true means the request was cancelled by the client
  • provider_responses with multiple entries means fallbacks occurred

Check cost of a specific request

cd <skill-path>/scripts && npx tsx get-generation.ts gen-abc123

Check total_cost (what you were charged) vs upstream_inference_cost (what the provider charged OpenRouter).

Review what was actually sent/received

cd <skill-path>/scripts && npx tsx get-generation-content.ts gen-abc123

Useful for debugging unexpected outputs — verify the actual prompt sent and completion received.

Trace a multi-generation session

If you have a request_id or session_id from one generation, you can find related generations via the analytics query endpoint (see openrouter-analytics skill).

Error Handling

Status Meaning
401 Invalid or missing API key
403 You don't have access to this generation (belongs to another user)
404 Generation ID not found
429 Rate limited — wait and retry
500 Server error — retry
502 Upstream failure — retry

Key Fields Reference

Metadata fields

Field Type Description
id string Generation ID (gen-...)
model string Model permaslug (e.g., openai/gpt-4o)
provider_name string|null Provider that served the request
api_type string One of: completions, embeddings, rerank, tts, stt, video
tokens_prompt int|null Prompt token count
tokens_completion int|null Completion token count
native_tokens_reasoning int|null Reasoning/thinking tokens
native_tokens_cached int|null Cached input tokens
total_cost number Total cost in USD
usage number Usage amount in USD
upstream_inference_cost number|null Provider's cost in USD
cache_discount number|null Discount from caching
latency number|null Total latency in ms
generation_time number|null Model generation time in ms
moderation_latency number|null Moderation check time in ms
finish_reason string|null Why generation stopped (stop, length, content_filter, etc.)
native_finish_reason string|null Raw finish reason from provider
streamed bool|null Whether response was streamed
is_byok bool Whether user's own provider key was used
cancelled bool|null Whether request was cancelled
app_id int|null OAuth app ID
external_user string|null External user identifier (X-External-User header)
session_id string|null Session grouping ID
request_id string|null Request grouping ID (all gens from one API call)
router string|null Router used (e.g., openrouter/auto)
service_tier string|null Provider service tier
web_search_engine string|null Search engine used (e.g., exa, firecrawl)
num_search_results int|null Number of search results included
provider_responses array|null Provider attempt chain with per-provider latency/status

Content fields

Field Type Description
data.input.prompt string|null Raw prompt text
data.input.messages array|null Messages array ([{role, content}])
data.output.completion string|null Model's completion text
data.output.reasoning string|null Chain-of-thought reasoning
Install via CLI
npx skills add https://github.com/OpenRouterTeam/skills --skill openrouter-generations
Repository Details
star Stars 169
call_split Forks 21
navigation Branch main
article Path SKILL.md
More from Creator
OpenRouterTeam
OpenRouterTeam Explore all skills →