model-routing - SKILL.md Agent Skill

name: model-routing description: > Guide for configuring and debugging the model routing layer in Superagent. TRIGGER when: adding a new model provider, configuring fallback chains, tuning cost/latency routing strategy, debugging "model not found" errors, or when the user asks "how does model routing work", "模型路由怎么配置", "add a new LLM provider". DO NOT TRIGGER when: writing agent YAML (use agent-yaml-authoring), implementing tool logic, or working on the UI. origin: learned tags: [model, routing, llm, provider, fallback, cost, latency]

Model Routing

Source: backend/pkg/modelrouter/ and configs/models/routing-rules.yaml.

Routing Strategies

Strategy	When to use
`capability`	Route by required feature (vision, function-calling, long-context)
`cost`	Always pick cheapest model that meets the task
`latency`	Always pick fastest model (streaming first-token)
`fallback`	Try primary; on error/timeout move to next in chain
`round_robin`	Distribute load evenly across providers

routing-rules.yaml Structure

strategies:
  default: fallback

providers:
  openai:
    base_url: https://api.openai.com/v1
    api_key: ${OPENAI_API_KEY}
    models:
      - id: gpt-4o
        capabilities: [vision, function_calling]
        cost_per_1k_tokens: 0.005
        avg_latency_ms: 800

  anthropic:
    base_url: https://api.anthropic.com
    api_key: ${ANTHROPIC_API_KEY}
    models:
      - id: claude-sonnet-4-6
        capabilities: [function_calling, long_context]
        cost_per_1k_tokens: 0.003

fallback_chains:
  default:
    - gpt-4o
    - claude-sonnet-4-6
    - gpt-4o-mini          # cheap fallback

Agent-Level Override

spec:
  model: gpt-4o            # explicit model
  # or
  model_strategy: cost     # let router pick cheapest capable model
  model_capabilities:
    - vision
    - function_calling

Fallback Behavior

Primary model called; if error or timeout → next in chain.
All models in chain exhausted → returns last error to caller.
Timeout per-model configured at provider level (timeout_ms).

Adding a New Provider

Add provider block to routing-rules.yaml with base_url, api_key, model list.
Ensure the model ID matches what the provider API expects.
Restart or hot-reload (router watches the config file).
Test: curl -X POST /api/v1/chat -d '{"model":"new-model","messages":[...]}'

Debugging

# Check which model was selected for a request (log level DEBUG)
APP_LOG_LEVEL=debug make dev-server

# Grep for routing decisions
grep "model_router" logs/app.log