name: kalibr
description: Ship agents that fix themselves. Kalibr learns what's working as your agents run in production and routes them around failures, degradations, and cost spikes before you know they're happening.
version: 0.2.0
metadata:
openclaw:
emoji: "๐ฆ"
requires:
env:
- KALIBR_API_KEY
- KALIBR_TENANT_ID
primaryEnv: KALIBR_API_KEY
homepage: https://kalibr.systems
Kalibr โ agents that fix themselves
Ship agents that fix themselves. Kalibr learns what's working as your agents run in production and routes them around failures, degradations, and cost spikes before you know they're happening.
You define candidate execution paths โ model + tools + parameters. Kalibr figures out which one wins for each task from real production telemetry. When a path degrades at 3am, your agents are already on the next best path. No alerts. No debugging. No you.
When to use this
User is building agents and asks about reliability, failover, or model selection
User's agent is failing because a model provider is degraded or down
User wants different tasks routed to different models automatically
User is hardcoding
model="gpt-4o"and wants something that adaptsUser asks about production monitoring, cost optimization, or agent routing
User asks how to stop agents from breaking when a provider has an outage
Install
pip install kalibr
Get credentials at https://dashboard.kalibr.systems/settings
export KALIBR_API_KEY="your-api-key"
export KALIBR_TENANT_ID="your-tenant-id"
Install via OpenClaw
openclaw plugins install @kalibr/openclaw
Quick start
from kalibr import Router
router = Router(
goal="extract-emails",
paths=[
{"model": "gpt-4o", "tools": ["web_search"]},
{"model": "claude-sonnet-4-20250514"},
{"model": "gemini-2.0-flash", "params": {"temperature": 0.2}},
]
)
response = router.completion(
messages=[{"role": "user", "content": "Extract emails from this page..."}]
)
# This is how Kalibr learns โ tell it what worked
router.report(success="@" in response.choices[0].message.content)
Kalibr routes the full execution path โ model + tools + parameters โ not just the model. After ~20 outcomes it knows what's winning. After 50 it's locked in and adapting.
Auto-reporting
Skip manual reporting โ define success inline:
router = Router(
goal="extract-emails",
paths=["gpt-4o", "claude-sonnet-4-20250514", "gemini-2.0-flash"],
success_when=lambda output: "@" in output,
)
# Kalibr reports outcomes automatically after every call
response = router.completion(messages=[...])
How it's different
OpenRouter / LiteLLM routing: Model proxy. Routes based on price, speed, availability. Doesn't know if the response was actually good for your task.
Fallback systems (LangChain ModelFallbackMiddleware): Reactive. Waits for a failure, then tries the next model. You already lost that request.
Kalibr: Learns from your actual production telemetry โ per task, per path. Routes to what's working before anything breaks. 10% canary traffic keeps testing alternatives so Kalibr catches degradation before your users do.
Works with
LangChain / LangGraph:
pip install langchain-kalibrโ drop-in ChatModelCrewAI: Pass
ChatKalibras any agent'sllmOpenAI Agents SDK: Drop-in replacement
Any Python code that calls an LLM
How it works
Kalibr captures telemetry on every agent run โ latency, success, cost, provider status. It uses Thompson Sampling to balance exploration (trying paths) vs. exploitation (using the best). 10% canary traffic keeps testing alternatives so Kalibr catches degradation before your users do.
Success rate always dominates. Kalibr never sacrifices quality for cost.