ai-provider-debug

star 0

Debug and troubleshoot AI/LLM provider integrations in ettametta. Use when investigating API failures, rate limits, fallback chain issues, model selection problems, or cost anomalies across 17+ providers.

psalmprax By psalmprax schedule Updated 6/2/2026

name: ai-provider-debug description: Debug and troubleshoot AI/LLM provider integrations in ettametta. Use when investigating API failures, rate limits, fallback chain issues, model selection problems, or cost anomalies across 17+ providers.

AI Provider Debugging

Quick Diagnostics

curl http://localhost:8000/api/v1/llm/providers
curl http://localhost:8000/api/v1/llm/models
curl -X POST http://localhost:8000/api/v1/llm/reset-circuits

Provider Registry

Provider SDK Primary Use
Groq groq SDK Default LLM, fast inference
OpenAI openai SDK High-reasoning (GPT-4o)
Anthropic anthropic SDK Premium reasoning (Claude 3.5)
Google Gemini google.genai Cost-effective medium tasks
Ollama OpenAI-compat Zero-cost local inference
xAI (Grok) openai (custom URL) Grok-2 reasoning
DeepSeek openai (custom URL) DeepSeek-Chat/Coder
Mistral openai (custom URL) Mistral-Large
Cohere cohere SDK Command-R-Plus
Cerebras openai (custom URL) Llama-3.3-70b (30 RPM free)
Cloudflare REST dict Llama-3.1-70b
Hugging Face REST dict Llama-3.3-70B
OpenRouter openai (custom URL) Multi-model gateway
NVIDIA NIM openai (custom URL) Llama-3.3-70b
SiliconFlow openai (custom URL) Qwen2.5-72B
Ollama Cloud openai (custom URL) Qwen2.5:72b
Dify Custom client Orchestrator/workflow
vLLM REST (OpenAI-compat) High-throughput inference

Three Orchestration Layers

1. IntelligenceHub (src/services/llm/intelligence_hub.py)

  • Complexity routing: low -> ollama, medium -> gemini/groq, high -> openai
  • Circuit breaker per engine (5 failures -> open, 60s recovery)
  • Auto-heal: rate limits after 10 min, 3+ errors -> degraded

2. UnifiedLLMService (src/services/llm/service.py)

  • 7 providers, tries requested then iterates all
  • tenacity retries: 3 attempts, exponential backoff 1-10s

3. BaseEttamettaAgent (src/services/base_agent.py)

  • 17 providers, fixed fallback: primary -> ollama -> xai -> deepseek -> cerebras -> groq -> openai -> openrouter -> mistral -> siliconflow -> nvidia -> gemini -> anthropic

Fallback Chains

  • VLM (Vision): Groq Vision -> Local Moondream2 -> Gemini 1.5 Flash -> heuristic
  • CrewAI: Groq (llama-3.3-70b-versatile) -> OpenAI (gpt-4o-mini) -> Ollama
  • Ollama self-failover: primary URL -> localhost:11434

Key Files

File Purpose
src/services/llm/service.py UnifiedLLMService (7 providers)
src/services/llm/intelligence_hub.py IntelligenceHub (complexity routing)
src/services/llm/dify_client.py DifyClient with retries
src/services/base_agent.py BaseEttamettaAgent (17 providers)
src/services/openclaw/agent.py OpenClawAgent (17+ providers)
src/services/video_engine/vlm_service.py VLMService (vision chain)
src/api/utils/vault.py 3-tier secret resolution
src/api/utils/llm_vault.py LLM vault (17 providers)
src/api/utils/resilience.py CircuitBreaker
src/api/routes/llm.py LLM API endpoints

Common Issues

All providers failing — circuit breakers open

curl -X POST http://localhost:8000/api/v1/llm/reset-circuits

Rate limited — auto-heal not working

IntelligenceHub auto-heals after 10 min. If stuck: reset circuits, check for 429 in logs.

Placeholder key rejected

IntelligenceHub rejects keys containing "your_", "placeholder", "CHANGE_ME". Set real keys.

Ollama not responding

docker compose ps ollama
curl http://localhost:11434/api/tags
docker compose exec ollama ollama list

API key priority

Three-tier in vault.py: UserSetting DB -> SystemSettings DB -> .env

Rate Limits

Tier Limit
FREE 100/hr
PRO/PREMIUM/BASIC 500/hr
SOVEREIGN/STUDIO 5000/hr
Install via CLI
npx skills add https://github.com/psalmprax/ettametta --skill ai-provider-debug
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator