name: a7-plugin-ai-proxy description: >- Skill for configuring the API7 Enterprise Edition ai-proxy plugin via the a7 CLI. Covers proxying requests to LLM providers (OpenAI, Azure OpenAI, DeepSeek, Anthropic, Gemini, Vertex AI, and more), authentication per provider, model configuration, streaming, logging, and route/service usage. version: "1.0.0" author: API7.ai Contributors license: Apache-2.0 metadata: category: plugin apisix_version: ">=3.9.0" plugin_name: ai-proxy a7_commands: - a7 route create - a7 route update - a7 service create - a7 config sync
a7-plugin-ai-proxy
Overview
The ai-proxy plugin turns API7 Enterprise Edition (API7 EE) into an AI gateway. It proxies requests in
OpenAI-compatible format to LLM providers, handling authentication, endpoint
routing, and response streaming. Clients send a standard chat-completion
request; the plugin translates and forwards it to the configured provider.
When to Use
- Proxy chat-completion or embedding requests to any supported LLM provider
- Centralize API keys at the gateway instead of distributing to clients
- Add observability (token counts, latency) to LLM calls
- Combine with
ai-prompt-template,ai-prompt-decorator, or content moderation plugins for a full AI gateway pipeline - Apply consistent AI proxy configurations directly on services or routes
Supported Providers
| Provider | Value | Default Endpoint |
|---|---|---|
| OpenAI | openai |
https://api.openai.com/v1/chat/completions |
| DeepSeek | deepseek |
https://api.deepseek.com/chat/completions |
| Azure OpenAI | azure-openai |
Custom via override.endpoint |
| Anthropic | anthropic |
https://api.anthropic.com/v1/chat/completions |
| AIMLAPI | aimlapi |
https://api.aimlapi.com/v1/chat/completions |
| OpenRouter | openrouter |
https://openrouter.ai/api/v1/chat/completions |
| Gemini | gemini |
https://generativelanguage.googleapis.com/v1beta/openai/chat/completions |
| Vertex AI | vertex-ai |
https://aiplatform.googleapis.com |
| OpenAI-Compatible | openai-compatible |
Custom via override.endpoint |
Plugin Configuration Reference
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
provider |
string | Yes | — | One of the 9 supported providers |
auth |
object | Yes | — | Authentication config (see below) |
options |
object | No | — | Model and generation parameters |
options.model |
string | No | — | Model name (provider-specific) |
options.temperature |
number | No | — | Sampling temperature |
options.top_p |
number | No | — | Nucleus sampling |
options.max_tokens |
integer | No | — | Maximum tokens to generate |
options.stream |
boolean | No | false |
Enable SSE streaming |
override |
object | No | — | Override default endpoint |
override.endpoint |
string | No | — | Full URL for the provider API |
provider_conf |
object | No | — | Provider-specific config (Vertex AI) |
provider_conf.project_id |
string | No | — | GCP project ID (Vertex AI) |
provider_conf.region |
string | No | — | GCP region (Vertex AI) |
logging |
object | No | — | Logging options |
logging.summaries |
boolean | No | false |
Log model, duration, tokens |
logging.payloads |
boolean | No | false |
Log request/response bodies |
timeout |
integer | No | 30000 |
Request timeout (ms) |
keepalive |
boolean | No | true |
Keep connection alive |
keepalive_timeout |
integer | No | 60000 |
Keepalive timeout (ms) |
keepalive_pool |
integer | No | 30 |
Keepalive pool size |
ssl_verify |
boolean | No | true |
Verify SSL certificate |
Authentication by Provider
OpenAI / DeepSeek / Anthropic / AIMLAPI / OpenRouter
{
"auth": {
"header": {
"Authorization": "Bearer sk-your-api-key"
}
}
}
Azure OpenAI
{
"auth": {
"header": {
"api-key": "your-azure-key"
}
},
"override": {
"endpoint": "https://YOUR-RESOURCE.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-02-15-preview"
}
}
Vertex AI (GCP Service Account)
{
"auth": {
"gcp": {
"service_account_json": "{ ... }",
"max_ttl": 3600,
"expire_early_secs": 60
}
},
"provider_conf": {
"project_id": "your-project-id",
"region": "us-central1"
}
}
Step-by-Step: Route to OpenAI
1. Create a route with ai-proxy
All runtime resources like routes must be scoped to a gateway group using --gateway-group or -g.
a7 route create -g default -f - <<'EOF'
{
"id": "openai-chat",
"uri": "/v1/chat/completions",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer sk-your-openai-key"
}
},
"options": {
"model": "gpt-4",
"temperature": 0.7,
"max_tokens": 1024
}
}
}
}
EOF
2. Send a request
curl http://127.0.0.1:9080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 1+1?"}
]
}'
Using Services
In API7 EE, configure ai-proxy directly on a service or route. Services are the preferred place for reusable upstream and plugin configuration.
a7 service create -g default -f - <<'EOF'
{
"id": "standard-ai-proxy",
"name": "Standard AI Proxy",
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer sk-global-key"
}
},
"options": {
"model": "gpt-4"
}
}
}
}
EOF
Common Patterns
Streaming responses
{
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer sk-your-key"
}
},
"options": {
"model": "gpt-4",
"stream": true
}
}
}
}
Model Routing with Multiple Routes
The plugin does not natively route by model. Use separate routes with vars matching on request body fields:
# Route requests for gpt-4 to OpenAI
a7 route create -g default -f - <<'EOF'
{
"id": "openai-gpt4",
"uri": "/v1/chat/completions",
"methods": ["POST"],
"vars": [["post_arg.model", "==", "gpt-4"]],
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": { "header": { "Authorization": "Bearer sk-openai-key" } },
"options": { "model": "gpt-4" }
}
}
}
EOF
Access Log Variables
| Variable | Description |
|---|---|
$request_type |
traditional_http, ai_chat, or ai_stream |
$llm_time_to_first_token |
Time to first token (ms) |
$llm_model |
Actual model used by provider |
$request_llm_model |
Model requested by client |
$llm_prompt_tokens |
Prompt token count |
$llm_completion_tokens |
Completion token count |
Config Sync Example
Config sync is scoped by gateway group:
a7 config sync -f config.yaml --gateway-group default
version: "1"
routes:
- id: openai-chat
uri: /v1/chat/completions
methods:
- POST
plugins:
ai-proxy:
provider: openai
auth:
header:
Authorization: Bearer sk-your-openai-key
options:
model: gpt-4
max_tokens: 1024
temperature: 0.7
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| 502 Bad Gateway | Wrong endpoint or provider value | Verify provider matches; check override.endpoint |
| 401 from upstream | Invalid API key | Check auth.header value |
| 404 Not Found | Missing --gateway-group |
Ensure all runtime commands include -g <group> |
| Azure 404 | Missing api-version in URL | Include ?api-version=YYYY-MM-DD-preview in override.endpoint |