name: a6-plugin-ai-content-moderation description: >- Skill for configuring APISIX AI content moderation plugins via the a6 CLI. Covers both ai-aws-content-moderation (AWS Comprehend, request-only) and ai-aliyun-content-moderation (Aliyun, request + response with streaming), toxicity thresholds, category filtering, and integration with ai-proxy. version: "1.0.0" author: Apache APISIX Contributors license: Apache-2.0 metadata: category: plugin apisix_version: ">=3.9.0" plugin_name: ai-aws-content-moderation related_plugins: - ai-aliyun-content-moderation a6_commands: - a6 route create - a6 route update - a6 config sync
a6-plugin-ai-content-moderation
Overview
APISIX provides two content moderation plugins that filter harmful content in LLM requests and responses:
| Plugin | Provider | Request | Response | Streaming |
|---|---|---|---|---|
ai-aws-content-moderation |
AWS Comprehend | ✅ | ❌ | ❌ |
ai-aliyun-content-moderation |
Aliyun Moderation Plus | ✅ | ✅ | ✅ |
Both must be used alongside ai-proxy or ai-proxy-multi.
When to Use
- Block toxic, hateful, or sexual content before it reaches the LLM
- Filter harmful LLM responses before they reach clients (Aliyun only)
- Enforce content policies with configurable thresholds
- Comply with content safety regulations
Plugin Execution Order
ai-prompt-template (priority 1071)
ai-prompt-decorator (priority 1070)
ai-aws-content-moderation (priority 1050) ← runs BEFORE ai-proxy
ai-proxy (priority 1040)
ai-aliyun-content-moderation (priority 1029) ← runs AFTER ai-proxy
The AWS plugin blocks requests before they reach the LLM. The Aliyun plugin
runs after ai-proxy sets context and can check both requests and responses.
Plugin 1: ai-aws-content-moderation
Uses the AWS Comprehend detectToxicContent API to score request content.
Configuration Reference
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
comprehend.access_key_id |
string | Yes | — | AWS access key ID |
comprehend.secret_access_key |
string | Yes | — | AWS secret access key |
comprehend.region |
string | Yes | — | AWS region (e.g. us-east-1) |
comprehend.endpoint |
string | No | Auto | Custom Comprehend endpoint |
comprehend.ssl_verify |
boolean | No | true |
Verify SSL certificate |
moderation_categories |
object | No | — | Per-category thresholds (0-1) |
moderation_threshold |
number | No | 0.5 |
Overall toxicity threshold (0-1) |
Moderation Categories
| Category | Description |
|---|---|
PROFANITY |
Profane language |
HATE_SPEECH |
Hateful content |
INSULT |
Insulting language |
HARASSMENT_OR_ABUSE |
Harassment or abusive content |
SEXUAL |
Sexual content |
VIOLENCE_OR_THREAT |
Violent or threatening content |
Each category accepts a score threshold from 0 (strictest, blocks nearly
everything) to 1 (most permissive). If moderation_categories is set,
each category is checked individually. Otherwise, the moderation_threshold
is used as an overall toxicity check.
Step-by-Step: AWS Content Moderation
a6 route create -f - <<'EOF'
{
"id": "moderated-chat",
"uri": "/v1/chat/completions",
"methods": ["POST"],
"plugins": {
"ai-aws-content-moderation": {
"comprehend": {
"access_key_id": "AKIAIOSFODNN7EXAMPLE",
"secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"region": "us-east-1"
},
"moderation_categories": {
"HATE_SPEECH": 0.3,
"VIOLENCE_OR_THREAT": 0.2,
"SEXUAL": 0.5
}
},
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer sk-your-key"
}
},
"options": {
"model": "gpt-4"
}
}
}
}
EOF
Toxic requests are rejected with HTTP 400:
request body exceeds HATE_SPEECH threshold
Overall threshold (no per-category filtering)
{
"plugins": {
"ai-aws-content-moderation": {
"comprehend": {
"access_key_id": "AKIA...",
"secret_access_key": "secret...",
"region": "us-east-1"
},
"moderation_threshold": 0.7
}
}
}
Plugin 2: ai-aliyun-content-moderation
Uses Aliyun Machine-Assisted Moderation Plus. Supports request moderation, response moderation, and real-time streaming moderation.
Configuration Reference
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
endpoint |
string | Yes | — | Aliyun service endpoint URL |
region_id |
string | Yes | — | Aliyun region (e.g. cn-shanghai) |
access_key_id |
string | Yes | — | Aliyun access key ID |
access_key_secret |
string | Yes | — | Aliyun access key secret |
check_request |
boolean | No | true |
Enable request moderation |
check_response |
boolean | No | false |
Enable response moderation |
stream_check_mode |
string | No | final_packet |
realtime or final_packet |
stream_check_cache_size |
integer | No | 128 |
Max chars per batch (realtime) |
stream_check_interval |
number | No | 3 |
Seconds between batch checks (realtime) |
request_check_service |
string | No | llm_query_moderation |
Aliyun service for request checks |
request_check_length_limit |
number | No | 2000 |
Max chars per request chunk |
response_check_service |
string | No | llm_response_moderation |
Aliyun service for response checks |
response_check_length_limit |
number | No | 5000 |
Max chars per response chunk |
risk_level_bar |
string | No | high |
Threshold: none, low, medium, high, max |
deny_code |
number | No | 200 |
HTTP status code for rejected content |
deny_message |
string | No | — | Custom rejection message |
timeout |
integer | No | 10000 |
Request timeout (ms) |
ssl_verify |
boolean | No | true |
Verify SSL certificate |
Risk Level System
Content is blocked when its risk level meets or exceeds the risk_level_bar:
none (0) < low (1) < medium (2) < high (3) < max (4)
Setting risk_level_bar: "high" blocks content rated high or max.
Setting risk_level_bar: "low" blocks everything rated low or above.
Streaming Modes
| Mode | Behavior |
|---|---|
final_packet |
Buffers entire response, checks at end |
realtime |
Checks content in batches during streaming, can interrupt mid-response |
Step-by-Step: Aliyun Request + Response Moderation
a6 route create -f - <<'EOF'
{
"id": "aliyun-moderated-chat",
"uri": "/v1/chat/completions",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer sk-your-key"
}
},
"options": {
"model": "gpt-4"
}
},
"ai-aliyun-content-moderation": {
"endpoint": "https://green.cn-shanghai.aliyuncs.com",
"region_id": "cn-shanghai",
"access_key_id": "your-aliyun-key-id",
"access_key_secret": "your-aliyun-key-secret",
"check_request": true,
"check_response": true,
"risk_level_bar": "high",
"deny_code": 400,
"deny_message": "Content policy violation"
}
}
}
EOF
Realtime streaming moderation
{
"plugins": {
"ai-aliyun-content-moderation": {
"endpoint": "https://green.cn-shanghai.aliyuncs.com",
"region_id": "cn-shanghai",
"access_key_id": "key-id",
"access_key_secret": "key-secret",
"check_request": true,
"check_response": true,
"stream_check_mode": "realtime",
"stream_check_cache_size": 256,
"stream_check_interval": 2,
"risk_level_bar": "medium"
}
}
}
Integration Patterns
Pattern A: Request-only filtering (AWS)
Client → [AWS Comprehend blocks toxic] → ai-proxy → LLM → Response → Client
plugins:
ai-aws-content-moderation:
comprehend:
access_key_id: "${AWS_ACCESS_KEY_ID}"
secret_access_key: "${AWS_SECRET_ACCESS_KEY}"
region: us-east-1
moderation_threshold: 0.5
ai-proxy:
provider: openai
auth:
header:
Authorization: "Bearer ${OPENAI_API_KEY}"
Pattern B: Request + response filtering (Aliyun)
Client → ai-proxy [sets context] → [Aliyun checks request] → LLM
→ [Aliyun checks response] → Client
plugins:
ai-proxy:
provider: openai
auth:
header:
Authorization: "Bearer ${OPENAI_API_KEY}"
ai-aliyun-content-moderation:
endpoint: "https://green.cn-shanghai.aliyuncs.com"
region_id: cn-shanghai
access_key_id: "${ALIYUN_KEY_ID}"
access_key_secret: "${ALIYUN_KEY_SECRET}"
check_request: true
check_response: true
risk_level_bar: high
Secret Management
Both plugins support APISIX secret management for credentials:
plugins:
ai-aws-content-moderation:
comprehend:
access_key_id: "$secret://vault/aws_key_id"
secret_access_key: "$secret://vault/aws_secret_key"
region: us-east-1
Config Sync Example
version: "1"
routes:
- id: moderated-chat
uri: /v1/chat/completions
methods:
- POST
plugins:
ai-aws-content-moderation:
comprehend:
access_key_id: AKIAIOSFODNN7EXAMPLE
secret_access_key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
region: us-east-1
moderation_categories:
HATE_SPEECH: 0.3
VIOLENCE_OR_THREAT: 0.2
moderation_threshold: 0.5
ai-proxy:
provider: openai
auth:
header:
Authorization: Bearer sk-your-key
options:
model: gpt-4
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| "no ai instance picked" | Aliyun plugin used without ai-proxy | Always configure ai-proxy or ai-proxy-multi on the same route |
| AWS plugin not blocking | Threshold too permissive | Lower moderation_threshold or per-category thresholds |
| Aliyun response moderation inactive | check_response defaults to false |
Explicitly set check_response: true |
| "Specified signature is not matched" | Wrong Aliyun credentials | Verify access_key_id and access_key_secret |
| High latency | Double moderation (both plugins) | Use one moderation provider per route, not both |
| Streaming interrupted mid-response | Aliyun realtime mode detected violation | Expected behavior; adjust risk_level_bar or use final_packet mode |