name: vss-generate-video-report-rag description: Generates VSS video summary reports with LVS HITL and optional Enterprise RAG document grounding. Trigger when the user asks for a frag/RAG-assisted video report, knowledge-enhanced analysis, or Enterprise RAG context in a video summary. license: Apache-2.0 metadata: version: "3.2.0" github-url: "https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization" tags: "nvidia blueprint operational"
VSS Generate Video Report RAG — Video Analysis with Enterprise RAG
Generate video summary reports using the LVS profile's RAG-enabled agent config. This skill adds Enterprise RAG document grounding and guided human-in-the-loop (HITL) parameter collection on top of the VSS agent.
Always run curl commands yourself; never instruct the user to run them.
Enable Enterprise RAG on the LVS Profile
The repository ships the RAG-enabled LVS agent config at
deploy/docker/developer-profiles/dev-profile-lvs/vss-agent/configs/config_rag.yml.
It is a superset of the default LVS config: regular caption retrieval remains
enabled, and frag_retrieval adds Enterprise RAG document grounding.
Use the normal /vss-deploy-profile workflow for deployment. The source
.env remains read-only; apply non-secret overrides to
deploy/docker/developer-profiles/dev-profile-lvs/generated.env.
generated.env is ignored by the repository, but it is still a plaintext file:
do not commit it, paste it into logs, or store long-lived credentials there.
Prefer a vault, Docker secrets, or ephemeral shell environment variables for
API keys.
Step 1: Configure the generated env file
REPO=${REPO:-$(git rev-parse --show-toplevel)}
cd "$REPO"
cp deploy/docker/developer-profiles/dev-profile-lvs/.env \
deploy/docker/developer-profiles/dev-profile-lvs/generated.env
Set these non-secret values in generated.env:
HOST_IP— host IP (hostname -I | awk '{print $1}')VSS_AGENT_CONFIG_FILE=./deploy/docker/developer-profiles/dev-profile-lvs/vss-agent/configs/config_rag.ymlRAG_SERVER_URL— Enterprise RAG server HTTP endpoint (defaults tohttp://rag-server:8081/v1)KNOWLEDGE_COLLECTION— default Enterprise RAG collection forfrag_retrieval
Keep sensitive values (NGC_CLI_API_KEY, NVIDIA_API_KEY, RAG_API_KEY) out
of generated.env and out of resolved.yml. Do not export them before running
docker compose config > resolved.yml, because Compose expands environment
variables into that file. Use a secret manager, an existing authenticated Docker
session, or a local override file that references an ephemeral shell variable at
up time.
Step 2: Log in to NGC registry
Prefer an existing authenticated Docker session or a secret-managed login. If a login is required, use --password-stdin without printing token values:
read -rsp "NGC API key: " NGC_CLI_API_KEY
printf '%s\n' "$NGC_CLI_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
unset NGC_CLI_API_KEY
Step 3: Deploy the LVS profile with the RAG config
Do not export RAG_API_KEY for the dry-run below. If the RAG server requires an
API key, create this untracked local override after resolved.yml is generated:
cat > rag-secret.override.yml <<'EOF'
services:
vss-agent:
environment:
RAG_API_KEY: ${RAG_API_KEY:?Set RAG_API_KEY only for docker compose up}
EOF
REPO=${REPO:-$(git rev-parse --show-toplevel)}
cd "$REPO/deploy/docker"
docker compose --env-file developer-profiles/dev-profile-lvs/generated.env \
config > resolved.yml
uv run "$REPO/skills/vss-deploy-profile/scripts/normalize_resolved_yml.py" \
"$REPO/deploy/docker/resolved.yml"
docker compose --env-file developer-profiles/dev-profile-lvs/generated.env \
-f resolved.yml up -d
When rag-secret.override.yml is needed, use:
read -rsp "RAG API key: " RAG_API_KEY
RAG_API_KEY="$RAG_API_KEY" docker compose \
--env-file developer-profiles/dev-profile-lvs/generated.env \
-f resolved.yml -f rag-secret.override.yml up -d
unset RAG_API_KEY
Step 4: Verify deployment
# Check containers are running
docker ps --format "table {{.Names}}\t{{.Status}}"
# Health check
curl -sf --max-time 5 "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health" >/dev/null \
&& echo "VSS LVS RAG agent is running" \
|| echo "VSS LVS RAG agent is NOT reachable"
Tear down
REPO=${REPO:-$(git rev-parse --show-toplevel)}
cd "$REPO/deploy/docker"
docker compose -f resolved.yml down
When to Use
- User wants to generate a video summary or report using the RAG-enabled LVS pipeline
- User asks to analyze a video with Enterprise RAG knowledge context
- User mentions "frag", "enterprise RAG", or "knowledge-enhanced report"
When NOT to Use
- Simple video understanding queries (use
video-understandingskill) - Direct LVS summarization without HITL (use
video-summarizationskill) - Deployment tasks (use
deployskill) - Real-time alerts (use
alertsskill)
Workflow: Generate an LVS Report with Enterprise RAG
Step 1: List available videos
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "What videos are available?"}]}' | \
python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"
A selected video is required before Step 2. If the user has not already named one, return the short list and stop; resume when the user supplies the video name.
Step 2: Collect parameters from the user
Required user-provided parameters:
- Scenario — scenario label for the video. Example: "warehouse monitoring", "traffic monitoring", "retail store activity"
- Events — comma-separated event names to detect. Example: "accident, forklift stuck, workers not wearing PPE, person entering restricted area"
- Objects of Interest — focus objects, or "skip". Example: "forklifts, pallets, workers"
If any required value is missing, return a concise missing-fields message and stop; resume the workflow when the user supplies the missing values.
There is no separate Enterprise RAG Query HITL prompt. Document grounding comes
from the RAG-enabled agent config exposing frag_retrieval; if the user wants
specific SOP, policy, or procedure context reflected in the report, capture that
context in the original report request or resolve it as a document-grounding
question before starting the HITL report flow.
Step 3: Start the report (HTTP HITL)
Send a POST to /v1/chat. This returns HTTP 202 with an execution_id and the first
HITL prompt. Replace VIDEO_NAME with the chosen video:
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Generate a report for VIDEO_NAME using long video summarization"}]}'
The response contains:
execution_id— save this, used in all subsequent requestsinteraction_id— identifies the current promptprompt.text— the HITL prompt textresponse_url— the URL to POST the response to
Step 4: Respond to HITL prompts
For each prompt, POST the user's parameter to the response_url. Replace EXECUTION_ID, INTERACTION_ID, and the text value:
curl -sS -X POST \
"http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID/interactions/INTERACTION_ID/response" \
-H "Content-Type: application/json" \
-d '{"response": {"type": "text", "text": "USER_VALUE_HERE"}}'
Then poll for the next prompt:
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool
The HITL prompts come in this order:
- Scenario — respond with the scenario from Step 2
- Events — respond with the events from Step 2
- Objects of Interest — respond with the objects from Step 2, or "skip"
- Confirmation — respond with empty string "" to confirm and start processing
Repeat the POST-then-poll cycle for each prompt.
Step 5: Wait for completion
After the confirmation prompt, the system processes the video. This takes 3-5 minutes. Keep polling until the status changes from "running" to "completed":
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool
Set the expectation that processing usually takes 3-5 minutes, then poll every 30 seconds.
Step 6: Present the results
When status is "completed", the response contains the full report with:
- Detected events with timestamps
- Narrative analysis summary
- Enterprise RAG context (if queried)
- PDF report download link (if available)
Present the report content to the user in a readable format.
Error Handling
- If a deployment, health, or chat request fails, report the failing endpoint, HTTP status or command error, and the most useful next check. Do not continue into HITL without a valid
execution_id,interaction_id, andresponse_url. - If a HITL response is rejected or the next execution poll omits the expected prompt, stop and show the execution status plus any error payload instead of guessing the next prompt.
- If the execution status becomes
failed,cancelled, or staysrunningwithout progress beyond the expected processing window, surface the status and recommend checking thevss-agentlogs before retrying. - If the final response lacks report text or a PDF link, return the available response fields and clearly state which output was missing.
Quick Commands
Simple chat query (non-report)
For simple questions that do NOT involve report generation:
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "YOUR_QUESTION_HERE"}]}' | \
python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"
Notes
- LVS reports take 3-5 minutes for a ~3.5 minute video; set that expectation before polling
- Enterprise RAG requires a reachable RAG server with data already ingested in
KNOWLEDGE_COLLECTION - If objects are not needed, respond with "skip"
- The HITL response format is always:
{"response": {"type": "text", "text": "value"}} - The RAG-enabled agent config must keep its HITL templates and
hitl_enabled: truesettings for HTTP HITL to work - See also:
video-summarization,video-understanding,report,vios,deploy