name: dlstreamer-coding-agent description: "Build new DL Streamer video-analytics applications (Python, C, C++ or GStreamer command line). Use when: user describes a vision AI pipeline, wants to create a new sample app, combine elements from existing samples, add detection/classification/VLM/tracking/alerts/recording to a video pipeline, or create custom GStreamer elements in Python or C++. Translates natural-language pipeline descriptions into working DL Streamer code using established design patterns." permissions: - write - command
DL Streamer Coding Agent
Build new DL Streamer video-analytics applications (Python, C, C++ or GStreamer command line) by composing design patterns extracted from existing sample apps.
File Resolution
This skill uses repo-root-relative paths to reference files outside the skill folder (e.g. docs/user-guide/elements/, samples/gstreamer/python/hello_dlstreamer/). The repo root is three directories above this skill file when the full repo is cloned, or refers to https://github.com/open-edge-platform/dlstreamer if only skill files were copied.
When to Use
- User describes a vision AI pipeline in natural language
- User wants to create a new Python sample application built on DL Streamer
- User wants to create a new C or C++ sample application built on DL Streamer
- User wants to create a new GStreamer command line using DL Streamer elements
- User wants to combine elements from multiple existing samples (e.g. detection + VLM + recording)
- User needs to add custom analytics logic or custom GStreamer elements in Python or C++
See example prompts for inspiration.
Directory Layout for a New Sample App
<new_sample_app_name>
├── <app_name>.py or .sh # Main application (Python or shell script)
├── export_models.py or .sh # Model download and export script
├── requirements.txt # Python dependencies for the application
├── export_requirements.txt # Python dependencies for model export scripts
├── README.md # Setup and usage instructions
├── plugins/ # Only if custom GStreamer elements are needed
│ ├── python/
│ │ └── <element>.py
│ └── c/
│ └── <element>.c
├── config/ # Only if config files are needed
│ └── *.txt / *.json
├── models/ # Created at runtime (cached model exports)
├── videos/ # Created at runtime (cached video downloads)
└── results/ # Created at runtime (output files)
Procedure
Execution Overview
After Step 0 (requirements gathering), kick off all independent long-running tasks in parallel via async terminals, then continue with reasoning-heavy work while they complete. When in doubt about ordering, always wait for a step's listed prerequisites to finish before starting it — the dependency graph below is the single source of truth.
Step 0 (gather requirements — interactive)
│
├──► Step 1 (Docker pull — async) ───────────────────────────────────────┐
├──► Step 2a (export scripts + pip install — async) ──► Step 2c (export)──┤
├──► Step 2b (video download — async) ────────────────────────────────────┤───► Step 5 (run & validate)
└──► Step 3 (design pipeline — reasoning) ──► Step 4 (generate app) ─────┘
Parallelization rules:
- Steps 1, 2a, 2b, and 3 are fully independent — start them all immediately after Step 0
- Step 2c (model export) depends on Step 2a (pip install) completing
- Step 4 (generate app) depends on Step 3 (pipeline design) completing
- Step 5 (run and validate) depends on Steps 1, 2c, and 4 all completing
Safety rules for autonomous execution:
- Before running any command that installs packages, downloads external content, or modifies/deletes files, show the exact command and request explicit user confirmation in chat.
- Never interpolate raw user input into shell commands. Use validated allowlists and fixed argument templates.
- Restrict file operations to the sample application directory unless the user explicitly approves a wider scope.
Reference Lookup
Each reference document is used in one primary step to avoid redundant reads:
| Reference | Primary Step | Purpose |
|---|---|---|
| Requirements Questionnaire | Step 0 | Detailed questions to ask when user prompt is incomplete |
| Model Preparation | Step 2 | Prepare AI models in OpenVINO IR format |
| Pipeline Construction | Step 3 | Element selection, pipeline rules, common patterns |
| Sample Index | Step 3 | Existing samples to study before generating code |
| Design Patterns | Step 3 | Python application structure, patterns, and coding conventions |
| Debugging Hints | Step 5 | Docker testing, common gotchas, validation checklist |
Fast Path (Pattern Table Match)
Before proceeding with the full procedure, check if the user's prompt maps directly to a row in the Common Pipeline Patterns table. If a match is found:
- Pre-fill Step 0 fields from the matched row
- If any required field is missing or inferred from the matched row, present the pre-filled values to the user for confirmation (skip the full Requirements Questionnaire unless info is still missing)
- If all required fields were explicitly provided by the user (not inferred), skip requirement-field confirmation, but still request explicit user approval before running any command in Steps 1–2
- After the user confirms (or overrides), read only the design patterns, reference sections, and model-preparation sections needed for the confirmed selections
- Proceed to Steps 1–5
Step 0 — Gather Requirements
Extract the following from the user's prompt:
| Required info | Look for | Default if missing |
|---|---|---|
| Video input | File path, HTTP URL (for download), or RTSP URI | — (must ask) |
| AI model(s) | Model name/URL and task (detection, classification, VLM, OCR, …) | — (must ask) |
| Target hardware | Intel platform, available accelerators (GPU/NPU/CPU) | Not sure / detect at runtime |
| Output format | Annotated video, JSON, JPEG snapshots, display window | All of the above |
| Application type | Python app, C/C++ app, or GStreamer command line | When the prompt references an existing application to convert, determine the application type by inspecting the source application's file extensions. Application type must match the programming language of the input application (C/C++ → C/C++, Python → Python, shell → GStreamer command line) |
| Docker image | DL Streamer Docker tag | intel/dlstreamer:latest (this tag is treated as the latest Ubuntu 24 image) |
Application type override: If the user's prompt contains explicit language like
"bash script", "shell script", "gst-launch", or "command line", set Application type to
GStreamer command lineregardless of the default. Only default toPython applicationwhen the prompt does not indicate a preference and there is no source application to convert.
If the user's prompt explicitly provides all required info (video input AND model names are explicitly stated, not inferred), proceed directly to Step 1.
If any required info is missing or was inferred via Fast Path (not explicitly stated
by the user), you MUST present the pre-filled values and ask the user to confirm
or override before proceeding. Use the interactive question tool if available
(e.g. vscode_askQuestions in VS Code Copilot), otherwise list the values inline
in chat. Do NOT silently assume defaults and skip confirmation.
If the user requests NPU but the selected model or elements do not support NPU inference, inform the user and suggest falling back to GPU or CPU.
Step 1 — Pull Docker Image (async)
Start the Docker image pull in an async terminal immediately after Step 0 completes.
Always pull the latest available image Do NOT reuse a locally cached image without pulling first.
docker pull intel/dlstreamer:latest
If docker pull fails (for example, image not found or network error), inform the user
and suggest checking Docker login and network connectivity before retrying.
Step 2 — Prepare Models and Video (async)
2a — Create export scripts and kick off venv + pip install
Check whether the requested models (or similar ones) appear in the model exporters bundled with DL Streamer.
| Model exporter | Typical Models | Path |
|---|---|---|
| download_public_models.sh | Traditional computer vision models | samples/download_public_models.sh |
| download_hf_models.py | HuggingFace models, including VLM models and Transformer-based detection/classification models (RTDETR, CLIP, ViT) | scripts/download_models/download_hf_models.py |
| download_ultralytics_models.py | Specialized model downloader for Ultralytics YOLO models | scripts/download_models/download_ultralytics_models.py |
If a model is found, extract its download recipe and create a local export_models.py in the application directory.
If a model is not listed, check the Model Preparation Reference for export instructions, then write a new script using the Export Models Template.
Create the export_requirements.txt file using the Export Requirements Template if the model export script requires additional Python packages (e.g. HuggingFace transformers, Ultralytics, optimum-cli, etc.). Add comments in export_requirements.txt to indicate which model export script requires a specific package. Use exact pinned versions from the Model Preparation Reference → Requirements.
CRITICAL — CPU-only PyTorch: The first line of
export_requirements.txtmust be--extra-index-url https://download.pytorch.org/whl/cpu(before any torch-dependent package likeultralyticsornncf). Without this, pip pulls multi-GB GPU libraries not needed for model export. See Model Preparation Reference → Requirements for the full template.
Once both files are written, start venv creation and pip install in an async terminal:
# Run in async mode — do NOT wait for completion
python3 -m venv .<app_name>-export-venv && \
source .<app_name>-export-venv/bin/activate && \
pip install -r export_requirements.txt
2b — Download video to local directory
If the user provided an HTTP URL for video input, download it now:
mkdir -p videos && curl -L -o videos/<video_name>.mp4 \
-H "Referer: https://www.pexels.com/" \
-H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36" \
"<DIRECT_VIDEO_URL>"
The application itself should not download videos — it accepts only --input
pointing to a local file or RTSP URI. Document download steps in the README.
Pexels page URLs → direct file URLs: A Pexels page URL (
https://www.pexels.com/video/<slug>-<ID>/) is not a direct download link. Scrape the page withcurl -sand search the HTML forvideos.pexels.com/video-files/links to get the actual.mp4URL. Do not guess resolution or FPS — they vary per video. If scraping fails, ask the user for the direct URL.
Git LFS warning: Videos from
edge-ai-resourcesmay return HTML instead of video data. Verify:file videos/sample.mp4 | grep -q "ISO Media". Prefer Pexels direct URLs as default test videos.
Proceed to Step 3 while pip install and docker pull run in the background.
2c — Run model export (after pip install completes)
Before running the export, confirm the async terminal from Step 2a has completed successfully. If the install failed, diagnose and re-run before continuing.
Once confirmed, run the model export:
source .<app_name>-export-venv/bin/activate
python3 export_models.py # or bash export_models.sh
If model export fails, check command output for common causes (unsupported architecture, insufficient RAM, missing model weights), report the error with a suggested fix, then retry.
Step 3 — Design Pipeline
Design a DL Streamer pipeline that fulfills the user's requirements. This step covers element selection and application structure.
3a — Select elements and assemble pipeline string
Use the Pipeline Construction Reference to identify elements for each pipeline stage (source, decode, inference, metadata, sink). Follow the Pipeline Design Rules in that reference.
For common use cases, go straight to file generation using the use-case → template/pattern mapping table.
For complex cases, consult the Sample Index for relevant reference implementations, then read the specific samples that match the user's use case.
Converting from DeepStream
When converting a DeepStream application, follow these additional rules:
- Inventory the source pipeline. Identify all elements in the DeepStream pipeline first.
- Map each element 1-to-1 using the Converting Guide at
docs/user-guide/dev_guide/converting_deepstream_to_dlstreamer.md. - Connect DL Streamer elements using the Common Pipeline Patterns table or Sample Index.
- Do not add elements absent from the source pipeline. Every element in the converted pipeline must trace back to the inventory.
3b — Choose application structure
For a CLI application, the pipeline string from 3a is the deliverable — wrap it in a gst-launch-1.0 shell script.
For a Python application, map the user's description to one or more design patterns using the Pattern Selection Table:
- Select the pipeline construction approach — see Pattern 1: Pipeline Core
- Add callbacks/probes as needed
- Add custom Python elements if the user needs inline analytics — check first whether existing GStreamer elements can handle the logic. If not, follow the Conventions under Pattern 7.
- Wire up argument parsing
- Add the pipeline event loop — see Pattern 2: Pipeline Event Loop
Step 4 — Generate Application
Generate all application files following the directory layout defined at the beginning of this document.
Language-specific generation:
C/C++ applications:: Use the Application Template as the starting skeleton. Read the Design Patterns Reference for coding conventions and application structure.
- Python applications: Use the Application Template as the starting skeleton. Read the Design Patterns Reference for coding conventions and application structure.
For all languages:
Use the README Template to generate
README.mdby replacing all{{PLACEHOLDERS}}as described below:Placeholder What to generate {{APP_TITLE}}Short title of the application {{APP_DESCRIPTION}}2–3 sentences describing what the application does and its main use case {{DLSTREAMER_CODING_AGENT_PROMPT}}The verbatim initial user prompt wrapped in a Markdown blockquote ( >). Do not paraphrase or summarize.{{APP_VISUALIZATION}}Optional screenshot line: . Omit this line entirely if no screenshot is available.{{DETAILED_DESCRIPTION}}Extended description: model names, hardware requirements, expected outputs. If the input video is from a publicly available source (e.g. Pexels), add: This sample uses a video from <link> by <author>.{{NUMBERED_STEPS}}Numbered list of pipeline stages, e.g. 1. **Detects** objects using gvadetect{{PIPELINE_DIAGRAM}}Mermaid diagram. Use graph LRfor linear pipelines; use subgraphs for tee/multi-branch (seesmart_nvrandvlm_self_checkoutfor examples).{{PIPELINE_ELEMENTS_LIST}}Optional bulleted list of each GStreamer/DL Streamer element and its role. Omit if the pipeline is straightforward. {{VIDEO_DOWNLOAD_INSTRUCTIONS}}curlcommand to download the test video intovideos/. If no public video is used, omit the enclosing### Download Videoheading and this placeholder entirely.{{ADVANCED_USAGE}}Optional second usage block showing non-default CLI options. Omit if not needed. {{HOW_IT_WORKS_SECTIONS}}One ### STEP Nsubsection per major pipeline stage or custom element, with relevant code snippets.{{CONFIGURATION_FILES_SECTION}}Optional ## Configuration Filestable (file name + purpose). Omit the section if unused.{{CLI_ARGUMENTS_TABLE}}One table row per CLI argument: flag name, default value, description. {{OUTPUT_FILES_LIST}}Bulleted list of output files produced under results/.If the application requires Python packages, list them in
requirements.txt. If the OpenVINO Python runtime is required, pin the same version as the OpenVINO runtime installed with DL Streamer.
Step 5 — Run, Debug, and Validate
Run in Docker
docker run --init --rm \
-u "$(id -u):$(id -g)" \
-e PYTHONUNBUFFERED=1 \
-v "$(pwd)":/app -w /app \
--device /dev/dri \
--group-add $(stat -c "%g" /dev/dri/render*) \
--device /dev/accel \
--group-add $(stat -c "%g" /dev/accel/accel*) \
intel/dlstreamer:latest \
python3 <app_name>.py
Autonomous execution — never wait for user confirmation. Launch in async mode, poll
get_terminal_outputevery 15–30s until completion. Only ask the user when a decision is needed (e.g. device change after OOM). This applies to all long-running commands:docker run,docker pull,pip install, model export.
Validate: check that output matches the user's expected results. Use the Debugging Hints and Validation Checklist for common gotchas. For continuous or long inputs, send EOS to finalize.
Post-Task — Report Coding Agent Analytics
Report the analytics table immediately after the application is validated, in the same response as the validation results.
After the application is working, report timing metrics:
- AI reasoning time — understanding the prompt, designing the pipeline, writing code
- Environment setup time — waiting for
pip install, model export, Docker image pull - Debug and validation time — running the application, checking outputs, fixing issues
- User wait time — waiting for user input or confirmation
- Total activity time (phases may overlap, so total ≠ sum of individual phases)
Examples
See example prompts for inspiration and practical demonstrations of the procedure.