name: hub-detection description: "Use for object detection and tracking nodes in dora. Triggers on: dora-yolo, dora-sam2, dora-cotracker, YOLO, YOLOv8, SAM, SAM2, CoTracker, object detection, segmentation, tracking, bounding box, mask, point tracking, 目标检测, 分割, 跟踪, 边界框" globs: ["/dataflow.yml", "/dataflow.yaml"] source: "https://github.com/dora-rs/dora-hub"
Object Detection & Tracking Nodes
YOLO detection, SAM2 segmentation, and CoTracker point tracking
Available Detection Nodes
| Node | Install | Description | GPU Required |
|---|---|---|---|
| dora-yolo | pip install dora-yolo |
YOLOv8 object detection | Recommended |
| dora-sam2 | pip install dora-sam2 |
Segment Anything 2 | Required (CUDA) |
| dora-cotracker | pip install dora-cotracker |
Point tracking | Recommended |
dora-yolo
YOLOv8 object detection with bounding boxes, confidence scores, and labels.
YAML Configuration
- id: yolo
build: pip install dora-yolo
path: dora-yolo
inputs:
image: camera/image
outputs:
- bbox
env:
MODEL: yolov8n.pt # yolov5n, yolov8n/s/m/l/x
Input Format
# image: UInt8Array
metadata = {
"width": 640,
"height": 480,
"encoding": "bgr8" # or "rgb8"
}
Output Format (bbox)
# StructArray with bounding boxes
bbox = {
"bbox": np.array([x1,y1,x2,y2, ...]).flatten(), # xyxy format
"conf": np.array([0.95, 0.87, ...]), # confidence scores
"labels": np.array(["person", "car", ...]) # class names
}
metadata = {"format": "xyxy", "primitive": "boxes2d"}
Decoding Bounding Boxes
bbox_data = event["value"][0]
bbox = {
"bbox": bbox_data["bbox"].values.to_numpy().reshape(-1, 4),
"conf": bbox_data["conf"].values.to_numpy(),
"labels": bbox_data["labels"].values.to_numpy(zero_copy_only=False)
}
# Draw boxes
for i, box in enumerate(bbox["bbox"]):
x1, y1, x2, y2 = box.astype(int)
label = bbox["labels"][i]
conf = bbox["conf"][i]
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(frame, f"{label} {conf:.2f}", (x1, y1-10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
dora-sam2
Segment Anything Model 2 for object segmentation.
Requirements
- NVIDIA GPU with CUDA support
YAML Configuration
- id: sam2
build: pip install dora-sam2
path: dora-sam2
inputs:
image: camera/image
bbox: yolo/bbox # Optional: use YOLO boxes as prompts
outputs:
- masks # UInt8Array segmentation masks
Output Format
# masks: UInt8Array
metadata = {
"width": 640,
"height": 480,
"primitive": "masks"
}
YOLO + SAM2 Pipeline
nodes:
- id: camera
build: pip install opencv-video-capture
path: opencv-video-capture
inputs:
tick: dora/timer/millis/33
outputs:
- image
- id: yolo
build: pip install dora-yolo
path: dora-yolo
inputs:
image: camera/image
outputs:
- bbox
- id: sam2
build: pip install dora-sam2
path: dora-sam2
inputs:
image: camera/image
bbox: yolo/bbox
outputs:
- masks
- id: rerun
build: pip install dora-rerun
path: dora-rerun
inputs:
image: camera/image
detections: yolo/bbox
segmentation: sam2/masks
dora-cotracker
Real-time point tracking using Facebook's CoTracker model.
YAML Configuration
- id: tracker
build: pip install dora-cotracker
path: dora-cotracker
inputs:
image: camera/image
points_to_track: detector/points # Optional programmatic input
outputs:
- tracked_image # Visualization with tracked points
- tracked_points # Current point positions
Interactive Usage
- Left-click in "Raw Feed" window to add tracking points
- Points assigned unique IDs (C0, C1 for clicks, I0, I1 for inputs)
Programmatic Point Input
import numpy as np
import pyarrow as pa
# Send points to track
points = np.array([
[320, 240], # Center
[160, 120], # Top-left
[480, 360] # Bottom-right
], dtype=np.float32)
node.send_output("points_to_track", pa.array(points.ravel()), {
"num_points": len(points),
"dtype": "float32",
"shape": (len(points), 2)
})
Output Format
# tracked_points: Float32Array
# Same format as input points
metadata = {
"num_points": N,
"dtype": "float32",
"shape": (N, 2)
}
YOLO Detection to CoTracker Pipeline
nodes:
- id: camera
build: pip install opencv-video-capture
path: opencv-video-capture
inputs:
tick: dora/timer/millis/100
outputs:
- image
env:
ENCODING: "rgb8"
IMAGE_WIDTH: 640
IMAGE_HEIGHT: 480
- id: yolo
build: pip install dora-yolo
path: dora-yolo
inputs:
image: camera/image
outputs:
- bbox
- centroids # Custom output: detection centers
- id: tracker
build: pip install dora-cotracker
path: dora-cotracker
inputs:
image: camera/image
points_to_track: yolo/centroids
outputs:
- tracked_image
- tracked_points
- id: rerun
build: pip install dora-rerun
path: dora-rerun
inputs:
raw_image: camera/image
tracking_viz: tracker/tracked_image
Bounding Box Data Format
Sending Bounding Boxes
import pyarrow as pa
import numpy as np
# Create bbox structure
bbox_dict = {
"bbox": np.array([x1, y1, x2, y2, ...], dtype=np.float32),
"conf": np.array([0.95, ...], dtype=np.float32),
"labels": np.array(["person", ...])
}
# Encode as Arrow
encoded = pa.array([bbox_dict])
node.send_output("bbox", encoded, {
"format": "xyxy",
"primitive": "boxes2d"
})
Box Format Conversion
# xyxy to xywh
def xyxy_to_xywh(box):
x1, y1, x2, y2 = box
return [x1, y1, x2 - x1, y2 - y1]
# xywh to xyxy
def xywh_to_xyxy(box):
x, y, w, h = box
return [x, y, x + w, y + h]
Complete Detection Pipeline
nodes:
- id: camera
build: pip install opencv-video-capture
path: opencv-video-capture
inputs:
tick: dora/timer/millis/33
outputs:
- image
env:
IMAGE_WIDTH: 640
IMAGE_HEIGHT: 480
- id: yolo
build: pip install dora-yolo
path: dora-yolo
inputs:
image: camera/image
outputs:
- bbox
env:
MODEL: yolov8n.pt
- id: rerun
build: pip install dora-rerun
path: dora-rerun
inputs:
camera_feed:
source: camera/image
metadata:
primitive: "image"
detections:
source: yolo/bbox
metadata:
primitive: "boxes2d"
env:
IMAGE_WIDTH: 640
IMAGE_HEIGHT: 480
Related Skills
- hub-camera - Camera input nodes
- hub-visualization - Rerun visualization
- domain-vision - Vision pipeline patterns