pmoves-model-registry - SKILL.md Agent Skill

name: PMOVES Model Registry description: | Query, discover, and enrich the PMOVES.AI model catalog. Manages all AI model metadata (LLM, embedding, TTS, vision), HuggingFace enrichment, TensorZero TOML config export, and GPU deployment tracking across the fleet. keywords: models, registry, catalog, embedding, llm, tensorzero, huggingface, gpu, deployment, discovery version: 1.0.0 category: Infrastructure/AI tier: 1 agent_class: Standard agent_id: pmoves_model_registry

PMOVES Model Registry

Agent Class: Standard (Pmoves-) Category: Infrastructure/AI Version: 1.0.0 Tier: 1 (Core Infrastructure) Status: Active — Supabase-backed model catalog + HuggingFace enrichment Port: 8110

Capabilities

Command	What It Does
`list-models`	List all active models with optional type/provider filter
`get-model`	Get detailed metadata for a single model by ID
`enrich-hf`	Fetch dimensions, tags, CUDA support from HuggingFace API
`enrich-hf-bulk`	Batch-enrich all models that have hf_id in metadata
`export-tensorzero`	Generate TensorZero TOML config from catalog
`list-deployments`	Show active GPU model deployments across fleet
`register-deployment`	Register/update a model deployment (GPU Orchestrator)
`service-models`	List models mapped to a specific service

Trigger Phrases (Pinokio 7 Interpreter)

Phrase	Action	Endpoint
`"list available models"`	Show full model catalog	`GET /api/models`
`"show embedding models"`	Filter catalog by type	`GET /api/models?model_type=embedding`
`"show LLM models"`	Filter catalog by type	`GET /api/models?model_type=llm`
`"get model details for [id]"`	Single model lookup	`GET /api/models/{id}`
`"enrich model from huggingface"`	Fetch HF metadata + dimensions	`POST /api/models/{id}/enrich-hf`
`"enrich all embedding models"`	Batch HF enrichment	`POST /api/models/enrich-hf-bulk`
`"export tensorzero config"`	Download TensorZero TOML	`GET /api/tensorzero/config`
`"show GPU deployments"`	List active model deployments	`GET /api/deployments`
`"which models are on 5090"`	Filter deployments by node	`GET /api/deployments?node_id=5090`
`"what models does hi-rag use"`	Service-specific model lookup	`GET /api/services/hi-rag/models`

API Reference

Model Catalog

# List all models
curl http://localhost:8110/api/models

# Filter by type (embedding, llm, tts, vision, audio)
curl http://localhost:8110/api/models?model_type=embedding

# Filter by provider (ollama, openai, anthropic, venice)
curl http://localhost:8110/api/models?provider=ollama

# Get single model
curl http://localhost:8110/api/models/{model_id}

# Models for a service
curl http://localhost:8110/api/services/hi-rag/models

HuggingFace Enrichment

# Enrich a single model (requires metadata.hf_id set)
curl -X POST http://localhost:8110/api/models/{model_id}/enrich-hf

# Batch-enrich all embedding models
curl -X POST http://localhost:8110/api/models/enrich-hf-bulk?model_type=embedding

TensorZero Config Export

# Generate TOML config from catalog
curl http://localhost:8110/api/tensorzero/config -o tensorzero.toml

GPU Deployments

# List active deployments
curl http://localhost:8110/api/deployments

# Filter by node
curl http://localhost:8110/api/deployments?node_id=5090

# Filter by status
curl http://localhost:8110/api/deployments?status=loaded

Health Check

curl http://localhost:8110/healthz
# → {"status": "healthy", "timestamp": "...", "services": {"supabase": "...", "nats": "..."}}

Integration Points

Supabase — pmoves_core.models, pmoves_core.model_service_mapping, pmoves_core.v_active_deployments
NATS — Publishes model.registry.updated.v1 on catalog mutations (model enriched, deployment registered)
GPU Orchestrator — Calls POST /api/deployments when models are loaded/unloaded on GPU nodes
TensorZero Gateway — Consumes exported TOML config for model provider routing
HuggingFace API — Fetches model cards, config.json for embedding dimensions, tags, CUDA support

Prerequisites

Supabase running with pmoves_core schema seeded
NATS message bus at port 4222 (optional — catalog changes still work without NATS)
HuggingFace API accessible (no auth required for public models)