name: llmobs-integration description: | Use when adding, debugging, or modifying LLMObs plugins for an LLM library in dd-trace-js. Triggers: "add LLMObs support", "instrument chat completions / streaming / embeddings / agent runs / orchestration / tool calls / retrieval", "LLMObsPlugin", "getLLMObsSpanRegisterOptions", "setLLMObsTags", "LlmObsCategory", "LlmObsSpanKind", any provider tag ("openai" / "anthropic" / "genai" / "google" / "langchain" / "langgraph" / "ai-sdk" llmobs), "VCR cassettes".
LLM Observability Integration Skill
This skill covers creating LLMObs plugins that instrument LLM library operations and emit span events. Supported operations: chat completions (streaming and non-streaming), embeddings, agent runs, orchestration (workflows / graphs), tool calls, retrieval (RAG / vector DB).
Read Upstream Source First
LLM libraries iterate fast — six-month-old assumptions about an SDK's response shape, streaming contract, or tool-call format are usually wrong. Before category detection or any plugin work, read the upstream library's source for the installed version (versions/<lib>@<range>/node_modules/<lib>). The category decision tree below depends on facts the source carries (does this package make HTTP calls? does it orchestrate? does it support multiple providers?). See apm-integrations § Read Upstream Source First for the shallow-clone / npm pack shapes.
Core Concepts
1. LLMObsPlugin Base Class
All LLMObs plugins extend LLMObsPlugin. Two methods must be implemented:
getLLMObsSpanRegisterOptions(ctx)— returns{ modelProvider, modelName, kind, name }.setLLMObsTags(ctx)— extracts and tags input / output messages, token metrics, and model metadata.
Lifecycle: start(ctx) registers the span and captures context; the wrapped operation runs; asyncEnd(ctx) calls setLLMObsTags(); end(ctx) restores the parent.
See references/plugin-architecture.md for the full implementation surface.
2. Package Category System
CRITICAL: Every integration must be classified into one category using the LlmObsCategory enum. This determines test strategy and implementation approach.
LlmObsCategory Enum Values
LlmObsCategory.LLM_CLIENT- Direct API wrappers (openai, anthropic, genai)- Signs: Makes HTTP calls to LLM provider endpoints, requires API keys
- Test strategy: VCR with real API calls via proxy
- Instrumentation: Hook chat/completion methods
LlmObsCategory.MULTI_PROVIDER- Multi-provider frameworks (ai-sdk, langchain)- Signs: Supports multiple LLM providers via configuration, wraps LLM_CLIENT libraries
- Test strategy: VCR with real API calls via proxy
- Instrumentation: Hook provider abstraction layer
LlmObsCategory.ORCHESTRATION- Workflow managers (langgraph)- Signs: Graph/workflow execution, state management, NO direct HTTP to LLM providers
- Test strategy: Pure function tests, NO VCR, NO real API calls
- Instrumentation: Hook workflow lifecycle (invoke, stream, run)
- Special: Tests should use actual LLM as orchestration node (not mock responses)
LlmObsCategory.INFRASTRUCTURE- Protocols/servers (MCP)- Signs: Protocol implementation, server/client architecture, transport layers
- Test strategy: Mock server tests
- Instrumentation: Hook protocol handlers
Decision Tree
Answer these questions by reading the code:
- Does the package make direct HTTP calls to LLM provider endpoints?
- YES → Go to question 2
- NO → Go to question 3
- Does it support multiple LLM providers via configuration?
- YES →
LlmObsCategory.MULTI_PROVIDER - NO →
LlmObsCategory.LLM_CLIENT
- Does it implement workflow/graph orchestration with state management?
- YES →
LlmObsCategory.ORCHESTRATION - NO →
LlmObsCategory.INFRASTRUCTURE
See references/category-detection.md for detailed heuristics and examples.
3. LLM Span Kinds
Use the LlmObsSpanKind enum:
LlmObsSpanKind.LLM- Chat completions, text generationLlmObsSpanKind.WORKFLOW- Graph/chain executionLlmObsSpanKind.AGENT- Agent runsLlmObsSpanKind.TOOL- Tool/function callsLlmObsSpanKind.EMBEDDING- Embedding generationLlmObsSpanKind.RETRIEVAL- Vector DB/RAG retrieval
Most common: Use 'llm' for chat completions/text generation in LLM_CLIENT and MULTI_PROVIDER categories.
4. Message Extraction
All plugins must convert provider-specific message formats to the standard format:
Standard format: [{content: string, role: string}]
Common roles: 'user', 'assistant', 'system', 'tool'
Provider-specific handling:
- OpenAI: Direct format match, handle
function_callandtool_calls - Anthropic: Map
rolevalues, flatten nested content arrays - Google GenAI: Extract from
partsarrays, map role names - Multi-provider: Detect provider and apply appropriate extraction
See references/message-extraction.md for provider-specific patterns.
Implementation Steps
- Detect package category (REQUIRED FIRST STEP)
- Follow decision tree above
- Output: category, confidence, reasoning
- Create plugin file
- Location:
packages/dd-trace/src/llmobs/plugins/{integration}/index.js - Extend:
LLMObsPluginbase class - Implement: Required methods per plugin architecture
- Implement
getLLMObsSpanRegisterOptions(ctx)
- Extract model provider and name from context
- Determine span kind (usually
'llm') - Return registration options object
- Implement
setLLMObsTags(ctx)
- Extract input messages from
ctx.arguments - Extract output messages from
ctx.result - Extract token metrics (input_tokens, output_tokens, total_tokens)
- Extract metadata (temperature, max_tokens, etc.)
- Tag span using
this._taggermethods
- Handle edge cases
- Streaming responses (if applicable)
- Error cases (empty output messages)
- Non-standard message formats
- Missing metadata
See references/plugin-architecture.md for step-by-step implementation guide.
Plugin Registration
All plugins must export an array:
Static properties required:
integration- Integration name (e.g., 'openai')id- Unique plugin ID (e.g., 'llmobs_openai')prefix- Channel prefix (e.g., 'tracing:apm:openai:chat')
References
For detailed information, see:
- references/plugin-architecture.md - Complete plugin structure, implementation steps, helper methods
- references/category-detection.md - Package classification heuristics and detection process
- references/message-extraction.md - Provider-specific message format patterns
- references/reference-implementations.md - Working plugin examples (Anthropic, Google GenAI)