llmobs-integration

star 817

Use when adding, debugging, or modifying LLMObs plugins for an LLM library in dd-trace-js. Triggers: "add LLMObs support", "instrument chat completions / streaming / embeddings / agent runs / orchestration / tool calls / retrieval", "LLMObsPlugin", "getLLMObsSpanRegisterOptions", "setLLMObsTags", "LlmObsCategory", "LlmObsSpanKind", any provider tag ("openai" / "anthropic" / "genai" / "google" / "langchain" / "langgraph" / "ai-sdk" llmobs), "VCR cassettes".

DataDog By DataDog schedule Updated 6/3/2026

name: llmobs-integration description: | Use when adding, debugging, or modifying LLMObs plugins for an LLM library in dd-trace-js. Triggers: "add LLMObs support", "instrument chat completions / streaming / embeddings / agent runs / orchestration / tool calls / retrieval", "LLMObsPlugin", "getLLMObsSpanRegisterOptions", "setLLMObsTags", "LlmObsCategory", "LlmObsSpanKind", any provider tag ("openai" / "anthropic" / "genai" / "google" / "langchain" / "langgraph" / "ai-sdk" llmobs), "VCR cassettes".

LLM Observability Integration Skill

This skill covers creating LLMObs plugins that instrument LLM library operations and emit span events. Supported operations: chat completions (streaming and non-streaming), embeddings, agent runs, orchestration (workflows / graphs), tool calls, retrieval (RAG / vector DB).

Read Upstream Source First

LLM libraries iterate fast — six-month-old assumptions about an SDK's response shape, streaming contract, or tool-call format are usually wrong. Before category detection or any plugin work, read the upstream library's source for the installed version (versions/<lib>@<range>/node_modules/<lib>). The category decision tree below depends on facts the source carries (does this package make HTTP calls? does it orchestrate? does it support multiple providers?). See apm-integrations § Read Upstream Source First for the shallow-clone / npm pack shapes.

Core Concepts

1. LLMObsPlugin Base Class

All LLMObs plugins extend LLMObsPlugin. Two methods must be implemented:

  • getLLMObsSpanRegisterOptions(ctx) — returns { modelProvider, modelName, kind, name }.
  • setLLMObsTags(ctx) — extracts and tags input / output messages, token metrics, and model metadata.

Lifecycle: start(ctx) registers the span and captures context; the wrapped operation runs; asyncEnd(ctx) calls setLLMObsTags(); end(ctx) restores the parent.

See references/plugin-architecture.md for the full implementation surface.

2. Package Category System

CRITICAL: Every integration must be classified into one category using the LlmObsCategory enum. This determines test strategy and implementation approach.

LlmObsCategory Enum Values

  • LlmObsCategory.LLM_CLIENT - Direct API wrappers (openai, anthropic, genai)

    • Signs: Makes HTTP calls to LLM provider endpoints, requires API keys
    • Test strategy: VCR with real API calls via proxy
    • Instrumentation: Hook chat/completion methods
  • LlmObsCategory.MULTI_PROVIDER - Multi-provider frameworks (ai-sdk, langchain)

    • Signs: Supports multiple LLM providers via configuration, wraps LLM_CLIENT libraries
    • Test strategy: VCR with real API calls via proxy
    • Instrumentation: Hook provider abstraction layer
  • LlmObsCategory.ORCHESTRATION - Workflow managers (langgraph)

    • Signs: Graph/workflow execution, state management, NO direct HTTP to LLM providers
    • Test strategy: Pure function tests, NO VCR, NO real API calls
    • Instrumentation: Hook workflow lifecycle (invoke, stream, run)
    • Special: Tests should use actual LLM as orchestration node (not mock responses)
  • LlmObsCategory.INFRASTRUCTURE - Protocols/servers (MCP)

    • Signs: Protocol implementation, server/client architecture, transport layers
    • Test strategy: Mock server tests
    • Instrumentation: Hook protocol handlers

Decision Tree

Answer these questions by reading the code:

  1. Does the package make direct HTTP calls to LLM provider endpoints?
  • YES → Go to question 2
  • NO → Go to question 3
  1. Does it support multiple LLM providers via configuration?
  • YES → LlmObsCategory.MULTI_PROVIDER
  • NO → LlmObsCategory.LLM_CLIENT
  1. Does it implement workflow/graph orchestration with state management?
  • YES → LlmObsCategory.ORCHESTRATION
  • NO → LlmObsCategory.INFRASTRUCTURE

See references/category-detection.md for detailed heuristics and examples.

3. LLM Span Kinds

Use the LlmObsSpanKind enum:

  • LlmObsSpanKind.LLM - Chat completions, text generation
  • LlmObsSpanKind.WORKFLOW - Graph/chain execution
  • LlmObsSpanKind.AGENT - Agent runs
  • LlmObsSpanKind.TOOL - Tool/function calls
  • LlmObsSpanKind.EMBEDDING - Embedding generation
  • LlmObsSpanKind.RETRIEVAL - Vector DB/RAG retrieval

Most common: Use 'llm' for chat completions/text generation in LLM_CLIENT and MULTI_PROVIDER categories.

4. Message Extraction

All plugins must convert provider-specific message formats to the standard format:

Standard format: [{content: string, role: string}]

Common roles: 'user', 'assistant', 'system', 'tool'

Provider-specific handling:

  • OpenAI: Direct format match, handle function_call and tool_calls
  • Anthropic: Map role values, flatten nested content arrays
  • Google GenAI: Extract from parts arrays, map role names
  • Multi-provider: Detect provider and apply appropriate extraction

See references/message-extraction.md for provider-specific patterns.

Implementation Steps

  1. Detect package category (REQUIRED FIRST STEP)
  • Follow decision tree above
  • Output: category, confidence, reasoning
  1. Create plugin file
  • Location: packages/dd-trace/src/llmobs/plugins/{integration}/index.js
  • Extend: LLMObsPlugin base class
  • Implement: Required methods per plugin architecture
  1. Implement getLLMObsSpanRegisterOptions(ctx)
  • Extract model provider and name from context
  • Determine span kind (usually 'llm')
  • Return registration options object
  1. Implement setLLMObsTags(ctx)
  • Extract input messages from ctx.arguments
  • Extract output messages from ctx.result
  • Extract token metrics (input_tokens, output_tokens, total_tokens)
  • Extract metadata (temperature, max_tokens, etc.)
  • Tag span using this._tagger methods
  1. Handle edge cases
  • Streaming responses (if applicable)
  • Error cases (empty output messages)
  • Non-standard message formats
  • Missing metadata

See references/plugin-architecture.md for step-by-step implementation guide.

Plugin Registration

All plugins must export an array:

Static properties required:

  • integration - Integration name (e.g., 'openai')
  • id - Unique plugin ID (e.g., 'llmobs_openai')
  • prefix - Channel prefix (e.g., 'tracing:apm:openai:chat')

References

For detailed information, see:

Install via CLI
npx skills add https://github.com/DataDog/dd-trace-js --skill llmobs-integration
Repository Details
star Stars 817
call_split Forks 395
navigation Branch main
article Path SKILL.md
More from Creator