dspy-langwatch

star 1

Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.

AcidicSoil By AcidicSoil schedule Updated 5/8/2026

name: dspy-langwatch description: Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.

LangWatch — Auto-Tracing + Real-Time Optimizer Progress for DSPy

Guide the user through setting up LangWatch for automatic DSPy tracing and live optimizer progress tracking.

What is LangWatch

LangWatch is an open-source LLMOps platform with two distinct DSPy integrations:

  1. Auto-tracing (inference): automatically captures module inputs/outputs, LM calls, and retrieval queries
  2. Optimizer progress tracking (unique feature): streams live step-by-step scores, predictor states, and cost as optimizers run

No other observability tool (Langtrace, Phoenix, Weave, MLflow) patches DSPy optimizers to stream live progress.

When to use LangWatch

Use LangWatch when:

  • You run long optimization passes and want to see progress in real-time
  • You want auto-tracing of DSPy inference with no manual decorators
  • You want a dashboard showing optimizer scores, cost, and predictor state as they happen
  • You need both inference tracing AND optimizer monitoring in one tool

Do NOT use LangWatch when:

  • You only need tracing and want the simplest one-line setup — see /dspy-langtrace
  • You want a local trace viewer with built-in evals — see /dspy-phoenix
  • Your team already uses W&B for experiment tracking — see /dspy-weave
  • You need a model registry and full ML lifecycle — see /dspy-mlflow

Setup

Install

pip install langwatch
# Or pin DSPy version compatibility:
pip install langwatch[dspy]

Cloud setup (quickest)

  1. Sign up at app.langwatch.ai
  2. Create a project and copy your API key
  3. Set the environment variable:
export LANGWATCH_API_KEY="your-key"

Self-hosted setup

Docker Compose

git clone https://github.com/langwatch/langwatch.git
cd langwatch
docker compose up -d

Then point your SDK at your local instance:

export LANGWATCH_ENDPOINT="http://localhost:5560"

Helm chart (Kubernetes)

LangWatch provides a Helm chart for production Kubernetes deployments. See the LangWatch docs for Helm values and configuration.

Integration 1: Auto-Tracing (Inference)

Use @langwatch.trace() and autotrack_dspy() to automatically capture all DSPy calls during inference.

What gets traced

Component Details captured
Module calls Inputs/outputs per dspy.Module.forward()
LM calls Model name, messages, response, token counts
Retrievals Queries, retrieved passages
Nested spans Full call tree with parent-child relationships

Basic auto-tracing

import langwatch
import dspy

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))  # or "anthropic/claude-sonnet-4-5-20250929", etc.

@langwatch.trace()
def answer_question(question):
    langwatch.get_current_trace().autotrack_dspy()

    program = dspy.ChainOfThought("question -> answer")
    return program(question=question)

result = answer_question("What is DSPy?")
# View traces at app.langwatch.ai (or your self-hosted URL)

Tracing a full pipeline

import langwatch
import dspy

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))  # or "anthropic/claude-sonnet-4-5-20250929", etc.

class RAGPipeline(dspy.Module):
    def __init__(self):
        self.retrieve = dspy.Retrieve(k=3)
        self.answer = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        context = self.retrieve(question).passages
        return self.answer(context=context, question=question)

pipeline = RAGPipeline()

@langwatch.trace()
def handle_query(question):
    langwatch.get_current_trace().autotrack_dspy()
    return pipeline(question=question)

result = handle_query("How do refunds work?")
# LangWatch captures:
#   - The RAGPipeline call
#   - The Retrieve call (query, passages)
#   - The ChainOfThought LM call (prompt, response, tokens)

Adding metadata to traces

@langwatch.trace()
def handle_query(user_id, question):
    trace = langwatch.get_current_trace()
    trace.autotrack_dspy()
    trace.update(metadata={"user_id": user_id, "environment": "production"})
    return pipeline(question=question)

Integration 2: Optimizer Progress Tracking (Unique Feature)

LangWatch patches DSPy optimizer classes to stream live step-by-step progress. This is LangWatch's killer feature — no other tool does this.

What the optimizer dashboard shows

  • Live scores: see each trial's score as it completes
  • Predictor states: which instructions and demos the optimizer is testing
  • LM calls: every call the optimizer makes during search
  • Cost tracking: running cost total as the optimizer runs
  • Progress bar: how far through the optimization you are

Supported optimizers

Optimizer Supported
dspy.BootstrapFewShot Yes
dspy.BootstrapFewShotWithRandomSearch Yes
dspy.COPRO Yes
dspy.MIPROv2 Yes
Others Raises ValueError

Setup optimizer tracking

import langwatch.dspy
import dspy

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))  # or "anthropic/claude-sonnet-4-5-20250929", etc.

trainset = [...]  # your training examples

def metric(example, prediction, trace=None):
    return prediction.answer.strip().lower() == example.answer.strip().lower()

program = dspy.ChainOfThought("question -> answer")
optimizer = dspy.MIPROv2(metric=metric, auto="medium")

# Initialize LangWatch optimizer tracking
langwatch.dspy.init(
    experiment="mipro-medium-run1",
    optimizer=optimizer,
)

# Run optimization — progress streams to the LangWatch dashboard
optimized = optimizer.compile(program, trainset=trainset)
# Watch live progress at app.langwatch.ai

Tracking BootstrapFewShot

import langwatch.dspy
import dspy

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))  # or "anthropic/claude-sonnet-4-5-20250929", etc.

program = dspy.ChainOfThought("question -> answer")
optimizer = dspy.BootstrapFewShot(metric=metric, max_bootstrapped_demos=4)

langwatch.dspy.init(
    experiment="bootstrap-4demos",
    optimizer=optimizer,
)

optimized = optimizer.compile(program, trainset=trainset)

Comparing multiple optimizer runs

Run multiple experiments with different names — they appear side-by-side in the LangWatch dashboard:

import langwatch.dspy
import dspy

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))  # or "anthropic/claude-sonnet-4-5-20250929", etc.

experiments = [
    ("bootstrap-4", dspy.BootstrapFewShot, {"metric": metric, "max_bootstrapped_demos": 4}),
    ("bootstrap-8", dspy.BootstrapFewShot, {"metric": metric, "max_bootstrapped_demos": 8}),
    ("mipro-light", dspy.MIPROv2, {"metric": metric, "auto": "light"}),
    ("mipro-medium", dspy.MIPROv2, {"metric": metric, "auto": "medium"}),
]

for name, opt_class, kwargs in experiments:
    program = dspy.ChainOfThought("question -> answer")
    optimizer = opt_class(**kwargs)
    langwatch.dspy.init(experiment=name, optimizer=optimizer)
    optimized = optimizer.compile(program, trainset=trainset)

LangWatch vs Langtrace vs Phoenix vs Weave vs MLflow

Feature LangWatch Langtrace Phoenix Weave MLflow
DSPy auto-tracing Yes Yes (built-in) Yes (plugin) No (manual) Yes (autolog)
Optimizer progress Yes (unique) No No No No
Live scores dashboard Yes No No No No
Setup effort 2-3 lines One line Two lines + launch Manual decorators One line
Self-hosted Yes (Docker, Helm) Yes (Docker) Yes No (cloud only) Yes
Cloud option Yes (app.langwatch.ai) Yes (app.langtrace.ai) Yes (Arize) Yes (wandb.ai) Yes (Databricks)
Model registry No No No No Yes
Built-in evals Basic Basic Yes Basic Basic

Decision guide

What do you need?
|
+- Watch optimizer progress live? -> LangWatch (this skill)
+- Easiest auto-tracing setup? -> Langtrace (/dspy-langtrace)
+- Tracing + evals (local)? -> Phoenix (/dspy-phoenix)
+- Tracing + experiment tracking (cloud)? -> Weave (/dspy-weave)
+- Full ML lifecycle + model registry? -> MLflow (/dspy-mlflow)

Gotchas

  1. Claude forgets to call autotrack_dspy() inside the traced function. The @langwatch.trace() decorator creates the trace context, but DSPy auto-tracking only activates when you call langwatch.get_current_trace().autotrack_dspy() inside the function body. Without it, you get an empty trace with no DSPy spans.
  2. Claude puts autotrack_dspy() outside the @langwatch.trace() function. The autotrack_dspy() call must be inside the decorated function where a trace context exists. Calling it at module level or before the trace starts raises an error because there is no current trace.
  3. Claude calls langwatch.dspy.init() after optimizer.compile(). The init() call must come before compile() — it patches the optimizer to stream progress. If called after, no progress data is captured. Always: create optimizer, call langwatch.dspy.init(experiment=..., optimizer=...), then call optimizer.compile().
  4. Claude reuses the same experiment name across runs. Each langwatch.dspy.init(experiment=...) call should use a unique experiment name so runs appear as separate entries in the dashboard. Reusing names overwrites or merges data, making comparison impossible.

Additional resources

Cross-references

Install any skill: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>

  • Langtrace (auto-instrumentation, easiest one-line setup) — /dspy-langtrace
  • Arize Phoenix (open-source with evals) — /dspy-phoenix
  • W&B Weave (team dashboards, experiment tracking) — /dspy-weave
  • MLflow (full ML lifecycle, model registry) — /dspy-mlflow
  • Lightweight experiment tracking (JSONL-based, no extra tools) — /ai-tracking-experiments
  • Production monitoring/ai-monitoring
  • For worked examples, see examples.md
  • Install /ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-do
Install via CLI
npx skills add https://github.com/AcidicSoil/lms-llmsTxt --skill dspy-langwatch
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator