name: 01-mlflow-genai-foundation
description: >
Use when starting any new GenAI agent project on Databricks. Covers MLflow 3.x
GenAI setup: install dependencies, enable autolog, configure connection pooling,
understand model signatures, detect runtime environment. Foundation Step 1.
Produces the configured MLflow environment consumed by all subsequent skills.
license: Apache-2.0
clients: [ide_cli, genie_code]
bundle_resource: none
deploy_verb: none
deploy_note: "MLflow 3.x GenAI environment setup — no deployed resource; configured via the MLflow SDK identically on both clients. On Genie Code the runtime is serverless workspace compute (pre-authenticated); see skills/genie-code-environment."
coverage: full
metadata:
last_verified: "2026-06-05"
volatility: high
upstream_sources: []
author: "prashanth-subrahmanyam"
version: "3.1.0"
domain: "genai-agents"
pipeline_position: "F1"
produces: "mlflow_environment, autolog_enabled, connection_pool"
grounded_in: "docs.databricks.com/aws/en/mlflow3/genai, mlflow.org/docs/latest/genai/serving/responses-agent/"
MLflow GenAI foundation
When to Use
Use this skill when you are:
- Starting a new GenAI agent project on Databricks
- Standardizing MLflow 3.x setup before tracing, evaluation, or deployment
- Aligning dependencies and autolog behavior with the Genie Space Optimizer job bundle
Downstream skills assume this foundation: experiment configuration, ResponsesAgent logging, and template formatting for UC-backed resources.
CRITICAL: ResponsesAgent is MANDATORY
For Databricks agents that must interoperate with AI Playground and MLflow GenAI tooling:
ResponsesAgentauto-infers model signatures that are compatible with AI Playground. Rely on that inference instead of hand-written signatures.- Never pass
signature=tolog_model()when usingResponsesAgent(or other GenAI agent flavors that supply their own schema). Manual signatures often drift from what the runtime and UI expect. - Use the
inputkey in input examples, notmessages. Example payloads must match the schema the agent and evaluator consume.
DO / DON'T: Model Logging
# DON'T: Manual signature overrides auto-inference
sig = ModelSignature(
inputs=Schema([ColSpec("string", "messages")]),
outputs=Schema([ColSpec("string", "response")]),
)
mlflow.pyfunc.log_model(python_model=agent, signature=sig)
# DO: Let MLflow infer the signature from ResponsesAgent
mlflow.pyfunc.log_model(
python_model=agent,
artifact_path="agent",
input_example={"input": [{"role": "user", "content": "Hello"}]},
)
For complete signature compatibility rules, migration patterns, and verification steps, see: references/model-signatures.md
Decision tree
| Path | When | Notes |
|---|---|---|
| ResponsesAgent | New agents, Playground, standard GenAI flows | Recommended default |
| ChatAgent | Legacy chat-style agents already on this API | Maintain only if required |
| PythonModel | Custom arbitrary Python | Avoid for standard LLM agents; more manual signature and tracing burden |
Dependencies
Versions are pinned in the bundle job environment (databricks.yml):
| Package | Constraint | Purpose |
|---|---|---|
mlflow[databricks] |
>=3.10.1 |
GenAI tracing, evaluation, model registry |
databricks-sdk |
>=0.40.0 |
WorkspaceClient, API access |
databricks-agents |
>=1.1.0 |
Agent deployment, production monitoring |
openai |
>=1.17 |
OpenAI-compatible API client |
databricks-openai |
>=0.3.0 |
Supervisor API client (Track B) |
Install locally (use uv pip per project convention):
uv pip install "mlflow[databricks]>=3.10.1" "databricks-sdk>=0.40.0" "databricks-agents>=1.1.0" "openai>=1.17"
Track B also needs databricks-openai. All dependencies are declared in the
course-level pyproject.toml and each track's template pyproject.toml.
DO / DON'T: Dependency Pinning
# DON'T: Unpinned — may resolve to MLflow 2.x with incompatible APIs
# requirements.txt
mlflow
databricks-sdk
# DO: Lower-bounded — ensures GenAI features are available
dependencies:
- mlflow[databricks]>=3.10.1
- databricks-sdk>=0.40.0
For the full list of dependency anti-patterns, see: references/anti-patterns.md (entry #8)
Autolog Setup
Enable autolog early (typically at module import or top of the job entrypoint) so traces and model calls are captured consistently.
import mlflow
mlflow.openai.autolog() # OpenAI-compatible / Databricks Foundation Models
# OR
mlflow.langchain.autolog() # LangChain / LangGraph agents
Call autolog at the top of your agent module or job entrypoint, before any LLM calls.
DO / DON'T: Autolog Placement
# DON'T: Autolog inside a nested function — misses early calls
def run_evaluation(space_id: str):
import mlflow
mlflow.openai.autolog()
result = call_llm("Analyze this space")
return result
# DO: Module-level autolog — captures everything
import mlflow
mlflow.openai.autolog()
def run_evaluation(space_id: str):
result = call_llm("Analyze this space")
return result
For all autolog and tracing anti-patterns, see: references/anti-patterns.md (entry #4)
UC Naming Conventions
Use consistent naming for Unity Catalog resources. Build fully qualified names from catalog, schema, and resource name parts using f-strings or config values:
catalog = "main"
schema = "my_agent_project"
table_name = "benchmarks"
full_name = f"{catalog}.{schema}.{table_name}"
# Result: "main.my_agent_project.benchmarks"
For prompt registry names, follow the same pattern:
prompt_name = f"{catalog}.{schema}.system_prompt"
For UC naming anti-patterns, see: references/anti-patterns.md
Connection pooling
Reuse long-lived clients instead of creating a new HTTP stack per request:
- Instantiate
WorkspaceClient(and OpenAI-compatible clients) once per process or request scope and pass them into agent code. - Avoid constructing a fresh SDK client inside hot loops; the
databricks-sdkand typical OpenAI clients keep connection pools internally when reused.
Downstream skills may specify pool sizes or timeouts for serving vs notebook; this skill only establishes the rule: one shared client pattern per runtime.
DO / DON'T: Client Reuse
# DON'T: New client per benchmark — TLS overhead, rate limit risk
for b in benchmarks:
w = WorkspaceClient()
result = w.serving_endpoints.query(...)
# DO: Shared client — connection pool reused across all calls
w = WorkspaceClient()
for b in benchmarks:
result = w.serving_endpoints.query(...)
For the full connection pooling anti-pattern, see: references/anti-patterns.md (entry #6)
Environment detection
Use environment variables to branch behavior (serving vs app vs cluster vs local):
import os
def detect_environment() -> str:
if os.environ.get("IS_IN_DB_MODEL_SERVING_ENV"):
return "model_serving"
if os.environ.get("DATABRICKS_APP_NAME"):
return "databricks_apps"
if os.environ.get("DATABRICKS_RUNTIME_VERSION"):
return "notebook" # or job
return "local"
Adjust return values or add finer distinctions (e.g. job vs notebook) if your agent needs different connection or logging settings per runtime.
For the complete implementation with RuntimeEnvironment enum, per-environment configuration table, client factory, and testing patterns, see: references/environment-detection.md
Common mistakes
| Mistake | Why it hurts | Fix |
|---|---|---|
Passing signature= to log_model() for ResponsesAgent |
Breaks auto-inferred GenAI schema; Playground/eval mismatch | Omit signature=; use framework defaults |
Using messages in input examples |
Wrong schema for many GenAI agents and scorers | Use input (and keys your model expects) |
Logging with PythonModel for simple LLM agents |
Extra maintenance; easy to get traces/signatures wrong | Prefer ResponsesAgent |
| Autolog only inside a nested function | Misses early calls; inconsistent traces | Call autolog at module level or job start |
| Pinning MLflow 2.x patterns on 3.x | Deprecated APIs, wrong tracing hooks | Stay on mlflow[databricks]>=3.10.1 and current docs |
| Hardcoding UC names without templates | Environment-specific drift | Use format_mlflow_template for shared patterns |
For 12 complete DO/DON'T pairs with code examples, see: references/anti-patterns.md
Validation checklist
-
mlflow[databricks]>=3.10.1installed (matchdatabricks.ymljob env when deploying) - Autolog enabled at module level or immediately after experiment setup
-
ResponsesAgentused for new agents (notPythonModelunless justified) - No manual
signature=inlog_model()for GenAI agent types -
inputkey used in examples (notmessages)
References
Related skills
- Foundation Step 2 — experiment tracing:
../02-experiment-tracing-and-uc-storage/SKILL.mdfor experiment IDs, run lifecycle, and tracing details beyond autolog. - Track A Step 2 — Agent Framework & ResponsesAgent:
../../tracks/A-custom-agent-apps/02-agent-framework/SKILL.mdfor implementation patterns and logging.
Official documentation
Reference files (this skill)
| File | Content |
|---|---|
references/model-signatures.md |
Signature auto-inference, compatibility matrix, migration guide, verification |
references/environment-detection.md |
Full detect_environment() implementation, per-env config, client factory, tests |
references/anti-patterns.md |
12 DO/DON'T pairs with code examples covering all common mistakes |
Reference Implementation
The patterns in this skill are demonstrated in the Genie Space Optimizer codebase. In your own project, apply them to your module structure — the principles (autolog placement, connection pooling, environment detection) are framework-agnostic.
Version History
| Version | Date | Changes |
|---|---|---|
| 3.0.0 | 2026-03-26 | Initial skill: dependencies, autolog, signatures, environment detection, common mistakes |
| 3.1.0 | 2026-03-26 | Added reference files (model-signatures.md, environment-detection.md, anti-patterns.md); inline DO/DON'T pairs for model logging, dependency pinning, autolog placement, client reuse; version history table; expanded References section with reference file pointers |