name: 00-uc-resources-foundation
description: >
Use when any agent or downstream skill needs Unity Catalog schemas and managed
volumes (knowledge_sources, agent_outputs, memory tables, benchmark tables,
etc.) provisioned idempotently. Foundation Step 0 — runs before MLflow
tracing setup and before any track-specific skill. Owned by no track. Reads
the resolved spec to discover which volumes a use case needs and creates them
alongside the canonical agent + ops UC schemas.
license: Apache-2.0
clients: [ide_cli, genie_code]
bundle_resource: schemas
deploy_verb: bundle_deploy
deploy_note: "Provisions UC schemas + managed volumes idempotently under the per-user prefixed schema ({user_schema_prefix}); created via SDK/DDL identically on both clients. On Genie Code run CLI steps through runDatabricksCli (pre-authenticated) and execute on serverless; see skills/genie-code-environment."
coverage: full
metadata:
last_verified: "2026-04-15"
volatility: medium
upstream_sources: []
author: "prashanth-subrahmanyam"
version: "1.1.0"
domain: "genai-agents"
pipeline_position: "F0"
consumes: "uc_catalog, USER_SCHEMA_PREFIX, AgentSpec.required_volumes(optional), AgentSpec.knowledge_base_backend(optional)"
produces: "agent_schema, ops_schema, uc_volumes(map of name → /Volumes path), signoff_volume_path"
grounded_in: "docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-ddl-create-schema, docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-ddl-create-volume, docs.databricks.com/aws/en/volumes/managed-vs-external"
fields_read:
- resources.uc.catalog
- resources.uc.user_schema_prefix
- agent.required_volumes
- agent.knowledge_base_backend
Foundation Step 0: UC Resources (Schemas + Volumes)
Cross-cutting. This skill is consumed by every Foundation step and
agent track. It exists so that no downstream skill (KA, agent memory, tool
exports, benchmark persistence, prompt registry) has to repeat schema or
volume provisioning. If you find yourself writing CREATE SCHEMA or
CREATE VOLUME inside another skill, delegate to F0 instead.
What This Skill Owns
| Resource |
Purpose |
Default name |
| Agent UC schema |
Holds agent-owned assets — KA source volume, tool output volume, benchmark tables, prompt registry pointers, memory tables |
${user_schema_prefix}_agent |
| Ops UC schema |
Holds operational assets — MLflow OTeL trace tables (F2), monitoring tables (SDLC 07), eval-run rollups (SDLC 04) |
${user_schema_prefix}_ops |
knowledge_sources MANAGED volume (in agent schema) |
Source markdown / PDFs for Knowledge Assistants (F5) and any retrieval-style tool |
/Volumes/${catalog}/${agent_schema}/knowledge_sources |
agent_outputs MANAGED volume (in agent schema) |
CSV exports, charts, generated artifacts produced by @function_tool calls |
/Volumes/${catalog}/${agent_schema}/agent_outputs |
signoffs MANAGED volume (in ops schema) |
Stakeholder + engineering sign-off decision.md artifacts (SDLC 04b) — the durable, UC-governed location read by SDLC 06 promotion gates |
/Volumes/${catalog}/${ops_schema}/signoffs |
| Custom volumes (in agent or ops schema) |
Anything declared in state://AgentSpec.required_volumes[] |
/Volumes/${catalog}/${schema}/${name} |
signoffs is generic. F0 pre-creates the volume under a workshop-neutral
name (signoffs) so SDLC 04b never has to provision its own UC location and
SDLC 06 can resolve /Volumes/${catalog}/${ops_schema}/signoffs/v<N>/decision.md
deterministically. The volume name does not carry any use-case-specific
prefix (no skyloyalty_signoffs, no ${use_case_slug}_signoffs); the
per-version directory v<N>/ carries the disambiguation.
Every operation is idempotent (CREATE SCHEMA IF NOT EXISTS,
CREATE VOLUME IF NOT EXISTS — AlreadyExists is treated as success). It
is safe to call this skill from any prompt that needs to be sure these
resources exist; the cost on a warm workspace is one round-trip per resource.
What This Skill Does NOT Own
- Catalog creation. A UC catalog must already exist; F0 reads
state://Resources.uc.catalog and validates the caller has USE CATALOG +
CREATE SCHEMA on it.
- UC table DDL. Tables for OTeL traces, benchmarks, memory, or monitoring
are owned by their respective consuming skills (F2, SDLC 02/04/07).
- Source content. F0 creates an empty
knowledge_sources volume; F5 (or
any other consumer) is responsible for populating it with files.
- Bronze data. Bronze schema + Delta tables are owned by the bronze layer
setup skill (
data_product_accelerator/skills/bronze/00-bronze-layer-setup).
When to Use
- Always, near the top of the prompt sequence — before F1 (MLflow env) so
that anything downstream can assume schemas + volumes exist.
- Whenever you add a new consuming skill that needs a new volume — extend
state://AgentSpec.required_volumes[] rather than putting CREATE VOLUME
inside the consuming skill.
- After
bootstrap when the resolved spec is available, but before any other
Foundation step.
Prerequisites
| Requirement |
How to Check |
state://Resources.uc.catalog is resolved |
Set by vibecoding-state bootstrap op |
state://Resources.uc.user_schema_prefix is resolved |
Derived in bootstrap from the user identity + use_case_slug |
| Workshop SQL warehouse is RUNNING |
state://Resources.warehouse_id (set by Prompt 1 preflight) — needed for CREATE SCHEMA execution via WorkspaceClient.statement_execution |
databricks-sdk >= 0.30 installed |
python -c "from databricks.sdk import WorkspaceClient; from databricks.sdk.service.catalog import VolumeType" |
Caller has USE CATALOG + CREATE SCHEMA on the catalog |
databricks unity-catalog catalogs get $CATALOG --output json | jq '.privileges' |
Inputs
# Resolved by the caller from state — F0 does not read state directly.
uc_catalog: "string (required)" # e.g. "main"
agent_schema: "string (required)" # default: "${user_schema_prefix}_agent"
ops_schema: "string (required)" # default: "${user_schema_prefix}_ops"
warehouse_id: "string (required)" # SQL warehouse for DDL execution
required_volumes: # optional — defaults below if not provided
- { name: "knowledge_sources", schema: "agent", comment: "KA + retrieval source files" }
- { name: "agent_outputs", schema: "agent", comment: "Tool-generated artifacts (CSV, charts)" }
- { name: "signoffs", schema: "ops", comment: "SDLC 04b stakeholder + engineering signoff decision.md artifacts" }
# AgentSpec.required_volumes[] is appended verbatim
Operations
flowchart TD
start["F0 invoked"] --> grant["Verify USE CATALOG + CREATE SCHEMA"]
grant --> agentSch["CREATE SCHEMA IF NOT EXISTS<br/>${catalog}.${agent_schema}"]
grant --> opsSch["CREATE SCHEMA IF NOT EXISTS<br/>${catalog}.${ops_schema}"]
agentSch --> volLoop["For each required_volume:<br/>create MANAGED volume idempotently"]
opsSch --> volLoop
volLoop --> capture["Capture { agent_schema, ops_schema, uc_volumes{name → path} }"]
Reference implementation
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.catalog import VolumeType
from databricks.sdk.errors import AlreadyExists
def provision_uc_resources(
*,
uc_catalog: str,
agent_schema: str,
ops_schema: str,
warehouse_id: str,
required_volumes: list[dict] | None = None,
) -> dict:
w = WorkspaceClient()
# 1. Schemas (DDL via statement execution against the workshop warehouse)
for schema in (agent_schema, ops_schema):
w.statement_execution.execute_statement(
warehouse_id=warehouse_id,
statement=(
f"CREATE SCHEMA IF NOT EXISTS {uc_catalog}.{schema} "
f"COMMENT 'F0-managed: agent or ops assets'"
),
wait_timeout="30s",
)
# 2. Volumes (SDK; AlreadyExists is success)
defaults = [
{"name": "knowledge_sources", "schema": "agent",
"comment": "KA + retrieval source files"},
{"name": "agent_outputs", "schema": "agent",
"comment": "Tool-generated artifacts (CSV, charts)"},
{"name": "signoffs", "schema": "ops",
"comment": "SDLC 04b stakeholder + engineering signoff decision.md artifacts"},
]
volumes_spec = (required_volumes or []) + defaults
seen, paths = set(), {}
for v in volumes_spec:
target_schema = agent_schema if v["schema"] == "agent" else ops_schema
key = (target_schema, v["name"])
if key in seen:
continue
seen.add(key)
try:
w.volumes.create(
catalog_name=uc_catalog,
schema_name=target_schema,
name=v["name"],
volume_type=VolumeType.MANAGED,
comment=v.get("comment", "F0-managed volume"),
)
except AlreadyExists:
pass
paths[v["name"]] = f"/Volumes/{uc_catalog}/{target_schema}/{v['name']}"
return {
"agent_schema": agent_schema,
"ops_schema": ops_schema,
"uc_volumes": paths,
}
The skill caller passes the result back to vibecoding-state exit so
downstream skills can read state://Resources.uc_volumes.knowledge_sources etc.
Outputs (handoff)
| Key |
Consumer |
Value |
agent_schema |
F2, F5, SDLC 02/04, Track A 05 (memory) |
${user_schema_prefix}_agent |
ops_schema |
F2 (OTeL tables), SDLC 07 (monitoring), SDLC 04b (signoffs) |
${user_schema_prefix}_ops |
uc_volumes.knowledge_sources |
F5 (Step 5_0 — branch B/C upload), F3 (Vector Search MCP source dir) |
/Volumes/${catalog}/${agent_schema}/knowledge_sources |
uc_volumes.agent_outputs |
Track A 03 csv_export @function_tool, any code-interpreter output |
/Volumes/${catalog}/${agent_schema}/agent_outputs |
uc_volumes.signoffs (alias signoff_volume_path) |
SDLC 04b (writes decision.md), SDLC 06 (reads it as the promotion gate) |
/Volumes/${catalog}/${ops_schema}/signoffs |
uc_volumes.<custom> |
Whichever consuming skill declared the volume |
/Volumes/${catalog}/${schema}/${name} |
Validation Gate
Delegation Map
| Need |
Delegate To |
| Bronze schema + Delta tables |
bronze/00-bronze-layer-setup |
OTeL trace tables in ${ops_schema} |
F2 |
KA source markdown into knowledge_sources |
F5 Step 5_0 — branch (A) skip if non-empty / (B) upload ka_source / (C) auto-generate from PRD |
| Lakebase memory tables |
Track A 05 |
Benchmark tables in ${agent_schema} |
SDLC 02 |
Monitoring tables in ${ops_schema} |
SDLC 07 |
Limitations to Plan For
- Catalog must exist. F0 does not create catalogs (org-level admin action).
- Workspace warehouse required. Schema DDL goes through the SQL warehouse,
not the Python serverless API.
- No drop semantics. F0 is create-only — to remove a schema or volume,
delete out of band; do not put
DROP in any agent skill.
- Volume size budgeting. Managed volumes inherit the catalog's storage
location; if your
agent_outputs may grow large, document a retention
policy in the consuming skill.
Notes to Carry Forward
| Key |
Produced By |
Value |
agent_schema |
F0 |
${user_schema_prefix}_agent |
ops_schema |
F0 |
${user_schema_prefix}_ops |
uc_volumes |
F0 |
Map of name → /Volumes/${catalog}/${schema}/${name} |
knowledge_source_path |
F0 (alias of uc_volumes.knowledge_sources) |
Path that F5 attaches as a files knowledge source |
agent_outputs_path |
F0 (alias of uc_volumes.agent_outputs) |
Path that Track A 03's csv_export writes to |
signoff_volume_path |
F0 (alias of uc_volumes.signoffs) |
Generic, use-case-neutral signoff volume path SDLC 04b writes decision.md into and SDLC 06 reads as the promotion gate. Always /Volumes/${catalog}/${ops_schema}/signoffs — never workshop-specific. |
Next Step
After F0's gate passes, continue with F1: MLflow GenAI Foundation. Every subsequent skill can now assume schemas + volumes exist.
References
Version History
| Version |
Date |
Changes |
| 1.1.0 |
2026-04-26 |
Added generic, workshop-neutral signoffs MANAGED volume in ${ops_schema} as a default required_volume. SDLC 04b writes decision.md artifacts here and SDLC 06 reads them as the promotion gate. Captures signoff_volume_path (alias of uc_volumes.signoffs) so downstream skills resolve /Volumes/${catalog}/${ops_schema}/signoffs/v<N>/decision.md deterministically without per-use-case naming. |
| 1.0.0 |
2026-04-24 |
Initial skill — extracted from F5 (05-knowledge-assistant) Step 5_0 so schema + volume provisioning is owned in one cross-cutting place. Consumers (F5, F2, Track A 05, SDLC 02/04/07) now delegate. |