00-uc-resources-foundation

star 4

Use when any agent or downstream skill needs Unity Catalog schemas and managed volumes (knowledge_sources, agent_outputs, memory tables, benchmark tables, etc.) provisioned idempotently. Foundation Step 0 — runs before MLflow tracing setup and before any track-specific skill. Owned by no track. Reads the resolved spec to discover which volumes a use case needs and creates them alongside the canonical agent + ops UC schemas.

databricks-solutions By databricks-solutions schedule Updated 6/2/2026

name: 00-uc-resources-foundation description: > Use when any agent or downstream skill needs Unity Catalog schemas and managed volumes (knowledge_sources, agent_outputs, memory tables, benchmark tables, etc.) provisioned idempotently. Foundation Step 0 — runs before MLflow tracing setup and before any track-specific skill. Owned by no track. Reads the resolved spec to discover which volumes a use case needs and creates them alongside the canonical agent + ops UC schemas. license: Apache-2.0 clients: [ide_cli, genie_code] bundle_resource: schemas deploy_verb: bundle_deploy deploy_note: "Provisions UC schemas + managed volumes idempotently under the per-user prefixed schema ({user_schema_prefix}); created via SDK/DDL identically on both clients. On Genie Code run CLI steps through runDatabricksCli (pre-authenticated) and execute on serverless; see skills/genie-code-environment." coverage: full metadata: last_verified: "2026-04-15" volatility: medium upstream_sources: [] author: "prashanth-subrahmanyam" version: "1.1.0" domain: "genai-agents" pipeline_position: "F0" consumes: "uc_catalog, USER_SCHEMA_PREFIX, AgentSpec.required_volumes(optional), AgentSpec.knowledge_base_backend(optional)" produces: "agent_schema, ops_schema, uc_volumes(map of name → /Volumes path), signoff_volume_path" grounded_in: "docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-ddl-create-schema, docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-ddl-create-volume, docs.databricks.com/aws/en/volumes/managed-vs-external" fields_read: - resources.uc.catalog - resources.uc.user_schema_prefix - agent.required_volumes - agent.knowledge_base_backend

Foundation Step 0: UC Resources (Schemas + Volumes)

Cross-cutting. This skill is consumed by every Foundation step and agent track. It exists so that no downstream skill (KA, agent memory, tool exports, benchmark persistence, prompt registry) has to repeat schema or volume provisioning. If you find yourself writing CREATE SCHEMA or CREATE VOLUME inside another skill, delegate to F0 instead.

What This Skill Owns

Resource Purpose Default name
Agent UC schema Holds agent-owned assets — KA source volume, tool output volume, benchmark tables, prompt registry pointers, memory tables ${user_schema_prefix}_agent
Ops UC schema Holds operational assets — MLflow OTeL trace tables (F2), monitoring tables (SDLC 07), eval-run rollups (SDLC 04) ${user_schema_prefix}_ops
knowledge_sources MANAGED volume (in agent schema) Source markdown / PDFs for Knowledge Assistants (F5) and any retrieval-style tool /Volumes/${catalog}/${agent_schema}/knowledge_sources
agent_outputs MANAGED volume (in agent schema) CSV exports, charts, generated artifacts produced by @function_tool calls /Volumes/${catalog}/${agent_schema}/agent_outputs
signoffs MANAGED volume (in ops schema) Stakeholder + engineering sign-off decision.md artifacts (SDLC 04b) — the durable, UC-governed location read by SDLC 06 promotion gates /Volumes/${catalog}/${ops_schema}/signoffs
Custom volumes (in agent or ops schema) Anything declared in state://AgentSpec.required_volumes[] /Volumes/${catalog}/${schema}/${name}

signoffs is generic. F0 pre-creates the volume under a workshop-neutral name (signoffs) so SDLC 04b never has to provision its own UC location and SDLC 06 can resolve /Volumes/${catalog}/${ops_schema}/signoffs/v<N>/decision.md deterministically. The volume name does not carry any use-case-specific prefix (no skyloyalty_signoffs, no ${use_case_slug}_signoffs); the per-version directory v<N>/ carries the disambiguation.

Every operation is idempotent (CREATE SCHEMA IF NOT EXISTS, CREATE VOLUME IF NOT EXISTSAlreadyExists is treated as success). It is safe to call this skill from any prompt that needs to be sure these resources exist; the cost on a warm workspace is one round-trip per resource.

What This Skill Does NOT Own

  • Catalog creation. A UC catalog must already exist; F0 reads state://Resources.uc.catalog and validates the caller has USE CATALOG + CREATE SCHEMA on it.
  • UC table DDL. Tables for OTeL traces, benchmarks, memory, or monitoring are owned by their respective consuming skills (F2, SDLC 02/04/07).
  • Source content. F0 creates an empty knowledge_sources volume; F5 (or any other consumer) is responsible for populating it with files.
  • Bronze data. Bronze schema + Delta tables are owned by the bronze layer setup skill (data_product_accelerator/skills/bronze/00-bronze-layer-setup).

When to Use

  • Always, near the top of the prompt sequence — before F1 (MLflow env) so that anything downstream can assume schemas + volumes exist.
  • Whenever you add a new consuming skill that needs a new volume — extend state://AgentSpec.required_volumes[] rather than putting CREATE VOLUME inside the consuming skill.
  • After bootstrap when the resolved spec is available, but before any other Foundation step.

Prerequisites

Requirement How to Check
state://Resources.uc.catalog is resolved Set by vibecoding-state bootstrap op
state://Resources.uc.user_schema_prefix is resolved Derived in bootstrap from the user identity + use_case_slug
Workshop SQL warehouse is RUNNING state://Resources.warehouse_id (set by Prompt 1 preflight) — needed for CREATE SCHEMA execution via WorkspaceClient.statement_execution
databricks-sdk >= 0.30 installed python -c "from databricks.sdk import WorkspaceClient; from databricks.sdk.service.catalog import VolumeType"
Caller has USE CATALOG + CREATE SCHEMA on the catalog databricks unity-catalog catalogs get $CATALOG --output json | jq '.privileges'

Inputs

# Resolved by the caller from state — F0 does not read state directly.
uc_catalog:           "string (required)"        # e.g. "main"
agent_schema:         "string (required)"        # default: "${user_schema_prefix}_agent"
ops_schema:           "string (required)"        # default: "${user_schema_prefix}_ops"
warehouse_id:         "string (required)"        # SQL warehouse for DDL execution
required_volumes:                                 # optional — defaults below if not provided
  - { name: "knowledge_sources", schema: "agent", comment: "KA + retrieval source files" }
  - { name: "agent_outputs",     schema: "agent", comment: "Tool-generated artifacts (CSV, charts)" }
  - { name: "signoffs",          schema: "ops",   comment: "SDLC 04b stakeholder + engineering signoff decision.md artifacts" }
  # AgentSpec.required_volumes[] is appended verbatim

Operations

flowchart TD
  start["F0 invoked"] --> grant["Verify USE CATALOG + CREATE SCHEMA"]
  grant --> agentSch["CREATE SCHEMA IF NOT EXISTS<br/>${catalog}.${agent_schema}"]
  grant --> opsSch["CREATE SCHEMA IF NOT EXISTS<br/>${catalog}.${ops_schema}"]
  agentSch --> volLoop["For each required_volume:<br/>create MANAGED volume idempotently"]
  opsSch --> volLoop
  volLoop --> capture["Capture { agent_schema, ops_schema, uc_volumes{name → path} }"]

Reference implementation

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.catalog import VolumeType
from databricks.sdk.errors import AlreadyExists

def provision_uc_resources(
    *,
    uc_catalog: str,
    agent_schema: str,
    ops_schema: str,
    warehouse_id: str,
    required_volumes: list[dict] | None = None,
) -> dict:
    w = WorkspaceClient()

    # 1. Schemas (DDL via statement execution against the workshop warehouse)
    for schema in (agent_schema, ops_schema):
        w.statement_execution.execute_statement(
            warehouse_id=warehouse_id,
            statement=(
                f"CREATE SCHEMA IF NOT EXISTS {uc_catalog}.{schema} "
                f"COMMENT 'F0-managed: agent or ops assets'"
            ),
            wait_timeout="30s",
        )

    # 2. Volumes (SDK; AlreadyExists is success)
    defaults = [
        {"name": "knowledge_sources", "schema": "agent",
         "comment": "KA + retrieval source files"},
        {"name": "agent_outputs", "schema": "agent",
         "comment": "Tool-generated artifacts (CSV, charts)"},
        {"name": "signoffs", "schema": "ops",
         "comment": "SDLC 04b stakeholder + engineering signoff decision.md artifacts"},
    ]
    volumes_spec = (required_volumes or []) + defaults
    seen, paths = set(), {}
    for v in volumes_spec:
        target_schema = agent_schema if v["schema"] == "agent" else ops_schema
        key = (target_schema, v["name"])
        if key in seen:
            continue
        seen.add(key)
        try:
            w.volumes.create(
                catalog_name=uc_catalog,
                schema_name=target_schema,
                name=v["name"],
                volume_type=VolumeType.MANAGED,
                comment=v.get("comment", "F0-managed volume"),
            )
        except AlreadyExists:
            pass
        paths[v["name"]] = f"/Volumes/{uc_catalog}/{target_schema}/{v['name']}"

    return {
        "agent_schema": agent_schema,
        "ops_schema": ops_schema,
        "uc_volumes": paths,
    }

The skill caller passes the result back to vibecoding-state exit so downstream skills can read state://Resources.uc_volumes.knowledge_sources etc.


Outputs (handoff)

Key Consumer Value
agent_schema F2, F5, SDLC 02/04, Track A 05 (memory) ${user_schema_prefix}_agent
ops_schema F2 (OTeL tables), SDLC 07 (monitoring), SDLC 04b (signoffs) ${user_schema_prefix}_ops
uc_volumes.knowledge_sources F5 (Step 5_0 — branch B/C upload), F3 (Vector Search MCP source dir) /Volumes/${catalog}/${agent_schema}/knowledge_sources
uc_volumes.agent_outputs Track A 03 csv_export @function_tool, any code-interpreter output /Volumes/${catalog}/${agent_schema}/agent_outputs
uc_volumes.signoffs (alias signoff_volume_path) SDLC 04b (writes decision.md), SDLC 06 (reads it as the promotion gate) /Volumes/${catalog}/${ops_schema}/signoffs
uc_volumes.<custom> Whichever consuming skill declared the volume /Volumes/${catalog}/${schema}/${name}

Validation Gate

  • ${uc_catalog}.${agent_schema} exists (SHOW SCHEMAS IN ${uc_catalog} LIKE '${agent_schema}')
  • ${uc_catalog}.${ops_schema} exists
  • Every volume in required_volumes is listable via databricks volumes list ${uc_catalog} ${schema} --output json | jq '.volumes[] | select(.name=="<name>")'
  • The generic signoffs volume in ${ops_schema} is present (consumed by SDLC 04b / SDLC 06 — name is workshop-neutral)
  • agent_schema, ops_schema, the full uc_volumes map, and signoff_volume_path are captured in state

Delegation Map

Need Delegate To
Bronze schema + Delta tables bronze/00-bronze-layer-setup
OTeL trace tables in ${ops_schema} F2
KA source markdown into knowledge_sources F5 Step 5_0 — branch (A) skip if non-empty / (B) upload ka_source / (C) auto-generate from PRD
Lakebase memory tables Track A 05
Benchmark tables in ${agent_schema} SDLC 02
Monitoring tables in ${ops_schema} SDLC 07

Limitations to Plan For

  • Catalog must exist. F0 does not create catalogs (org-level admin action).
  • Workspace warehouse required. Schema DDL goes through the SQL warehouse, not the Python serverless API.
  • No drop semantics. F0 is create-only — to remove a schema or volume, delete out of band; do not put DROP in any agent skill.
  • Volume size budgeting. Managed volumes inherit the catalog's storage location; if your agent_outputs may grow large, document a retention policy in the consuming skill.

Notes to Carry Forward

Key Produced By Value
agent_schema F0 ${user_schema_prefix}_agent
ops_schema F0 ${user_schema_prefix}_ops
uc_volumes F0 Map of name → /Volumes/${catalog}/${schema}/${name}
knowledge_source_path F0 (alias of uc_volumes.knowledge_sources) Path that F5 attaches as a files knowledge source
agent_outputs_path F0 (alias of uc_volumes.agent_outputs) Path that Track A 03's csv_export writes to
signoff_volume_path F0 (alias of uc_volumes.signoffs) Generic, use-case-neutral signoff volume path SDLC 04b writes decision.md into and SDLC 06 reads as the promotion gate. Always /Volumes/${catalog}/${ops_schema}/signoffs — never workshop-specific.

Next Step

After F0's gate passes, continue with F1: MLflow GenAI Foundation. Every subsequent skill can now assume schemas + volumes exist.

References


Version History

Version Date Changes
1.1.0 2026-04-26 Added generic, workshop-neutral signoffs MANAGED volume in ${ops_schema} as a default required_volume. SDLC 04b writes decision.md artifacts here and SDLC 06 reads them as the promotion gate. Captures signoff_volume_path (alias of uc_volumes.signoffs) so downstream skills resolve /Volumes/${catalog}/${ops_schema}/signoffs/v<N>/decision.md deterministically without per-use-case naming.
1.0.0 2026-04-24 Initial skill — extracted from F5 (05-knowledge-assistant) Step 5_0 so schema + volume provisioning is owned in one cross-cutting place. Consumers (F5, F2, Track A 05, SDLC 02/04/07) now delegate.
Install via CLI
npx skills add https://github.com/databricks-solutions/vibe-coding-workshop-template --skill 00-uc-resources-foundation
Repository Details
star Stars 4
call_split Forks 4
navigation Branch main
article Path SKILL.md
More from Creator
databricks-solutions
databricks-solutions Explore all skills →