name: databricks-execution-compute description: >- Execute code and manage compute on Databricks. Use this skill when the user mentions: "run code", "execute", "run on databricks", "serverless", "no cluster", "run python", "run scala", "run sql", "run R", "run file", "push and run", "notebook run", "batch script", "model training", "run script on cluster", "create cluster", "new cluster", "resize cluster", "modify cluster", "delete cluster", "terminate cluster", "create warehouse", "new warehouse", "resize warehouse", "delete warehouse", "node types", "runtime versions", "DBR versions", "spin up compute", "provision cluster".
Databricks Execution & Compute
Run code on Databricks. Three execution modes—choose based on workload.
Execution Mode Decision Matrix
| Aspect | Databricks Connect ⭐ | Serverless Job | Interactive Cluster |
|---|---|---|---|
| Use for | Spark code (ETL, data gen) | Heavy processing (ML) | State across tool calls, Scala/R |
| Startup | Instant | ~25-50s cold start | ~5min if stopped |
| State | Within Python process | None | Via context_id |
| Languages | Python (PySpark) | Python, SQL | Python, Scala, SQL, R |
| Dependencies | withDependencies() |
CLI with environments spec | Install on cluster |
Decision Flow
Spark-based code? → Databricks Connect (fastest)
└─ Python 3.12 missing? → Install it + databricks-connect
└─ Install fails? → Ask user (don't auto-switch modes)
Heavy/long-running (ML)? → Serverless Job (independent)
Need state across calls? → Interactive Cluster (list and ask which one to use)
Scala/R? → Interactive Cluster (list and ask which one to use)
How to Run Code
Read the reference file for your chosen mode before proceeding.
Databricks Connect (no MCP tool, run locally) → reference
python my_spark_script.py
Serverless Job → reference
execute_code(file_path="/path/to/script.py")
Interactive Cluster → reference
# Check for running clusters first (or use the one instructed)
list_compute(resource="clusters")
# Ask the customer which one to use
# Run code, reuse context_id for follow-up MCP call
result = execute_code(code="...", compute_type="cluster", cluster_id="...")
execute_code(code="...", context_id=result["context_id"], cluster_id=result["cluster_id"])
MCP Tools
| Tool | For | Purpose |
|---|---|---|
execute_code |
Serverless, Interactive | Run code remotely |
list_compute |
Interactive | List clusters, check status, auto-select running cluster |
manage_cluster |
Interactive | Create, start, terminate, delete. COSTLY: start takes 3-8 min—ask user |
manage_sql_warehouse |
SQL | Create, modify, delete SQL warehouses |
Related Skills
- databricks-synthetic-data-gen — Data generation using Spark + Faker
- databricks-jobs — Production job orchestration
- databricks-dbsql — SQL warehouse and AI functions