name: databricks-ads-session description: Conduct Azure Databricks Architecture Design Sessions (ADS). Orchestrate a structured, multi-turn conversation to gather solution requirements from users across any industry or starting point (on-prem migration, IoT, data warehousing, ML/AI, streaming, etc.), then generate a Databricks-centric architecture diagram as PNG. Use when the user wants to (1) design an Azure Databricks solution, (2) run an architecture design session, (3) scope a Databricks migration or greenfield project, (4) gather requirements for a data platform, or (5) generate an Azure Databricks architecture diagram from requirements. license: MIT compatibility: Works with Claude Code, GitHub Copilot, VS Code, Cursor, and any Agent Skills compatible tool. PNG export requires Node.js (npx @mermaid-js/mermaid-cli). Fallback is Mermaid in Markdown preview. metadata: author: community version: "3.0" domain: Azure Databricks
Azure Databricks ADS Session
This skill provides domain-specific knowledge for Azure Databricks to be used within an Architecture Design Session. The ADS methodology (persona, pacing, session structure, decision narration, trade-off framework, self-critique) is defined in the runtime system prompt — this skill supplies the Databricks-specific questions, patterns, components, and references that the methodology operates on.
Domain: Azure Databricks
This skill covers the Azure Databricks data platform including:
- Data Engineering: LakeFlow Connect, LakeFlow Jobs, LakeFlow Spark Declarative Pipelines (DLT), Auto Loader, Structured Streaming, Apache Flink, Delta Lake
- Data Warehousing: SQL Warehouse (Serverless/Pro/Classic), Lakehouse Federation, dbt integration, materialized views
- AI/ML: Mosaic AI (Model Serving, Feature Store, AI Gateway), MLflow 3.0, serverless GPU compute, distributed training
- GenAI: Mosaic AI Agent Framework, Agent Bricks, Vector Search, MCP Servers, AI Gateway with guardrails
- Governance: Unity Catalog, Delta Sharing, ABAC, column-level masking, data lineage, Compatibility Mode
- Infrastructure: Serverless Workspace, Classic VNet-injected workspace, ADLS Gen2, Azure Key Vault, Microsoft Entra ID
Phase-Specific Databricks Questions
Phase 1: Context Discovery
Ask about:
- Business problem or opportunity driving this initiative
- Industry and regulatory context
- Greenfield project vs. migration from existing system
- Key stakeholders and decision-makers
- Timeline and budget constraints
- Success criteria (what does "done" look like?)
- KPIs, latency targets, and cost envelope
Adapt: If user mentions migration, read references/migration-patterns.md. If user names a specific industry, read references/industry-templates.md for starter context.
Phase 2: Current Landscape
Ask about:
- Data sources (databases, APIs, files, streams, SaaS platforms)
- Current data platform (if migrating: Hadoop, Snowflake, on-prem SQL, etc.)
- Data volumes and growth rate
- Real-time vs. batch requirements
- Data governance and cataloging needs (Unity Catalog considerations)
- Sensitive data classification (PII, PHI, financial)
- Unstructured data (documents, PDFs, images, audio) for AI processing
See references/probing-questions.md for deep-dive question banks when the user's answers are vague or incomplete.
Phase 3: Security & Networking
Ask about:
- Network topology (VNet injection, private endpoints, hub-spoke)
- Identity provider (Entra ID, federation, SCIM provisioning)
- Data access control model (table-level, row-level, column-level, attribute-based access control / ABAC)
- Regulatory compliance (HIPAA, SOC2, GDPR, FedRAMP, industry-specific)
- Encryption requirements (at-rest, in-transit, customer-managed keys)
- Secrets management approach
Phase 4: Operational Requirements
Ask about:
- HA/DR requirements and RPO/RTO targets
- Multi-region or single-region deployment
- Cost optimization priorities (reserved capacity, spot instances, serverless compute, FinOps strategy)
- Workspace deployment model (Serverless Workspace vs Classic with VNet injection)
- Monitoring and alerting requirements
- Environment strategy (dev/staging/prod workspace separation)
- Tagging and cost allocation strategy
- Operating model: who owns data products, who approves access, who triages incidents
Use references/trade-offs-and-failure-modes.md for domain-specific failure scenarios to raise proactively during this phase.
Readiness Gate
After each phase, internally track information completeness using references/readiness-checklist.md.
Decision Logic
IF all must-have items are gathered:
→ Proceed to diagram generation
IF some should-have items are missing:
→ State assumptions explicitly, ask user to confirm or correct
IF must-have items are missing:
→ Ask targeted follow-up questions (max 2 additional turns)
IF user says "just generate something" or expresses impatience:
→ Generate with sensible defaults, document all assumptions
Always tell the user what you know and what you're assuming before generating.
Databricks Diagram Components
When generating diagrams, use these Databricks-specific node shapes (extends the generic style guide from the architecture-diagramming skill):
| Component Type | Shape | Mermaid Syntax | Example |
|---|---|---|---|
| Databricks Workspace | Rectangle | [Name] |
[Databricks Workspace] |
| Delta Tables / ADLS | Cylinder | [(Name)] |
[(ADLS Gen2)], [(Delta Tables)] |
| Unity Catalog | Rectangle | [Name] |
[Unity Catalog] |
| Security (Key Vault, Entra ID) | Rounded | (Name) |
(Azure Key Vault), (Microsoft Entra ID) |
| Networking (VNet, ExpressRoute) | Hexagon / Stadium | {{Name}} / ([Name]) |
{{Hub VNet}}, ([ExpressRoute]) |
| External / On-Prem Systems | Double-bordered | [[Name]] |
[[On-Prem HDFS]] |
Pattern Selection
Match gathered requirements to a Databricks architecture pattern. Read references/databricks-patterns.md for the pattern catalog.
Common patterns:
| Use Case | Pattern |
|---|---|
| Data lakehouse (general) | Medallion architecture with Unity Catalog |
| On-prem migration | Lift-and-shift → modernize with Delta Lake |
| Real-time analytics | Structured Streaming + Apache Flink + LakeFlow Spark Declarative Pipelines |
| ML/AI platform | Feature Store + MLflow 3.0 + Mosaic AI + Model Serving + Serverless GPU Compute |
| GenAI / AI agents | Mosaic AI Agent Framework + Agent Bricks + Vector Search + AI Gateway + MCP Servers |
| Business analytics | Databricks One + AI/BI Genie + SQL Warehouse |
| Data warehouse replacement | SQL Warehouse + dbt + Lakehouse Federation |
| IoT data platform | Event Hubs → Databricks Streaming → Delta |
| Multi-team data mesh | Unity Catalog + workspace per domain + Delta Sharing |
| Hybrid batch + streaming | LakeFlow Connect + LakeFlow Jobs + Structured Streaming + Flink |
Generating the Diagram
Generate a Mermaid flowchart diagram based on the gathered requirements. Use the pattern templates in references/databricks-patterns.md as a starting point, then customize based on the specific requirements gathered.
Follow the architecture-diagramming skill's style guide for general Mermaid conventions (arrow styles, subgraph naming, layout direction). Apply the Databricks-specific node shapes listed above.
For rendering, Architecture Recap format, and iteration workflow, defer to the architecture-diagramming skill.
Optional: Workload Profiling
These questions are not a mandatory phase — the customer may or may not raise workload-specific topics during the session. Have this content ready to deploy when the conversation naturally moves toward workloads, but do not force it as a separate phase.
If the customer discusses workloads, ask about:
- ETL/ELT pipelines (complexity, frequency, SLAs)
- ML/AI workloads (training, inference, MLOps maturity, Mosaic AI)
- GenAI applications (RAG, chatbots, AI agents, document intelligence, Mosaic AI Agent Bricks)
- BI/reporting and self-service analytics (tools, user count, concurrency, AI/BI Genie)
- Streaming/real-time analytics requirements
- SQL analytics workloads (ad-hoc queries, dashboards)
- Data application hosting needs (Databricks Apps, custom UIs)
- Notebook/interactive development needs
- CI/CD and DevOps practices for data engineering (DABs, Azure DevOps, GitHub Actions)
If a workload area warrants deeper exploration, offer a Technical Deep-Dive using references/technical-deep-dives.md.
Databricks Expertise
You know Databricks inside-out — the tradeoffs between serverless and provisioned, when LakeFlow Connect beats ADF, why Liquid Clustering replaced partitioning. Connect technical decisions to business outcomes: "LakeFlow Spark Declarative Pipelines" isn't just a product name — it's fewer pipeline engineers and faster time-to-insight. Translate tech into value.
Reference Files
| File | Load When |
|---|---|
| references/conversation-framework.md | Understanding the full phase detail, signal detection, and transition logic |
| references/databricks-patterns.md | Selecting architecture pattern before diagram generation |
| references/readiness-checklist.md | Evaluating if enough info has been gathered |
| references/industry-templates.md | User mentions a specific industry vertical |
| references/migration-patterns.md | User is migrating from an existing platform |
| references/probing-questions.md | User gives vague answers, need to dig deeper |
| references/trade-offs-and-failure-modes.md | Trade-off analysis or failure mode walkthrough needed |
| references/technical-deep-dives.md | User accepts a technical deep-dive (spike) on a workload topic |
Scripts
| Script | Purpose |
|---|---|
| scripts/generate_architecture.py | Generate Mermaid diagram code for a given Databricks architecture pattern |