name: generate-mdl description: "Generate a Wren MDL project by exploring a database with available tools (SQLAlchemy, database drivers, MCP connectors, or raw SQL). Guides agents through schema discovery, type normalization, and MDL YAML generation using the wren CLI. Use when: user wants to create or set up a new MDL, onboard a new data source, or scaffold a project from an existing database." license: Apache-2.0 metadata: author: wrenai
Generate Wren MDL — CLI Agent Workflow
Builds an MDL project by discovering database schema and converting it into Wren's YAML project format. The agent uses whatever database tools are available in its environment for introspection; the wren CLI handles type normalization, validation, and build.
For memory and query workflows after setup, run wren skills get usage.
Prerequisites
wrenCLI installed (pip install "wrenai[<datasource>]")- A working database connection (credentials available to the agent)
- A connection profile (set up via
wren profile add) or connection info ready
Phase 0 — Detect existing project
Goal: If the current directory is already inside a wren project, let the user decide how to proceed.
Check whether wren_project.yml exists in the current working directory
(or any parent up to the repository root). If found:
- Tell the user that an existing wren project was detected and show its path.
- Ask:
- Reset — wipe the existing project (
models/,views/,relationships.yml,instructions.md, and rebuildwren_project.yml) and regenerate from scratch in the same directory. - New path — keep the existing project untouched and choose a
different directory for the new project. Ask the user for the new path,
then
wren context init --path <new_path>and continue from Phase 1 using that path.
- Reset — wipe the existing project (
If no existing project is detected, proceed directly to Phase 1.
Phase 1 — Establish connection and scope
Goal: Confirm the agent can reach the database and agree on scope with the user.
Verify connectivity using whichever tool is available:
- If SQLAlchemy:
engine.connect()test - If database driver: simple query like
SELECT 1 - If wren profile exists:
wren profile debugto check config - If raw SQL via wren:
wren --sql "SELECT 1"(requires profile or connection file)
- If SQLAlchemy:
Ask the user:
- Which schema(s) or dataset(s) to include (skip if only one exists)
- Whether to include all tables or a subset
- The datasource type for wren (e.g.,
postgres(including Aurora),mysql(including Aurora),bigquery,snowflake) — needed for type normalization dialect
Phase 2 — Discover schema
Goal: Collect table names, column names, column types, and constraints.
Use whatever introspection method is available. Here are common approaches ranked by convenience:
Option A: SQLAlchemy (recommended if available)
from sqlalchemy import create_engine, inspect
engine = create_engine(connection_url)
inspector = inspect(engine)
tables = inspector.get_table_names(schema="public")
for table in tables:
columns = inspector.get_columns(table, schema="public")
# columns → [{"name": "id", "type": INTEGER(), "nullable": False, ...}]
pk = inspector.get_pk_constraint(table, schema="public")
# pk → {"constrained_columns": ["id"], "name": "orders_pkey"}
fks = inspector.get_foreign_keys(table, schema="public")
# fks → [{"constrained_columns": ["customer_id"],
# "referred_table": "customers",
# "referred_columns": ["id"]}]
Option B: Database-specific driver
- psycopg / asyncpg (Postgres): Query
information_schema.columnsandinformation_schema.table_constraints - google-cloud-bigquery:
client.list_tables(),client.get_table()→table.schema - snowflake-connector-python:
SHOW COLUMNS IN TABLE,SHOW PRIMARY KEYS IN TABLE - clickhouse-driver:
DESCRIBE TABLE,system.tables
Option C: Raw SQL via wren
If no driver is available but a wren profile is configured, query
information_schema through wren itself:
wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o json
Note: this goes through the MDL layer, so it only works if you already
have a minimal MDL or if the database supports information_schema as
regular tables. For bootstrapping from zero, Option A or B is preferred.
Phase 3 — Normalize types
Goal: Convert raw database types to wren-core-compatible types.
Python import (recommended for batch processing)
from wren.type_mapping import parse_type, parse_types
# Single type
normalized = parse_type("character varying(255)", "postgres") # → "VARCHAR(255)"
# Batch — entire table at once
columns = [
{"column": "id", "raw_type": "int8"},
{"column": "name", "raw_type": "character varying"},
{"column": "total", "raw_type": "numeric(10,2)"},
]
normalized_cols = parse_types(columns, dialect="postgres")
# Each dict now has a "type" key with the normalized value
CLI (if Python import not available)
Single type:
wren utils parse-type --type "character varying(255)" --dialect postgres
# → VARCHAR(255)
Batch (stdin JSON):
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
| wren utils parse-types --dialect postgres
Phase 4 — Scaffold and write MDL project
Goal: Create the YAML project structure.
Step 1 — Initialize project
wren context init --path /path/to/project
This creates:
project/
├── wren_project.yml
├── models/ # business-facing tables/models
├── views/ # named SQL statements
├── cubes/ # pre-aggregation cubes (measures + dimensions)
├── relationships.yml
└── instructions.md
When to define cubes: If the user asks aggregation questions like "revenue by month" or "top customers", define cubes alongside models — they give agents a structured query API instead of forcing them to hand-write
GROUP BY/DATE_TRUNCSQL. See the Cube guide.
IMPORTANT:
catalogandschemainwren_project.ymlThese are Wren Engine's internal namespace — they are NOT the database's native catalog or schema. Keep the defaults (
catalog: wren,schema: public) unless you are intentionally configuring a multi-project namespace.Your database's actual catalog/schema is specified per-model in
table_reference(see Step 2). Do not copy database catalog/schema values intowren_project.yml.
Step 2 — Write model files
For each table, create a YAML file under models/. Use snake_case
naming (the build step converts to camelCase automatically).
# models/orders/metadata.yml
name: orders
table_reference:
catalog: "" # database catalog (empty string if not applicable;
# for DuckDB, use the DB file name without extension,
# e.g. jaffle_shop.duckdb → catalog: jaffle_shop)
schema: public # database schema (this IS the DB schema)
table: orders # database table name
primary_key: order_id
columns:
- name: order_id
type: INTEGER
not_null: true
- name: customer_id
type: INTEGER
- name: total
type: "DECIMAL(10, 2)"
- name: status
type: VARCHAR
properties:
description: "Order status: pending, shipped, delivered, cancelled"
Step 3 — Write relationships
From foreign key constraints discovered in Phase 2:
# relationships.yml
- name: orders_customers
models:
- orders
- customers
join_type: many_to_one
condition: "orders.customer_id = customers.customer_id"
Join type mapping:
- FK table → PK table:
many_to_one - PK table → FK table:
one_to_many - Unique FK:
one_to_one - Junction table:
many_to_many
If no foreign keys were found, infer from naming conventions:
- Column
<table>_idor<table_singular>_id→ likely FK to<table> - Ask the user to confirm inferred relationships
Step 4 — Add descriptions (optional but valuable)
Ask the user to describe:
- Each model (1-2 sentences about what the table represents)
- Key columns (especially calculated fields or non-obvious names)
These descriptions are indexed by wren memory index and significantly
improve LLM query accuracy.
Phase 5 — Validate and build
# Validate YAML structure and integrity
wren context validate --path /path/to/project
# If strict mode is desired:
wren context validate --path /path/to/project --strict
# Build JSON manifest
wren context build --path /path/to/project
# Verify against database
wren --sql "SELECT * FROM <model_name> LIMIT 1"
If validation fails, fix the reported issues and re-run. Common errors:
- Duplicate model/column names
- Missing primary key
- Relationship referencing non-existent model
- Invalid column type (try re-running through
parse_type)
Phase 6 — Initialize memory
# Index schema (generates seed NL-SQL examples automatically)
wren memory index
# Verify
wren memory status
After this step, wren memory fetch and wren memory recall are
operational. See wren skills get usage for query workflows.
Phase 7 — Iterate with the user
The initial MDL is a starting point. Improve it by:
- Adding calculated columns based on business logic
- Adding views for common query patterns
- Refining descriptions based on actual query usage
- Adding access control (RLAC/CLAC) if needed
Each change follows: edit YAML → wren context validate →
wren context build → wren memory index.
Quick reference
| Task | Command / Method |
|---|---|
| Discover tables | Agent's own tools (SQLAlchemy, driver, raw SQL) |
| Discover columns + types | Agent's own tools |
| Discover constraints | Agent's own tools |
| Normalize types (Python) | from wren.type_mapping import parse_type |
| Normalize types (CLI) | wren utils parse-type --type T --dialect D |
| Normalize types (batch) | wren utils parse-types --dialect D < columns.json |
| Scaffold project | wren context init |
| Write models | Create models/<name>/metadata.yml |
| Write relationships | Edit relationships.yml |
| Validate | wren context validate |
| Build manifest | wren context build |
| Test query | wren --sql "SELECT * FROM <model> LIMIT 1" |
| Index memory | wren memory index |
Things to avoid
- Do not hardcode database-specific type strings in MDL — always normalize via
parse_type - Do not skip validation before build — invalid YAML produces broken manifests silently
- Do not guess column types — introspect from the actual database
- Do not write relationships without confirming join conditions — wrong conditions cause silent query errors
- Do not skip
wren memory indexafter build — stale indexes degrade recall quality