sql-workflow

name: sql-workflow description: "Use this skill before writing any SQL query. Covers: output shape inference (cardinality clues from the question), efficient schema exploration, iterative CTE-based query building, structured verification loop (row count, NULL audit, fan-out check, sample inspection), error recovery protocol, saving output to result.sql and result.csv, turn budget management, and common benchmark traps." type: skill

SQL Workflow Skill

0. Load Knowledge Base Context FIRST

Project-specific conventions, decisions, and quirks live in the Knowledge Base. Always consult before exploring the schema.

Step 0a — `get_knowledge`

Call once at the start of every task with a 1-line task_description. Returns the always-loaded baseline (org/project understanding + conventions) plus up to 5 task-relevant decisions/debugging/quirks. Treat the returned ## title blocks as authoritative for naming, joins, and known traps.

Step 0b — `search_knowledge(query=...)`

Call when you hit something unexpected — column meaning unclear, ambiguous join, surprising row count. Pass a 2–4 word query. It is a pure read — no side effects.

Step 0c — `propose_knowledge`

Call ONLY after you have completed work and verified a finding. Use it to record:

category="decisions" for choices made (auto-accepted).
category="debugging" for root-cause traps you hit and resolved (auto-accepted).
category="quirks" (scope=connection) for connector/dialect oddities (auto-accepted).
Do NOT propose understanding — humans only. Do NOT propose conventions or domain-rules as part of automated runs unless explicitly asked (these queue for human review).
Title must be a slug (^[a-z0-9-]+$, ≤120 chars). Body is markdown.
On duplicate-title: re-call with overwrite=true only if the prior doc is genuinely outdated.

What NOT to do

Do not paste raw KB text back into SQL comments — reference the doc title instead.
Do not call propose_knowledge mid-exploration — only after success.

1. Schema Exploration — Do This First

Before writing any SQL, load KB context (Phase 0 above) then understand the data:

Read local schema files first (if schema/ directory exists in workdir):
- schema/DDL.csv — all CREATE TABLE statements (if it exists)
- schema/{table_name}.json — column names, types, descriptions, sample values Reading these files costs zero tool calls and gives you table structure + sample data. Only call MCP tools for information not in the local files (e.g., row counts, live data exploration).
Call list_tables to get all schemas and tables — only if no local schema files exist or you need row counts.
Call describe_table on the tables that seem relevant to the question (only if JSON files lack detail)
Call explore_column on categorical columns to see distinct values (for filtering/grouping)
Call find_join_path if you need to join tables and the relationship is unclear

Stop exploring after 3-5 tool calls. Write SQL based on what you've found.

2. Output Shape Inference — Before Writing SQL

Read the task question carefully for cardinality clues:

"for each X" → GROUP BY X, one output row per X
"top N" / "top 5" → LIMIT N or QUALIFY RANK() <= N
"total / sum / average" → single row aggregate
"list all" → detail rows, no aggregation
"how many" → COUNT, result is 1 row 1 column

Write a comment at the top of your SQL:

-- EXPECTED: <row count estimate> rows because <reason from question>

Critical checks:

If the question asks for a single number, the result MUST be 1 row × 1 column
If the question says "how many", verify the CSV has exactly 1 row with a COUNT value
If "top N" appears in the question, verify the CSV has at most N rows

3. Iterative Query Building — Build Bottom-Up

Do NOT write a 50-line query and run it all at once:

Write the innermost subquery or first CTE first
Run it standalone with query_database — verify row count and sample values
Add the next CTE, verify again
Continue until the full query is built

Example incremental pattern:

-- Step 1: verify source
SELECT COUNT(*) FROM orders WHERE status = 'completed';

-- Step 2: verify join partner cardinality
SELECT COUNT(*), COUNT(DISTINCT customer_id) FROM orders;

-- Step 3: build first CTE, verify
WITH order_totals AS (
  SELECT customer_id, SUM(amount) AS total
  FROM orders
  GROUP BY customer_id
)
SELECT COUNT(*), COUNT(DISTINCT customer_id) FROM order_totals;

-- Step 4: add final aggregation

4. Execution and Structured Verification

mcp__signalpilot__query_database
  connection_name="<task_connection_name>"
  sql="SELECT ..."

After executing, run these checks IN ORDER before saving:

Row count sanity: Does 0 rows make sense? Does 1M rows make sense for a "top 10" question?
Column count: Does the result have the right number of columns for the question?
NULL audit: For each key column — unexpected NULLs indicate wrong JOINs:
```
SELECT COUNT(*) - COUNT(col) AS nulls FROM (your_query) t
```
Sample inspection: Look at 5 rows — are values in expected ranges? Do string columns have meaningful values (not join keys)?
Fan-out check: If JOINing, compare COUNT(*) vs COUNT(DISTINCT primary_key):
```
SELECT COUNT(*) AS total_rows, COUNT(DISTINCT <pk>) AS unique_keys FROM (your_query) t;
```
If they differ, you have duplicate rows from a fan-out JOIN.
Re-read the question: Does your output actually answer what was asked?

5. Error Recovery Protocol

Syntax error: Use validate_sql before query_database to catch errors without burning a query turn
Wrong results: Do NOT just re-run the same query. Diagnose: which JOIN is wrong? Which filter is too aggressive?

Zero rows: Binary-search your WHERE conditions — remove them one at a time to find the culprit:

SELECT COUNT(*) FROM table WHERE cond_1;             -- still same? keep it
SELECT COUNT(*) FROM table WHERE cond_1 AND cond_2;  -- drops? cond_2 is culprit

Too many rows: Check for fan-out (duplicate join keys) or missing GROUP BY
CTE debugging: Use debug_cte_query to run each CTE independently and find which step breaks

6. Saving Output

Once you have the correct result:

Write final SQL to result.sql:

Write tool: path="result.sql", content="<your SQL query>"

Write the result as CSV to result.csv:
```
Write tool: path="result.csv", content="col1,col2,...\nval1,val2,..."
```
- Always include a header row with column names
- Use comma as delimiter
- Quote string values that contain commas or newlines

7. Turn Budget Management

First 3 turns: Schema exploration only (schema_overview, describe_table on 2-3 tables, explore_column on key categorical columns). STOP exploring.
Turns 4 through (N-3): Write query iteratively — execute and verify each step.
Last 3 turns: Finalize result.sql and result.csv. If you have a working query, SAVE IT NOW — do not keep iterating.

If your query works and passes all verification checks, SAVE IMMEDIATELY — do not continue exploring "just in case".

8. Common Benchmark Traps

Rounding: Do NOT round unless the question explicitly asks for rounded values. The evaluator uses tolerance-based comparison — full precision is always safer.
Column naming: Match the question's phrasing exactly. If the question says "total revenue", name the column total_revenue, not sum_revenue or revenue_total.
CSV format: No trailing newline, no BOM, comma delimiter, double-quote strings containing commas.
Empty result: If the correct answer is 0 or empty, write a CSV with just the header row (or header + "0").
Date/time format in CSV: Use ISO 8601 (YYYY-MM-DD) unless the question specifies otherwise.
String case in CSV: Preserve the case from the database — do not uppercase/lowercase unless the question explicitly asks.
Fan-out from JOINs: Always check COUNT(*) vs COUNT(DISTINCT key) after every JOIN
Wrong NULL handling: Use IS NULL / IS NOT NULL, not = NULL
Date format mismatch: Check the actual format stored in the column with explore_column
Case sensitivity: Use the correct case-insensitive function for your backend
Interpretation errors: Before saving, re-read the original question. Verify:
- Filter conditions match domain values (check with explore_column if unsure)
- "Excluding X" means the right thing (NOT IN vs EXCEPT vs WHERE NOT)
- Metrics match domain definitions (e.g., "scored points" in F1 = points > 0, not just participated)