dsql

name: dsql description: "Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, load data, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL and PostgreSQL-to-DSQL schema conversion, FK replacement code generation, OCC retry patterns, ORM migration (Django/Hibernate/Rails), DDL operations, query plan explainability, SQL compatibility validation, and bulk data loading. Triggers on phrases like: DSQL, Aurora DSQL, distributed SQL database, serverless PostgreSQL-compatible database, migrate to DSQL, DSQL query plan, DSQL EXPLAIN ANALYZE, DSQL ENUM, DSQL foreign key, DSQL OCC retry, DSQL multi-region, DSQL JSONB, DSQL GIN index, load into DSQL, load CSV into DSQL, bulk load DSQL, aurora-dsql-loader." license: Apache-2.0 metadata: tags: aws, aurora, dsql, distributed-sql, distributed, distributed-database, database, serverless, serverless-database, postgresql, postgres, sql, schema, migration, multi-tenant, iam-auth, aurora-dsql, mcp, orm, enum, foreign-key, occ-retry, django, hibernate, rails, multi-region, schema-conversion, type-mapping, data-loading

Amazon Aurora DSQL Skill

Aurora DSQL is a serverless, PostgreSQL-compatible distributed SQL database. This skill covers direct query execution via MCP tools, schema management, migrations, multi-tenant isolation, IAM auth, and bulk data loading via aurora-dsql-loader.

Reference Files

Load these files as needed for detailed guidance:

Core:

Reference	When to Load	Contains
development-guide.md	ALWAYS before schema changes or DB operations	Best practices, DDL rules, transaction limits, app-layer referential integrity
language.md	MUST load for language-specific choices	Driver selection, DSQL Connectors, connection code
access-control.md	MUST load for roles, grants, or sensitive data	Scoped role setup, IAM-to-database role mapping
troubleshooting.md	SHOULD load for errors or unexpected behavior	OCC errors, connection failures, cluster state errors, token expiry, DDL rejection causes
dsql-examples.md	Load for implementation examples	Multi-tenant schema examples, batch operations, FK validation patterns, connection pooling
onboarding.md	User requests "Get started with DSQL"	Interactive step-by-step guide
occ-retry-patterns.md	MUST load for OCC retry code or conflict mitigation	DSQL Connectors, manual retry pattern, idempotent design

MCP:

Reference	When to Load	Contains
mcp-setup.md	Always for MCP server guidance	Setup instructions, 2 configuration options
mcp-tools.md	For MCP tool syntax and examples	Tool parameters, input validation
dsql-lint.md	MUST load before running `dsql_lint` or processing external SQL	Tool reference, fix statuses, unfixable error resolution

DDL Migrations:

Reference	When to Load	Contains
ddl-migrations/overview.md	MUST load for DROP COLUMN, ALTER TYPE, DROP CONSTRAINT	Table recreation pattern, verify & swap
ddl-migrations/column-operations.md	DROP COLUMN, ALTER TYPE, SET/DROP NOT NULL/DEFAULT	Column-level migration patterns
ddl-migrations/constraint-operations.md	ADD/DROP CONSTRAINT, MODIFY PRIMARY KEY	Constraint and structural changes
ddl-migrations/batched-migration.md	Tables exceeding 3,000 rows	Batching patterns, progress tracking

MySQL Migrations:

Reference	When to Load	Contains
mysql-migrations/type-mapping.md	MUST load for MySQL → DSQL migration	Data type mappings, feature alternatives
mysql-migrations/ddl-operations.md	Translating MySQL DDL to DSQL	AUTO_INCREMENT, ENUM, SET, FK patterns
mysql-migrations/full-example.md	Complete MySQL table migration	End-to-end example with decision summary

PostgreSQL Migrations:

Reference	When to Load	Contains
pg-migrations/type-mapping.md	MUST load for PG → DSQL type questions	C collation rules, NUMERIC precision, JSON/JSONB
pg-migrations/fk-replacement.md	MUST load for FK validation code generation	Tenant-scoped validate_fk_*() template, cascade
pg-migrations/index-conversion.md	MUST load for unfixable index diagnostics	GIN/GiST/BRIN → btree, partial, expression indexes
pg-migrations/schema-objects.md	MUST load for ENUM, materialized views, extensions, multi-schema	ENUM → CHECK, views, role/IAM mapping
pg-migrations/multi-region.md	Multi-region, active-active, or HA questions	Architecture, geographic partitioning

ORM Guides:

Reference	When to Load	Contains
orm-guides/overview.md	Migrating any ORM to DSQL	Adapter names, key gotchas for Django/Hibernate/Rails/SQLAlchemy

Data Loading:

Reference	When to Load	Contains
data-loading.md	Planning or running bulk loads with `aurora-dsql-loader`	Fresh-vs-warm partitions, resume/retry, `--on-conflict` semantics, throughput diagnostics

Query Plan Explainability:

Reference	When to Load	Contains
query-plan/plan-interpretation.md	MUST load at Workflow 9 Phase 0	DSQL node types, Node Duration math, estimation-error bands
query-plan/catalog-queries.md	MUST load at Workflow 9 Phase 0	`pg_class`/`pg_stats`/`pg_indexes` SQL, correlated-predicate verification
query-plan/guc-experiments.md	MUST load at Workflow 9 Phase 0	GUC experiment procedures, 30-second skip protocol
query-plan/report-format.md	MUST load at Workflow 9 Phase 0	Required report structure, element checklist, support request template

MCP Tools Available

The aurora-dsql MCP server provides these tools:

Database Operations:

readonly_query - Execute SELECT queries (returns list of dicts)
transact - Execute DDL/DML statements in transaction (takes list of SQL statements)
get_schema - Get table structure for a specific table

SQL Validation:

dsql_lint - Validate SQL for DSQL compatibility and optionally auto-fix issues. Use before executing externally-sourced SQL.

Documentation & Knowledge:

dsql_search_documentation - Search Aurora DSQL documentation
dsql_read_documentation - Read specific documentation pages
dsql_recommend - Get DSQL best practice recommendations

Note: There is no list_tables tool. Use readonly_query with information_schema.

See mcp-setup.md for detailed setup instructions. See mcp-tools.md for detailed usage and examples.

AWS Knowledge MCP (`awsknowledge`)

Consult for verifying DSQL service limits before advising users. The numeric limits below are defaults that may change — when a user's decision depends on an exact limit, verify it first:

Limit	Default	Verify query
Max rows per transaction	3,000	`aurora dsql transaction limits`
Max data size per transaction	10 MiB	`aurora dsql transaction limits`
Max transaction duration	5 minutes	`aurora dsql transaction limits`
Max connections per cluster	10,000	`aurora dsql connection limits`
Auth token expiry	15 minutes	`aurora dsql authentication token`
Max connection duration	60 minutes	`aurora dsql connection limits`
Max indexes per table	24	`aurora dsql index limits`
Max columns per index	8	`aurora dsql index limits`
IDENTITY/SEQUENCE CACHE values	1 or >= 65536	`aurora dsql sequence cache`
Supported column data types	See docs	`aurora dsql supported data types`

When to verify: Before recommending batch sizes, connection pool settings, or schema designs where hitting a limit would cause failures; any time the exact number can affect user decision.

Fallback: If awsknowledge is unavailable, use the defaults above and flag that limits should be verified against DSQL documentation.

CLI Scripts Available

Bash scripts in scripts/ for cluster management (create, delete, list, cluster info), psql connection, and bulk data loading from local/s3 csv/tsv/parquet files. See scripts/README.md for usage and hook configuration.

Quick Start

Explore: Use readonly_query with information_schema to list tables. Use get_schema for table structure.
Query: Use readonly_query for SELECT queries. MUST include tenant_id in WHERE for multi-tenant apps. MUST build SQL with safe_query.build().
Schema changes: Use transact with one DDL per transaction. MUST batch DML under 3,000 rows. MUST use CREATE INDEX ASYNC in a separate call. Use dsql_lint to validate first.
Bulk load data: Use aurora-dsql-loader for CSV/TSV/Parquet. Load data-loading.md for details. Use --dry-run first.

Common Workflows

Workflow 1: Create Multi-Tenant Schema

Create main table with tenant_id column using transact
Create async index on tenant_id in separate transact call
Create composite indexes for common query patterns (separate transact calls)
Verify schema with get_schema

MUST include tenant_id in all tables
MUST use CREATE INDEX ASYNC exclusively
MUST issue each DDL in its own transact call: transact(["CREATE TABLE ..."])
MUST serialize arrays as JSONB; expand at query time with jsonb_array_elements_text(data)

Workflow 2: Safe Data Migration

MUST validate every DDL with dsql_lint(fix=true) before executing. DML does not require linting.

Validate DDL with dsql_lint(sql=..., fix=true) — handle diagnostics per dsql-lint.md
Add column: transact(["ALTER TABLE ... ADD COLUMN ..."])
Populate existing rows with UPDATE (batched under 3,000 rows)
Verify with readonly_query COUNT
Create index if needed: validate then transact(["CREATE INDEX ASYNC ..."])

MUST issue each ALTER TABLE in its own transact call — DSQL rejects multi-DDL transactions with multiple ddl statements not supported in a transaction
MUST add column with only name and type; apply DEFAULT via separate UPDATE
MUST batch updates under 3,000 rows in separate transact calls

Recovery: Resume failed batches by filtering WHERE new_column IS NULL.

Workflow 3: Bulk Data Loading

Use aurora-dsql-loader for CSV, TSV, or Parquet loads. MUST load data-loading.md before advising on throughput or diagnosing slow loads.

Validate with --dry-run first
Run with --manifest-dir on persistent storage (not /tmp — tmpfs on AL2023, lost on crash) and --header if file has a header row
On failure: resume with --resume-job-id; for duplicates use --on-conflict do-nothing
For large tables: create secondary indexes after load using CREATE INDEX ASYNC

Workflow 4: Application-Layer Referential Integrity

INSERT: MUST validate parent exists with readonly_query → throw error if not found → insert child with transact.

DELETE: MUST check dependents with readonly_query COUNT → return error if dependents exist → delete with transact if safe.

Workflow 5: Query with Tenant Isolation

MUST authorize the caller against the tenant — format validation does not establish authorization
MUST build SQL with safe_query.build() — use allow()/regex() for values (emits 'v'), ident() for table/column names (emits "v"). See input-validation.md
MUST include tenant_id in the WHERE clause; reject cross-tenant access at the application layer

Workflow 6: Set Up Scoped Database Roles

MUST load access-control.md for role setup, IAM mapping, and schema permissions.

Workflow 7: Table Recreation DDL Migration

Use the Table Recreation Pattern for ALTER COLUMN TYPE, DROP COLUMN, DROP CONSTRAINT, or MODIFY PRIMARY KEY. This is a destructive workflow that requires user confirmation at each step. Every generated DDL in the pattern (CREATE new, INSERT ... SELECT, DROP old, RENAME) MUST be validated with dsql_lint(sql=..., fix=true) before execution.

MUST load ddl-migrations/overview.md before attempting any of these operations.

Workflow 8: Validate and Migrate to DSQL

MUST load dsql-lint.md before running dsql_lint. Run dsql_lint(sql=source_sql, fix=true) to validate and auto-convert. For MySQL-origin SQL, MUST cross-check against mysql-migrations/type-mapping.md even when lint returns clean. On parse_error, fall back to manual conversion then re-lint.

Workflow 9: Query Plan Explainability

Explains why the DSQL optimizer chose a particular plan. Triggered by slow queries, high DPU, unexpected Full Scans, or plans the user doesn't understand. REQUIRES a structured Markdown diagnostic report as the deliverable.

MUST load all four reference files at Phase 0: query-plan/plan-interpretation.md, query-plan/catalog-queries.md, query-plan/guc-experiments.md, query-plan/report-format.md. The phase procedures (capture plan, gather evidence, experiment, produce report) are defined in those files.

Safety. Plan capture uses readonly_query exclusively. Rewrite DML to SELECT for plan capture. MUST NOT use transact --allow-writes for plan capture.

Workflow 10: Full PostgreSQL → DSQL Schema Migration

MUST load pg-migrations/type-mapping.md and pg-migrations/schema-objects.md. Run dsql_lint(fix=true) first for mechanical fixes, then apply semantic conversions from the pg-migrations references for unfixable diagnostics and patterns the linter cannot handle. Re-lint the final output before deploying.

Workflow 11: ORM Migration (Django/Hibernate/Rails)

Load orm-guides/overview.md for adapter names and framework-specific gotchas.

Error Scenarios

awsknowledge returns no results: Use the default limits in the table above and note that limits should be verified against DSQL documentation.
dsql_lint unavailable or timing out: See the Error Handling section of dsql-lint.md. Do not silently skip validation — inform the user and require explicit confirmation before proceeding with manual rules from development-guide.md.
OCC serialization error: Retry the transaction. If persistent, check for hot-key contention — see troubleshooting.md.
Transaction exceeds limits: Split into batches under 3,000 rows — see batched-migration.md.
Token expiration mid-operation: Generate a fresh IAM token — see authentication-guide.md. See troubleshooting.md for other issues.