hdbalembic

name: hdb:alembic description: Diagnose and fix Alembic migration issues — broken chains, multiple heads, failed downgrades, and CI gate failures

hdb:alembic

Diagnose and fix Alembic migration problems using invariants and patterns proven in production projects.

Usage

/hdb:alembic <problem description or task>

Description

Alembic migrations are a directed acyclic graph (DAG) where each revision points to its parent via down_revision. Most migration problems come from violating the invariants of this graph. This skill encodes those invariants, the diagnostic commands to detect violations, and the fixes for every common failure mode.

Instructions

When the user invokes /hdb:alembic <problem description>:

Phase 1: Diagnose

Gather state. Run these commands to understand the current migration state:

# Current heads (should be exactly 1 in most projects)
cd backend && uv run alembic heads

# Current database revision (what's applied)
cd backend && uv run alembic current

# Full revision history
cd backend && uv run alembic history --verbose

# Check for graph integrity issues
cd backend && uv run alembic check

Read the migration files. Look at the down_revision fields in every file under migrations/versions/. Build a mental model of the chain.
Identify the invariant violation. Match the symptoms to the invariants listed below.

Phase 2: Fix

Apply the appropriate fix from the catalog below. Every fix preserves existing data and maintains reversibility.

Verify the fix:

# Graph integrity
cd backend && uv run alembic heads          # must show exactly 1 head
cd backend && uv run alembic history         # must show linear chain (or explicit merges)

# Full round-trip (requires a test database)
cd backend && uv run alembic upgrade head
cd backend && uv run alembic downgrade base
cd backend && uv run alembic upgrade head

Alembic Invariants

These are the rules that must hold for a healthy migration graph. Every migration problem is a violation of one or more of these invariants.

Invariant 1: Single Head

Rule: There must be exactly one leaf node (head) in the migration DAG at any given time on a release branch.

Why: alembic upgrade head applies all migrations up to the head. If there are multiple heads, Alembic doesn't know which path to take and refuses to run.

Check:

cd backend && uv run alembic heads

If this shows more than one revision, you have multiple heads.

Common cause: Two PRs merged independently, each adding a migration with the same down_revision (both pointing to the previous head).

A → B → C (head on main)
         ↓
PR #1:   C → D₁ (down_revision = C)
PR #2:   C → D₂ (down_revision = C)

After both merge: C → D₁ (head)
                  C → D₂ (head)  ← TWO HEADS

Invariant 2: Linear Chain (or Explicit Merges)

Rule: Every revision has exactly one down_revision (a single parent), except:

The initial migration, which has down_revision = None
Merge migrations, which have down_revision = ("rev1", "rev2") — a tuple of two parents

Why: A linear chain guarantees deterministic ordering. Merges are the only way to join two branches.

Invariant 3: Referential Integrity

Rule: Every down_revision value must reference an existing revision ID in another migration file.

Why: A broken reference means Alembic cannot walk the graph. alembic upgrade head will fail with Can't locate revision identified by '<id>'.

Check:

cd backend && uv run alembic history --verbose 2>&1 | grep -i error

Invariant 4: No Orphans

Rule: Every revision must be reachable by walking from a head backward through down_revision links.

Why: Orphaned revisions are dead code — they exist on disk but never run. They confuse developers and can cause merge conflicts.

Invariant 5: Full Reversibility

Rule: Every upgrade() function must have a corresponding downgrade() that fully reverses it. The cycle upgrade head → downgrade base → upgrade head must succeed.

Why: Reversibility enables rollback in production and validates that no migration has hidden state dependencies.

Check:

cd backend && uv run alembic upgrade head
cd backend && uv run alembic downgrade base
cd backend && uv run alembic upgrade head

Invariant 6: Idempotent Operations

Rule: Migrations should check for existence before creating or dropping objects.

Why: Migrations may be re-run during development, testing, or disaster recovery. A migration that fails on re-run is fragile.

# Good: defensive
def upgrade() -> None:
    inspector = sa.inspect(op.get_bind())
    if not inspector.has_table("my_table"):
        op.create_table(...)

# Bad: crashes on re-run
def upgrade() -> None:
    op.create_table(...)  # raises if table already exists

Invariant 7: One Migration Per PR

Rule: Each PR should add at most one migration file.

Why: Multiple migrations in one PR increase merge conflict risk, complicate review, and make rollback harder. If you need multiple schema changes, combine them into one migration.

Fixing Multiple Heads

This is the most common Alembic problem. There are two approaches:

Fix A: Re-chain (preferred when one migration should come after the other)

If the two heads are logically sequential (D₂ should apply after D₁):

# In D₂'s migration file, change:
down_revision = "C"
# To:
down_revision = "D₁"

This creates: C → D₁ → D₂ (single head).

When to use: The migrations don't conflict and can be ordered arbitrarily, or there's a natural ordering.

Fix B: Merge migration (when both branches must be preserved)

Create a new merge migration:

cd backend && uv run alembic merge heads -m "merge D1 and D2"

This creates a new file with:

down_revision = ("D₁", "D₂")

def upgrade() -> None:
    pass  # no-op, just merges the branches

def downgrade() -> None:
    pass

Result: C → D₁ → M (head) C → D₂ ↗

When to use: The two branches have conflicting operations that can't be reordered, or you need to preserve the exact commit history.

Fix C: Squash (when the extra migration is unnecessary)

If one of the heads is empty, redundant, or can be folded into the other:

Delete the redundant migration file
Verify the remaining chain: uv run alembic heads (should show 1 head)

When to use: One migration was created by mistake or is a no-op.

Fixing Broken References

Symptom: Can't locate revision identified by 'abc123'

Cause: A migration's down_revision points to a revision ID that doesn't exist (deleted, renamed, or typo).

Fix:

Find the migration file with the broken reference
Identify what the down_revision should actually be (look at alembic history or the other files)
Correct the down_revision value
Verify: uv run alembic heads and uv run alembic history

Fixing Failed Downgrades

Symptom: alembic downgrade fails mid-way.

Common causes:

Missing downgrade() body — the function is pass or incomplete
Data dependency — the downgrade drops a column that a constraint references
Type conversion not reversible — e.g., Float → Integer loses precision

Fix pattern:

def downgrade() -> None:
    # Drop constraints BEFORE dropping columns
    op.drop_constraint("fk_my_table_other_id", "my_table", type_="foreignkey")
    # Drop indexes BEFORE dropping tables
    op.drop_index("ix_my_table_column", table_name="my_table")
    # Now safe to drop
    op.drop_column("my_table", "column_name")
    # Or drop entire table
    op.drop_table("my_table")

Order matters in downgrades:

Drop foreign key constraints first
Drop indexes
Drop columns
Drop tables

This is the reverse of the upgrade order (create table → add columns → add indexes → add constraints).

Fixing Data Migrations

Data migrations (INSERT, UPDATE, DELETE) require special care in downgrades.

Pattern:

def upgrade() -> None:
    # Add column
    op.add_column("tasks", sa.Column("board_id", sa.Uuid()))
    # Backfill from related table
    op.execute("""
        UPDATE tasks SET board_id = agents.board_id
        FROM agents WHERE tasks.agent_id = agents.id
    """)

def downgrade() -> None:
    # No need to undo the data — just drop the column
    op.drop_column("tasks", "board_id")

Rule: Downgrade of a data migration drops the column/table. You don't need to "un-backfill" data.

Creating a New Migration

cd backend && uv run alembic revision --autogenerate -m "add foo column to bars"

After autogeneration, always:

Read the generated file. Autogenerate guesses; it can miss renames (interprets as drop+add), miss data migrations entirely, and generate incorrect type mappings.
Verify down_revision points to the current single head.
Add defensive checks for idempotency where appropriate.
Write a complete downgrade() function.

Test the full round-trip:

uv run alembic upgrade head
uv run alembic downgrade -1    # undo just this migration
uv run alembic upgrade head    # re-apply

CI Gate: `backend-migration-check`

The CI pipeline validates migrations with this sequence:

# 1. Validate graph structure (no multiple heads, no orphans)
cd backend && uv run python scripts/check_migration_graph.py

# 2. Spin up ephemeral Postgres
docker run -d --rm -e POSTGRES_PASSWORD=postgres -e POSTGRES_DB=migration_ci \
    -p 55432:5432 postgres:16

# 3. Full round-trip test
AUTH_MODE=local \
LOCAL_AUTH_TOKEN=<long-token> \
DATABASE_URL=postgresql+psycopg://postgres:postgres@localhost:55432/migration_ci \
BASE_URL=http://localhost:8000 \
    uv run alembic upgrade head
    uv run alembic downgrade base
    uv run alembic upgrade head

# 4. Cleanup
docker rm -f <container>

If CI fails:

Read the error message — it tells you which step failed
If check_migration_graph.py fails → you have multiple heads or broken references
If upgrade head fails → a migration has a syntax error or incompatible operation
If downgrade base fails → a migration's downgrade() is incomplete
If the second upgrade head fails → a migration isn't idempotent
If it fails with ValidationError for missing settings → add the required env vars (AUTH_MODE, LOCAL_AUTH_TOKEN, DATABASE_URL, BASE_URL)

Environment Variables for Migration Commands

The application's Settings class validates required fields at import time. When running Alembic commands (which import app.core.config), you must set:

AUTH_MODE=local
LOCAL_AUTH_TOKEN=<at-least-32-chars>
DATABASE_URL=postgresql+psycopg://user:pass@host:port/dbname
BASE_URL=http://localhost:8000

If you get a Pydantic ValidationError when running alembic, you're missing one of these.

env.py: How Alembic Finds Models

The migrations/env.py file:

Imports app.models (which registers all SQLModel classes)
Uses SQLModel.metadata as the target metadata
Normalizes postgresql:// to postgresql+psycopg:// for the psycopg3 driver
Uses NullPool for migrations (no connection pooling needed)
Enables compare_type=True to detect column type changes

If autogenerate misses your model: Make sure it's imported in app/models/__init__.py.

Common Pitfalls

1. Autogenerate creates drop+add instead of rename

Alembic cannot detect renames. It sees "column A disappeared, column B appeared" and generates drop_column + add_column. Fix: Replace with op.alter_column(..., new_column_name=...).

2. Enum types on PostgreSQL

Adding or removing enum values requires special handling:

# Adding a value (PostgreSQL)
op.execute("ALTER TYPE myenum ADD VALUE 'new_value'")

# Downgrade: PostgreSQL cannot remove enum values!
# You must recreate the type and migrate data.

3. SQLite vs PostgreSQL differences in tests

If tests use SQLite (:memory:) but production uses PostgreSQL, some operations behave differently:

SQLite doesn't enforce foreign keys by default
SQLite doesn't support ALTER COLUMN
SQLite doesn't have enum types
Index and constraint naming differs

4. Alembic `batch_alter_table` for SQLite

When testing downgrades on SQLite, you may need batch mode:

with op.batch_alter_table("my_table") as batch_op:
    batch_op.drop_column("column_name")

PostgreSQL doesn't need this, but it's harmless.

5. The `alembic_version` table

Alembic tracks the current revision in an alembic_version table with a single row. If this gets corrupted:

# Check current stamp
uv run alembic current

# Force-stamp to a specific revision (does NOT run migrations)
uv run alembic stamp <revision_id>

# Force-stamp to head (marks all as applied without running)
uv run alembic stamp head

Warning: alembic stamp changes the recorded version without running any upgrade/downgrade logic. Use only when you know the schema already matches.

Diagnostic Cheat Sheet

Symptom	Command	Likely Cause
`Multiple heads`	`alembic heads`	Two migrations share the same `down_revision`
`Can't locate revision`	`alembic history -v`	Broken `down_revision` reference
`Target database is not up to date`	`alembic current`	Pending migrations not applied
`Table already exists`	Read the migration file	Missing idempotency check
`Column not found` during downgrade	Read the downgrade function	Wrong drop order (constraint before column)
`ValidationError` on any alembic command	Check env vars	Missing AUTH_MODE, DATABASE_URL, or BASE_URL
`No such revision`	`alembic history`	Migration file deleted but still referenced
Autogenerate produces empty migration	Check `app/models/__init__.py`	Model not imported into metadata

Guidelines

Always test the full round-trip (up → down → up) before committing a migration.
One migration per PR. Combine related schema changes into a single file.
Read autogenerated migrations. They are a starting point, not the final product.
Write defensive migrations. Check existence before creating. Future-you will thank present-you.
Downgrade order is reverse of upgrade order. Constraints → indexes → columns → tables.
Never edit a migration that has been applied to production. Create a new migration to fix issues.
Fix multiple heads immediately. They compound — every new PR adds another head if the base isn't fixed.
Respect CLAUDE.md. The project's instructions override everything in this skill.

name: hdb:alembic description: Diagnose and fix Alembic migration issues — broken chains, multiple heads, failed downgrades, and CI gate failures

hdb:alembic

Usage

Description

Instructions

Phase 1: Diagnose

Phase 2: Fix

Alembic Invariants

Invariant 1: Single Head

Invariant 2: Linear Chain (or Explicit Merges)

Invariant 3: Referential Integrity

Invariant 4: No Orphans

Invariant 5: Full Reversibility

Invariant 6: Idempotent Operations

Invariant 7: One Migration Per PR

Fixing Multiple Heads

Fix A: Re-chain (preferred when one migration should come after the other)

Fix B: Merge migration (when both branches must be preserved)

Fix C: Squash (when the extra migration is unnecessary)

Fixing Broken References

Fixing Failed Downgrades

Fixing Data Migrations

Creating a New Migration

CI Gate: backend-migration-check

Environment Variables for Migration Commands

env.py: How Alembic Finds Models

Common Pitfalls

1. Autogenerate creates drop+add instead of rename

2. Enum types on PostgreSQL

3. SQLite vs PostgreSQL differences in tests

4. Alembic batch_alter_table for SQLite

5. The alembic_version table

Diagnostic Cheat Sheet

Guidelines

CI Gate: `backend-migration-check`

4. Alembic `batch_alter_table` for SQLite

5. The `alembic_version` table