name: altinity-expert-clickhouse-mutations description: Track and diagnose ClickHouse ALTER UPDATE, ALTER DELETE, and other mutation operations. Use for stuck mutations and mutation performance issues.
Mutation Tracking and Analysis
Track and diagnose ALTER UPDATE, ALTER DELETE, and other mutation operations.
Quick Diagnostics
1. Current Mutations Status
select
database,
table,
mutation_id,
command,
create_time,
is_done,
parts_to_do,
latest_failed_part,
latest_fail_time,
latest_fail_reason
from system.mutations
where not is_done
order by create_time
2. Mutation Summary by Table
select
database,
table,
countIf(not is_done) as pending,
countIf(is_done) as completed,
countIf(latest_fail_reason != '') as failed,
min(create_time) as oldest_pending
from system.mutations
group by database, table
having pending > 0
order by pending desc
3. Stuck Mutations Detection
with
now() as current_time,
dateDiff('minute', create_time, current_time) as age_minutes
select
database,
table,
mutation_id,
substring(command, 1, 60) as command,
create_time,
age_minutes,
parts_to_do,
multiIf(age_minutes > 1440, 'Critical', age_minutes > 360, 'Major', age_minutes > 60, 'Moderate', 'OK') as severity,
latest_fail_reason
from system.mutations
where not is_done
and age_minutes > 30
order by create_time
Mutation History
Recent Completed Mutations
select
event_time,
database,
table,
mutation_id,
duration_ms,
formatReadableSize(size_in_bytes) as size,
rows,
formatReadableSize(peak_memory_usage) as peak_memory
from system.part_log
where event_type = 'MutatePart'
and event_date >= today() - 1
order by event_time desc
limit 30
Mutation Performance by Table
select
database,
table,
count() as mutations,
round(avg(duration_ms)) as avg_ms,
round(max(duration_ms)) as max_ms,
formatReadableSize(sum(size_in_bytes)) as total_size,
sum(rows) as total_rows
from system.part_log
where event_type = 'MutatePart'
and event_date >= today() - 7
group by database, table
order by count() desc
limit 30
Failed Mutations in Part Log
select
event_time,
database,
table,
part_name,
duration_ms,
error,
exception
from system.part_log
where event_type = 'MutatePart'
and error != 0
and event_date >= today() - 7
order by event_time desc
limit 30
Mutation Impact Analysis
Mutations Running Now
select
database,
table,
elapsed,
progress,
is_mutation,
num_parts,
formatReadableSize(total_size_bytes_compressed) as size,
formatReadableSize(memory_usage) as memory,
result_part_name
from system.merges
where is_mutation = 1
order by elapsed desc
Parts Awaiting Mutation
select
m.database,
m.table,
m.mutation_id,
m.parts_to_do,
m.command,
(select count() from system.parts where database = m.database and table = m.table and active) as total_active_parts,
round(100.0 * m.parts_to_do / total_active_parts, 1) as pct_remaining
from system.mutations m
where not m.is_done
order by m.parts_to_do desc
Problem Investigation
Why Is Mutation Stuck?
Check for competing operations:
-- Active merges on same table
select
database,
table,
is_mutation,
elapsed,
progress,
num_parts
from system.merges
where database = '{database}' and table = '{table}'
-- Replication queue for same table
select
type,
create_time,
is_currently_executing,
num_tries,
last_exception
from system.replication_queue
where database = '{database}' and table = '{table}'
order by create_time
limit 20
-- Part mutations status
select
name,
active,
mutation_version,
modification_time
from system.parts
where database = '{database}' and table = '{table}'
order by mutation_version desc
limit 30
Mutation vs Merge Competition
-- Check background pool saturation
select
metric,
value
from system.metrics
where metric like 'Background%'
Mutations and merges share the same pool. If pool is saturated, mutations wait.
Mutation Rate Analysis
Mutation Creation Rate
select
toStartOfHour(create_time) as hour,
count() as mutations_created,
countIf(is_done) as completed,
countIf(not is_done) as pending
from system.mutations
where create_time > now() - interval 7 day
group by hour
order by hour desc
Red flag: >1 mutation per 5 minutes sustained = mutation overload.
Mutation Types
select
multiIf(
command ilike '%DELETE%', 'DELETE',
command ilike '%UPDATE%', 'UPDATE',
command ilike '%MATERIALIZE%', 'MATERIALIZE',
command ilike '%DROP COLUMN%', 'DROP COLUMN',
command ilike '%ADD COLUMN%', 'ADD COLUMN',
command ilike '%MODIFY%', 'MODIFY',
'OTHER'
) as mutation_type,
count() as total,
countIf(not is_done) as pending,
countIf(latest_fail_reason != '') as failed
from system.mutations
group by mutation_type
order by total desc
Canceling Mutations
To kill a stuck mutation:
-- Find mutation_id first
select mutation_id, command from system.mutations
where database = '{database}' and table = '{table}' and not is_done;
-- Then kill it
-- KILL MUTATION WHERE database = '{database}' AND table = '{table}' AND mutation_id = '{mutation_id}';
Warning: Killed mutations leave table in partially-mutated state.
Best Practices
Mutation Anti-Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Frequent small UPDATEs | Creates many mutations | Batch updates together |
| DELETE without WHERE | Full table rewrite | Use TTL instead |
| UPDATE on high-cardinality column | Slow, lots of IO | Restructure data model |
| Many concurrent mutations | Queue builds up | Serialize mutations |
Monitoring Mutations
Set alerts for:
- Mutations pending > 10
- Mutation age > 1 hour
latest_fail_reasonnot empty
Ad-Hoc Query Guidelines
Required Safeguards
-- For part_log
where event_date >= today() - 7
limit 100
Key Tables
system.mutations- Current mutation statesystem.part_log(event_type = 'MutatePart') - Mutation historysystem.merges(is_mutation = 1) - Running mutationssystem.replication_queue- Pending replicated mutations
Cross-Module Triggers
| Finding | Load Module | Reason |
|---|---|---|
| Mutation blocked by merge | altinity-expert-clickhouse-merges |
Merge backlog |
| Mutation OOM | altinity-expert-clickhouse-memory |
Memory limits |
| Mutation slow due to disk | altinity-expert-clickhouse-storage |
IO bottleneck |
| Replicated mutation stuck | altinity-expert-clickhouse-replication |
Replication issues |
Settings Reference
| Setting | Notes |
|---|---|
mutations_sync |
0=async, 1=wait current replica, 2=wait all |
max_mutations_in_flight |
Max concurrent mutations |
number_of_mutations_to_delay |
Delay INSERTs threshold |
number_of_mutations_to_throw |
Throw error threshold |