name: sqlmemory-review description: Analyze SQL Server memory pressure using buffer pool metrics, plan cache composition, memory grants, and memory clerk data. Applies 20 checks (O1–O20) covering Page Life Expectancy degradation, single-use plan cache bloat, RESOURCE_SEMAPHORE queue depth, memory grant timeouts, buffer pool concentration, ColumnStore and In-Memory OLTP footprint, OS memory pressure notifications, and server memory configuration. Use this skill when the server is paging, queries queue for memory grants, or PLE is low and dropping. Trigger when pasting output from sys.dm_os_memory_clerks, sys.dm_os_ring_buffers, sys.dm_exec_query_memory_grants, or PLE perf counters. triggers: - /sqlmemory-review - /memory-review - /ple-check
SQL Server Memory Review Skill
Purpose
Analyze SQL Server memory state and identify the root cause of memory pressure. Applies 20 checks (O1–O20) across four categories:
- O1–O5 — Buffer pool and Page Life Expectancy: detect low PLE, declining trends, NUMA node imbalance, and buffer pool concentration in a single database
- O6–O10 — Plan cache: single-use plan bloat, excessive compile counts, large individual plans, and high plan cache churn
- O11–O15 — Memory grants and RESOURCE_SEMAPHORE: detect grant queuing, grant timeouts, oversized grants, and Resource Governor misconfigurations
- O16–O20 — Memory clerks, OS pressure, and configuration: ColumnStore/In-Memory OLTP memory footprint, OS pressure notifications, stolen (non-buffer) memory dominance, Lock Pages in Memory misconfiguration, and Max Server Memory not explicitly set
Input
Accept any of:
- Output from
sys.dm_os_memory_clerkscapture query below (paste the result grid) - Output from
sys.dm_os_ring_buffersWHEREring_buffer_type = N'RING_BUFFER_RESOURCE_MONITOR'— memory pressure notifications (also acceptRING_BUFFER_OOMrecords for out-of-memory events) - Output from
sys.dm_exec_query_memory_grantsfor current grant queue state - PLE counter values from
sys.dm_os_performance_countersor SSMS Activity Monitor - Output from
sys.dm_os_sys_memoryfor OS-level memory state - Combined paste of two or more of the above; apply all applicable checks
- A natural language description of symptoms ("PLE is 200 and dropping, RESOURCE_SEMAPHORE is 15% of waits")
Recommended capture queries
-- 1. Memory clerks — top consumers (paste top 20+ rows)
SELECT TOP 20
type,
name,
memory_node_id,
pages_kb,
virtual_memory_reserved_kb,
virtual_memory_committed_kb,
awe_allocated_kb,
shared_memory_reserved_kb,
shared_memory_committed_kb
FROM sys.dm_os_memory_clerks
ORDER BY pages_kb DESC;
-- 2. Page Life Expectancy (PLE)
SELECT object_name, counter_name, instance_name, cntr_value
FROM sys.dm_os_performance_counters
WHERE object_name LIKE '%Buffer Manager%'
AND counter_name = 'Page life expectancy';
-- 3. Plan cache single-use waste
SELECT
SUM(CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) / 1048576 AS single_use_mb,
SUM(size_in_bytes) / 1048576 AS total_plan_cache_mb,
COUNT(*) AS total_plans,
SUM(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END) AS single_use_plans
FROM sys.dm_exec_cached_plans
WHERE objtype IN ('Adhoc', 'Prepared');
-- 4. Memory grant queue (current requests waiting)
SELECT
session_id,
request_id,
scheduler_id,
grant_time,
requested_memory_kb,
granted_memory_kb,
required_memory_kb,
used_memory_kb,
max_used_memory_kb,
query_cost,
timeout_sec,
resource_semaphore_id,
wait_order,
is_next_candidate
FROM sys.dm_exec_query_memory_grants
ORDER BY wait_order;
-- 5. OS memory state
SELECT
total_physical_memory_kb,
available_physical_memory_kb,
total_page_file_kb,
available_page_file_kb,
system_memory_state_desc
FROM sys.dm_os_sys_memory;
Thresholds Reference
| Metric | Info | Warning | Critical |
|---|---|---|---|
| PLE (single NUMA node) | ≥ 300 s | < 300 s | < 60 s |
| PLE (multi-NUMA, per node) | ≥ 300 s | < 300 s | < 60 s |
| PLE decline rate (trend) | < 10 s/min | ≥ 10 s/min | ≥ 60 s/min |
| Single-use plan cache as % of total plan cache | < 30% | ≥ 30% | ≥ 60% |
| RESOURCE_SEMAPHORE wait (from sqlwait-review) | 0 sessions | 1–5 queued | > 5 queued |
| Memory grant timeout | 0 | Any | — |
| Stolen memory (non-buffer) as % of target | < 15% | ≥ 15% | ≥ 30% |
| Buffer pool: one DB as % of pool | < 60% | ≥ 60% | ≥ 80% |
| ColumnStore pool (COLUMNSTORE_OBJECT_POOL) | — | > 25% | > 50% |
| In-Memory OLTP (XTP) memory | — | > 25% | > 50% |
Buffer Pool and PLE Checks (O1–O5)
Run these first to determine if SQL Server is under immediate memory pressure.
O1 — Low Page Life Expectancy
- Trigger: Any
cntr_valueforPage life expectancyinsys.dm_os_performance_counters(objectBuffer ManagerorBuffer Node) < 300 seconds - Severity: Warning if < 300 s; Critical if < 60 s
- Fix: PLE below 300 means pages cycle out of the buffer pool faster than expected (the traditional rule of thumb: 300 s per 4 GB of buffer pool, so a 128 GB server warrants PLE > ~9,600 s). Investigate: (1) run O4 to check if one database dominates the pool; (2) check if a scan-heavy query is flooding the pool — run
/sqltrace-reviewto identify full-scan workloads; (3) check if total committed memory is near Max Server Memory — run O20 to verify the setting.
O2 — PLE Declining Trend
- Trigger: Multiple PLE snapshots show a monotonically decreasing pattern over ≥ 3 data points, with a rate of decline ≥ 10 s per minute
- Severity: Warning if 10–59 s/min decline; Critical if ≥ 60 s/min decline
- Fix: A sustained PLE drop means the buffer pool is being churned. Identify the source: correlate PLE drop timestamps with batch execution times in
/sqltrace-reviewor Query Store. Common causes: a newly deployed query causing a large table scan, a growing database whose working set no longer fits in memory, or a maintenance job (REBUILD, DBCC CHECKDB) running withoutWITH TABLOCK/WITH PHYSICAL_ONLYlimits.
O3 — NUMA Node PLE Imbalance
- Trigger: When multiple
Buffer NodePLE counters are present (multi-NUMA system), any node's PLE is < 50% of the maximum node PLE - Severity: Warning
- Fix: One NUMA node is under higher memory pressure than others. This suggests either: uneven memory configuration across NUMA nodes (check BIOS / hardware), or queries are bound to specific schedulers/CPUs that concentrate working sets. Check
sys.dm_os_nodesfor memory allocation per node. Consider soft-NUMA if physical NUMA is unavailable.
O4 — Buffer Pool Dominated by Single Database
- Trigger: A single database holds ≥ 60% of all pages in the buffer pool (use
sys.dm_os_buffer_descriptorsgrouped bydatabase_id) - Severity: Warning if ≥ 60%; Critical if ≥ 80%
- Fix: One database is evicting all others from the buffer pool. If this is the intended primary OLTP database, increase Max Server Memory (run O20). If it is unexpected (reporting DB, audit DB), consider: (1) partitioning workloads to separate SQL instances; (2) using Resource Governor with memory limits; (3) scheduling large scan queries during off-peak hours.
O5 — Stolen Memory Excessive
- Trigger: Sum of all non-buffer-pool clerk
pages_kb≥ 15% ofcommitted_target_kbfromsys.dm_os_sys_info - Severity: Warning if 15–29%; Critical if ≥ 30%
- Fix: Stolen (non-buffer) memory includes locks, SQL CLR, linked servers, plan cache, ColumnStore/XTP pools, and third-party extended procedures. Run O9 (locks clerk), O16 (ColumnStore), and O17 (XTP) checks. If CLR or linked servers are involved, review their configuration. Stolen memory leaves less room for the buffer pool, causing the PLE symptoms seen in O1.
Plan Cache Checks (O6–O10)
O6 — Single-Use Plan Cache Bloat
- Trigger: Single-use plans represent ≥ 30% of total plan cache size OR total single-use plan count > 10,000
- Severity: Warning if 30–59%; Critical if ≥ 60%
- Fix: Ad-hoc queries that run once generate plans that consume memory and are never reused. Enable
optimize for ad hoc workloads(sp_configure 'optimize for ad hoc workloads', 1) to store only a stub on first execution, saving the full plan allocation until the query runs a second time. Longer term, parameterize application queries or usesp_executesqlwith parameters.
O7 — High Plan Cache Compilation Rate
- Trigger:
SQL Compilations/secorSQL Re-Compilations/secfromsys.dm_os_performance_counters> 500/sec - Severity: Warning if 500–999/sec; Critical if ≥ 1,000/sec
- Fix: Frequent compilations burn CPU and fragment the plan cache. Causes: unparameterized ad-hoc SQL (fix: O6), use of
WITH RECOMPILEhints on hot paths, DDL/statistics changes triggering recompiles, orSEToption changes per connection. Run/tsql-reviewon the top offending query text to check for recompile hints.
O8 — Large Individual Plan in Cache
- Trigger: Any single plan in
sys.dm_exec_cached_planshassize_in_bytes> 10 MB - Severity: Warning if 10–49 MB; Critical if ≥ 50 MB
- Fix: Large plans typically come from queries with many joins, OR-heavy predicates, or dynamic SQL generating enormous statement trees. A 50 MB plan indicates a query that is too complex to compile efficiently. Review the query for: missing indexes that force large merge trees, OR predicates that could be rewritten as UNION, or dynamically constructed WHERE clauses that generate cartesian predicate combinations.
O9 — Lock Memory Excessive
- Trigger:
OBJECTSTORE_LOCK_MANAGERclerk insys.dm_os_memory_clerkshaspages_kb> 100,000 KB (100 MB) - Severity: Warning if 100–499 MB; Critical if ≥ 500 MB
- Fix: SQL Server allocates a lock structure per row/page/object lock held. Excessive lock memory means: (1) row-level locking on very large result sets — review transactions holding locks and add appropriate WHERE clauses; (2) lock escalation is disabled or failing — check
sys.dm_tran_locksfor escalation events; (3) long-running transactions holding many locks — pair with/sqlwait-reviewfor LCK_ wait analysis.
O10 — Plan Cache Churn (Low Hit Rate)
- Trigger: Plan cache hit rate (=
Cache Hit Ratiocounter from thePlan Cacheperformance object) < 90% over a sustained period - Severity: Warning if 80–89%; Critical if < 80%
- Fix: A cache hit rate below 90% means SQL Server is recompiling more than 10% of queries. Pair with O6 (single-use bloat) and O7 (compile rate) to determine whether the cause is ad-hoc queries or forced recompiles. Enabling
optimize for ad hoc workloadsis the first mitigation; review for parameterization improvements next.
Memory Grant Checks (O11–O15)
O11 — Memory Grant Queue Depth
- Trigger:
sys.dm_exec_query_memory_grantshas any row wheregrant_time IS NULL(query waiting for a grant) — i.e.,wait_order > 0 - Severity: Warning if 1–5 sessions waiting; Critical if > 5 sessions waiting
- Fix: Queries are queuing for memory grants, meaning the server has insufficient memory for parallel sort, hash join, and hash aggregate operations. Immediate: identify the queries holding large grants (column
granted_memory_kb) and considerOPTION (MAXDOP 1)orOPTION (MAX_GRANT_PERCENT = 10)as emergency mitigations. Long-term: add indexes to eliminate hash operations; increase Max Server Memory if available physical memory allows (O20).
O12 — Memory Grant Timeout
- Trigger: Error 8645 ("A timeout occurred while waiting for memory resources to execute the query in the resource pool") OR error 701 ("There is insufficient system memory in resource pool 'default' to run this query") in the error log. Note:
timeout_secinsys.dm_exec_query_memory_grantsis the configured time-out before the query gives up its grant request, not an indicator that a timeout occurred - Severity: Critical
- Fix: Queries are failing due to memory exhaustion. Immediate actions: (1) kill the sessions holding the largest grants; (2) set Resource Governor minimum memory grant percent lower; (3) add
OPTION (MIN_GRANT_PERCENT = 1)to the offending query. Root cause: the query's estimated row count is drastically wrong, causing an oversized grant estimate — run/sqlplan-reviewto check N21 (cardinality estimate accuracy) and/sqlstats-reviewfor stale statistics.
O13 — Oversized Memory Grant
- Trigger: Any single session has
granted_memory_kb> 25% ofmax_memory_grantfor the resource pool, ORmax_used_memory_kb / granted_memory_kb< 0.25 (granted 4× more than actually used) - Severity: Warning
- Fix: The query received a large grant but used a fraction of it, blocking other queries from getting grants (see O11). The grant overestimate almost always traces to stale or low-sample statistics on join input tables. Run
/sqlplan-reviewon this query and look for N21 (row count estimate mismatch). Fix statistics withUPDATE STATISTICS ... WITH FULLSCAN.
O14 — Resource Governor Memory Pool Imbalance
- Trigger:
sys.dm_resource_governor_resource_poolsshows a user pool withmax_memory_percent= 100 ormin_memory_percent= 0 when multiple pools are in use - Severity: Warning
- Fix: Resource Governor is configured but workload pools are not memory-bounded. Set
min_memory_percentandmax_memory_percenton high-priority pools (OLTP) to protect them from being crowded out by reporting workloads. UseALTER RESOURCE POOL+ALTER RESOURCE GOVERNOR RECONFIGURE.
O15 — Buffer Pool Extension (BPE) Active on SSDs
- Trigger:
sys.dm_os_buffer_pool_extension_configurationshowsstate = 5(BUFFER POOL EXTENSION ENABLED) - Severity: Info
- Fix: Buffer Pool Extension is enabled. Verify the BPE file is on a fast NVMe/SSD — placing it on a spinner negates any benefit and can worsen latency. Note: BPE is available in Enterprise and Standard editions only (introduced SQL Server 2014); if the workload outgrows it, additional physical RAM outperforms BPE.
Memory Clerk, OS Pressure, and Configuration Checks (O16–O20)
O16 — ColumnStore Buffer Pool Pressure
- Trigger:
CACHESTORE_COLUMNSTOREOBJECTPOOLclerk insys.dm_os_memory_clerkshaspages_kb> 25% ofcommitted_target_kb - Severity: Warning if 25–49%; Critical if ≥ 50%
- Fix: Columnstore index structures (column segments and dictionaries cached for query execution) are consuming a large share of SQL Server memory. This is expected on a ColumnStore-heavy workload but reduces buffer pool space. Mitigations: increase total RAM; if the server runs both OLTP and ColumnStore workloads, separate them with Resource Governor; ensure ColumnStore indexes only exist on tables that benefit from them — run
/tsql-reviewto check for unnecessary column store usage (T-series checks for ColumnStore on low-cardinality tables).
O17 — In-Memory OLTP (XTP) Memory Pressure
- Trigger: The
MEMORYCLERK_XTPclerk haspages_kb> 25% ofcommitted_target_kb - Severity: Warning if 25–49%; Critical if ≥ 50%
- Fix: In-Memory OLTP memory is unbounded by default — bind the database to a Resource Governor resource pool with
MIN_MEMORY_PERCENT/MAX_MEMORY_PERCENTset, usingsp_xtp_bind_db_resource_pool(database must go offline/online for the binding to take effect). If XTP memory is growing, check:sys.dm_db_xtp_memory_consumersfor the largest consumers; whether durable in-memory tables have an appropriate checkpoint file footprint; whether non-durable (SCHEMA_ONLY) in-memory tables are accumulating rows that are never cleaned up.
O18 — OS Memory Pressure Notifications
- Trigger:
sys.dm_os_ring_buffers WHERE ring_buffer_type = N'RING_BUFFER_RESOURCE_MONITOR'contains records withRESOURCE_MEMPHYSICAL_LOWorRESOURCE_MEMVIRTUAL_LOWnotifications in the last 24 hours - Severity: Warning for RESOURCE_MEMPHYSICAL_LOW; Critical for RESOURCE_MEMVIRTUAL_LOW
- Fix: Windows is signaling SQL Server that physical or virtual memory is critically low. SQL Server will shrink its buffer pool in response, causing PLE drops (O1). Root causes: (1) Max Server Memory set too high — another process on the server needs memory; (2) physical RAM is genuinely insufficient for the workload; (3) memory leak in a third-party component. Review
sys.dm_os_memory_clerksfor unexpected non-buffer growth. Set Max Server Memory to leave at least 4–10% or 4 GB (whichever is larger) for the OS.
O19 — Lock Pages in Memory (LPIM) Misconfiguration
- Trigger: Server RAM > 32 GB AND
sys.dm_os_sys_info.sql_memory_model_desc='CONVENTIONAL'(LPIM not enabled on a large-memory server) - Severity: Warning
- Fix: Without Lock Pages in Memory, Windows can page out the SQL Server buffer pool to disk even when
sys.dm_os_sys_memoryshows available memory. On servers with > 32 GB RAM running under memory pressure, enable LPIM via Group Policy (Lock Pages in Memoryuser right for the SQL Server service account). LPIM prevents OS paging but does not protect against OOM — Max Server Memory must still be set correctly (O20).
O20 — Max Server Memory Not Explicitly Set
- Trigger:
sys.configurationsWHEREname = 'max server memory (MB)'hasvalue_in_use = 2147483647(the default unlimited value) - Severity: Critical
- Fix: Without an explicit Max Server Memory limit, SQL Server can consume all RAM, starving the OS, other services, and connection overhead. Set Max Server Memory using the formula:
Total RAM − OS overhead (4–10%) − SSAS/SSRS/SSIS if co-located − 1 GB per 4 GB RAM for non-buffer memory. Usesp_configure 'max server memory (MB)', <calculated value>; RECONFIGURE. Then verify LPIM is enabled (O19) on large-memory servers.
Output Format
Present findings in this order:
- Memory Pressure Summary — one sentence: is the server memory-pressured (PLE, grants queuing, OS notifications)?
- Findings table — one row per triggered check:
| Check | Severity | Metric | Finding | Fix |
|---|---|---|---|---|
| O1 | Critical | PLE = 45 s | Well below 300 s threshold; dropping 3 s/min | Increase Max Server Memory or identify scan-heavy queries |
- Root cause hypothesis — what is most likely causing the pressure? List top 1–3 ranked by evidence weight.
- Recommended next steps — ordered action list with companion skill references.
Analyzed by:
sqlmemory-review(O1–O20)
Companion Skills
/sqlwait-review— RESOURCE_SEMAPHORE (V11), CMEMTHREAD (V12), and PAGEIOLATCH (V1–V3) waits are the live signal of memory pressure that this skill explains at the configuration level/sqltrace-review— identify which workload queries are flooding the buffer pool or acquiring large memory grants/sqlplan-review— N21 (row count estimate mismatch) and N15/N16 (sort/hash spills) explain why grants are oversized/sqlquerystore-review— Q23 (memory grant regression) and Q24 (spill regression) link plan changes to memory pressure events/sqldiskio-review— if PLE is low andsys.dm_io_virtual_file_statsshows high latency, the working set is spilling to disk; disk I/O is the symptom of memory pressure
VERSION_COMPATIBILITY
See skills/VERSION_COMPATIBILITY.md for the full compatibility matrix.
| Check | 2008 R2 | 2012 | 2014 | 2016 | 2017 | 2019 | 2022 | Azure SQL |
|---|---|---|---|---|---|---|---|---|
| O1 PLE | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O2 PLE Trend | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O3 NUMA PLE | — | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| O4 DB concentration | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O5 Stolen memory | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Partial |
| O6 Single-use plan | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O7 Compile rate | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O8 Large plan | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O9 Lock clerk | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O10 Plan hit rate | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O11 Grant queue | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O12 Grant timeout | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O13 Oversized grant | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O14 RG pool | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O15 BPE | — | — | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| O16 ColumnStore | — | — | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O17 XTP/In-Memory | — | — | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| O18 OS pressure | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| O19 LPIM | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | N/A |
| O20 Max Server Mem | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | N/A |