name: polardb-daily-check description: Comprehensive daily health check for PolarDB for PostgreSQL databases, including LogIndex status, HTAP/MPP monitoring, shared storage performance, and high availability checks license: MIT compatibility: opencode
PolarDB Daily Check Agent
This skill guides the agent in conducting a thorough daily health check of a PolarDB for PostgreSQL database instance. It extends the PostgreSQL daily check capabilities with PolarDB-specific checks for its unique architecture (Shared-Storage, LogIndex, HTAP/MPP).
Purpose
The primary goal of this skill is to empower the agent to proactively monitor the health and performance of PolarDB for PostgreSQL, leveraging its unique features like storage-compute separation, LogIndex, and HTAP capabilities. It performs routine inspections efficiently with PolarDB-specific insights.
Core Capabilities
The agent performs checks across several key areas:
- PolarDB Core Health: Node type verification, LogIndex status, PFS storage usage
- Availability & Health: Standard PostgreSQL health checks (invalid indexes, XID wraparound, blocking locks)
- Performance & Activity: Active sessions, long-running queries, cache efficiency, MPP parallel query performance
- HTAP & MPP: Parallel query statistics, worker status, coordinator coordination
- Storage & I/O: Shared storage performance, PolarFS usage, buffer pool status
- Replication & Consistency: Primary-readonly node sync status, LogIndex replay lag
PolarDB-Specific Checks
LogIndex Architecture
PolarDB uses LogIndex to maintain page replay history for read-only nodes, solving the "past page" problem. Key checks include:
- LogIndex replay status and latency
- WAL metadata consistency between primary and read-only nodes
- Lazy and Parallel replay efficiency
HTAP & MPP Architecture
PolarDB supports HTAP via distributed MPP execution engine:
- PxScan and Shuffle operators for parallel execution
- Coordinator-Worker coordination for skew elimination
- Serverless elastic scaling of compute nodes
polar_enable_pxparameter controls MPP functionality
Storage-Compute Separation
In this architecture:
- Storage: Shared PolarFS, scales independently
- Compute: Multiple read-write and read-only nodes, stateless
- Benefits: No data replication during scaling, cost-effective read replicas
Workflow
When activated, this skill executes a predefined sequence of checks:
- First runs PolarDB-specific checks (node type, LogIndex, PFS)
- Then runs standard PostgreSQL compatibility checks
- Generates a comprehensive Markdown report with PolarDB-specific recommendations
Available Skills
Each item below represents a callable skill, returning structured JSON output.
1. PolarDB Core Health
Skill: get_polar_node_type
- Description: Checks the current PolarDB node type and role using
polar_node_type()function. - Usage:
./run_polardb_check.sh get_polar_node_type - Expected Output:
{ "skill": "get_polar_node_type", "status": "success", "data": [ { "node_type": "Primary", "is_writable": true, "polar_version": "2.0" } ] } - Analysis: Reports the node role (Primary/ReadOnly) and write capability.
Skill: get_logindex_status
- Description: Monitors LogIndex replay status and lag between primary and read-only nodes.
- Usage:
./run_polardb_check.sh get_logindex_status - Expected Output:
{ "skill": "get_logindex_status", "status": "success", "data": [ { "node_role": "ReadOnly", "replay_lag_mb": 15, "replay_lag_seconds": 2, "pending_wal_count": 50 } ] } - Analysis: WARNING if replay lag >100MB or >10 seconds. CRITICAL if lag >1GB.
Skill: get_pfs_usage
- Description: Reports Polar File System (PFS) storage usage using
pfs_du_with_depth()andpfs_info(). - Usage:
./run_polardb_check.sh get_pfs_usage - Expected Output:
{ "skill": "get_pfs_usage", "status": "success", "data": [ { "total_size_gb": 500, "used_size_gb": 350, "used_percentage": 70, "file_count": 15000 } ] } - Analysis: WARNING if usage >80%, CRITICAL if >90%.
Skill: get_polar_process_status
- Description: Detailed process information using
polar_stat_process()- PID, wait events, I/O stats, CPU, RSS. - Usage:
./run_polardb_check.sh get_polar_process_status - Expected Output:
{ "skill": "get_polar_process_status", "status": "success", "data": [ { "pid": 1234, "state": "active", "wait_event": "ClientRead", "cpu_user": 5.2, "cpu_system": 1.5, "rss_mb": 2048, "shared_storage_read_iops": 150, "shared_storage_read_throughput_mbps": 50, "shared_storage_read_latency_ms": 0.5 } ] } - Analysis: Reports detailed resource utilization per process.
Skill: get_polar_activity
- Description: Enhanced activity view combining
pg_stat_activityandpolar_stat_process(). - Usage:
./run_polardb_check.sh get_polar_activity - Expected Output:
{ "skill": "get_polar_activity", "status": "success", "data": [ { "pid": 1234, "usename": "app_user", "state": "active", "query": "SELECT * FROM orders WHERE...", "duration": "00:00:05", "wait_event": "IO polarfs", "shared_io": true } ] } - Analysis: Enhanced monitoring with PolarDB-specific wait events.
2. HTAP & MPP Checks
Skill: get_px_workers_status
- Description: Checks MPP parallel query worker status and configuration.
- Usage:
./run_polardb_check.sh get_px_workers_status - Expected Output:
{ "skill": "get_px_workers_status", "status": "success", "data": [ { "polar_enable_px": true, "polar_px_max_workers_number": 64, "polar_px_dop_per_node": 8, "active_px_queries": 3, "total_px_workers": 24 } ] } - Analysis: Reports MPP configuration and active parallel queries.
Skill: get_px_query_stats
- Description: Statistics on MPP parallel query execution and performance.
- Usage:
./run_polardb_check.sh get_px_query_stats - Expected Output:
{ "skill": "get_px_query_stats", "status": "success", "data": [ { "query_type": "PxScan", "execution_count": 1500, "avg_execution_time_ms": 45, "total_rows_scanned": 5000000000 } ] } - Analysis: Monitors HTAP workload performance.
Skill: get_px_nodes
- Description: Lists nodes participating in MPP execution via
polar_px_nodes. - Usage:
./run_polardb_check.sh get_px_nodes - Expected Output:
{ "skill": "get_px_nodes", "status": "success", "data": [ {"node_id": 1, "node_name": "primary", "is_coordinator": true}, {"node_id": 2, "node_name": "readonly1", "is_worker": true}, {"node_id": 3, "node_name": "readonly2", "is_worker": true} ] } - Analysis: Shows MPP cluster topology.
Skill: get_buffer_pool_affinity
- Description: Checks Buffer Pool affinity settings and effectiveness for MPP operations.
- Usage:
./run_polardb_check.sh get_buffer_pool_affinity - Expected Output:
{ "skill": "get_buffer_pool_affinity", "status": "success", "data": [ { "buffer_hit_ratio": 99.5, "local_buffer_usage": 80, "shared_buffer_usage": 75 } ] } - Analysis: Reports buffer efficiency for shared storage access.
3. Storage & I/O Performance
Skill: get_shared_storage_stats
- Description: Shared storage I/O performance metrics (IOPS, throughput, latency).
- Usage:
./run_polardb_check.sh get_shared_storage_stats - Expected Output:
{ "skill": "get_shared_storage_stats", "status": "success", "data": [ { "read_iops": 5000, "write_iops": 2000, "read_throughput_mbps": 200, "write_throughput_mbps": 100, "read_latency_ms": 0.3, "write_latency_ms": 0.5 } ] } - Analysis: WARNING if latency >5ms, CRITICAL if >10ms.
Skill: get_polar_io_stats
- Description: Detailed I/O statistics from PolarDB-specific monitoring.
- Usage:
./run_polardb_check.sh get_polar_io_stats - Expected Output:
{ "skill": "get_polar_io_stats", "status": "success", "data": [ { "polarfs_read_count": 1000000, "polarfs_write_count": 500000, "polarfs_read_bytes": "50 GB", "polarfs_write_bytes": "25 GB", "polarfs_iops": 8000, "polarfs_throughput_mbps": 350 } ] } - Analysis: Detailed PolarFS I/O patterns.
Skill: get_dirty_page_status
- Description: Dirty page flush status and coordination with read-only nodes.
- Usage:
./run_polardb_check.sh get_dirty_page_status - Expected Output:
{ "skill": "get_dirty_page_status", "status": "success", "data": [ { "dirty_pages_count": 5000, "dirty_bytes_mb": 200, "flush_rate_pages_per_sec": 1000, "oldest_modified_age": 120 } ] } - Analysis: Monitors primary node dirty page flushing coordination.
4. High Availability & Consistency
Skill: get_primary_readonly_sync
- Description: Synchronization status between primary and read-only nodes.
- Usage:
./run_polardb_check.sh get_primary_readonly_sync - Expected Output:
{ "skill": "get_primary_readonly_sync", "status": "success", "data": [ { "primary_node": "node_primary", "readonly_nodes": [ { "node_name": "node_readonly1", "sync_status": "streaming", "lag_bytes": 5242880, "lag_seconds": 1.5 } ] } ] } - Analysis: CRITICAL if any node not in streaming status.
Skill: get_online_promote_status
- Description: Checks readiness for online promotion capability.
- Usage:
./run_polardb_check.sh get_online_promote_status - Expected Output:
{ "skill": "get_online_promote_status", "status": "success", "data": [ { "promote_ready": true, "last_promote_time": "2024-01-15 10:30:00", "promote_in_progress": false } ] } - Analysis: Reports online promotion readiness.
Skill: get_recovery_progress
- Description: Recovery progress for read-only nodes (if applicable).
- Usage:
./run_polardb_check.sh get_recovery_progress - Expected Output:
{ "skill": "get_recovery_progress", "status": "success", "data": [ { "node_name": "node_readonly1", "received_lsn": "0/ABCDEF00", "replayed_lsn": "0/ABCDE800", "replay_lag_bytes": 256, "is_applying": true } ] } - Analysis: Monitors WAL replay progress.
5. PostgreSQL Compatibility Checks
The following skills are identical to the standard PostgreSQL daily check and work seamlessly with PolarDB.
Availability & Health
| Skill | Description |
|---|---|
get_invalid_indexes |
Check for corrupted indexes |
get_xid_wraparound_risk |
Monitor transaction ID wraparound |
get_blocking_locks |
Detect lock contention |
get_deadlock_detection |
Check for past deadlocks |
get_critical_settings |
Review critical parameters |
Session & Connection Monitoring
| Skill | Description |
|---|---|
get_long_running_queries |
Find long-running queries |
get_idle_in_transaction_sessions |
Find idle-in-transaction sessions |
get_long_running_transactions |
Find long transactions |
get_connection_usage |
Check connection pool usage |
get_lock_waiters |
Detailed lock wait analysis |
get_wait_events |
Current wait event analysis |
Performance & Activity
| Skill | Description |
|---|---|
get_cache_hit_rate |
Cache efficiency metric |
get_rollback_rate |
Transaction rollback ratio |
get_top_sql_by_time |
Most expensive queries |
get_table_hotspots |
Most active tables |
get_bgwriter_stats |
Background writer metrics |
get_wal_statistics |
WAL activity statistics |
Replication & Archiving
| Skill | Description |
|---|---|
get_replication_slots |
Replication slot status |
get_replication_status |
Streaming replica lag |
get_wal_archiver_status |
WAL archiving health |
Maintenance & Storage
| Skill | Description |
|---|---|
get_autovacuum_status |
Active vacuum workers |
get_table_bloat |
Table space bloat |
get_index_bloat |
Index space bloat |
get_top_objects_by_size |
Largest objects |
get_stale_statistics |
Outdated table stats |
get_database_sizes |
Database sizes |
Freeze & Wraparound Protection
| Skill | Description |
|---|---|
get_freeze_prediction |
Predict freeze thresholds |
Environment Setup
Requirements
psqlcommand-line tool (PostgreSQL client, version 10+)- Python 3.6+ (uses only standard library, no additional packages)
polar_monitorextension installed and available- Read access to PolarDB-specific functions
Configuration
Configure database connection in assets/db_config.env:
export PGHOST="127.0.0.1"
export PGPORT="5432"
export PGUSER="digoal"
export PGPASSWORD="your_password"
export PGDATABASE="postgres"
Required Extensions
Ensure the following extensions are available:
polar_monitor- Core PolarDB monitoring functionspg_stat_statements- SQL performance statistics (optional)pg_buffercache- Buffer pool analysis (optional)
Installation Check
Verify PolarDB extensions are installed:
SELECT * FROM pg_extension WHERE extname LIKE 'polar%';
SELECT * FROM pg_extension WHERE extname = 'pg_stat_statements';
Usage
Run Full Health Check
cd polardb-daily-check/scripts
# Run full health check (generates polar_daily_health_report.md)
python3 polardb_agent.py
# Or run via the bash wrapper
./run_polardb_check.sh full_check
Run Individual PolarDB-Specific Checks
./run_polardb_check.sh get_polar_node_type
./run_polardb_check.sh get_logindex_status
./run_polardb_check.sh get_pfs_usage
./run_polardb_check.sh get_px_workers_status
./run_polardb_check.sh get_shared_storage_stats
Run Standard PostgreSQL Compatibility Checks
./run_polardb_check.sh get_long_running_queries
./run_polardb_check.sh get_table_bloat
./run_polardb_check.sh get_replication_status
./run_polardb_check.sh get_cache_hit_rate
Output
PolarDB Daily Health Report
The agent generates polar_daily_health_report.md with:
PolarDB-Specific Status
- Node type and role
- LogIndex replay lag
- PFS storage usage
- MPP/HTAP status
Overall Health Status
- OK / WARNING / CRITICAL
Detailed Findings
- PolarDB-specific issues
- Standard PostgreSQL issues
- Performance recommendations
Actionable Recommendations
- LogIndex optimization
- MPP tuning suggestions
- Storage capacity planning
- High availability readiness
Sample Report Structure
# PolarDB Daily Health Report
Generated: 2024-01-15 10:00:00
## PolarDB Status
- Node Type: Primary (Writable)
- LogIndex Lag: 5MB (2s) ✅ OK
- PFS Usage: 70% ✅ OK
- MPP Enabled: true ✅ OK
## Overall Status: ✅ HEALTHY
### Critical Issues
None
### Warnings
- 3 long-running queries detected
- Table 'orders' has 15% bloat
### Recommendations
1. Consider running VACUUM FULL on 'orders' table
2. Review slow queries in pg_stat_statements
PolarDB-Specific Recommendations
LogIndex Optimization
- Monitor replay lag continuously
- If lag increases, check:
- Network bandwidth between compute and storage
- Storage I/O performance
- Read-only node workload
MPP Tuning
- Adjust
polar_px_max_workers_numberbased on workload - Use
ALTER TABLE ... SET(px_workers=...)for large tables only - Monitor
polar_px_dop_per_nodefor parallel degree
Storage Capacity
- Plan for 70% PFS usage threshold
- Consider storage tiering for cold data
- Monitor I/O latency for performance degradation
High Availability
- Test online promotion periodically
- Monitor sync lag between nodes
- Keep
polar_enable_pxconsistent across nodes
Skill Index
PolarDB Core Health
| Skill | Category | Description |
|---|---|---|
get_polar_node_type |
Core | Node type verification |
get_logindex_status |
Core | LogIndex replay status |
get_pfs_usage |
Core | PolarFS storage usage |
get_polar_process_status |
Core | Detailed process metrics |
get_polar_activity |
Core | Enhanced activity monitor |
HTAP & MPP
| Skill | Category | Description |
|---|---|---|
get_px_workers_status |
HTAP | MPP worker configuration |
get_px_query_stats |
HTAP | Parallel query statistics |
get_px_nodes |
HTAP | MPP cluster topology |
get_buffer_pool_affinity |
HTAP | Buffer efficiency |
Storage & I/O
| Skill | Category | Description |
|---|---|---|
get_shared_storage_stats |
I/O | Shared storage performance |
get_polar_io_stats |
I/O | PolarFS detailed I/O |
get_dirty_page_status |
I/O | Dirty page coordination |
High Availability
| Skill | Category | Description |
|---|---|---|
get_primary_readonly_sync |
HA | Primary-readonly sync |
get_online_promote_status |
HA | Promotion readiness |
get_recovery_progress |
HA | Recovery progress |
PostgreSQL Compatibility
| Skill | Category | Description |
|---|---|---|
| All standard PostgreSQL checks | Various | 40+ compatibility skills |
Notes
This skill is designed for PolarDB for PostgreSQL and requires the polar_monitor extension. Standard PostgreSQL checks work on both PolarDB and regular PostgreSQL instances.
For more information about PolarDB architecture:
Base directory for this skill: file:///Users/digoal/.config/opencode/skills/polardb-daily-check Relative paths in this skill (e.g., scripts/, assets/) are relative to this base directory.