vmware-monitor

star 8

Use this skill for safe, risk-free queries of VMware infrastructure — code-level enforced safety means no destructive operations exist in the codebase. Directly handles: list VMs/hosts/datastores/clusters, check active alarms with remediation hints, view recent events, get VM details (CPU/memory/disks/NICs/snapshots). Always use vmware-monitor when the user asks to "list VMs", "check vSphere alarms", "show host status", "get VM details", "what vSphere events happened", or needs read-only VMware information before making changes. Do NOT use for any write operations — this skill is code-level read-only and cannot modify, create, or delete any resource. For VM modifications use vmware-aiops, for networking use vmware-nsx, for metrics/capacity use vmware-aria. For load balancing/AVI/AKO use vmware-avi.

zw008 By zw008 schedule Updated 6/12/2026

name: vmware-monitor description: > Use this skill for safe, risk-free queries of VMware infrastructure — code-level enforced safety means no destructive operations exist in the codebase. Directly handles: list VMs/hosts/datastores/clusters, check active alarms with remediation hints, view recent events, get VM details (CPU/memory/disks/NICs/snapshots). Always use vmware-monitor when the user asks to "list VMs", "check vSphere alarms", "show host status", "get VM details", "what vSphere events happened", or needs read-only VMware information before making changes. Do NOT use for any write operations — this skill is code-level read-only and cannot modify, create, or delete any resource. For VM modifications use vmware-aiops, for networking use vmware-nsx, for metrics/capacity use vmware-aria. For load balancing/AVI/AKO use vmware-avi. installer: kind: uv package: vmware-monitor allowed-tools: - Bash metadata: {"openclaw":{"requires":{"env":["VMWARE_MONITOR_CONFIG"],"bins":["vmware-monitor"],"config":["/.vmware-monitor/config.yaml","/.vmware-monitor/.env"]},"optional":{"env":["VMWARE_TARGET_PASSWORD","SLACK_WEBHOOK_URL","DISCORD_WEBHOOK_URL"],"bins":["vmware-policy"]},"primaryEnv":"VMWARE_MONITOR_CONFIG","homepage":"https://github.com/zw008/VMware-Monitor","emoji":"📊","os":["macos","linux"]}} compatibility: > vmware-policy auto-installed as Python dependency (provides @vmware_tool decorator and audit logging). All operations audited to ~/.vmware/audit.db. Credentials: Each vCenter/ESXi target requires a per-target password env var in ~/.vmware-monitor/.env following the pattern VMWARE__PASSWORD (e.g., target "vcenter-prod" → VMWARE_VCENTER_PROD_PASSWORD). SLACK_WEBHOOK_URL and DISCORD_WEBHOOK_URL are optional — disabled by default, user-configured only, used solely by the opt-in daemon scanner. Daemon: the background scanner (vmware-monitor daemon start) is user-initiated only, never auto-started. Webhook payloads contain only aggregated alert metadata (alarm counts, event types) — no credentials, IPs, or PII.

VMware Monitor (Read-Only)

Disclaimer: This is a community-maintained open-source project and is not affiliated with, endorsed by, or sponsored by VMware, Inc. or Broadcom Inc. "VMware" and "vSphere" are trademarks of Broadcom. Source code is publicly auditable at github.com/zw008/VMware-Monitor under the MIT license.

Read-only VMware vCenter/ESXi monitoring — 21 MCP tools, zero destructive code.

Code-level safety: This skill contains NO power, create, delete, snapshot, or modify operations. Not disabled — they don't exist in the codebase. Companion skills: vmware-aiops (VM lifecycle), vmware-storage (iSCSI/vSAN), vmware-vks (Tanzu Kubernetes), vmware-nsx (NSX networking), vmware-nsx-security (DFW/firewall), vmware-aria (metrics/alerts/capacity), vmware-avi (AVI/ALB/AKO), vmware-harden (compliance baselines). | vmware-pilot (workflow orchestration) | vmware-policy (audit/policy)

What This Skill Does

All 21 tools are read-only.

Category Capabilities
Inventory List VMs, ESXi hosts, datastores, clusters, networks
Health Active alarms, recent events (filter by severity/time), hardware sensors, host services
Performance Real-time host & VM CPU/memory/disk/network utilisation (PerfManager)
Capacity Datastore thin-provisioning over-commit, resource-pool reservation/usage
Infra Health ESXi certificate expiry, license usage/expiry, NTP configuration health
Snapshots Inventory-wide snapshot aging & sprawl (flag old snapshots)
Activity In-flight tasks, active login sessions
VM Details CPU, memory, disks, NICs, snapshots, guest OS, IP
Scanning Scheduled alarm/log scanning with Slack/Discord webhooks

Quick Install

uv tool install vmware-monitor
vmware-monitor doctor

When to Use This Skill

  • List or search VMs, hosts, datastores, clusters
  • Check active alarms or recent events
  • Get detailed info about a specific VM
  • Set up scheduled monitoring with webhook alerts
  • Any read-only VMware query where safety is paramount

Alarm/Event Output: suggested_actions Field

get_alarms and get_events results include a suggested_actions list. Each item is a ready-to-use hint pointing to the correct companion skill and tool:

{
  "alarm_name": "VM CPU Ready High",
  "entity_name": "prod-db-01",
  "suggested_actions": [
    "vmware-aiops: acknowledge_vcenter_alarm(entity_name='prod-db-01', alarm_name='VM CPU Ready High')",
    "vmware-aiops: reset_vcenter_alarm(entity_name='prod-db-01', alarm_name='VM CPU Ready High')"
  ]
}

AI agents (especially smaller local models) can read these hints directly to determine which skill and tool to call next, without needing to reason about skill routing themselves.

Use companion skills for:

  • Power on/off, deploy, clone, migrate --> vmware-aiops
  • iSCSI, vSAN, datastore management --> vmware-storage
  • Tanzu Kubernetes clusters --> vmware-vks
  • Load balancing, AVI/ALB, AKO, Ingress --> vmware-avi

Related Skills — Skill Routing

User Intent Recommended Skill
Read-only vSphere monitoring, zero risk vmware-monitor ← this skill
Storage: iSCSI, vSAN, datastores vmware-storage
VM lifecycle, deployment, guest ops vmware-aiops
Tanzu Kubernetes (vSphere 8.x+) vmware-vks
NSX networking: segments, gateways, NAT vmware-nsx
NSX security: DFW rules, security groups vmware-nsx-security
Aria Ops: metrics, alerts, capacity planning vmware-aria
Multi-step workflows with approval vmware-pilot
Compliance baselines (CIS / 等保 / PCI-DSS), drift detection, LLM remediation advisor vmware-harden (uv tool install vmware-harden)
Load balancer, AVI, ALB, AKO, Ingress vmware-avi (uv tool install vmware-avi)
Audit log query vmware-policy (vmware-audit CLI)

Common Workflows

Diagnostic investigations: Before running any "why is X failing / down / abnormal" workflow, follow references/investigation-protocol.md. It enforces the four root-cause completeness criteria (falsifiability / sufficiency / necessity / mechanism) and the up-to-three-rounds deepening loop. Since vmware-monitor is read-only, it serves as the data source — actuation belongs to companion skills like vmware-aiops.

Daily Health Check

Judgment: alarms tell you what vCenter has decided is wrong, events tell you what happened. They diverge — an event burst with no alarms often signals a metric threshold miscalibration, not "everything is fine." Read both.

  1. Check alarms --> vmware-monitor health alarms --target prod-vcenter — focus on Red severity AND alarms older than 1 hour (transient ones self-clear)
  2. Review recent events --> vmware-monitor health events --hours 24 --severity warning — look for repeated events from the same entity (a single event is noise; 50 events in an hour is a pattern)
  3. List hosts --> vmware-monitor inventory hosts — flag hosts disconnected, in maintenance mode unexpectedly, or memory > 90%
  4. If connection fails --> run vmware-monitor doctor to diagnose config/network issues

Investigate a Specific VM

  1. Find the VM --> vmware-monitor inventory vms --power-state poweredOff
  2. Get details --> vmware-monitor vm info problem-vm
  3. Check related events --> vmware-monitor health events --hours 48
  4. If VM not found --> verify VM name with vmware-monitor inventory vms --limit 100 or check target with --target <other-vcenter>

Performance Triage ("the cluster feels slow")

Judgment: inventory shows configured capacity (cores, GB); it cannot tell you what is actually hot. Use the real-time perf tools, then narrow.

  1. Rank hosts --> vmware-monitor perf hosts — the busiest host floats to the top (sorted by CPU%)
  2. Rank VMs on the suspect --> vmware-monitor perf vms --limit 25 — find the noisy neighbour
  3. Check for hidden storage pressure --> vmware-monitor capacity datastores — over-commit % > 100 means a thin datastore can fill mid-run even with "free" space showing
  4. Rule out snapshot drag --> vmware-monitor snapshots aging --only-old — old snapshots silently degrade I/O
  5. If perf tools return empty --> the host/VM may be disconnected or powered off (no real-time provider); confirm with inventory hosts / inventory vms

Scheduled-Outage Pre-flight (certs, licenses, time)

  1. Cert expiry --> vmware-monitor infra certs --warn-days 60 — an expired ESXi cert drops host management
  2. License headroom --> vmware-monitor infra licenses — catch over-allocation before it disables features
  3. Time sync --> vmware-monitor infra ntphealthy: no breaks SSO/Kerberos/log correlation (note: live offset is not exposed by the SOAP API, only config health)

Set Up Continuous Monitoring

  1. Configure webhook in ~/.vmware-monitor/config.yaml
  2. Start daemon --> vmware-monitor daemon start
  3. Daemon scans every 15 min, sends alerts to Slack/Discord

Usage Mode

Scenario Recommended Why
Local/small models (Ollama, Qwen) CLI ~2K tokens vs ~8K for MCP
Cloud models (Claude, GPT-4o) Either MCP gives structured JSON I/O
Automated pipelines MCP Type-safe parameters, structured output

MCP Tools (21 — all read-only)

Tool Description
list_virtual_machines List VMs with filtering (power state, sort, limit, folder_filter for case-insensitive folder-tree search); each VM includes folder_path
list_esxi_hosts ESXi hosts with CPU, memory, version, uptime
list_all_datastores Datastores with capacity, free space, type
list_all_clusters Clusters with host count, DRS/HA status
list_all_networks Networks with attached VM count and accessibility
get_alarms All active/triggered alarms — includes suggested_actions remediation hints
get_events Recent events filtered by severity and time — includes suggested_actions hints
get_host_sensors Hardware sensor status (temperature/voltage/fan) per host with green/yellow/red health
get_host_services Host service status (running state and startup policy), optionally filtered by host
vm_info Detailed VM info (CPU, memory, disks, NICs, snapshots)
vm_list_snapshots Snapshot list for one VM with nesting hierarchy (read-only)
host_performance Real-time host CPU/mem/disk/net utilisation (PerfManager); busiest first
vm_performance Real-time VM CPU/mem/disk/net utilisation (top 25 by default); powered-on only
snapshot_aging Inventory-wide snapshot sweep with age + sprawl; flags snapshots older than N days
certificate_status Per-host ESXi management certificate expiry (days until expiry, expiring flag)
license_status vCenter/ESXi license inventory with used/total and expiration
ntp_status Per-host NTP config health (servers + ntpd state); live offset not in SOAP API
datastore_capacity Datastore over-commit (provisioned vs capacity); thin-provisioning risk
resource_pool_usage Resource-pool CPU/memory reservation, limit, and current usage
active_tasks In-flight (and recently completed) vCenter tasks with progress/errors
active_sessions Currently authenticated vCenter/ESXi sessions (who is logged in)

All tools are read-only. No tool can modify, create, or delete any resource. Performance/capacity readings are point-in-time samples — this skill retains no history, so it never reports a fabricated "trend" or runway date.

CLI Quick Reference

vmware-monitor inventory vms [--target <t>] [--limit 20] [--power-state poweredOn]
vmware-monitor inventory hosts [--target <t>]
vmware-monitor inventory datastores [--target <t>]
vmware-monitor inventory clusters [--target <t>]
vmware-monitor inventory networks [--target <t>]
vmware-monitor health alarms [--target <t>]
vmware-monitor health events [--hours 24] [--severity warning]
vmware-monitor health sensors [--target <t>]
vmware-monitor health services [--host <esxi>] [--target <t>]
vmware-monitor perf hosts [--host <esxi>] [--target <t>]
vmware-monitor perf vms [--vm <name>] [--limit 25] [--target <t>]
vmware-monitor capacity datastores [--target <t>]
vmware-monitor capacity pools [--target <t>]
vmware-monitor infra certs [--warn-days 30] [--target <t>]
vmware-monitor infra licenses [--target <t>]
vmware-monitor infra ntp [--host <esxi>] [--target <t>]
vmware-monitor snapshots aging [--threshold 30] [--only-old] [--target <t>]
vmware-monitor activity tasks [--active-only] [--target <t>]
vmware-monitor activity sessions [--target <t>]
vmware-monitor vm info <vm-name> [--target <t>]
vmware-monitor scan now [--target <t>]
vmware-monitor daemon start|stop|status
vmware-monitor doctor [--skip-auth]

Full CLI reference: see references/cli-reference.md

Troubleshooting

Alarms returns empty but vCenter shows alarms

The get_alarms tool queries triggered alarms at the root folder level. Some alarms are entity-specific — try checking events instead: get_events --hours 1 --severity info.

"Connection refused" error

  1. Run vmware-monitor doctor to diagnose
  2. Verify target hostname/IP and port (443) in config.yaml
  3. For self-signed certs: set disableSslCertValidation: true

Events returns too many results

Use severity filter: --severity warning (default) filters out info-level events. Use --hours 4 to narrow time range.

VM info shows "guest_os: unknown"

VMware Tools not installed or not running in the guest. Install/start VMware Tools for guest OS detection, IP address, and guest family info.

Doctor passes but commands fail with timeout

vCenter may be under heavy load. Try targeting a specific ESXi host directly instead of vCenter, or increase connection timeout in config.yaml.

Setup

uv tool install vmware-monitor
vmware-monitor init      # guided: prompts for host/user/password, writes config + .env (chmod 600), then verifies

init stores the password grep-safe (obfuscated b64:, never plaintext) and locks .env to 0600. Prefer it over hand-editing. Manual alternative:

mkdir -p ~/.vmware-monitor
cp config.example.yaml ~/.vmware-monitor/config.yaml
cp .env.example ~/.vmware-monitor/.env && chmod 600 ~/.vmware-monitor/.env
# Edit config.yaml (targets) and .env (passwords), then: vmware-monitor doctor

All tools are automatically audited via vmware-policy. Audit logs: vmware-audit log --last 20

Full setup guide, security details, and AI platform compatibility: see references/setup-guide.md

Audit & Safety

All operations are automatically audited via vmware-policy (@vmware_tool decorator):

  • Every tool call logged to ~/.vmware/audit.db (SQLite, framework-agnostic)
  • Policy rules enforced via ~/.vmware/rules.yaml (deny rules, maintenance windows, risk levels)
  • Risk classification: each tool tagged as low/medium/high/critical
  • View recent operations: vmware-audit log --last 20
  • View denied operations: vmware-audit log --status denied

vmware-policy is automatically installed as a dependency — no manual setup needed.

License

MIT — github.com/zw008/VMware-Monitor

Install via CLI
npx skills add https://github.com/zw008/VMware-Monitor --skill vmware-monitor
Repository Details
star Stars 8
call_split Forks 4
navigation Branch main
article Path SKILL.md
More from Creator