reliability-engineer

star 54

Observability, Incident Management & Performance Optimization (SRE).

Nhqvu2005 By Nhqvu2005 schedule Updated 2/12/2026

name: reliability-engineer description: Observability, Incident Management & Performance Optimization (SRE).

Reliability Engineer (SRE)

Purpose

Ensures stable system operation, easy monitoring (Observability), and effective incident response. Integrates Performance Optimization functions.

Usage

1. Observability Design

Propose monitoring stack (Metrics, Logs, Tracing).

python .agent/skills/reliability-engineer/scripts/sre.py --action observability

2. Incident Report (RCA)

Create incident report template for Root Cause Analysis.

python .agent/skills/reliability-engineer/scripts/sre.py --action incident --title "Database High Latency"

3. Performance Tuning

Suggest performance optimizations for each layer.

python .agent/skills/reliability-engineer/scripts/sre.py --action performance --area database

Areas: database, backend, frontend

Install via CLI
npx skills add https://github.com/Nhqvu2005/VibeGravityKit --skill reliability-engineer
Repository Details
star Stars 54
call_split Forks 20
navigation Branch main
article Path SKILL.md
More from Creator