anomaly-detection

star 0

Detect statistical anomalies in log patterns — sudden error rate spikes, unusual latency, memory pressure, or cold start storms. Use when asked about anomalies, trends, spikes, or unusual behavior.

RohitRana2003 By RohitRana2003 schedule Updated 3/6/2026

name: anomaly-detection description: Detect statistical anomalies in log patterns — sudden error rate spikes, unusual latency, memory pressure, or cold start storms. Use when asked about anomalies, trends, spikes, or unusual behavior.

Skill: Anomaly Detection in Logs

Use this workflow to detect abnormal patterns compared to a baseline.

Step-by-Step Instructions

1. Establish Baseline

Use generate_time_report to get error counts per time bucket:

generate_time_report(log_dir="dummy_logs/", hours=24)

This returns error frequencies bucketed by hour.

2. Compute Error Rate

For each time bucket:

  • error_rate = error_count / total_request_count
  • Baseline = median error rate over the period
  • Anomaly threshold = baseline × 3 (or absolute >10% error rate)

3. Detect Specific Anomaly Types

Error Rate Spikes

  • Flag any 5-minute window with error rate > 3x the hourly baseline
  • Use grep with a time range to investigate that window

Latency Spikes

  • Search REPORT lines: grep(pattern="Duration: [2-9][0-9]{4}", file="...")
  • This matches durations ≥ 20,000ms (near timeout)
  • Group by 10-minute window

Memory Pressure

  • Search REPORT lines where Max Memory Used is > 90% of Memory Size
  • Pattern: look for Max Memory Used: ([4-9][0-9]{2}|[0-9]{4}) MB when Memory Size is 512 MB

Cold Start Storms

  • Search REPORT lines containing Init Duration
  • More than 5 cold starts per minute indicates a scale-out event or deployment

Error Cascades

  • Look for one error type that starts appearing, then other error types follow within minutes
  • This suggests upstream failure propagation

4. Correlate Anomalies

  • Do multiple anomaly types coincide? (e.g., error spike + latency spike = downstream issue)
  • Check if anomaly aligns with deployment time (look for "cold start" pattern)

5. Write Anomaly Report

Save to anomaly_report.md with:

  • Timeline visualization (ASCII chart if possible)
  • Each anomaly: type, time range, magnitude, affected requestIds (sample)
  • Correlation analysis
  • Recommended action

Anomaly Classification

Type Threshold Likely Cause
Error spike >3x baseline in 5min Deployment, traffic spike, dependency failure
Latency spike Duration >25s (>83% of 30s limit) Downstream timeout, cold start
Memory pressure MaxMem >90% of limit Memory leak, large payload
Cold start storm >5 cold starts/min Scale-out, deployment
Error cascade 3+ error types growing simultaneously Upstream failure
Install via CLI
npx skills add https://github.com/RohitRana2003/inventory-agent-ppo --skill anomaly-detection
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
RohitRana2003
RohitRana2003 Explore all skills →