hardware-calculator - SKILL.md Agent Skill

name: hardware-calculator description: Quick VRAM/RAM calculations, hardware recommendations, feasibility checks for AI models short_desc: VRAM/RAM calc + GPU sizing for AI models keywords: [VRAM, "GPU memory", "RAM requirements", "model footprint", "GPU recommendation", "which GPU", "GPU sizing", "fit on GPU", "memory requirements", H100, A100, "RTX 4090", "consumer GPU"] model: haiku

Hardware Calculator (Haiku)

Purpose: Quick VRAM/RAM calculations, hardware recommendations, feasibility checks for AI models.

Model: Haiku 4.5

When to Invoke Autonomously

"Can I run X?": User asks if model fits hardware
Hardware Shopping: "Which GPU for [model]?"
Quick VRAM Check: Before loading model
Multi-Model Planning: "Can I run 2 models simultaneously?"
Quantization Math: "How much VRAM saves Q4 vs Q8?"

DO NOT invoke for

Complex architecture decisions (use /architect skill)
Performance optimization (use /performance-optimizer)
Already know hardware fits

What This Skill Does

VRAM Calculations:

Calculate model VRAM requirements from parameters + quantization
Account for context overhead and batch size
Apply 20% safety margin
Formula: (params × bytes_per_param × 1.2) + context + batch

GPU Recommendations:

Match budget to appropriate GPU tier ($300-5000+)
Match VRAM needs to GPU options (8-80GB range)
Consider price/performance tradeoffs
Warn about overspending or underpowered options

Feasibility Checks:

Quick yes/no: Will model fit on GPU?
Account for OS overhead (~2GB)
Warn if <10% headroom (tight fit, unstable)

Multi-Model Planning:

Calculate combined VRAM for running multiple models
Suggest offloading strategies if tight
Account for shared context when applicable

See: examples/calculations.md for formulas, examples/gpu-specs.md for hardware details, scripts/vram_calculator.py for automated calculations

Quick Workflow Reference

Before calculating: Search for hardware specs and benchmarks

.claude/scripts/kg-search search "hardware" --type hardware

For deep research: Ask user "Use hybrid_search to research [GPU comparison]"

Development env: Python 3.12, Weaviate:8081, Ollama:11435, venv: source claude_mcp_servers/.venv/bin/activate

Success Metrics

✅ VRAM estimates within ±1GB of actual usage
✅ GPU recommendations fit user's budget and needs
✅ Calculations complete in <2 seconds
✅ Users don't run into OOM errors after following recommendations
✅ Hardware purchases are successful (not over/under-powered)