capture-nsys-profile

star 185

Capture a Nsight Systems (.nsys-rep) profile of a short PithTrain run for performance analysis. Use when the user asks to "capture an nsys profile", "profile training", or "grab an nsys trace", or wants to inspect kernel timelines / pipeline behavior / all-to-all overheads. Adaptive over pipeline-parallel (PP), expert-parallel (EP), context-parallel (CP), and sequence length; size the global batch so the pipeline reaches steady state without producing a multi-GB .nsys-rep. Run 5 warmup steps + 1 profiled step from a released checkpoint.

mlc-ai By mlc-ai schedule Updated 6/4/2026

Skill instructions (SKILL.md) could not be loaded from local cache or raw GitHub repository.

Install via CLI
npx skills add https://github.com/mlc-ai/pith-train --skill capture-nsys-profile
Repository Details
star Stars 185
call_split Forks 13
navigation Branch main
article Path SKILL.md
More from Creator