name: clone-repos description: Clone SGLang, FlashInfer, sgl-cookbook, and flashinfer-trace repositories to tmp/. Use when setting up the project, preparing for kernel extraction, or when the user needs the source repositories.
Clone Repositories
Clone SGLang, FlashInfer, sgl-cookbook, and flashinfer-trace repositories to the tmp/ directory.
Description
This skill sets up the required repositories for kernel extraction, testing, and workload collection workflows. It:
- Clones SGLang, FlashInfer, sgl-cookbook, and flashinfer-trace repositories to
tmp/directory (if not already present) with all submodules - Updates repositories by pulling latest changes from remote and updating submodules (if repos already exist)
- Checks out the
mainbranch by default (or specified branch) - Installs SGLang and FlashInfer packages from source in the current environment
Repositories:
- SGLang: Inference engine with model implementations and kernel calls
- FlashInfer: GPU kernel library with optimized implementations (ground truth)
- sgl-cookbook: Best serving configurations for each architecture and model (TP, EP flags)
- flashinfer-trace: Workload dataset and kernel definitions — cloned to
tmp/flashinfer-trace
Usage
# Clone all repos to ./tmp directory, update if exists, and install from source
/clone-repos
# Clone specific branches
/clone-repos --sglang-branch v0.4.0 --flashinfer-branch v0.2.0
Parameters
sglang_branch(optional): SGLang branch to checkout (default: "main")flashinfer_branch(optional): FlashInfer branch to checkout (default: "main")cookbook_branch(optional): sgl-cookbook branch to checkout (default: "main")
Implementation Steps
When executing this skill:
Create tmp directory if needed:
mkdir -p tmpHandle SGLang repository:
# Check if repo exists if [ -d "tmp/sglang/.git" ]; then echo "SGLang exists, pulling latest changes..." (cd tmp/sglang && git fetch origin && git checkout "${sglang_branch:-main}" && git reset --hard "origin/${sglang_branch:-main}" && git submodule update --init --recursive) else echo "Cloning SGLang with submodules..." git clone --recurse-submodules https://github.com/sgl-project/sglang.git tmp/sglang (cd tmp/sglang && git checkout "${sglang_branch:-main}") fiNote: Using
(cd ...)subshell syntax ensures directory changes are isolated and don't affect subsequent commands.Handle FlashInfer repository:
# Check if repo exists if [ -d "tmp/flashinfer/.git" ]; then echo "FlashInfer exists, pulling latest changes..." (cd tmp/flashinfer && git fetch origin && git checkout "${flashinfer_branch:-main}" && git reset --hard "origin/${flashinfer_branch:-main}" && git submodule update --init --recursive) else echo "Cloning FlashInfer with submodules..." git clone --recurse-submodules https://github.com/flashinfer-ai/flashinfer.git tmp/flashinfer (cd tmp/flashinfer && git checkout "${flashinfer_branch:-main}") fiNote: Using
(cd ...)subshell syntax ensures directory changes are isolated and don't affect subsequent commands.Handle sgl-cookbook repository:
# Check if repo exists if [ -d "tmp/sgl-cookbook/.git" ]; then echo "sgl-cookbook exists, pulling latest changes..." (cd tmp/sgl-cookbook && git fetch origin && git checkout "${cookbook_branch:-main}" && git reset --hard "origin/${cookbook_branch:-main}") else echo "Cloning sgl-cookbook..." git clone https://github.com/sgl-project/sgl-cookbook.git tmp/sgl-cookbook (cd tmp/sgl-cookbook && git checkout "${cookbook_branch:-main}") fiNote: sgl-cookbook doesn't require submodules or installation. It contains serving configuration files only.
Handle flashinfer-trace repository:
# Check if repo exists if [ -d "tmp/flashinfer-trace/.git" ]; then echo "flashinfer-trace exists, pulling latest changes..." (cd tmp/flashinfer-trace && git fetch origin && git checkout main && git reset --hard origin/main) else echo "Cloning flashinfer-trace..." git clone https://huggingface.co/datasets/flashinfer-ai/flashinfer-trace tmp/flashinfer-trace fiNote: flashinfer-trace is a HuggingFace dataset repo (not GitHub). It contains kernel definitions, workloads, and blob safetensors. All workload collection writes to this directory.
Install packages from source:
# Upgrade pip once pip install --upgrade pip # Install FlashInfer (pyproject.toml in repo root) (cd tmp/flashinfer && python -m pip install --no-build-isolation -e . -v) # Install SGLang (pyproject.toml in python/ subdirectory) (cd tmp/sglang && pip install -e "python")Note: Subshell syntax
(cd ... && command)keeps working directory unchanged.Verify installations:
# Test imports python -c "import sglang; print(f'SGLang: {sglang.__version__}')" python -c "import flashinfer; print(f'FlashInfer: {flashinfer.__version__}')" # Verify directory structure ls tmp/sglang/python/sglang/srt/models/ ls tmp/flashinfer/flashinfer/ ls tmp/flashinfer/tests/ ls tmp/sgl-cookbook/ ls tmp/flashinfer-trace/definitions/
Output Directory Structure
flashinfer-bench/
└── tmp/ # Cloned repositories (auto-updated)
├── sglang/ # SGLang repository (installed in current env)
│ └── python/sglang/srt/
│ ├── models/ # Model implementations
│ │ ├── llama.py
│ │ ├── deepseek_v3.py
│ │ ├── qwen2_moe.py
│ │ └── ...
│ └── layers/ # Layer implementations
│ ├── attention/
│ ├── moe/
│ └── layernorm.py
├── flashinfer/ # FlashInfer repository (installed in current env)
│ ├── flashinfer/ # Python package in root (not python/ subdir!)
│ │ ├── attention.py
│ │ ├── norm.py
│ │ ├── moe.py
│ │ └── ...
│ ├── tests/ # Reference tests with vanilla implementations
│ ├── csrc/ # CUDA source files
│ └── include/ # C++ headers with kernel implementations
├── sgl-cookbook/ # Serving configuration repository (NOT installed)
└── flashinfer-trace/ # HuggingFace dataset clone — single source of truth
│ # for definitions, ref tests, baselines, workloads,
│ # blobs, and eval traces. All trace edits commit here.
├── definitions/{op_type}/ # Kernel definition JSONs
├── tests/references/ # Reference tests (pytest)
├── solutions/baseline/ # FlashInfer-wrapper baseline solutions
├── workloads/{op_type}/ # Sanitized workload JSONLs
├── blob/workloads/{op_type}/ # Safetensors blobs referenced by JSONLs
└── traces/{op_type}/ # Eval traces (one entry per workload)
Requirements
- Git (with submodule support)
- Network access to GitHub (for sglang, flashinfer, sgl-cookbook, and their submodules)
- Sufficient disk space (~6GB total including submodules and serving configs)
- Python development environment for building from source
- CUDA toolkit (for FlashInfer CUDA kernels)
Common Issues
- Network errors: Check GitHub connectivity; repositories with submodules require stable connection
- Submodule failures: Retry
git submodule update --init --recursive - Disk space: Requires ~6GB total for all repositories with submodules
- Installation failures: Verify Python ≥3.8, CUDA toolkit installed, and submodules initialized
- sgl-cookbook not found: Ensure you have network access to github.com/sgl-project/sgl-cookbook
Integration with Other Skills
This skill provides the foundation for:
- extract-kernel-definitions: Uses SGLang model files to extract kernels, sgl-cookbook to find serving configurations (TP/EP flags), outputs to
tmp/flashinfer-trace/definitions/(the HuggingFace dataset clone) - add-reference-tests: Uses FlashInfer for ground truth, outputs tests to
tmp/flashinfer-trace/tests/references/ - collect-workloads: Uses
tmp/flashinfer-trace(HuggingFace dataset clone) as the target for workload JSONL + safetensors blobs, then submits a PR - onboard-model: End-to-end pipeline that calls this skill first (Phase 0) to ensure all repos are current before model discovery, definition generation, and workload collection.
Example workflow:
# Step 1: Clone SGLang, FlashInfer, and sgl-cookbook repositories
/clone-repos
# Step 2: Extract kernel definitions from a model (uses sgl-cookbook for TP/EP configs)
/extract-kernel-definitions --model-name deepseek_v3
# Step 3: Add reference tests
/add-reference-tests --op-type mla_paged
# Or run the full end-to-end pipeline
/onboard-model --model-name qwen3-235b-a22b
Notes
- Updates existing repos or performs full clones with submodules
- Editable installs (
pip install -e) for development - FlashInfer package location:
tmp/flashinfer/flashinfer/(not inpython/subdirectory) - sgl-cookbook is NOT installed (configuration files only, no Python package)
Maintaining This Document
Update this file when changing repository URLs, directory structure, or adding new repositories.