name: cookbook-review-pr
description: Review a pull request against the SGLang Cookbook (docs_new/, Mintlify) contribution checklist — the config-driven format (per-model config + benchmarks JSX consumed by the shared _deployment.jsx / _playground.jsx engines). Run with /cookbook-review-pr .
Cookbook Review PR
Fetch the diff, run the checklist, report what you find. The cookbook is config-driven:
shared engines (_deployment.jsx, _playground.jsx) with NO model-specific code; each
model is a data config (+ optional benchmarks) under src/snippets/configs/<vendor>/
plus an MDX page. This checklist targets that layout. Field-schema detail lives in
.claude/skills/cookbook-add-model/references/authoring-reference.md — defer to it rather
than restating.
Usage
/cookbook-review-pr <PR number>
Steps
gh pr view <N> --repo sgl-project/sglang --json title,body,files,author,baseRefName,headRefName,commits,reviewsgh pr diff <N> --repo sgl-project/sglanggh pr list --repo sgl-project/sglang --state open --search "<model name>"(duplicate check)- Run every checklist item against the diff.
- Output per-file verdicts + overall recommendation.
Checklist
1. File hygiene
- A cookbook PR should only touch:
docs_new/src/snippets/configs/<vendor>/*.jsx(config + benchmarks),docs_new/cookbook/**/*.mdx,docs_new/docs.json,docs_new/cookbook/<category>/intro.mdx(vendor card),docs_new/cards/logos/<vendor>.png(new vendor only). Flag stray files (settings.local.json, lockfiles, IDE configs). - Pages must be
.mdx, not.md. Files end with a trailing newline. Check commit history for unrelated commits accidentally included. - Engines untouched:
_deployment.jsx/_playground.jsxshould NOT change in a model-add PR (adding a model is data-only). Engine edits = a separate axis/feature PR (seecookbook-add-model/references/engine-axis.md); review them against that checklist.
2. Config quality (the per-model config)
- Single
export const config = { ... }literal — no function calls, spreads, fragment refs, or IIFE (Mintlify re-evals at hydration →ReferenceError). - No
!(x in y)anywhere (Mintlify AST walker crashes) — useobj.key === undefined. supportedHardware⊆HARDWARE_CATALOG(in_deployment.jsx) ∪config.hardware. A model-specific GPU the shared catalog lacks must be declared inconfig.hardware({id,label,vram,vendor}), not added to the engine catalog.placeholdersdeclares every{{KEY}}used incurlor any cell.modelNamescovers every cell (byhw|variant|quanttriple orvariant|quantpair).strategiescount matches the page's operating points — 1 recipe → a singlebalanced; 2 →low-latency+high-throughput; 3 → the full trio. Tiers apply per (hw × variant × quant) combination: a single-recipe combination must park under its semantically honest tier (clear slant → that tier, e.g. a workstation card underlow-latency; no slant →balanced, e.g. a CPU platform) — flag a no-slant recipe parked under low-latency/high-throughput. Mixed unions like [low-latency, balanced, high-throughput] with per-selection greying are fine. Also flag model-specific ids (e.g.mtp), and flag an INVERTED speculative mapping — the deterministic default is MTP/spec-decoding ON →low-latency, OFF →high-throughput(at saturation the draft+verify overhead outweighs the speedup); the reverse needs an explicit maintainer-confirmed justification in the PR. The MDX strategy bullets describe serving semantics in the DSv4 style (single-user chat / typical multi-user / batch jobs), not internal toggles.dockerImagescovers the hw ids that have cells (else users hit the:devfallback); ahw|quantkey (resolved before the plainhw) is valid when one quant on a shared GPU needs a different image (e.g. an FP4 dev build) — don't flag those.multiNodeHintspresent ONLY for hw whose fabric needs manual NIC env (e.g.gb200NVL72) — NOT everymulti-Nhw (standard-IB DeepEP / Marlin multi-node don't need it).github.cookbookModelis set to the model's HF id (<hf-org>/<model-slug>). The issue template'smodelfield is a free-form input prefilled from this value; if the config omits thegithubblock, the engine falls back todeepseek-ai/deepseek-v4and the page's submissions get mislabeled.playgroundFeaturesis opt-OUT: the general axes ship on every cookbook by default (attentionTP/CP/DP-Attn,moebackend+EP for MoE models,parsers,speculative,pdDisagg,hicache) — flag a missing general axis unless the model genuinely cannot use it. Model-specific axes only where applicable (MegaMoE backend +megamoeQuantonly on Blackwell MoE, gated byrequiresHw;hisparseonly DSA-style). Knobs that are meaningless for a subset of variants/hw aredisabled with a reason, not silently live (e.g. MoE knobs greyed on dense variants). No empty/stub axes.- No leftover
__TOKEN__— the config was stamped from the template and every placeholder is filled (grep -rn '__[A-Z_]*__'on the new config/benchmarks/MDX returns nothing). - All-hardware considered: every
supportedHardwareid (from the catalog orconfig.hardware) has ≥1 cell OR is a deliberate greyed "coming soon"; AMD was pruned or kept on purpose (not a leftover template family).
3. Cells / 5-dim matrix
- Every cell
matchhas EXACTLY the 5 keys (hw,variant,quant,strategy,nodes). env/flagsare flat literals (only{{PLACEHOLDER}}subst) — no sharedcommonFlagsreference (Mintlify won't inline it).- NO
--nnodes/--node-rank/--dist-init-addrliterals in multi-node cells (the renderer injects them frommatch.nodes). - NO literal
--host/--port— use{{HOST_IP}}/{{PORT}}. - NO
--reasoning-parser/--tool-call-parserin any cell — parsers are a Playground-only feature added on top of the base command (DSv4 convention); flag any cell that bakes them in. - Accuracy-degrading flags in cells — runtime quant below the checkpoint
(e.g. MegaMoE W4A4 — DSv4 gates it behind the Playground's
megamoeQuant) and lossy--kv-cache-dtype(e.g.fp8_e4m3over a higher-precision-KV checkpoint): flag for explicit maintainer confirmation. Output quality should be exactly what the quant chip declares, so absent a recorded sign-off in the PR (e.g. carried verbatim from a measured legacy recipe's default command), request the flag move to Playground/tips. - Flag order:
--model-pathfirst (an optional--trust-remote-codemay precede it — the DSv4 cells do), then parallelism, then MoE, then tuning,--host/--portlast (the playground's insert anchors assume this). - TP/memory sanity:
model_weight_GB / (tp × gpu_mem)fits with ~20–30% headroom (BF16 ≈ params×2 GB, FP8 ≈ ×1, FP4 ≈ ×0.5; MoE uses total weight, not active params).
4. Benchmarks
- Each
benchmarks[]entry'smatchtuple corresponds to a real cell. accuracyLabelsis present whenever the benchmarks carry accuracy data — the engine ships NO default eval set; without it the accuracy rows silently don't render.defaultAccuracy/ per-cellaccuracy/benchmarkCommands.accuracykeys all ∈config.accuracyLabels.- A benchmark's quantization must match a variant actually listed —
(BF16)on a model that only released FP8/FP4 is a factual bug. benchmarkCommands.speedispython3 -m sglang.bench_serving(the workload), separate from thesglang servedeploy command.sglang_versionis a real build the author ran (a release, ordev/nightly) — not a guessed/placeholder value (no leftover0.0.0).- Consistent accuracy harness across entries: every value under one
accuracyLabelscolumn must be produced by the SAME harness — flag a page that, say, measures one platform's GSM8K withfew_shot_gsm8k --num-questions 200and another's withrun_eval --eval-name gsm8k --num-examples 1319and shows both as one "GSM8K %" (the scores aren't comparable). Either standardize on one harness (matchingbenchmarkCommands.accuracy) or require an explicit per-entry note. Common when folding a second contributor's measurements (e.g. an AMD/ROCm PR) into the page.
5. Doc ↔ config parity (the #1 finding)
- Any
sglang servecommand shown in MDX prose (config tips, benchmark section) must equal what the engine emits from the corresponding cell — same flags, same order. Drift here is the most common review miss.
6. Commands / port
- Launch uses
sglang serve— flag anypython -m sglang.launch_server/python3 -m sglang.launch_server(deprecated). The engine already emitssglang serve; guard against prose/cells reintroducing the old launcher. - Port
30000everywhere (launch, curl, clientbase_url, bench) — flag8000. Launch port must match client/curl port on the same page.
7. Frontmatter
- Every new MDX page has
title:and a top-leveldescription:(a real one-line value prop, not copied from another vendor) — NOTmetatags.description(non-canonical; the top-level field is what renders as the subtitle and SEO meta — see mintlify-authoring). - No
mode: wideon a model page — it hides the right-hand "On this page" ToC that every other model page has. Leavemodeunset (the Deploy/Playground panels self-cap at 900px, so the default column holds them fine).mode: widebelongs only on categoryintro.mdxgrids. tag: NEWonly for genuine new launches; when one is added, staletag: NEWon older pages should be dropped in the same PR (grep -RlE "^tag: NEW" docs_new/cookbook/).- MDX imports BOTH
DeploymentandPlaygroundfrom/src/snippets/...(absolute). - Deploy heading slugs to
deployment(ordeploy), Playground toplayground— so "↑ Switch base" and "Open the Playground →" scroll. No numbered headings for these two.
8. Navigation & homepage
- New page →
docs_new/docs.jsonupdated: under the right vendor group insidenavigation→ Cookbook → Autoregressive Models, root-relative, no.mdx:cookbook/<category>/<Vendor>/<Model>. - Homepage
<Card href>indocs_new/cookbook/<category>/intro.mdxpoints to the vendor's flagship; new vendors get a new<Card>+ a logo atdocs_new/cards/logos/<vendor>.png— 940×525 RGBA transparent, icon-only (no wordmark), lowercase filename, tracked viagit add -f(*.pngis gitignored repo-wide). Card order matches thedocs.jsonnav order. - Don't change
docs_new/cookbook/intro.mdxfor individual model adds (top-level only).
9. Links & factual
- HuggingFace URLs resolve to a real model. License section matches the actual HF license
(don't copy from another model). Docker images from
lmsysorg/sglang; nosgl-project-dev. The image tag is a real build (a release the author ran, or:dev/nightly) — not a guessed version. - Internal links root-relative, no extension (
/cookbook/.../<Model>); flag.md/.mdxor../-relative links.docs.sglang.iois canonical. - No Google-Drive image links (don't render). Shell placeholders are
export VAR=<value>, not${VAR}(a bash no-op). - Parser ids must exist in the code registries on the PR's target branch: every
--reasoning-parser X/--tool-call-parser Ynamed in prose or inplaygroundFeatures.parsersflags is a registered key inpython/sglang/srt/parser/reasoning_parser.py(DetectorMap) /python/sglang/srt/function_call/function_call_parser.py(ToolCallParserEnum) — prose naming a near-miss id (e.g. the reasoning id where the tool id differs) is a factual bug.--…-parser autois acceptable ONLY if the template-detection rules (python/sglang/srt/managers/template_detection.py) actually resolve THIS model's chat template to the right parser — no rule match means auto silently disables the parser; when in doubt require explicit ids (the DSv4 page pins explicit ids).
9b. MDX authoring (Mintlify) — detail in cookbook-add-model/references/mintlify-authoring.md
- Forbidden syntax: no Docusaurus admonitions (
:::),@site/@theme, GitHub alert blocks (> [!NOTE]), markdown pipe tables (use JSX<table>), inline<details>, or unknown components.<CardGroup>/<Card>only on categoryintro.mdx, not model pages. - Code fences are labeled (e.g.
python Example/bash Command/text Outputafter the opening fence); a fenced block nested inside another uses four backticks outside. - §3 commands and outputs are collapsible (DeepSeek-V4 pattern): every runnable
example wrapped in an
<Accordion>, its real output in a following<Accordion title="Example Output">(Pending update...only with user acknowledgement). Flag bare/inline example blocks and**Output Example:**headings. - Reasoning-parser example matches the parser's output shape: separate-field
(
reasoning_content+content) vs inline<think>tags parsed out ofcontent. - No hardcoded sampling params (
temperature/top_p) in sample code (SGLang usesgeneration_config.jsondefaults); listing them in §1 informationally is fine.
10. Quantization rules
- NVFP4 checkpoints are Blackwell-only (B200/B300/GB300) — never AMD. An AMD FP4 cell
is legitimate ONLY when the vendor published an MXFP4 checkpoint for it (e.g.
amd/Qwen3.5-397B-A17B-MXFP4on MI355X) — verify the HF repo resolves; otherwise the AMD FP4 chip must be absent/disabled. - BF16 / FP8 work on NVIDIA and AMD.
--kv-cache-dtype fp8_e4m3in a cell is an accuracy-degrading flag — see §3 (needs explicit maintainer sign-off; default home is Playground/tips).
11. Scope
- Changes match the PR title. Flag global changes hiding behind a platform-specific title (e.g. an "H200 FP8" PR that adds a flag to ALL cells). Unmentioned side-fixes belong in the PR body.
12. Duplicate PRs
- Another open PR for the same model? Flag it; compare completeness; note merge-conflict
risk on
docs.json+ the vendor card; flag a superseded older PR by the same author.
13. Build / validate
cd docs_new
mint validate
mint broken-links
Optional: mint dev for a visual smoke test.
14. Reviewer feedback
gh api repos/sgl-project/sglang/pulls/<N>/comments— have prior reviewer requests been addressed? Unresolved requested-changes should be flagged.
15. Grammar & spelling
- Check added/changed prose for typos and grammar (e.g. "recommend" vs "recommended"). Flag each with the exact wrong text + correction.
Output
Per file:
- ✅ PASS
- ⚠️ ISSUE: <what>
- 🔴 BLOCK: <what>
Overall: APPROVE / REQUEST CHANGES / BLOCKED