llm-serving-patterns

star 78

LLM inference infrastructure, serving frameworks (vLLM, TGI, TensorRT-LLM), quantization techniques, batching strategies, and streaming response patterns. Use when designing LLM serving infrastructure, optimizing inference latency, or scaling LLM deployments.

melodic-software By melodic-software schedule Updated 12/27/2025

Skill instructions (SKILL.md) could not be loaded from local cache or raw GitHub repository.

Install via CLI
npx skills add https://github.com/melodic-software/claude-code-plugins --skill llm-serving-patterns
Repository Details
star Stars 78
call_split Forks 12
navigation Branch main
article Path SKILL.md
More from Creator
melodic-software
melodic-software Explore all skills →