llm-serving-patterns

star 78

LLM inference infrastructure, serving frameworks (vLLM, TGI, TensorRT-LLM), quantization techniques, batching strategies, and streaming response patterns. Use when designing LLM serving infrastructure, optimizing inference latency, or scaling LLM deployments.

By melodic-software schedule Updated 12/27/2025

play_arrow Run Skill in Manus View GitHub

Skill instructions (SKILL.md) could not be loaded from local cache or raw GitHub repository.

Install via CLI

npx skills add https://github.com/melodic-software/claude-code-plugins --skill llm-serving-patterns

Repository Details

star Stars 78

call_split Forks 12

navigation Branch main

article Path SKILL.md

Occupations

Computer Network Architects