Llumnix Documentation#

Llumnix is a full-stack solution for distributed LLM inference serving, featuring fully dynamic request scheduling for modern LLM deployments.


πŸš€ Getting Started#

New to Llumnix? Start here.

  • Quick Start β€” Deploy your first Llumnix cluster in minutes

  • Deployment Guide - Full deployment guide covering all modes: neutral, PD, and PD-KVS

  • Benchmark - Performance benchmarks for Llumnix on Kubernetes


πŸ“– User Manual#


πŸ› οΈ Development Guide#

  • Development Setup β€” Environment setup, build commands, and e2e unit tests guide

  • Build Images β€” Manually build and push component images


πŸ—οΈ Design#

Understand how Llumnix works internally.