Dynamo

NVIDIA Dynamo

Dynamo is a new modular inference framework designed for serving large language models (LLMs) in multi-node distributed environments. It enables seamless scaling of inference workloads across GPU nodes and the dynamic allocation of GPU workers to address traffic bottlenecks at various stages of the model pipeline.

This GitHub organization hosts repositories for Dynamo's core components and integrations, including:

Core Framework

Distributed inference runtime with Rust-based orchestration
Python bindings for workflow customization
Multi-GPU/multi-node serving capabilities

LLM Optimized Components

Disaggregated Serving Engine: Decoupling of prefill and decode to optimize for throughput at latency SLOs
Intelligent Routing System: Prefix-based and load-aware request distribution
KV Cache Management: Distributed KV Cache management

NVIDIA Optimized Transfer Library (NIXL)

Abstracts memory of heterogeneous devices, i.e., CPU, GPU, storage, and enables most efficient and low-latency communication among them
Integrates with distributed inference servers such as Dynamo. This library will target distributed inference communication patterns to effectively transfer the KV cache in disaggregated LLM serving platforms.

Getting Started

To learn more about NVIDIA Dynamo Inference Serving Platform, please refer to the Dynamo developer page and read our Quickstart Guide for container setup and basic workflows.

Documentation

User documentation on Dynamo features, APIs, and architecture is located in the Dynamo documents folder on GitHub.

FAQ

Consult the Dynamo FAQ Guide for frequently asked questions and answers.

Contribution & Support

Follow Contribution Guidelines
Report issues via GitHub Discussions
Enterprise support available through NVIDIA AI Enterprise

License

Apache 2.0 licensed with third-party attributions documented in each repository.

Note

This project is currently in alpha stage - APIs and components may evolve based on community feedback

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamo

NVIDIA Dynamo

Getting Started

Documentation

FAQ

Contribution & Support

License

Pinned Loading

Repositories

People

Top languages

Most used topics

Uh oh!