Skip to content

Commit 7d88f9c

Browse files
committed
update trtllm to 1.1.0rc5
Signed-off-by: richardhuo-nv <[email protected]>
1 parent e9cb942 commit 7d88f9c

File tree

3 files changed

+5
-8
lines changed

3 files changed

+5
-8
lines changed

container/build.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ TRTLLM_GIT_URL=""
9898
TENSORRTLLM_INDEX_URL="https://pypi.python.org/simple"
9999
# TODO: Remove the version specification from here and use the ai-dynamo[trtllm] package.
100100
# Need to update the Dockerfile.trtllm to use the ai-dynamo[trtllm] package.
101-
DEFAULT_TENSORRTLLM_PIP_WHEEL="tensorrt-llm==1.1.0rc3"
101+
DEFAULT_TENSORRTLLM_PIP_WHEEL="tensorrt-llm==1.1.0rc5"
102102
TENSORRTLLM_PIP_WHEEL=""
103103

104104

docs/guides/run_kvbm_in_trtllm.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ To learn what KVBM is, please check [here](https://docs.nvidia.com/dynamo/latest
2727
> - KVBM only supports TensorRT-LLM’s PyTorch backend.
2828
> - To enable disk cache offloading, you must first enable a CPU memory cache offloading.
2929
> - Disable partial reuse `enable_partial_reuse: false` in the LLM API config’s `kv_connector_config` to increase offloading cache hits.
30-
> - KVBM requires TensorRT-LLM at commit ce580ce4f52af3ad0043a800b3f9469e1f1109f6 or newer.
30+
> - KVBM requires TensorRT-LLM v1.1.0rc5 or newer.
3131
> - Enabling KVBM metrics with TensorRT-LLM is still a work in progress.
3232
3333
## Quick Start
@@ -38,12 +38,8 @@ To use KVBM in TensorRT-LLM, you can follow the steps below:
3838
# start up etcd for KVBM leader/worker registration and discovery
3939
docker compose -f deploy/docker-compose.yml up -d
4040

41-
# Build a container that includes TensorRT-LLM and KVBM. Note: KVBM integration is only available in TensorRT-LLM commit dcd110cfac07e577ce01343c455917832b0f3d5e or newer.
42-
# When building with the --tensorrtllm-commit option, you may notice that https://github.com keeps prompting for a username and password.
43-
# This happens because cloning TensorRT-LLM can hit GitHub’s rate limit.
44-
# To work around this, you can keep pressing "Enter" or "Return.".
45-
# Setting "export GIT_LFS_SKIP_SMUDGE=1" may also reduce the number of prompts.
46-
./container/build.sh --framework trtllm --tensorrtllm-commit dcd110cfac07e577ce01343c455917832b0f3d5e --enable-kvbm
41+
# Build a container that includes TensorRT-LLM and KVBM.
42+
./container/build.sh --framework trtllm --enable-kvbm
4743

4844
# launch the container
4945
./container/run.sh --framework trtllm -it --mount-workspace --use-nixl-gds

lib/bindings/python/Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)