Skip to content

Conversation

richardhuo-nv
Copy link
Contributor

@richardhuo-nv richardhuo-nv commented Sep 18, 2025

Overview:

Update the default trtllm to v1.1.0rc5, to be able to have the trtllm + KVBM integration by default

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • Documentation
    • Updated references to TensorRT-LLM 1.1.0rc5 across examples and support matrix.
    • Simplified KVBM guide with a single build command and version-based requirement (v1.1.0rc5+), removing commit pinning and rate-limit workarounds.
  • Chores
    • Bumped default TensorRT-LLM dependency to 1.1.0rc5, including optional dependency and build defaults.
    • Updated default commit used for build selection when not explicitly provided.

Copy link

copy-pr-bot bot commented Sep 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Contributor

coderabbitai bot commented Sep 18, 2025

Walkthrough

This PR updates TensorRT-LLM references from 1.1.0rc3 to 1.1.0rc5 across docs and config, bumps the default experimental commit hash in the build script, and simplifies the KVBM guide’s build instructions to a single command without commit pinning.

Changes

Cohort / File(s) Summary
Version reference updates
README.md, docs/support_matrix.md, pyproject.toml
Bump TensorRT-LLM version references from 1.1.0rc3 to 1.1.0rc5 in README, support matrix, and optional dependency.
Build defaults (TRT-LLM)
container/build.sh
Update DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT to 0c9430e5a530ba958fc9dca561a3ad865ad9f492 and DEFAULT_TENSORRTLLM_PIP_WHEEL to tensorrt-llm==1.1.0rc5.
KVBM guide simplification
docs/guides/run_kvbm_in_trtllm.md
Switch compatibility note to “v1.1.0rc5 or newer” and replace multi-step, commit-pinned build with single command: ./container/build.sh --framework trtllm --enable-kvbm.

Sequence Diagram(s)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

I nudge my paws from rc3 to five,
A hop in hashes—scripts arrive.
One build command, so clean, so swift,
Docs align with the latest gift.
In burrows of bytes I quietly thrive,
Version carrots fresh—long live rc5! 🥕🐇

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description includes a brief Overview stating the version bump but leaves "Details" and "Where should the reviewer start?" empty and the Related Issues entry uses a placeholder "#xxx", so it does not follow the repository's required template or provide file-level guidance for reviewers. This lack of concrete details and reviewer pointers makes the description insufficient. Please populate "Details" with a concise summary of the changes made in each modified file (for example container/build.sh, pyproject.toml, README.md, and docs/*), fill "Where should the reviewer start?" with the key files to inspect, and replace the placeholder issue number with the actual issue or remove it before merging.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title succinctly and accurately describes the primary change: bumping trtllm to v1.1.0rc5 to enable trtllm + KVBM integration, which matches the changes across container/build.sh, pyproject.toml, README and docs. It is clear and specific enough for a teammate scanning history.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: richardhuo-nv <[email protected]>
Signed-off-by: richardhuo-nv <[email protected]>
Signed-off-by: richardhuo-nv <[email protected]>
@richardhuo-nv richardhuo-nv force-pushed the rihuo/update_trtllm_0918 branch from 97ec7f6 to c62a71e Compare September 18, 2025 20:19
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (10)
docs/support_matrix.md (1)

51-53: Call out Python 3.12 constraint for KVBM more explicitly.

You already note KVBM is Python 3.12-only. Consider adding “and thus requires Ubuntu 24.04 image” in this file to reduce user footguns when following the guide.

README.md (2)

200-204: Container version mapping example updated correctly to rc5.

LGTM. Consider adding a brief note that future TRT-LLM releases may change the required NGC base tag and to always check BASE_TAG in Dockerfile.multi.


208-217: Optional: surface cuda-python pin in package extras instead of README-only.

You pin cuda-python in the README instructions; moving this into the trtllm extra would make pip installs more reproducible.

Proposed change in pyproject.toml shown in that file’s comment.

pyproject.toml (1)

49-52: Add cuda-python pin to trtllm extra to match README prerequisites.

Keeps installations consistent across docs and packaging.

Apply:

 trtllm =[
     "uvloop",
-    "tensorrt-llm==1.1.0rc5",
+    "tensorrt-llm==1.1.0rc5",
+    "cuda-python>=12,<13",
 ]
container/build.sh (3)

92-96: Document which TRT-LLM tag rc5 commit this corresponds to.

Add a short comment tying DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT to the rc5 cut to aid future bumps.

Apply:

-DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT="0c9430e5a530ba958fc9dca561a3ad865ad9f492"
+DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT="0c9430e5a530ba958fc9dca561a3ad865ad9f492"
+# ^ Commit known-good for TensorRT-LLM v1.1.0rc5 (used when --use-default-experimental-tensorrtllm-commit is passed)

101-103: Guard against providing both --tensorrtllm-commit and --tensorrtllm-pip-wheel.

Right now, if both are set (without the “use default” flag), commit is silently ignored. Fail fast to avoid confusion.

Apply near option parsing completion (after get_options) or at the start of the TRTLLM block:

 if [[ $FRAMEWORK == "TRTLLM" ]]; then
+    if [ -n "$TRTLLM_COMMIT" ] && [ -n "$TENSORRTLLM_PIP_WHEEL" ]; then
+        echo "ERROR: Provide only one of --tensorrtllm-commit or --tensorrtllm-pip-wheel, not both."
+        exit 1
+    fi

540-575: Minor: echo chosen default when neither commit nor wheel is provided.

Helps users see that rc5 wheel path was selected.

Apply:

     if [ -z "$TENSORRTLLM_PIP_WHEEL" ] && [ -z "$TRTLLM_COMMIT" ]; then
         TENSORRTLLM_PIP_WHEEL="$DEFAULT_TENSORRTLLM_PIP_WHEEL"
+        echo "INFO: Neither --tensorrtllm-commit nor --tensorrtllm-pip-wheel provided; using default: ${TENSORRTLLM_PIP_WHEEL}"
     fi
docs/guides/run_kvbm_in_trtllm.md (3)

20-22: Fix typo: “mange” → “manage”.

Small doc polish.

Apply:

-This guide explains how to leverage KVBM (KV Block Manager) to mange KV cache and do KV offloading in TensorRT-LLM (trtllm).
+This guide explains how to leverage KVBM (KV Block Manager) to manage KV cache and do KV offloading in TensorRT-LLM (trtllm).

41-46: Nice simplification to a single build command; add Python 3.12 reminder inline.

Since KVBM is Python 3.12-only (per support matrix), add a one-liner so users don’t miss it.

Apply:

-# Build a container that includes TensorRT-LLM and KVBM.
+# Build a container that includes TensorRT-LLM and KVBM.
+# Note: KVBM requires Python 3.12 (Ubuntu 24.04 base).
 ./container/build.sh --framework trtllm --enable-kvbm

86-97: Fix typos in example request payload.

A few misspellings in the sample prompt.

Apply:

-        "content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map hinting at ests that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your Task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost familt clue is hidden."
+        "content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time. You are an intrepid explorer who has stumbled upon an ancient map hinting that Aeloria holds a secret so profound it could reshape reality. Your task: Develop a detailed character background, motivations, skills and weaknesses, and any personal connections to the city or its legends."
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9cb942 and c62a71e.

⛔ Files ignored due to path filters (1)
  • lib/bindings/python/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • README.md (1 hunks)
  • container/build.sh (2 hunks)
  • docs/guides/run_kvbm_in_trtllm.md (2 hunks)
  • docs/support_matrix.md (1 hunks)
  • pyproject.toml (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-07-30T00:34:35.810Z
Learnt from: dmitry-tokarev-nv
PR: ai-dynamo/dynamo#2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

Applied to files:

  • docs/support_matrix.md
📚 Learning: 2025-08-28T20:43:54.701Z
Learnt from: bhuvan002
PR: ai-dynamo/dynamo#2702
File: components/backends/trtllm/src/dynamo/trtllm/logits_processing/adapter.py:7-13
Timestamp: 2025-08-28T20:43:54.701Z
Learning: CI import failures for optional dependencies like tensorrt_llm can be resolved by removing __init__.py files and restructuring imports, rather than wrapping imports in try/except blocks.

Applied to files:

  • pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: Build and Test - vllm
  • GitHub Check: Build and Test - sglang
🔇 Additional comments (3)
pyproject.toml (1)

49-52: Pin bump to tensorrt-llm==1.1.0rc5 is correct and minimal.

No issues spotted.

docs/guides/run_kvbm_in_trtllm.md (1)

30-32: Good: explicit minimum version (rc5) for KVBM.

Nice simplification vs. a specific commit hash.

docs/support_matrix.md (1)

70-74: Version bump to TensorRT-LLM rc5 looks good — repo-wide consistency verified.

No matches for 1.1.0rc3; 1.1.0rc5 appears in pyproject.toml, README.md, docs/guides/run_kvbm_in_trtllm.md, docs/support_matrix.md, container/build.sh.

Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming gitlab CI TRTLLM tests pass

@richardhuo-nv
Copy link
Contributor Author

LGTM assuming gitlab CI TRTLLM tests pass

https://gitlab-master.nvidia.com/dl/ai-dynamo/dynamo-ci/-/pipelines/35145191
CI passed here

@richardhuo-nv richardhuo-nv merged commit fb29bd5 into main Sep 19, 2025
13 checks passed
@richardhuo-nv richardhuo-nv deleted the rihuo/update_trtllm_0918 branch September 19, 2025 15:39
hhzhang16 pushed a commit that referenced this pull request Sep 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants