-
Notifications
You must be signed in to change notification settings - Fork 612
refactor: simplify Dockerfile.vllm, enable local-dev for all frameworks #3099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: simplify Dockerfile.vllm, enable local-dev for all frameworks #3099
Conversation
- Move local-dev target to new Dockerfile.local_dev - Add build_local_dev.sh script for converting dev images to local-dev images - Redirect local-dev builds to new dedicated script - Enable local-dev builds for all frameworks via unified approach - Remove complexity by extracting local-dev stage - Update documentation with new two-step build process Signed-off-by: Keiven Chang <[email protected]>
- Update missing image section to reflect local-dev image requirement - Add two-step build process with framework example (vllm) - Show expected output image name (dynamo:latest-vllm-local-dev) - Clarify troubleshooting steps for new build_local_dev.sh workflow Signed-off-by: Keiven Chang <[email protected]>
WalkthroughIntroduces a dedicated local-dev image flow: removes local-dev stage from Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor U as Developer
participant BS as build.sh
participant D as Docker (dev build)
participant BLD as build_local_dev.sh
participant DL as Docker (local-dev build)
U->>BS: ./container/build.sh --framework VLLM --target dev
BS->>D: docker build -f Dockerfile.vllm (target=dev)
D-->>BS: dev image built (e.g., dynamo:latest-vllm)
BS-->>U: Done
U->>BLD: ./container/build_local_dev.sh --dev-image dynamo:latest-vllm
BLD->>DL: docker build -f Dockerfile.local_dev<br/>--build-arg LOCAL_DEV_BASE=...
DL-->>BLD: local-dev image built (e.g., dynamo:latest-vllm-local-dev)
BLD-->>U: Use image in Dev Container
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
container/build_local_dev.sh (1)
1-279
: Add SPDX copyright header to container/build_local_dev.shCI "Copyright Checks" failing — no SPDX header found in container/build_local_dev.sh (rg for '^# SPDX-FileCopyrightText:' returned no matches).
Add the SPDX lines at the top (immediately after the shebang), e.g.:SPDX-FileCopyrightText: 2025
SPDX-License-Identifier: Apache-2.0
🧹 Nitpick comments (8)
container/build.sh (1)
472-485
: Make the guidance path-robust; print to stderr and use a usage‑style exit code.If users invoke this script from repo root (
container/build.sh
), the printed commands (./build.sh
,./build_local_dev.sh
) will be wrong. Prefer emitting paths relative to the script dir and write the note to stderr; exit 2 is conventional for usage errors.-if [[ $TARGET == "local-dev" ]]; then - cat << EOF +if [[ $TARGET == "local-dev" ]]; then + >&2 cat << EOF NOTE: The 'local-dev' target has been moved to a separate build process. Please use the build_local_dev.sh script instead: - # First build a dev image (this creates dynamo:latest-${FRAMEWORK,,}): - ./build.sh --framework $FRAMEWORK --target dev + # First build a dev image (this creates dynamo:latest-${FRAMEWORK,,}): + ${SOURCE_DIR}/build.sh --framework $FRAMEWORK --target dev # Then convert it to local-dev (this creates dynamo:latest-${FRAMEWORK,,}-local-dev) - ./build_local_dev.sh --dev-image dynamo:latest-${FRAMEWORK,,} + ${SOURCE_DIR}/build_local_dev.sh --dev-image dynamo:latest-${FRAMEWORK,,} -For more information, see: ./build_local_dev.sh --help +For more information, see: ${SOURCE_DIR}/build_local_dev.sh --help EOF - exit 1 + exit 2 fi.devcontainer/README.md (2)
97-101
: Fix script name and flag: build_local_dev.sh expects --dev-image, and the filename includes .sh.-./container/build_local_dev --image dynamo:latest-vllm +./container/build_local_dev.sh --dev-image dynamo:latest-vllm
360-361
: Minor: add --framework for consistency with Step 1.Troubleshooting should mirror the two‑step flow; default framework is VLLM but being explicit avoids confusion.
-./container/build.sh --target dev +./container/build.sh --framework VLLM --target devcontainer/Dockerfile.local_dev (3)
70-75
: Harden NVIDIA devtools repo setup: ensure keyrings dir exists; avoid hard‑pinning nsight-systems version.Create
/etc/apt/keyrings
and install the unversioned package to reduce breakage on repo updates. Consider making version an ARG if pinning is required.-RUN wget -qO - https://developer.download.nvidia.com/devtools/repos/ubuntu2404/${ARCH}/nvidia.pub | gpg --dearmor -o /etc/apt/keyrings/nvidia-devtools.gpg && \ - echo "deb [signed-by=/etc/apt/keyrings/nvidia-devtools.gpg] https://developer.download.nvidia.com/devtools/repos/ubuntu2404/${ARCH} /" | tee /etc/apt/sources.list.d/nvidia-devtools.list && \ - apt-get update && \ - apt-get install -y nsight-systems-2025.5.1 +RUN install -d -m 0755 /etc/apt/keyrings && \ + wget -qO - https://developer.download.nvidia.com/devtools/repos/ubuntu2404/${ARCH}/nvidia.pub | gpg --dearmor -o /etc/apt/keyrings/nvidia-devtools.gpg && \ + echo "deb [signed-by=/etc/apt/keyrings/nvidia-devtools.gpg] https://developer.download.nvidia.com/devtools/repos/ubuntu2404/${ARCH} /" > /etc/apt/sources.list.d/nvidia-devtools.list && \ + apt-get update && \ + apt-get install -y nsight-systems
99-103
: Guard Rust toolchain copy to avoid failing when toolchains aren’t present in base image.-RUN rsync -a --chown=$USER_UID:$USER_GID /usr/local/rustup/ $RUSTUP_HOME/ - -RUN rsync -a --chown=$USER_UID:$USER_GID /usr/local/cargo/ $CARGO_HOME/ +RUN if [ -d /usr/local/rustup ]; then rsync -a --chown=$USER_UID:$USER_GID /usr/local/rustup/ "$RUSTUP_HOME/"; else echo "rustup not found; skipping copy"; fi +RUN if [ -d /usr/local/cargo ]; then rsync -a --chown=$USER_UID:$USER_GID /usr/local/cargo/ "$CARGO_HOME/"; else echo "cargo not found; skipping copy"; fi
64-69
: Prefer ‘command -v’ over ‘which’; keep layer strictness.Also avoid swallowing failures silently; the current echo can hide real issues.
-RUN (apt-get install -y gawk || \ - apt-get install -y mawk || \ - apt-get install -y original-awk || \ - echo "Warning: Could not install any awk implementation") && \ - (which awk && echo "awk successfully installed: $(which awk)" || echo "awk not available") +RUN (apt-get install -y gawk || apt-get install -y mawk || apt-get install -y original-awk || \ + (echo "Warning: Could not install any awk implementation" && true)) && \ + (command -v awk >/dev/null && echo "awk installed: $(command -v awk)" || echo "awk not available")container/build_local_dev.sh (2)
73-109
: Image listing UX is solid. Optional: filter duplicates and sort by recency.Non-blocking improvement: sort by created time and uniq on image:tag to reduce noise.
139-153
: Deriving output tag is correct; consider preserving repository when non-‘dynamo’ repos are used.If base image is
myrepo/dynamo:tag
, defaulting todynamo:tag-local-dev
might be surprising. Optional: keep repo name by default.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.devcontainer/README.md
(2 hunks)container/Dockerfile.local_dev
(1 hunks)container/Dockerfile.vllm
(0 hunks)container/README.md
(4 hunks)container/build.sh
(1 hunks)container/build_local_dev.sh
(1 hunks)
💤 Files with no reviewable changes (1)
- container/Dockerfile.vllm
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
Learnt from: keivenchang
PR: ai-dynamo/dynamo#3035
File: lib/runtime/src/metrics/prometheus_names.rs:49-53
Timestamp: 2025-09-16T00:26:37.092Z
Learning: keivenchang prefers consistency in metric naming standardization over strict adherence to Prometheus conventions about gauge vs counter suffixes. When standardizing metrics naming, prioritize consistency across the codebase rather than technical pedantry about individual metric type conventions.
📚 Learning: 2025-09-03T01:10:12.599Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2822
File: container/Dockerfile.vllm:343-352
Timestamp: 2025-09-03T01:10:12.599Z
Learning: In the dynamo project's local-dev Docker targets, USER_UID and USER_GID build args are intentionally left without default values to force explicit UID/GID mapping during build time, preventing file permission issues in local development environments where container users need to match host user permissions for mounted volumes.
Applied to files:
container/build.sh
.devcontainer/README.md
container/Dockerfile.local_dev
container/README.md
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
Applied to files:
.devcontainer/README.md
container/Dockerfile.local_dev
container/README.md
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.
Applied to files:
.devcontainer/README.md
container/README.md
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project's devcontainer setup, hard-coded container names in devcontainer.json files serve as templates that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.
Applied to files:
.devcontainer/README.md
🪛 GitHub Actions: Copyright Checks
container/build_local_dev.sh
[error] 1-1: Copyright header check failed during the pwsh /workspace/.github/workflows/copyright-check.ps1 step. File 'container/build_local_dev.sh' has an invalid/missing header.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build and Test - vllm
- GitHub Check: Build and Test - sglang
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (7)
container/Dockerfile.local_dev (1)
126-127
: Confirm ENTRYPOINT exists in all dev images.If
/opt/nvidia/nvidia_entrypoint.sh
isn’t present in every base dev image, containers will fail to start. Consider keeping the base image’s ENTRYPOINT or add a lightweight wrapper script.container/build_local_dev.sh (1)
200-208
: Nice touch on usage hints.Clear, actionable guidance post-build.
container/README.md (5)
42-43
: Matrix entry reads well.Accurately reflects that local-dev is built via build_local_dev.sh.
110-111
: Good clarification on dev target scope.Docs now clearly separate dev vs local-dev paths.
123-125
: Example uses the correct script and flag.Matches the new CLI (
--dev-image
).
139-165
: Solid section describing build_local_dev.sh.Concise and accurate; no blockers.
218-226
: End-to-end flow is consistent.Two-step build then run with local-dev image is clear.
- Add --target local-dev support to build.sh (builds both dev and local-dev images) - Refactor build_local_dev.sh to use RUN_PREFIX and consistent tag handling - Add --squash option to build_local_dev.sh for smaller images - Update documentation to clarify dev vs local-dev image usage - Improve terminology: dev images run as root, local-dev run as local user - Add guidance on when to use each image type (mounting vs inference) - Streamline examples to show single-command approach with build.sh Signed-off-by: Keiven Chang <[email protected]>
- Add --dev-image option to build from existing dev images - Add --uid and --gid options for custom user permissions - Consolidate duplicate docker build logic with shared function - Add validation to prevent conflicting --target and --dev-image options - Fix tag generation to always add -local-dev suffix - Add dynamic path calculation for usage instructions - Rename LOCAL_DEV_BASE to DEV_BASE for clarity - Remove unused options from build_local_dev.sh - Update terminology from 'converting' to 'building' Signed-off-by: Keiven Chang <[email protected]>
All functionality has been moved to build.sh with improved integration Signed-off-by: Keiven Chang <[email protected]>
- Update all documentation to use build.sh --dev-image instead of build_local_dev.sh - Simplify container README section for local development image builder - Remove references to deleted build_local_dev.sh script - Update example commands to use consolidated build.sh interface Signed-off-by: Keiven Chang <[email protected]>
Signed-off-by: Keiven Chang <[email protected]>
…ks (#3099) Signed-off-by: Keiven Chang <[email protected]>
Overview:
Refactor container build system to enable local-dev images for all frameworks while simplifying Dockerfile.vllm.
Details:
Where should the reviewer start?
container/Dockerfile.local_dev and container/build_local_dev.sh for new infrastructure, then container/Dockerfile.vllm for removed complexity.
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
/coderabbit disable