-
Notifications
You must be signed in to change notification settings - Fork 693
chore: llama4 multimodal disagg support migration #4213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughA shell script was modified to migrate from direct Python script invocations to a unified Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Poem
Pre-merge checks❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/backends/vllm/launch/disagg_multimodal_llama.sh(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-03T10:14:30.570Z
Learnt from: fsaady
Repo: ai-dynamo/dynamo PR: 1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:230-244
Timestamp: 2025-07-03T10:14:30.570Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, background processes (like nats-server, etcd) are intentionally left running even if later processes fail. This design choice allows users to manually connect to nodes and debug issues without having to restart the entire SLURM job from scratch, providing operational flexibility for troubleshooting in cluster environments.
Applied to files:
examples/backends/vllm/launch/disagg_multimodal_llama.sh
🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/4213/merge) by ayushag-nv.
examples/backends/vllm/launch/disagg_multimodal_llama.sh
[error] 1-1: Command: pre-commit run --show-diff-on-failure --color=always --all-files failed. Trailing whitespace was detected and the hook modified the file: examples/backends/vllm/launch/disagg_multimodal_llama.sh. Rerun the pre-commit checks locally (e.g., 'pre-commit run --all-files') to verify all issues are resolved.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (3)
examples/backends/vllm/launch/disagg_multimodal_llama.sh (3)
18-31: Improved help documentation.The expanded help message with usage examples and clearer structure is good for user guidance. The examples clearly show how to use the script on head and worker nodes.
62-63: Wait strategy allows debugging flexibility.The script uses
waitat the end to allow all background processes to complete. This design (combined with the trap/kill cleanup handler) enables debugging by keeping processes running even if one fails, which is operationally useful in cluster environments. This aligns well with the migration goal and best practices for disaggregated serving.
51-51: Verification complete: CLI migration to unifieddynamo.vllminterface is correctly implemented.All three vLLM subcommands (
--multimodal-processor,--multimodal-encode-prefill-worker,--multimodal-decode-worker) are properly defined incomponents/src/dynamo/vllm/args.py, with correct flag mappings (--mm-prompt-template,--is-prefill-worker,--tensor-parallel-size,--max-model-len,--gpu-memory-utilization). The handler routing inmain.pycorrectly dispatches each subcommand to its corresponding handler (ProcessorHandler, EncodeWorkerHandler, MultimodalDecodeWorkerHandler). The script usage at lines 51, 56, and 59 aligns with the implementation.
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
Signed-off-by: ayushag <[email protected]>
b28402a to
e981930
Compare
|
/ok to test 0aba0ef |
Signed-off-by: ayushag <[email protected]> Signed-off-by: Daiyaan <[email protected]>
Signed-off-by: ayushag <[email protected]>
Overview:
This PR successfully migrates the llama4 multimodal disagg support from examples to vLLM components library.
Key Changes:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
Documentation
Refactor