-
Notifications
You must be signed in to change notification settings - Fork 1.7k
chore: remove duplicate should_stop_processing check #6242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: junq <[email protected]>
WalkthroughThe update modifies three executor loop methods by changing their loop conditions from Changes
Estimated code review effort1 (~2 minutes) Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tensorrt_llm/_torch/pyexecutor/py_executor.py (2)
814-821
: Same late-exit pattern as aboveThe
_executor_loop
body repeats the “do work → then break” sequence. Refactor here as well to avoid the extra_fetch_new_requests()
after shutdown.
958-965
: Ditto for overlap loopFor consistency and to prevent any accidental blocking, apply the pre-check arrangement in
_executor_loop_overlap
too.
🧹 Nitpick comments (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py (1)
656-663
: Loop now runs one extra bookkeeping step after shutdown – double-check queue blocking behaviourSwitching to
while True:
makes the loop enter once even whenshould_stop_processing
is alreadyTrue
.
Normally this is harmless, but_fetch_new_requests()
might:
- Touch CUDA / NCCL state (extra work), and
- Potentially block on the request queue (implementation-dependent).
If the queue’s
fetch_new_requests()
performs a blockingget()
with no timeout, the executor could hang on the final iteration.Consider guarding the expensive / blocking part first:
while True: - profile_step() - if self.enable_iter_perf_stats: - iter_start_time = time.time() - new_requests = self._fetch_new_requests() - if self.should_stop_processing: - break + if self.should_stop_processing: + break + profile_step() + if self.enable_iter_perf_stats: + iter_start_time = time.time() + new_requests = self._fetch_new_requests()Same pattern applies to the other executor loops.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py
(3 hunks)
🧠 Learnings (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py (1)
Learnt from: amitz-nv
PR: #5616
File: tensorrt_llm/executor/worker.py:375-384
Timestamp: 2025-07-17T09:01:27.374Z
Learning: In tensorrt_llm/executor/worker.py, the LoRA adapter cache optimization logic that checks is_adapter_in_cpu_cache()
and conditionally passes None for weights/config has a known race condition issue that cannot be solved with simple error handling or verification checks. This is a known limitation that requires a more comprehensive solution.
🧰 Additional context used
🧠 Learnings (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py (1)
Learnt from: amitz-nv
PR: #5616
File: tensorrt_llm/executor/worker.py:375-384
Timestamp: 2025-07-17T09:01:27.374Z
Learning: In tensorrt_llm/executor/worker.py, the LoRA adapter cache optimization logic that checks is_adapter_in_cpu_cache()
and conditionally passes None for weights/config has a known race condition issue that cannot be solved with simple error handling or verification checks. This is a known limitation that requires a more comprehensive solution.
/bot run |
PR_Github #12510 [ run ] triggered by Bot |
PR_Github #12510 [ run ] completed with state |
/bot run |
PR_Github #12522 [ run ] triggered by Bot |
PR_Github #12522 [ run ] completed with state |
/bot run |
PR_Github #12556 [ run ] triggered by Bot |
PR_Github #12556 [ run ] completed with state |
/bot run |
PR_Github #12569 [ run ] triggered by Bot |
PR_Github #12569 [ run ] completed with state |
Signed-off-by: junq <[email protected]> Signed-off-by: Shreyas Misra <[email protected]>
Signed-off-by: junq <[email protected]> Signed-off-by: Ransiki Zhang <[email protected]>
Signed-off-by: junq <[email protected]> Signed-off-by: Lanyu Liao <[email protected]>
There is already a check inside the while loop:
Summary by CodeRabbit