Skip to content

Commit 899b74c

Browse files
authored
[None][doc] Fix blog4 typo (#6612)
Signed-off-by: Enwei Zhu <[email protected]>
1 parent 6a3a921 commit 899b74c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,7 @@ Let's use some representative workloads to illustrate the performance impact wit
503503
</div>
504504
<p align="center"><sub><em>Figure 24: EP impact over MoE Group GEMM and EP communication</em></sub></p>
505505
In Figure 24, it can be observed that by increasing the EP size from 4 to 72, the MoE Group GEMM computation time gets reduced, while the EP communication time (for EP4/EP8 Reduce/Scatter is used, while for EP>8 All2All is used) stays almost constant.
506-
When the EP size increases from 18 to 32, the speed-up diminishes. We are working on optimizing it.
506+
When the EP size increases from 18 to 72, the speed-up diminishes. We are working on optimizing it.
507507

508508
Next, let's use some representative workloads to understand the performance impact with EPLB.
509509
<div align="center">

0 commit comments

Comments
 (0)