[CI][DCP][Perf] reduce DCP CI execution time #29858

pisceskkk · 2025-12-02T09:08:49Z

Based on #29487 (review) , I've re-optimized the DCP UT to reduce CI execution time. Referring to test_dbo.py, I changed the comparison from DCP vs. TP to DCP vs. reference accuracy, and removed unnecessary tests (I'm not entirely sure about this—if needed, they can be added back).
Currently, each test takes less than two minutes:
4 passed, 4 warnings in 409.86s (0:06:49)

Additionally, I'm not entirely clear about the model naming conventions in the tests/models/registry.py and am unsure what the key for the newly added Qwen/Qwen2.5-1.5B-Instruct model should be.

CC @LucasWilkinson

Signed-off-by: QiuChunshuo <[email protected]>

gemini-code-assist

Code Review

This pull request optimizes the Distributed Context Parallelism (DCP) CI tests by refactoring them to validate against a reference accuracy score from a GSM8K evaluation, instead of comparing outputs with a Tensor Parallelism (TP) run. This is a significant and effective change to reduce CI execution time. The changes include removing the "dummy" model loading path and streamlining test configurations, which simplifies the test suite. My review found one area for improvement related to cleaning up unused parameters from the refactoring. Regarding your question on tests/models/registry.py, the key "2.5-1.5B" you've chosen for Qwen/Qwen2.5-1.5B-Instruct is descriptive and consistent with conventions in the registry, so it looks appropriate.

Signed-off-by: QiuChunshuo <[email protected]>

[CI][Perf] compare DCP's acc with ref

b31062b

Signed-off-by: QiuChunshuo <[email protected]>

pisceskkk requested review from DarkLight1337 and ywang96 as code owners December 2, 2025 09:08

pisceskkk changed the title ~~[CI][DCP][Optim] reduce DCP CI execution time~~ [CI][DCP][Perf] reduce DCP CI execution time Dec 2, 2025

gemini-code-assist bot reviewed Dec 2, 2025

View reviewed changes

remove unnecessary tests

8ab0063

Signed-off-by: QiuChunshuo <[email protected]>

pisceskkk force-pushed the dcp-ut-optim branch from 9456bfb to c14f78f Compare December 2, 2025 10:53

[lint]

2964b9e

Signed-off-by: QiuChunshuo <[email protected]>

pisceskkk force-pushed the dcp-ut-optim branch from c14f78f to 2964b9e Compare December 2, 2025 12:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[CI][DCP][Perf] reduce DCP CI execution time #29858

[CI][DCP][Perf] reduce DCP CI execution time #29858

pisceskkk commented Dec 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[CI][DCP][Perf] reduce DCP CI execution time #29858

Are you sure you want to change the base?

[CI][DCP][Perf] reduce DCP CI execution time #29858

Conversation

pisceskkk commented Dec 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pisceskkk commented Dec 2, 2025 •

edited by github-actions bot

Loading