[XPU]add enable_logprob #5279

qw86972190 · 2025-11-27T14:08:51Z

Motivation

This PR primarily adds Logprobs support for XPU (Kunlun Chip) on the FastDeploy LLM inference engine.

Previously, Logprobs functionality was restricted only to CUDA platforms, which prevented users from leveraging advanced sampling features on XPU devices.

Modifications

This PR involves changes across configuration, worker logic, and the custom XPU operators

Usage or Command

python -m fastdeploy.entrypoints.openai.api_server
--model /work/PaddlePaddle/ERNIE-4.5-0.3B-Paddle
--port 8188
--tensor-parallel-size 1
--max-model-len 32768
--max-num-seqs 128
--quantization "wint8"
--gpu-memory-utilization 0.9
--enable-logprob

Accuracy Tests

This change affects the Logprobs output structure and platform support, not the core inference results.

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2025-11-27T14:08:59Z

Thanks for your contribution!

CLAassistant · 2025-11-27T14:09:00Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ qw86972190
❌ root

root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codecov-commenter · 2025-11-27T15:47:39Z

Codecov Report

❌ Patch coverage is 21.42857% with 11 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@68533eb). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...tdeploy/model_executor/xpu_pre_and_post_process.py	0.00%	7 Missing ⚠️
fastdeploy/output/token_processor.py	40.00%	2 Missing and 1 partial ⚠️
fastdeploy/engine/args_utils.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #5279   +/-   ##
==========================================
  Coverage           ?   59.60%           
==========================================
  Files              ?      324           
  Lines              ?    39711           
  Branches           ?     5976           
==========================================
  Hits               ?    23669           
  Misses             ?    14158           
  Partials           ?     1884

Flag	Coverage Δ
GPU	`59.60% <21.42%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

custom_ops/xpu_ops/src/ops/get_output_msg_with_topk.cc

fastdeploy/output/token_processor.py

gongshaotian

LGTM

qingqing01 · 2025-11-28T02:57:32Z

custom_ops/xpu_ops/src/ops/get_output_msg_with_topk.cc

+
+PD_BUILD_STATIC_OP(get_output_topk)
+    .Inputs({"x", "scores", "ranks"})
+    .Attrs({"k: int", "rank_id: int64_t", "wait_flag: bool"})


是最大不超过上面定义的 K 5个嘛？

XPU之前内部版的算子默认K=5 BS=128, 这里直接把算子迁移过来了，所以配置上没完全和GPU对齐。我们可以改改自定义算子，把这些配置和GPU对齐看看有没有问题

jeff41404

XPU存量自定义算子迁移，短期先豁免

hong19860320

LGTM

qingqing01

后续需要支持 zmq

qw86972190 and others added 7 commits November 17, 2025 18:53

[XPU]Update document

abe1682

Merge branch 'PaddlePaddle:develop' into develop

b192f2b

Merge branch 'PaddlePaddle:develop' into develop

ad31b5b

[XPU]Update documentation

ab3a63e

Merge branch 'PaddlePaddle:develop' into develop

724981e

Merge branch 'PaddlePaddle:develop' into develop

b91dd87

[XPU]add enable_logprob

69ce4a8

qw86972190 and others added 3 commits November 27, 2025 22:18

Merge branch 'develop' into develop

8df2bd3

Fix code style issues

4ddb11b

“doc”

4b4c6fb

qw86972190 added 2 commits November 28, 2025 00:05

“docs”

cfca9a1

“doc”

b87eb84

DDDivano previously approved these changes Nov 28, 2025

View reviewed changes

gongshaotian reviewed Nov 28, 2025

View reviewed changes

custom_ops/xpu_ops/src/ops/get_output_msg_with_topk.cc Show resolved Hide resolved

gongshaotian reviewed Nov 28, 2025

View reviewed changes

fastdeploy/output/token_processor.py Show resolved Hide resolved

gongshaotian previously approved these changes Nov 28, 2025

View reviewed changes

qingqing01 reviewed Nov 28, 2025

View reviewed changes

Merge branch 'develop' into develop

5628413

qw86972190 dismissed stale reviews from gongshaotian and DDDivano via 5628413 December 1, 2025 06:16

jeff41404 approved these changes Dec 1, 2025

View reviewed changes

qw86972190 and others added 2 commits December 1, 2025 16:54

Merge branch 'develop' into develop

438947d

Fix code style via pre-commit

3eda599

hong19860320 approved these changes Dec 1, 2025

View reviewed changes

DDDivano approved these changes Dec 2, 2025

View reviewed changes

qingqing01 approved these changes Dec 2, 2025

View reviewed changes

Merge branch 'develop' into develop

1bd81eb

Merge branch 'develop' into develop

569f3c7

EmmonsCurse approved these changes Dec 2, 2025

View reviewed changes

EmmonsCurse merged commit 6048ea3 into PaddlePaddle:develop Dec 2, 2025
21 of 26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XPU]add enable_logprob #5279

[XPU]add enable_logprob #5279

Uh oh!

qw86972190 commented Nov 27, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Nov 27, 2025

Uh oh!

CLAassistant commented Nov 27, 2025

Uh oh!

codecov-commenter commented Nov 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

gongshaotian left a comment

Uh oh!

qingqing01 Nov 28, 2025

Uh oh!

iosmers Nov 28, 2025

Uh oh!

jeff41404 left a comment

Uh oh!

hong19860320 left a comment

Uh oh!

qingqing01 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

[XPU]add enable_logprob #5279

[XPU]add enable_logprob #5279

Uh oh!

Conversation

qw86972190 commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Nov 27, 2025

Uh oh!

CLAassistant commented Nov 27, 2025

Uh oh!

codecov-commenter commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

gongshaotian left a comment

Choose a reason for hiding this comment

Uh oh!

qingqing01 Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

iosmers Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

jeff41404 left a comment

Choose a reason for hiding this comment

Uh oh!

hong19860320 left a comment

Choose a reason for hiding this comment

Uh oh!

qingqing01 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

qw86972190 commented Nov 27, 2025 •

edited

Loading

codecov-commenter commented Nov 27, 2025 •

edited

Loading