Skip to content

Conversation

@zhuohan123
Copy link
Member

@zhuohan123 zhuohan123 commented Dec 2, 2025

Purpose

Remove redundant engine arg token_only, which does exactly the same thing as skip_tokenizer_init.

Test Plan

All existing tests should pass.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@zhuohan123 zhuohan123 requested a review from NickLucche December 2, 2025 01:28
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the redundant tokens_only argument from EngineArgs. This argument's functionality was already covered by skip_tokenizer_init. The changes correctly remove the field definition and its usage, which simplifies the code and improves maintainability. The change is straightforward and looks good to merge.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 571 to 573
kv_offloading_backend: KVOffloadingBackend | None = (
CacheConfig.kv_offloading_backend
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Tokens-only flag no longer skips tokenizer init

Removing the tokens_only field from EngineArgs means AsyncEngineArgs.from_cli_args no longer forwards the --tokens-only flag that the OpenAI server still exposes (see vllm/entrypoints/openai/cli_args.py:192-196). The engine now leaves skip_tokenizer_init at its default and will try to initialize a tokenizer even when running in tokens-only mode. In Disaggregated Everything deployments where tokenizer files are intentionally absent, the server will now fail during startup because tokenizer initialization runs; the flag should still map to skip_tokenizer_init.

Useful? React with 👍 / 👎.

@zhuohan123
Copy link
Member Author

Codex is right. Close this PR

@zhuohan123 zhuohan123 closed this Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants