Skip to content

Conversation

shanjiaz
Copy link
Collaborator

@shanjiaz shanjiaz commented Sep 29, 2025

Changes:

  • Added support for optional arguments eagle_aux_hidden_state_layer_ids and inference_type.
  • Added more robust logic for target_vocab_size. We default on using "t2d" length, if not available, load the config file of verifier model, recursively search the dict for vocab_size. (The search is needed for nested dict. e.g. target_config_dict["text_config"]["vocab_size"] )

Command used:

speculators convert nvidia/Llama-4-Maverick-17B-128E-Eagle3 \
  --algorithm eagle3 \
  --verifier RedHatAI/Llama-4-Maverick-17B-128E-Instruct-quantized.w4a16 \
  --output-path Llama4-Maverick-Eagle3-Speculators \
  --validate-device cuda:0 \
  --algorithm-kwargs '{"eagle_aux_hidden_state_layer_ids": [1,23,44], "inference_type": "text"}'

Converted checkpoint:

shanjiaz/Llama4-Maverick-Eagle3-Speculators-converted

Copy link

github-actions bot commented Sep 29, 2025

📦 Build Artifacts Available
The build artifacts (`.whl` and `.tar.gz`) have been successfully generated and are available for download: https://github.com/vllm-project/speculators/actions/runs/18228590055/artifacts/4177299103.
They will be retained for up to 30 days.
Commit: c4cfbb6

@shanjiaz shanjiaz marked this pull request as ready for review October 1, 2025 01:28
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
@shanjiaz shanjiaz requested a review from rahul-tuli October 3, 2025 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants