-
Notifications
You must be signed in to change notification settings - Fork 30.3k
Open
Labels
Description
System Info
num_assistant_tokens
and other parameters intended for the assistant model are not properly passed down, leaving the assistant model with default values.
outputs = model.generate(**inputs, assistant_model=assistant_model, num_assistant_tokens=5)
print(f'num_assistant_tokens: {assistant_model.generation_config.num_assistant_tokens}')
This behavior is present both in the official 4.56.0 as well as main (currently 4.57.0.dev0/bb45d36)
Fix in PR #40740
transformers
version: 4.57.0.dev0- Platform: macOS-15.6.1-arm64-arm-64bit
- Python version: 3.12.11
- Huggingface_hub version: 0.34.4
- Safetensors version: 0.6.2
- Accelerate version: 1.10.1
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.8.0 (NA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?:?
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
outputs = model.generate(**inputs, assistant_model=assistant_model, num_assistant_tokens=5)
print(f'num_assistant_tokens: {assistant_model.generation_config.num_assistant_tokens}')
see also this Collab: https://colab.research.google.com/drive/1BIY6yklrsarrPmXWrciV1chAqV5I9-wo?usp=sharing
Expected behavior
num_assistant_tokens
should be 5
not the default 20