Skip to content

Conversation

GAD-cell
Copy link
Contributor

@GAD-cell GAD-cell commented Sep 2, 2025

Small patch to support LFM2 with vLLM.
Since LFM2 doesn’t support prefix caching with vLLM, I had to add enable_prefix_caching to both VLLMModelConfig and VLLMModel._create_auto_model to make it work.

@HuggingFaceDocBuilderDev
Copy link
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@GAD-cell
Copy link
Contributor Author

GAD-cell commented Sep 8, 2025

@NathanHB seems like a cuda compilation error, do you have any clue why ?

@NathanHB
Copy link
Member

NathanHB commented Sep 9, 2025

mhh not sure why. I launched again but if it does not work can you try and set the default value of enable prefix caching to None ?

@GAD-cell
Copy link
Contributor Author

mhh not sure why. I launched again but if it does not work can you try and set the default value of enable prefix caching to None ?

OK I've changed the default value

@NathanHB NathanHB changed the title LFM2 support adds enable_prefix_caching option to VLLMModelConfig Sep 15, 2025
@NathanHB NathanHB merged commit 16c2630 into huggingface:main Sep 23, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants