Skip to content

Conversation

danielealbano
Copy link

@danielealbano danielealbano commented Oct 6, 2025

What does this PR do?

This PR adds support for the Blackwell architecture, related to issue #652.

As I wanted to run TEI on my 5090 I went through a few iterations and got it working, tested with Qwen3-Embedding-0.6B.

Before submitting

Yes

  • Was this discussed/approved via a GitHub issue or the forum?

Not discussed nor approved but it's a known issue and there is a related issue already opened at #652

Documentation updated to mention the new compute cap.

  • Did you write any new necessary tests? If applicable, did you include or update the insta snapshots?

I have updated the only test already in place to validate the compute cap

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Copy link
Member

@alvarobartt alvarobartt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I'm just a bit concerned on bumping CUDA from 12.2 to 12.9 just for supporting Blackwell 🤔

@@ -1,4 +1,4 @@
FROM nvidia/cuda:12.2.0-devel-ubuntu22.04 AS base-builder
FROM nvidia/cuda:12.9.0-devel-ubuntu22.04 AS base-builder
Copy link
Member

@alvarobartt alvarobartt Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bumping CUDA required? As it might eventually be a breaking change for instances running on older versions of NVIDIA as 12.2, 12.4 and 12.6; besides that everything LGTM

Copy link
Author

@danielealbano danielealbano Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alvarobartt CUDA 12.8 is required to support GPUs like the 5080 and 5090, we can potentially downgrade to 12.8 and it should still work (I can test) however I don't think it would help too much.

I understand that it might be a problem, however CUDA 12.2 is 2 years (July 2023) old and it would need to be upgraded at some point.

What if we the cuda 12.9 is used with a :129-1.x docker image tag? It doesn't feel the right solution but it wouldn't break any backward compatibility.

Copy link
Member

@alvarobartt alvarobartt Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm fair enough, I then think we maybe just create Dockerfile-cuda-blackwell in the meantime with CUDA 12.8, whilst keeping the rest of the changes, just adding that to the CI and making sure we build with a different CUDA version for Blackwell, and eventually for TEI v1.9.0 we can think about bumping CUDA from 12.2 to 12.6.

In any case, I guess that given how recent Blackwell is it makes sense to be isolated for the moment to not break anything, but ideally all those should be under the same Dockerfile in the future.

Copy link
Author

@danielealbano danielealbano Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to test with CUDA 12.8 to be certain there no odd surprises, I'll need to figure out which packages to swap to downgrade the CUDA version on my test hardware.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks for the contribution @danielealbano, I'll try to test on my end too, and add it into the CI to make sure the Dockerfile-cuda-blackwell image is built as experimental, and later on we can consider on bumping CUDA on the Dockerfile-cuda and Dockerfile-cuda-all images to make sure that it supports all the architectures today!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants