-
-
Notifications
You must be signed in to change notification settings - Fork 501
Upgrade Keda and GPU components versions #1575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Upgrades GPU components and Keda versions to provide better CUDA support in k3d with latest working versions.
- Updates K3s version from v1.28.8-k3s1 to v1.31.7-k3s1
- Upgrades CUDA base image from 12.4.1-base-ubuntu22.04 to 12.8.1-base-ubuntu24.04
- Updates NVIDIA device plugin from v0.15.0-rc.2 to v0.17.1
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
install_gpu_drivers_server.sh | New script for installing GPU drivers on Ubuntu 24.04 server |
device-plugin-daemonset.yaml | Updates NVIDIA device plugin image version |
build.sh | Enhanced build script with improved error handling and updated versions |
Dockerfile | Multi-stage Dockerfile improvements with updated CUDA/K3s versions |
cuda.md | Documentation updates reflecting new versions and server preparation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
# Add NVIDIA GPG key | ||
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | \ | ||
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg | ||
|
||
# Add repo using fake distribution | ||
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ | ||
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ | ||
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list | ||
|
||
|
||
sudo apt-get update | ||
sudo apt-get install -y nvidia-container-toolkit | ||
|
||
sudo nvidia-ctk runtime configure --runtime=docker | ||
|
||
sudo systemctl restart docker | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code block duplicates the NVIDIA toolkit installation steps that already appear earlier in the script (lines 17-33). The duplicated code should be removed to avoid confusion and potential conflicts.
# Add NVIDIA GPG key | |
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | \ | |
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg | |
# Add repo using fake distribution | |
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ | |
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ | |
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list | |
sudo apt-get update | |
sudo apt-get install -y nvidia-container-toolkit | |
sudo nvidia-ctk runtime configure --runtime=docker | |
sudo systemctl restart docker |
Copilot uses AI. Check for mistakes.
The K3s container itself also needs to run with this runtime. | ||
If you are using Docker you can install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). | ||
|
||
## Preparing Server to run Keda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The heading mentions 'Keda' but should be 'CUDA' based on the context. This appears to be a typo as the section is about preparing a server for GPU/CUDA workloads, not Keda (Kubernetes Event-driven Autoscaling).
## Preparing Server to run Keda | |
## Preparing Server to run CUDA |
Copilot uses AI. Check for mistakes.
|
||
## Preparing Server to run Keda | ||
|
||
To Create a Server that has all the drivers installed you can use exmaple script prepared for Ubuntu 24.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in 'exmaple' which should be 'example'.
To Create a Server that has all the drivers installed you can use exmaple script prepared for Ubuntu 24.04 | |
To Create a Server that has all the drivers installed you can use example script prepared for Ubuntu 24.04 |
Copilot uses AI. Check for mistakes.
ca-certificates \ | ||
software-properties-common \ | ||
lsb-release \ | ||
less \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'less' package is listed twice in the same RUN command (lines 26 and 30), creating unnecessary duplication.
less \ |
Copilot uses AI. Check for mistakes.
What
Upgrade Keda and GPU components versions
Why
To have the latest working version of K3D using GPU
Implications