Make sure you have installed the NVIDIA Container Toolkit:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
Check if everithing is configured:
sudo docker run --gpus all nvidia/cuda:12.8.0-base-ubuntu22.04 nvidia-smiTroubleshoting:
If you get this message by checking if it works:
Failed to initialize NVML: Unknown Error
I use Ubuntu 22.04 and after following the steps above I needed to cahnte a configuration in /etc/nvidia-container-runtime/config.toml.
Change no-cgroups to false
no-cgroups = false
Open a terminal and run the following command to start the Ollama container:
cd ~/workspace/ai/ollama
docker run -it --rm --gpus=all \
--name ollama \
-v ./data:/root/.ollama \
-v ./shared:/root/shared \
-p 11434:11434 \
ollama/ollamaNow you can run Ollama commands inside the container. For example, to list downloaded models:
docker exec -it ollama ollama listTo run a model, you can use the following command:
docker exec -it ollama ollama run llama3If the model is not downloaded, Ollama will automatically download it for you.
See the catalog: https://ollama.com/search
You can set a system prompt for the model using the following command:
/set system "You are a helpful assistant."To prompt a model with a file, you can use the following command:
docker exec -it ollama ollama run llava:7b \
'What is the Motor Nr of this image: ' \
< ./shared/motor-info.pngCLI tool reporting GPU use, VRAM, temperature, power draw, and memory.
Real-time refresh with:
nvidia-smi --loop=1Check https://github.com/open-webui/open-webui for more information on how to run the web UI.