Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
286 commits
Select commit Hold shift + click to select a range
484a99e
docs: add LocalLLM app to community integrations (#8953)
qusaismael Feb 8, 2025
1f766c3
ci: use windows-2022 to sign and bundle (#8941)
mxyng Feb 8, 2025
38117fb
readme: add Lunary to observability community integrations (#8975)
hughcrt Feb 10, 2025
f4711da
ml/backend/ggml: fix crash on dlopen for non-AVX systems (#8976)
jmorganca Feb 10, 2025
0189bdd
readme: add Abso SDK to community integrations (#8973)
hughcrt Feb 11, 2025
49df03d
fix: harden backend loading (#9024)
mxyng Feb 11, 2025
afa55bc
doc: fix link for Abso (#9043)
hughcrt Feb 12, 2025
378d6e1
docs: fix nix package link (#9045)
bloominstrong Feb 12, 2025
82658c3
readme: add Homebrew to package managers section (#9052)
pygeek Feb 12, 2025
a4f69a0
build: add -DGGML_CUDA_NO_PEER_COPY=ON for rocm builds on windows (#9…
jmorganca Feb 13, 2025
10d59d5
openai: finish_reason as tool_calls for streaming with tools (#7963)
anuraaga Feb 13, 2025
3a4449e
docs: add H200 as supported device. (#9076)
rick-github Feb 13, 2025
8cf1606
docs: add ollamazing to the README.md (#9075)
buiducnhat Feb 13, 2025
5824541
next ollama runner (#7913)
mxyng Feb 14, 2025
7e13f56
backend: Don't return an error on Close
jessegross Feb 5, 2025
0e38297
backend: Consistently use int (vs. int64) for tensor shapes
jessegross Feb 4, 2025
4d4463b
backend: Support graph computation that does not return an output
jessegross Feb 4, 2025
d773b7d
backend: API to support full precision matmul
jessegross Feb 13, 2025
01d9a46
ggml-backend: Let GGML allocate context memory
jessegross Jan 31, 2025
6083069
ggml-backend: Ensure data is available after async computation
jessegross Feb 5, 2025
d223f3b
ggml-backend: Close on nil should be a no-op
jessegross Feb 10, 2025
d650ad3
model: Load tensors behind an interface
jessegross Jan 15, 2025
7916f55
vocab: Use int32 for special tokens
jessegross Feb 4, 2025
6945617
models: Move model into their own directory
jessegross Feb 5, 2025
ed443a0
Runner for Ollama engine
jessegross Dec 18, 2024
6600bd7
ml/backend/ggml: stable sort devices by score (#9081)
jmorganca Feb 14, 2025
f05774b
llm: do not evaluate symlink for exe path lookup (#9088)
jmorganca Feb 14, 2025
5296f48
llm: attempt to evaluate symlinks, but do not fail (#9089)
jmorganca Feb 14, 2025
010313b
llamarunner: Init GGML before printing system info
jessegross Feb 14, 2025
df2680b
Wire up system info log for new engine (#9123)
dhiltgen Feb 14, 2025
d006e1e
model: document high-level model interface (#9122)
BruceMacD Feb 15, 2025
0667bad
docs: fix incorrect shortcut key in windows.md (#9098)
James-William-Kincaid-III Feb 15, 2025
faf67db
cmd: fix progress bar flickering
jeremyschlatter Feb 17, 2025
5930aae
cmd: fix cursor flickering in progress bar
jeremyschlatter Feb 17, 2025
f9c7ead
cmd: eliminate flickering with synchronized output
jeremyschlatter Feb 18, 2025
3b4424f
readme: add LLM Telegram Bot to community integrations (#9150)
innightwolfsleep Feb 18, 2025
716e365
test: add test cases for HumanNumber (#9108)
ismdeep Feb 18, 2025
33ad61b
Add OpenDeepResearcher-via-searxng to Community Integrations (#9138)
benhaotang Feb 18, 2025
7b5d916
ci: set owner/group in tarball
mxyng Feb 15, 2025
08a299e
cmake: avoid building intel backends on linux
mxyng Feb 18, 2025
5f8c031
build: remove backend build for sapphirerapids
mxyng Feb 18, 2025
78f403f
address code review comments
jeremyschlatter Feb 18, 2025
e13e7c8
Merge pull request #9079 from jeremyschlatter/main
mxyng Feb 18, 2025
d2eb226
llama: add patch to fix ggml backend reg on Linux with utf-8 characte…
jmorganca Feb 19, 2025
3c874df
docs: Add MaxKB to Community Integrations (#9212)
maninhill Feb 19, 2025
778603a
docs: Add AntSK to Community Integrations (#9214)
xuzeyu91 Feb 19, 2025
d721a02
test: add test cases for ListHandler (#9146)
yuiseki Feb 19, 2025
1e438b2
Merge pull request #9203 from ollama/mxyng/sapphirerapids
mxyng Feb 19, 2025
bda4ef6
reorder patches
mxyng Feb 19, 2025
351a85d
openai: add 'timeout' to allowable x-stainless headers (#9237)
lucasthahn Feb 20, 2025
3d4cc78
docs: Add yla to community integrations
danielekp Feb 20, 2025
7c168b0
server: add missing function parens to debug log (#9255)
rick-github Feb 20, 2025
ba9ec3d
ci: use clang for windows cpu builds
mxyng Feb 20, 2025
14b5a9a
api: document client stream behavior with a test (#8996)
BruceMacD Feb 20, 2025
bd6a7d5
ollamarunner: Pass runner performance parameters to backends
jessegross Feb 20, 2025
e5bcc51
ggml-backend: Don't recreate the scheduler for each context
jessegross Feb 19, 2025
5c5535c
models: Prune unused outputs earlier in the forward pass
jessegross Feb 19, 2025
5d81c1a
docs: add `RockChinQ/LangBot` to integrations list (#9272)
RockChinQ Feb 21, 2025
2192a28
ml/backend/ggml: fix rms norm
mxyng Feb 21, 2025
f53f419
ml: Abstract attention out of model definitions
jessegross Feb 15, 2025
68bac1e
server: group routes by category and purpose (#9270)
bmizerany Feb 22, 2025
7cfd4ae
docs: add additional ROCm docs for building (#9066)
jmorganca Feb 22, 2025
8c13cfa
ml/backend/ggml: fix crash on windows paths with wide characters (#9305)
jmorganca Feb 24, 2025
4604b10
go.mod: bump to go1.24 (#9242)
bmizerany Feb 24, 2025
314573b
config: allow setting context length through env var (#8938)
ParthSareen Feb 24, 2025
0b7e167
sample: add sampling package for new engine (#8410)
ParthSareen Feb 25, 2025
348b3e0
server/internal: copy bmizerany/ollama-go to internal package (#9294)
bmizerany Feb 25, 2025
4df98f3
Move cgroups fix out of AMD section. (#9072)
rick-github Feb 25, 2025
a499390
build: support Compute Capability 5.0, 5.2 and 5.3 for CUDA 12.x (#8567)
prusnak Feb 25, 2025
b16367b
fix: add back bf16 support
mxyng Feb 25, 2025
8888556
docs: rocm install link (#9346)
ChuanhuiLiu Feb 25, 2025
6ecd7f6
docker: upgrade rocm to 6.3.3 (#8211)
Pekkari Feb 25, 2025
e91ae3d
Update ROCm (6.3 linux, 6.2 windows) and CUDA v12.8 (#9304)
dhiltgen Feb 25, 2025
0d69479
.github: always run tests, and other helpful fixes (#9348)
bmizerany Feb 25, 2025
3ad4bc8
llama: removed unused 'vendoring' file (#9351)
jmorganca Feb 25, 2025
e12af46
Add cuda Blackwell architecture for v12 (#9350)
dhiltgen Feb 26, 2025
2db96c1
readme: add Nichey to community integrations (#9370)
gkamer8 Feb 26, 2025
d7d7e99
llama: update llama.cpp vendor code to commit d7cfe1ff (#9356)
jmorganca Feb 27, 2025
a527213
ml/backend/ggml: follow on fixes after updating vendored code (#9388)
jmorganca Feb 27, 2025
76e903c
.github/workflows: swap order of go test and golangci-lint (#9389)
bmizerany Feb 27, 2025
688925a
Windows ARM build (#9120)
dhiltgen Feb 27, 2025
a59f665
ml/backend/ggml: fix debug logging
mxyng Feb 27, 2025
d6af13e
runner: simplify tensor split parsing
mxyng Feb 26, 2025
dc13813
server: allow vscode-file origins (#9313)
eriestrisnadi Feb 27, 2025
be2ac1e
docs: fix api examples link (#9360)
stevenh Feb 27, 2025
2412adf
server/internal: replace model delete API with new registry handler. …
bmizerany Feb 27, 2025
e185c08
go.mod: Use full version for go 1.24.0
jessegross Feb 27, 2025
53d2990
model: add bos token if configured
mxyng Feb 26, 2025
41dc280
server/internal/registry: implement CloseNotify and Flush (for now) (…
bmizerany Feb 27, 2025
3e8b8a1
ml: update Context.Forward interface
mxyng Feb 21, 2025
8b194b7
kvcache: update tests
mxyng Feb 26, 2025
c245b04
sample: remove transforms from greedy sampling (#9377)
ParthSareen Feb 27, 2025
0c1041a
runner: default to greedy sampler for performance (#9407)
BruceMacD Feb 28, 2025
2099e2d
CONTRIBUTING: provide clarity on good commit messages, and bad (#9405)
bmizerany Feb 28, 2025
98d44fa
llama: add phi4 mini support (#9403)
jmorganca Feb 28, 2025
25885e5
docs: Add 1Panel to Community Integrations (#9312)
wanghe-fit2cloud Feb 28, 2025
b42aba4
cuda: enable flash attention
mxyng Feb 28, 2025
eed11de
server/.../safetensors: fix offsets and include all model parts (#9427)
bmizerany Feb 28, 2025
a149128
build: add compute capability 12.0 to CUDA 12 preset (#9426)
jmorganca Feb 28, 2025
657685e
fix: replace deprecated functions
mxyng Feb 28, 2025
31e472b
runner: defer context cancel
mxyng Feb 28, 2025
bebb682
server: validate local path on safetensor create (#9379)
BruceMacD Mar 1, 2025
cda6f5c
server/internal/internal/names: validate names (#9400)
bmizerany Mar 1, 2025
e75c612
build: set GGML_CUDA_NO_VMM for ggml-hip target (#9449)
jmorganca Mar 1, 2025
96a97ad
build: use correct GGML_HIP_NO_VMM compiler definition for ggml-hip (…
jmorganca Mar 2, 2025
854a919
attention: Remove unnecessary contiguous operations
jessegross Feb 23, 2025
55e5776
ggml-backend: Store parent backend as part of tensor
jessegross Feb 27, 2025
ee141cc
ml: Empty tensor constructor for tensors
jessegross Mar 1, 2025
21aa666
ml: Enable support for flash attention
jessegross Feb 26, 2025
af68d60
readme: add AstrBot to community integrations (#9442)
Soulter Mar 2, 2025
ee048b7
server/internal/client/ollama: handle extended names in client/ollama…
bmizerany Mar 2, 2025
e41c4cb
build: install ccache manually in Dockerfile (#9464)
jmorganca Mar 3, 2025
3519dd1
server/internal/client/ollama: hold DiskCache on Registry (#9463)
bmizerany Mar 3, 2025
1579c4f
build: install binutils alongside gcc in Dockerfile (#9475)
jmorganca Mar 3, 2025
3b1ddb2
docs: add reins to community integrations (#9411)
ibrahimcetin Mar 3, 2025
a6f0f90
docs: update phi3-mini to phi4-mini (#9424)
olumolu Mar 3, 2025
36dfb90
docs: don't use self-closing tag for anchor element (#9456)
remarkablemark Mar 3, 2025
d25efe3
cmd: add default err return for stop (#9458)
googs1025 Mar 3, 2025
ba7d312
fix: own lib/ollama directory
mxyng Mar 3, 2025
b428ddd
docker: use go version from go.mod
mxyng Mar 3, 2025
fefbf8f
docs: add Ollama Android Chat community integration
sunshine0523 Mar 4, 2025
55ab9f3
server/.../backoff,syncs: don't break builds without synctest (#9484)
bmizerany Mar 4, 2025
7a01ad7
server/internal/registry: reintroduce pruning on model deletion (#9489)
bmizerany Mar 4, 2025
1fdb351
New engine: vision models and auto-fallback (#9113)
dhiltgen Mar 4, 2025
8fe6f69
docs: add granite-3.2 to the readme
olumolu Mar 4, 2025
05a01fd
ml/backend/ggml: consolidate system info logging
mxyng Mar 1, 2025
cae5d4d
Win: doc new rocm zip file (#9367)
dhiltgen Mar 5, 2025
e2252d0
server/internal/registry: take over pulls from server package (#9485)
bmizerany Mar 5, 2025
b70fc4d
model: Don't unconditionally add special tokens
jessegross Mar 5, 2025
a7e63b8
ollamarunner: Improve multimodal input handling
jessegross Mar 5, 2025
25248f4
Better WantedBy declaration
dwt Mar 7, 2025
4289c74
llama: fix kv loading on snowflake-arctic-embed models (#9536)
jmorganca Mar 7, 2025
1f6986e
readme: add QwQ to the supported models list (#9565)
iBreaker Mar 7, 2025
0682dae
sample: improve ollama engine sampler performance (#9374)
ParthSareen Mar 7, 2025
bab6f34
ml/backend/ggml: update model loading for hybrid/multi backends
mxyng Feb 19, 2025
bfce55d
model: load non-repeated tensors into multiple backends
mxyng Feb 24, 2025
764e199
kvcache: create cache ctx per layer
mxyng Feb 25, 2025
7bae7fa
ml/backend/ggml: create tensor on specific backend
mxyng Feb 26, 2025
58b9ec1
kvcache: update tests
mxyng Feb 26, 2025
bf92088
ml/backend/ggml: set cpu n_threads
mxyng Feb 26, 2025
26c2e0b
ml/backend/ggml: handle user specified cpu offloading
mxyng Feb 26, 2025
b5312f3
ml/backend/ggml: handle tensor split
mxyng Feb 26, 2025
2dc60d4
ml/backend/ggml: offload vision to cpu
mxyng Feb 28, 2025
daaf42e
ml/backend/ggml: clean up
mxyng Feb 28, 2025
45df786
comments
mxyng Mar 4, 2025
b27e8f3
ml/backend/ggml: use backend buffer type
mxyng Mar 5, 2025
98272fb
additional review comments
jessegross Mar 7, 2025
0daaaef
ollamarunner: Quiet debug logging and panic on unimplemented features
jessegross Mar 7, 2025
6da8b6a
kvcache: Support non-causal attention
jessegross Mar 7, 2025
25f9b15
ggml-backend: Ensure allocation meet backend requirements
jessegross Mar 8, 2025
f52b261
kvcache: Set context for shift offsets
jessegross Mar 8, 2025
4100ed7
ml: Add support for quantized KV cache
jessegross Feb 22, 2025
747898d
Merge pull request #1 from ollama/main
grinco Mar 8, 2025
189cbb4
Updated dockerfile
grinco Mar 8, 2025
4614faf
ollamarunner: Don't panic for unimplemented features at runtime.
jessegross Mar 9, 2025
81465ca
Installing rocm library
grinco Mar 9, 2025
42bac5c
This version works well
grinco Mar 9, 2025
a1cda80
model: Update encoder cache to use multimodal input processing handler
jessegross Mar 8, 2025
e648126
Merge branch 'ollama_vanilla_stable' into ollama_vulkan_stable
grinco Mar 10, 2025
98f6997
Applied 00-fix-vulkan-building.patch
grinco Mar 10, 2025
cff62cc
Merge branch 'ollama_vulkan_stable' into grinco-vulkan
grinco Mar 10, 2025
b14dd68
Fixed the "detached head" issues
grinco Mar 10, 2025
31606b2
Merged in the right direction
grinco Mar 10, 2025
e093db9
sample: temporarily use grammars for constrained generation in new en…
jmorganca Mar 10, 2025
96ec8af
docs(tool): add mcp-llm (#9537)
sammcj Mar 10, 2025
757668c
docs: add SwiftChat (#9540)
zhu-xiaowei Mar 10, 2025
d8a5d96
docs: Add OLLAMA_CONTEXT_LENGTH to FAQ. (#9545)
rick-github Mar 10, 2025
fe77629
Merge pull request #9569 from dwt/patch-1
mxyng Mar 10, 2025
7e34f4f
sample: add numerical stability to temperature/softmax transform (#9631)
ParthSareen Mar 10, 2025
8585b7b
docs: add opik to observability integrations (#9626)
vincentkoc Mar 10, 2025
9926eae
fix: pad tensor item if ge zero
mxyng Mar 8, 2025
26a2699
Merge pull request #9590 from ollama/mxyng/dump-pad
mxyng Mar 10, 2025
6b1f84e
Merging the latest stable (#2)
grinco Mar 11, 2025
9cb4ad0
This is no longer needed
grinco Mar 11, 2025
4dcf801
Build release for windows with local script (#9636)
dhiltgen Mar 11, 2025
5f74d1f
gemma2 impl
pdevine Feb 7, 2025
4b037a9
add gemma vision encoder
mxyng Mar 6, 2025
4346c24
fix drift from main
jessegross Mar 7, 2025
631fecc
temporary work around for converting spm
pdevine Mar 7, 2025
0df1800
set non-causal attention
mxyng Mar 7, 2025
c62861f
fix conversion
pdevine Mar 7, 2025
0e88659
Fix tests and drift from main
jessegross Mar 8, 2025
8934324
use fast attention
mxyng Mar 8, 2025
46bb016
update model
mxyng Mar 8, 2025
9b54267
fix configs
pdevine Mar 9, 2025
d368c03
skip repacking vision tensors
mxyng Mar 9, 2025
6b0486c
duplicate token_embd to output
mxyng Mar 9, 2025
9e4642e
ollama debug tensor
mxyng Mar 9, 2025
f888912
fix vision encoder
mxyng Mar 9, 2025
c5cbe4f
fallback to cpu
mxyng Mar 10, 2025
6b32a2d
compat with upstream gguf
mxyng Mar 10, 2025
2e54d72
fix gemma3 1b conversion
pdevine Mar 10, 2025
9d2a20a
use non-causal mask for inputs with images
mxyng Mar 10, 2025
e952789
use non-causal mask only for image positions
mxyng Mar 10, 2025
2c40c4d
Fix follow up images and images split across batches
jessegross Mar 10, 2025
4750055
Restrict Gemma to a single image per request
jessegross Mar 10, 2025
a8e83a7
Disable causal attention based on batch index
jessegross Mar 11, 2025
06007c0
Allow models to force a new batch
jessegross Mar 11, 2025
65b0f32
Revert "Allow models to force a new batch"
jmorganca Mar 11, 2025
f63e62e
reduce kernel size, add TODO for loading from config
jmorganca Mar 11, 2025
11bfa62
add trailing \n\n after <end_of_image> to match reference implementation
jmorganca Mar 11, 2025
ab39e08
llm: auto detect models that require Ollama Engine (#1)
dhiltgen Mar 11, 2025
63a3940
use 2d pooling
mxyng Mar 11, 2025
20e3593
model: validate left and right pairs before merging them
jmorganca Mar 11, 2025
fb4664f
model: add more spm tokenizer tests
jmorganca Mar 11, 2025
c6b6938
kvcache: fix tests by adding AvgPool2D stub
jmorganca Mar 11, 2025
83f0ec8
all: address linter errors
jmorganca Mar 11, 2025
aee2850
Merge pull request #9661 from ollama/gemma
mxyng Mar 11, 2025
ad4e0bf
Adding Gemma 3 to readme (#9671)
mchiang0610 Mar 12, 2025
b3af953
cli: don't exit for invalid model during /load. (#9576)
rick-github Mar 12, 2025
d0afc67
Merge branch 'vulkan' into ollama_vanilla_stable
grinco Mar 12, 2025
85ab552
ollama-debug.c: correct mistype
Shane-XB-Qian Mar 12, 2025
6b45b1d
cli: adding support ctrl-n/p like general cli (#9136)
Shane-XB-Qian Mar 12, 2025
a70820d
models/gemma3: remove final logit softcap (#9692)
BruceMacD Mar 12, 2025
1b7433b
sample: use container/heap for top_k
ParthSareen Mar 12, 2025
3ba9163
sample: simplify top_k=0 sorting
ParthSareen Mar 12, 2025
4aeb67e
sample: do all sorting in topK
ParthSareen Mar 12, 2025
30d7a59
ollama-debug.c: change 'ld' to 'PRIi64'
Shane-XB-Qian Mar 13, 2025
5c0b663
sample: separate softmax and temperature transforms (#9732)
ParthSareen Mar 13, 2025
45a13b1
Merge pull request #9688 from Shane-XB-Qian/debug_mistype_lld
mxyng Mar 13, 2025
5e2e0b4
fix: error if image requested without vision model
mxyng Mar 13, 2025
ec46f32
engine: error on embeddings; not currently implemented
mxyng Mar 13, 2025
3e102b7
Update model/model.go
mxyng Mar 13, 2025
ccfd41c
Merge pull request #9742 from ollama/mxyng/engine-error-embeddings
mxyng Mar 13, 2025
80c7ce3
fix: change default context size for gemma3 (#9744)
pdevine Mar 13, 2025
4bed739
add verbose mode to the show command (#9640)
pdevine Mar 13, 2025
543240f
Merge pull request #9741 from ollama/mxyng/visionless
mxyng Mar 13, 2025
033cec2
count gemma3 vision tensors
mxyng Mar 12, 2025
d2ec223
count all vision tensors
mxyng Mar 12, 2025
a422ba3
roughly count gemma3 graph
mxyng Mar 13, 2025
65b88c5
fix divide by zero
mxyng Mar 13, 2025
74b44fd
docs: Add OLLAMA_ORIGINS for browser extension support (#9643)
13rac1 Mar 13, 2025
8d76fa2
count non-repeating vision layers
mxyng Mar 13, 2025
4ea4d2b
Merge pull request #9703 from ollama/mxyng/gemma3-memory
mxyng Mar 13, 2025
eb2b22b
server/internal/client: use chunksums for concurrent blob verificatio…
bmizerany Mar 14, 2025
4e320b8
server/internal/chunks: remove chunks package (#9755)
bmizerany Mar 14, 2025
3892c3a
llm: remove internal subprocess req and resp types (#9324)
BruceMacD Mar 14, 2025
9679f40
ml: Allow models to constrain inputs to a single batch
jessegross Mar 12, 2025
282bfaa
ollamarunner: Use a separate context per multimodal input
jessegross Mar 14, 2025
7bf793a
gemma3: Allow multiple image in a single input
jessegross Mar 12, 2025
2d2247e
Align versions for local builds (#9635)
dhiltgen Mar 14, 2025
ef378ad
gemma3 quantization (#9776)
pdevine Mar 15, 2025
8294676
server/internal/client/ollama: set User-Agent for registry client (#9…
bmizerany Mar 15, 2025
2c8b484
fix: correctly save in interactive mode (#9788)
pdevine Mar 15, 2025
d1939aa
Fixes SIGSEGV: segmentation violation running gemma3 models on ollama…
grinco Mar 15, 2025
f77b9b9
Merge branch 'ollama_vanilla_stable' into vulkan
grinco Mar 15, 2025
c2e4408
Applied 04-disable-mmap-vulkan.patch
grinco Mar 16, 2025
640f0bb
Pulled new upstream code for ggml-bulkan backend
grinco Mar 16, 2025
4aa7e5e
Merge ollama/ollama main into vulkan
grinco Mar 16, 2025
45dbd14
Merged latest ollama 0.6.2 and nasrally's Flash Attention patches (#5)
grinco Mar 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 4 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ ml/backend/**/*.cu linguist-vendored
ml/backend/**/*.cuh linguist-vendored
ml/backend/**/*.m linguist-vendored
ml/backend/**/*.metal linguist-vendored
ml/backend/**/CMakeLists.txt linguist-vendored

llama/build-info.cpp linguist-generated
ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.s linguist-generated

* text=auto
*.go text eol=lf
8 changes: 8 additions & 0 deletions .github/ISSUE_TEMPLATE/10_bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,14 @@ body:
description: What happened? What did you expect to happen?
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant log output
description: Please copy and paste any relevant log output. See [Troubleshooting Guide](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) for details.
render: shell
validations:
required: false
- type: dropdown
id: os
attributes:
Expand Down
206 changes: 137 additions & 69 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ jobs:
path: dist/darwin-arm64
- run: |
export VERSION=${GITHUB_REF_NAME#v}
./scripts/build_darwin.sh macapp sign
./scripts/build_darwin.sh sign macapp
env:
APPLE_IDENTITY: ${{ secrets.APPLE_IDENTITY }}
APPLE_PASSWORD: ${{ secrets.APPLE_PASSWORD }}
Expand Down Expand Up @@ -111,13 +111,13 @@ jobs:
- os: windows
arch: amd64
preset: 'CUDA 12'
install: https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda_12.4.0_551.61_windows.exe
cuda-version: '12.4'
install: https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda_12.8.0_571.96_windows.exe
cuda-version: '12.8'
- os: windows
arch: amd64
preset: 'ROCm 6'
install: https://download.amd.com/developer/eula/rocm-hub/AMD-Software-PRO-Edition-24.Q3-WinSvr2022-For-HIP.exe
rocm-version: '6.1'
install: https://download.amd.com/developer/eula/rocm-hub/AMD-Software-PRO-Edition-24.Q4-WinSvr2022-For-HIP.exe
rocm-version: '6.2'
runs-on: ${{ matrix.arch == 'arm64' && format('{0}-{1}', matrix.os, matrix.arch) || matrix.os }}
environment: release
env:
Expand Down Expand Up @@ -160,6 +160,10 @@ jobs:
echo "$hipPath\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
echo "CC=$hipPath\bin\clang.exe" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "CXX=$hipPath\bin\clang++.exe" | Out-File -FilePath $env:GITHUB_ENV -Append
- if: matrix.preset == 'CPU'
run: |
echo "CC=clang.exe" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "CXX=clang++.exe" | Out-File -FilePath $env:GITHUB_ENV -Append
- if: ${{ !cancelled() && steps.cache-install.outputs.cache-hit != 'true' }}
uses: actions/cache/save@v4
with:
Expand Down Expand Up @@ -197,33 +201,38 @@ jobs:
env:
GOFLAGS: ${{ needs.setup-environment.outputs.GOFLAGS }}
steps:
- name: Install system dependencies
- name: Install AMD64 system dependencies
if: matrix.arch == 'amd64'
run: |
$ErrorActionPreference = "Stop"
if ("${{ matrix.arch }}" -eq 'amd64') {
Start-Process "C:\msys64\usr\bin\pacman.exe" -ArgumentList @("-S", "--noconfirm", "mingw-w64-clang-x86_64-gcc-compat", "mingw-w64-clang-x86_64-clang") -NoNewWindow -Wait
echo "C:\msys64\usr\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
echo "C:\msys64\clang64\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
} elseif ("${{ matrix.arch }}" -eq 'arm64') {
Set-ExecutionPolicy Bypass -Scope Process -Force
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072
iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
echo "C:\ProgramData\chocolatey\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
Start-Process "C:\msys64\usr\bin\pacman.exe" -ArgumentList @("-S", "--noconfirm", "mingw-w64-clang-x86_64-gcc-compat", "mingw-w64-clang-x86_64-clang") -NoNewWindow -Wait
echo "C:\msys64\usr\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
echo "C:\msys64\clang64\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
- name: Install ARM64 system dependencies
if: matrix.arch == 'arm64'
run: |
$ErrorActionPreference = "Stop"
Set-ExecutionPolicy Bypass -Scope Process -Force
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072
iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
echo "C:\ProgramData\chocolatey\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append

choco install -y --no-progress git gzip
echo "C:\Program Files\Git\cmd" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
choco install -y --no-progress git gzip
echo "C:\Program Files\Git\cmd" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append

Invoke-WebRequest -Uri "https://github.com/mstorsjo/llvm-mingw/releases/download/20240619/llvm-mingw-20240619-ucrt-aarch64.zip" -OutFile "${{ runner.temp }}\llvm-mingw-ucrt-aarch64.zip"
Expand-Archive -Path ${{ runner.temp }}\llvm-mingw-ucrt-aarch64.zip -DestinationPath "C:\Program Files\"
$installPath=(Resolve-Path -Path "C:\Program Files\llvm-mingw-*-ucrt-aarch64").path
echo $installPath\bin | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
}
Invoke-WebRequest -Uri "https://github.com/mstorsjo/llvm-mingw/releases/download/20240619/llvm-mingw-20240619-ucrt-aarch64.zip" -OutFile "${{ runner.temp }}\llvm-mingw-ucrt-aarch64.zip"
Expand-Archive -Path ${{ runner.temp }}\llvm-mingw-ucrt-aarch64.zip -DestinationPath "C:\Program Files\"
$installPath=(Resolve-Path -Path "C:\Program Files\llvm-mingw-*-ucrt-aarch64").path
echo $installPath\bin | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version-file: go.mod
- run: |
go build -o dist/${{ matrix.os }}-${{ matrix.arch }}/ .
- if: matrix.arch == 'arm64'
run: |
Invoke-WebRequest -Uri "https://aka.ms/vs/17/release/vc_redist.arm64.exe" -OutFile "dist\windows-arm64\vc_redist.arm64.exe"
- run: |
$env:VERSION='${{ github.ref_name }}' -Replace "v(.*)", '$1'
& .\scripts\build_windows.ps1 buildApp
Expand All @@ -237,7 +246,7 @@ jobs:
dist\${{ matrix.os }}-${{ matrix.arch }}-app.exe

windows-sign:
runs-on: windows
runs-on: windows-2022
environment: release
needs: [windows-depends, windows-build]
steps:
Expand All @@ -258,16 +267,18 @@ jobs:
echo "${{ vars.OLLAMA_CERT }}" >ollama_inc.crt
- uses: actions/download-artifact@v4
with:
name: build-windows-*
pattern: build-windows-*
path: dist\
merge-multiple: true
- uses: actions/download-artifact@v4
with:
name: depends-windows-amd64-*
pattern: depends-windows-amd64-*
path: dist\windows-amd64\
merge-multiple: true
- run: |
& .\scripts\build_windows.ps1 gatherDependencies sign buildInstaller distZip
env:
KEY_CONTAINER: ${{ vars.KEY_CONTAINER }}
- uses: actions/upload-artifact@v4
with:
name: dist-windows
Expand All @@ -281,10 +292,13 @@ jobs:
include:
- os: linux
arch: amd64
targets: 'archive rocm'
target: archive
- os: linux
arch: amd64
target: rocm
- os: linux
arch: arm64
targets: archive
target: archive
runs-on: ${{ matrix.arch == 'arm64' && format('{0}-{1}', matrix.os, matrix.arch) || matrix.os }}
environment: release
needs: setup-environment
Expand All @@ -293,67 +307,130 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v6
with:
context: .
platforms: ${{ matrix.os }}/${{ matrix.arch }}
target: ${{ matrix.target }}
build-args: |
GOFLAGS=${{ env.GOFLAGS }}
CGO_CFLAGS=${{ env.CGO_CFLAGS }}
CGO_CXXFLAGS=${{ env.CGO_CXXFLAGS }}
outputs: type=local,dest=dist/${{ matrix.os }}-${{ matrix.arch }}
cache-from: type=registry,ref=ollama/ollama:latest
cache-to: type=inline
- run: |
apt-get update && apt-get install pigz
for TARGET in ${{ matrix.targets }}; do docker buildx build --platform $PLATFORM --target $TARGET --build-arg GOFLAGS --build-arg CGO_CFLAGS --build-args CGO_CXXFLAGS --output type=local,dest=dist/$PLATFORM .; done
tar c -C dist/$PLATFORM . | pigz -9cv >dist/ollama-${PLATFORM//\//-}.tgz
env:
PLATFORM: ${{ matrix.os }}/${{ matrix.arch }}
for COMPONENT in bin/* lib/ollama/*; do
case "$COMPONENT" in
bin/ollama) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/*.so) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/cuda_v11) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/cuda_v12) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/cuda_jetpack5) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-jetpack5.tar.in ;;
lib/ollama/cuda_jetpack6) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-jetpack6.tar.in ;;
lib/ollama/rocm) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-rocm.tar.in ;;
esac
done
working-directory: dist/${{ matrix.os }}-${{ matrix.arch }}
- run: |
for ARCHIVE in dist/${{ matrix.os }}-${{ matrix.arch }}/*.tar.in; do
tar c -C dist/${{ matrix.os }}-${{ matrix.arch }} -T $ARCHIVE --owner 0 --group 0 | pigz -9vc >$(basename ${ARCHIVE//.*/}.tgz);
done
- uses: actions/upload-artifact@v4
with:
name: dist-${{ matrix.os }}-${{ matrix.arch }}
name: dist-${{ matrix.os }}-${{ matrix.arch }}-${{ matrix.target }}
path: |
dist/ollama-${{ matrix.os }}-${{ matrix.arch }}.tgz
*.tgz

docker-build:
# Build each Docker variant (OS, arch, and flavor) separately. Using QEMU is unreliable and slower.
docker-build-push:
strategy:
matrix:
include:
- flavor: 'latest=false'
platforms: linux/amd64,linux/arm64
- os: linux
arch: arm64
build-args: |
CGO_CFLAGS
CGO_CXXFLAGS
GOFLAGS
- os: linux
arch: amd64
build-args: |
CGO_CFLAGS
CGO_CXXFLAGS
GOFLAGS
- flavor: 'latest=false,suffix=rocm'
platforms: linux/amd64
- os: linux
arch: amd64
suffix: '-rocm'
build-args: |
CGO_CFLAGS
CGO_CXXFLAGS
GOFLAGS
FLAVOR=rocm
runs-on: ${{ matrix.arch == 'arm64' && format('{0}-{1}', matrix.os, matrix.arch) || matrix.os }}
environment: release
needs: setup-environment
env:
GOFLAGS: ${{ needs.setup-environment.outputs.GOFLAGS }}
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
username: ${{ vars.DOCKER_USER }}
password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
- id: build-push
uses: docker/build-push-action@v6
with:
context: .
platforms: ${{ matrix.os }}/${{ matrix.arch }}
build-args: ${{ matrix.build-args }}
outputs: type=image,name=ollama/ollama,push-by-digest=true,name-canonical=true,push=true
cache-from: type=registry,ref=ollama/ollama:latest
cache-to: type=inline
- run: |
mkdir -p ${{ matrix.os }}-${{ matrix.arch }}
echo "${{ steps.build-push.outputs.digest }}" >${{ matrix.os }}-${{ matrix.arch }}-${{ matrix.suffix }}.txt
working-directory: ${{ runner.temp }}
- uses: actions/upload-artifact@v4
with:
name: digest-${{ matrix.os }}-${{ matrix.arch }}-${{ matrix.suffix }}
path: |
${{ runner.temp }}/${{ matrix.os }}-${{ matrix.arch }}-${{ matrix.suffix }}.txt

# Merge Docker images for the same flavor into a single multi-arch manifest
docker-merge-push:
strategy:
matrix:
suffix: ['', '-rocm']
runs-on: linux
environment: release
needs: setup-environment
needs: [docker-build-push]
steps:
- uses: actions/checkout@v4
- uses: docker/setup-qemu-action@v2
- uses: docker/setup-buildx-action@v2
- uses: docker/login-action@v3
with:
username: ${{ vars.DOCKER_USER }}
password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
- id: metadata
uses: docker/metadata-action@v4
with:
flavor: ${{ matrix.flavor }}
flavor: |
latest=false
suffix=${{ matrix.suffix }}
images: |
ollama/ollama
tags: |
type=ref,enable=true,priority=600,prefix=pr-,event=pr
type=semver,pattern={{version}}
- uses: docker/build-push-action@v6
- uses: actions/download-artifact@v4
with:
context: .
push: true
platforms: ${{ matrix.platforms }}
build-args: ${{ matrix.build-args }}
tags: ${{ steps.metadata.outputs.tags }}
labels: ${{ steps.metadata.outputs.labels }}
cache-from: type=registry,ref=ollama/ollama:latest
cache-to: type=inline
provenance: false
pattern: digest-*
path: ${{ runner.temp }}
merge-multiple: true
- run: |
docker buildx imagetools create $(echo '${{ steps.metadata.outputs.json }}' | jq -cr '.tags | map("-t", .) | join(" ")') $(cat *-${{ matrix.suffix }}.txt | xargs printf 'ollama/ollama@%s ')
docker buildx imagetools inspect ollama/ollama:${{ steps.metadata.outputs.version }}
working-directory: ${{ runner.temp }}

# Aggregate all the assets and ship a release
release:
Expand All @@ -366,33 +443,24 @@ jobs:
GH_TOKEN: ${{ github.token }}
steps:
- uses: actions/checkout@v4
- name: Set Version
shell: bash
run: |
- uses: actions/download-artifact@v4
with:
name: dist-darwin
path: dist
pattern: dist-darwin
- uses: actions/download-artifact@v4
with:
name: dist-windows
path: dist
pattern: dist-windows
- uses: actions/download-artifact@v4
with:
path: dist
pattern: dist-linux-*
- uses: actions/download-artifact@v4
with:
path: dist
pattern: dist-windows
- run: |
ls -lh dist/
(cd dist; find . -type f | xargs sha256sum > ../sha256sum.txt)
mv sha256sum.txt dist/
cat dist/sha256sum.txt
merge-multiple: true
- run: find . -type f -not -name 'sha256sum.txt' | xargs sha256sum | tee sha256sum.txt
working-directory: dist
- name: Create or update Release
run: |
RELEASE_VERSION=$(echo ${GITHUB_REF_NAME} | cut -f1 -d-)"
RELEASE_VERSION="$(echo ${GITHUB_REF_NAME} | cut -f1 -d-)"

echo "Looking for existing release for ${RELEASE_VERSION}"
OLD_TAG=$(gh release ls --json name,tagName | jq -r ".[] | select(.name == \"${RELEASE_VERSION}\") | .tagName")
Expand Down
Loading