Skip to content

Releases: ggml-org/llama.cpp

b7057

14 Nov 08:53
2606b0a

Choose a tag to compare

metal : make the FA extra sizes consistent (#17143)

b7054

13 Nov 23:27
becc481

Choose a tag to compare

ggml-cpu: handle 3d tensors in repack mat_mul (#17241)

* ggml-cpu: handle 3d tensors in repack mul_mat

* Removed unnecessary branch, removed need for <algorithm>

* Fixed dst_ptr pointer in chunk + clang_format

* GGML_ASSERT to check wdata within bounds

* Accidental ggml.h inclusion

* Improved GGML_ASSERT on wdata boundaries

* Address performance regression in Qwen and llama.cpp due to chunking

b7053

13 Nov 23:05
c4abcb2

Choose a tag to compare

server: fixing naming conflict res_error (#17243)

b7052

13 Nov 23:02
389ac78

Choose a tag to compare

ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM (#17063)

* Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM

* Update ggml/include/ggml.h

Co-authored-by: Georgi Gerganov <[email protected]>

* Update tests/test-backend-ops.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

* Code review

* Whitespace

* Update tests/test-backend-ops.cpp

Co-authored-by: Diego Devesa <[email protected]>

* This is actually sigmoid, duh.

* Add CONST, remove TRI_KEEP, other changes from review

* Update tests/test-backend-ops.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

* Update ggml/src/ggml.c

Co-authored-by: Georgi Gerganov <[email protected]>

* Update ggml/src/ggml.c

Co-authored-by: Georgi Gerganov <[email protected]>

* Update ggml/src/ggml-cuda/unary.cu

Co-authored-by: Aman Gupta <[email protected]>

* Remove extra script

* Update ggml/src/ggml.c

Co-authored-by: Diego Devesa <[email protected]>

* Update tests/test-backend-ops.cpp

Co-authored-by: Diego Devesa <[email protected]>

* moving changes from laptop [no ci]

* pre-rebase

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Refactor tests

* ggml : cleanup

* cont : fix ggml_fill srcs

* tests : add note

* ggml : add ggml_fill_inplace

* ggml : add asserts

* ggml : fix ggml_fill constant cast

* cont : ggml_tri minor

* Use TENSOR_LOCALS

* Fix regression from #14596, regenerate

* Don't make commits at night...

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: Diego Devesa <[email protected]>
Co-authored-by: Aman Gupta <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>

b7051

13 Nov 19:31
a19bd6f

Choose a tag to compare

vulkan: remove shell call from vulkan-shaders-gen tool, revert file c…

b7050

13 Nov 18:46
dd091e5

Choose a tag to compare

sched : fix reserve ignoring user tensor assignments (#17232)

b7049

13 Nov 17:50
1215dde

Choose a tag to compare

ggml-cpu : add RISC-V vector intrinsic support for silu and cvar oper…

b7048

13 Nov 17:34
0cfb191

Choose a tag to compare

metal: accelerated conv2d (#17175)

* metal: accelerated conv2d

* cont : cleanup

---------

Co-authored-by: bghira <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

b7047

13 Nov 16:11
2776db6

Choose a tag to compare

Revert "ggml-cpu: handle 3d tensors in repack mat_mul (#17030)" (#17233)

This reverts commit 1c398dc9eca9c366ce98deb0e6f3538e444ebc8a.

b7046

13 Nov 10:17
879dec3

Choose a tag to compare

ggml-cpu : use template for argsort (#17222)