Releases · ggml-org/llama.cpp

14 Nov 08:53

2606b0a

b7057

metal : make the FA extra sizes consistent (#17143)

Assets 16

13 Nov 23:27

github-actions

b7054

becc481

b7054

ggml-cpu: handle 3d tensors in repack mat_mul (#17241)

* ggml-cpu: handle 3d tensors in repack mul_mat

* Removed unnecessary branch, removed need for <algorithm>

* Fixed dst_ptr pointer in chunk + clang_format

* GGML_ASSERT to check wdata within bounds

* Accidental ggml.h inclusion

* Improved GGML_ASSERT on wdata boundaries

* Address performance regression in Qwen and llama.cpp due to chunking

Assets 16

13 Nov 23:05

github-actions

b7053

c4abcb2

b7053

server: fixing naming conflict res_error (#17243)

Assets 16

13 Nov 23:02

github-actions

b7052

389ac78

b7052

ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM (#17063)

* Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM

* Update ggml/include/ggml.h

Co-authored-by: Georgi Gerganov <[email protected]>

* Update tests/test-backend-ops.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

* Code review

* Whitespace

* Update tests/test-backend-ops.cpp

Co-authored-by: Diego Devesa <[email protected]>

* This is actually sigmoid, duh.

* Add CONST, remove TRI_KEEP, other changes from review

* Update tests/test-backend-ops.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

* Update ggml/src/ggml.c

Co-authored-by: Georgi Gerganov <[email protected]>

* Update ggml/src/ggml.c

Co-authored-by: Georgi Gerganov <[email protected]>

* Update ggml/src/ggml-cuda/unary.cu

Co-authored-by: Aman Gupta <[email protected]>

* Remove extra script

* Update ggml/src/ggml.c

Co-authored-by: Diego Devesa <[email protected]>

* Update tests/test-backend-ops.cpp

Co-authored-by: Diego Devesa <[email protected]>

* moving changes from laptop [no ci]

* pre-rebase

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Refactor tests

* ggml : cleanup

* cont : fix ggml_fill srcs

* tests : add note

* ggml : add ggml_fill_inplace

* ggml : add asserts

* ggml : fix ggml_fill constant cast

* cont : ggml_tri minor

* Use TENSOR_LOCALS

* Fix regression from #14596, regenerate

* Don't make commits at night...

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: Diego Devesa <[email protected]>
Co-authored-by: Aman Gupta <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>

Assets 16

13 Nov 19:31

github-actions

b7051

a19bd6f

b7051

vulkan: remove shell call from vulkan-shaders-gen tool, revert file c…

Assets 16

13 Nov 18:46

github-actions

b7050

dd091e5

b7050

sched : fix reserve ignoring user tensor assignments (#17232)

Assets 16

13 Nov 17:50

github-actions

b7049

1215dde

b7049

ggml-cpu : add RISC-V vector intrinsic support for silu and cvar oper…

Assets 16

13 Nov 17:34

github-actions

b7048

0cfb191

b7048

metal: accelerated conv2d (#17175)

* metal: accelerated conv2d

* cont : cleanup

---------

Co-authored-by: bghira <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

Assets 16

13 Nov 16:11

github-actions

b7047

2776db6

b7047

Revert "ggml-cpu: handle 3d tensors in repack mat_mul (#17030)" (#17233)

This reverts commit 1c398dc9eca9c366ce98deb0e6f3538e444ebc8a.

Assets 16

13 Nov 10:17

github-actions

b7046

879dec3

b7046

ggml-cpu : use template for argsort (#17222)

Assets 16

Releases: ggml-org/llama.cpp

b7057

Uh oh!

b7054

Uh oh!

b7053

Uh oh!

b7052

Uh oh!

b7051

Uh oh!

b7050

Uh oh!

b7049

Uh oh!

b7048

Uh oh!

b7047

Uh oh!

b7046

Uh oh!