Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b7057
metal : make the FA extra sizes consistent (#17143)
b7054
ggml-cpu: handle 3d tensors in repack mat_mul (#17241) * ggml-cpu: handle 3d tensors in repack mul_mat * Removed unnecessary branch, removed need for <algorithm> * Fixed dst_ptr pointer in chunk + clang_format * GGML_ASSERT to check wdata within bounds * Accidental ggml.h inclusion * Improved GGML_ASSERT on wdata boundaries * Address performance regression in Qwen and llama.cpp due to chunking
b7053
server: fixing naming conflict res_error (#17243)
b7052
ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM (#17063) * Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM * Update ggml/include/ggml.h Co-authored-by: Georgi Gerganov <[email protected]> * Update tests/test-backend-ops.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Code review * Whitespace * Update tests/test-backend-ops.cpp Co-authored-by: Diego Devesa <[email protected]> * This is actually sigmoid, duh. * Add CONST, remove TRI_KEEP, other changes from review * Update tests/test-backend-ops.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update ggml/src/ggml.c Co-authored-by: Georgi Gerganov <[email protected]> * Update ggml/src/ggml.c Co-authored-by: Georgi Gerganov <[email protected]> * Update ggml/src/ggml-cuda/unary.cu Co-authored-by: Aman Gupta <[email protected]> * Remove extra script * Update ggml/src/ggml.c Co-authored-by: Diego Devesa <[email protected]> * Update tests/test-backend-ops.cpp Co-authored-by: Diego Devesa <[email protected]> * moving changes from laptop [no ci] * pre-rebase * Update tests/test-backend-ops.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update tests/test-backend-ops.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> * Refactor tests * ggml : cleanup * cont : fix ggml_fill srcs * tests : add note * ggml : add ggml_fill_inplace * ggml : add asserts * ggml : fix ggml_fill constant cast * cont : ggml_tri minor * Use TENSOR_LOCALS * Fix regression from #14596, regenerate * Don't make commits at night... --------- Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: Aman Gupta <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>
b7051
vulkan: remove shell call from vulkan-shaders-gen tool, revert file c…
b7050
sched : fix reserve ignoring user tensor assignments (#17232)
b7049
ggml-cpu : add RISC-V vector intrinsic support for silu and cvar oper…
b7048
metal: accelerated conv2d (#17175) * metal: accelerated conv2d * cont : cleanup --------- Co-authored-by: bghira <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>
b7047
Revert "ggml-cpu: handle 3d tensors in repack mat_mul (#17030)" (#17233) This reverts commit 1c398dc9eca9c366ce98deb0e6f3538e444ebc8a.
b7046
ggml-cpu : use template for argsort (#17222)