Skip to content

Releases: createthis/llama.cpp

b6240

21 Aug 22:26
54a241f
Compare
Choose a tag to compare
sched : fix possible use of wrong ids tensor when offloading moe prom…

b6237

21 Aug 15:47
97ae596
Compare
Choose a tag to compare
vulkan : support conv_2d_dw with f16 weights (#15392)

b6230

21 Aug 13:29
30649ca
Compare
Choose a tag to compare
ci : continue file download with wget (#15471)

ggml-ci

b6134

11 Aug 17:29
be48528
Compare
Choose a tag to compare
CANN: Add broadcast for softmax and FA (#15208)

* refactor softmax

* fix fa

* fix mask shape

* format

* add comments

* Remove whitespace

b6097

06 Aug 02:26
9515c61
Compare
Choose a tag to compare
ggml: WebGPU disable SET_ROWS for now (#15078)

* Add paramater buffer pool, batching of submissions, refactor command building/submission

* Add header for linux builds

* Free staged parameter buffers at once

* Format with clang-format

* Fix thread-safe implementation

* Use device implicit synchronization

* Update workflow to use custom release

* Remove testing branch workflow

* Disable set_rows until it's implemented

* Fix potential issue around empty queue submission

* Try synchronous submission

* Try waiting on all futures explicitly

* Add debug

* Add more debug messages

* Work on getting ssh access for debugging

* Debug on failure

* Disable other tests

* Remove extra if

* Try more locking

* maybe passes?

* test

* Some cleanups

* Restore build file

* Remove extra testing branch ci

b6092

05 Aug 13:42
c81de6e
Compare
Choose a tag to compare
Fix `glm4moe` bug (#15088)

b6067

02 Aug 11:24
f738989
Compare
Choose a tag to compare
chat : fix multiple tool_calls on hermes-2-pro (#14962)

b6006

27 Jul 23:05
7f97599
Compare
Choose a tag to compare
quantize : update README.md (#14905)

* Update README.md

* Fix trailing whitespace

* Update README.md

Co-authored-by: Sigbjørn Skjæret <[email protected]>

---------

Co-authored-by: Sigbjørn Skjæret <[email protected]>

b5981

24 Jul 18:19
e4868d1
Compare
Choose a tag to compare
context : perform output reorder lazily upon access after sync (#14853)

* context : perform output reorder after lazily upon access after sync

ggml-ci

* cont : add TODO

b5939

19 Jul 16:19
f0d4d17
Compare
Choose a tag to compare
Documentation: Update build.md's Vulkan section (#14736)

* Documentation: Rewrote and updated the "Without docker" portion of the Vulkan backend build documentation.

* Documentation: Reorganize build.md's Vulkan section.