Releases: createthis/llama.cpp
Releases · createthis/llama.cpp
b6240
b6237
vulkan : support conv_2d_dw with f16 weights (#15392)
b6230
ci : continue file download with wget (#15471) ggml-ci
b6134
CANN: Add broadcast for softmax and FA (#15208) * refactor softmax * fix fa * fix mask shape * format * add comments * Remove whitespace
b6097
ggml: WebGPU disable SET_ROWS for now (#15078) * Add paramater buffer pool, batching of submissions, refactor command building/submission * Add header for linux builds * Free staged parameter buffers at once * Format with clang-format * Fix thread-safe implementation * Use device implicit synchronization * Update workflow to use custom release * Remove testing branch workflow * Disable set_rows until it's implemented * Fix potential issue around empty queue submission * Try synchronous submission * Try waiting on all futures explicitly * Add debug * Add more debug messages * Work on getting ssh access for debugging * Debug on failure * Disable other tests * Remove extra if * Try more locking * maybe passes? * test * Some cleanups * Restore build file * Remove extra testing branch ci
b6092
Fix `glm4moe` bug (#15088)
b6067
chat : fix multiple tool_calls on hermes-2-pro (#14962)
b6006
quantize : update README.md (#14905) * Update README.md * Fix trailing whitespace * Update README.md Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>
b5981
context : perform output reorder lazily upon access after sync (#14853) * context : perform output reorder after lazily upon access after sync ggml-ci * cont : add TODO
b5939
Documentation: Update build.md's Vulkan section (#14736) * Documentation: Rewrote and updated the "Without docker" portion of the Vulkan backend build documentation. * Documentation: Reorganize build.md's Vulkan section.