Skip to content

Releases: ggml-org/llama.cpp

b7102

19 Nov 10:23
6fd4f95

Choose a tag to compare

Fix too relaxed check on CUDA "fast copy" (can_be_transposed) conditi…

b7101

19 Nov 10:12
980b7cd

Choose a tag to compare

vulkan: force full subgroups for flash attention to fix intel subgrou…

b7100

19 Nov 07:39
c49daff

Choose a tag to compare

ggml-cpu: Don't pass -mpowerpc64 when -mcpu already implies it (#17308)

b7097

18 Nov 20:01
1920345

Choose a tag to compare

common : Generalized XML-style tool-call parsing with streaming suppo…

b7096

18 Nov 18:55
561a3e2

Choose a tag to compare

ci : change the openEuler-310p image to fix release (#17361)

b7091

18 Nov 07:51
da95bf2

Choose a tag to compare

vulkan: support noncontig i32 copy (#17328)

b7090

18 Nov 03:11
0de8878

Choose a tag to compare

server: split HTTP into its own interface (#17216)

* server: split HTTP into its own interface

* move server-http and httplib to its own file

* add the remaining endpoints

* fix exception/error handling

* renaming

* missing header

* fix missing windows header

* fix error responses from http layer

* fix slot save/restore handler

* fix case where only one stream chunk is returned

* add NOMINMAX

* do not call sink.write on empty data

* use safe_json_to_str for SSE

* clean up

* add some comments

* improve usage of next()

* bring back the "server is listening on" message

* more generic handler

* add req.headers

* move the chat template print to init()

* add req.path

* cont : minor

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b7089

18 Nov 03:03
38e2c1b

Choose a tag to compare

vulkan: add log RTE support to fix Nvidia CI (#17320)

* vulkan: add log RTE support to fix Nvidia CI

* actually use the rte shader

b7088

18 Nov 02:54
cb44fc8

Choose a tag to compare

cmake : fix ARM feature verification (#17170)

* cmake : fix ARM feature verification

Use check_cxx_source_compiles to prevent conflicts with
the existing GGML_NATIVE detection code.

Signed-off-by: Adrien Gallouët <[email protected]>

* cmake : unset __ARM_FEATURE when feature is disabled

Signed-off-by: Adrien Gallouët <[email protected]>

* cmake : fix scope, this is really a macro

Signed-off-by: Adrien Gallouët <[email protected]>

* arm_neon.h is useless

Signed-off-by: Adrien Gallouët <[email protected]>

---------

Signed-off-by: Adrien Gallouët <[email protected]>

b7087

17 Nov 15:47
cb623de

Choose a tag to compare

ggml : add missing AVX512 feature checks (#17270)

_mm512_cvtepu8_epi16        requires  __AVX512BW__
_mm512_srli_epi16           requires  __AVX512BW__
__builtin_ia32_inserti32x8  requires  __AVX512DQ__

Signed-off-by: Adrien Gallouët <[email protected]>