Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b7102
b7101
vulkan: force full subgroups for flash attention to fix intel subgrou…
b7100
ggml-cpu: Don't pass -mpowerpc64 when -mcpu already implies it (#17308)
b7097
common : Generalized XML-style tool-call parsing with streaming suppo…
b7096
ci : change the openEuler-310p image to fix release (#17361)
b7091
vulkan: support noncontig i32 copy (#17328)
b7090
server: split HTTP into its own interface (#17216) * server: split HTTP into its own interface * move server-http and httplib to its own file * add the remaining endpoints * fix exception/error handling * renaming * missing header * fix missing windows header * fix error responses from http layer * fix slot save/restore handler * fix case where only one stream chunk is returned * add NOMINMAX * do not call sink.write on empty data * use safe_json_to_str for SSE * clean up * add some comments * improve usage of next() * bring back the "server is listening on" message * more generic handler * add req.headers * move the chat template print to init() * add req.path * cont : minor --------- Co-authored-by: Georgi Gerganov <[email protected]>
b7089
vulkan: add log RTE support to fix Nvidia CI (#17320) * vulkan: add log RTE support to fix Nvidia CI * actually use the rte shader
b7088
cmake : fix ARM feature verification (#17170) * cmake : fix ARM feature verification Use check_cxx_source_compiles to prevent conflicts with the existing GGML_NATIVE detection code. Signed-off-by: Adrien Gallouët <[email protected]> * cmake : unset __ARM_FEATURE when feature is disabled Signed-off-by: Adrien Gallouët <[email protected]> * cmake : fix scope, this is really a macro Signed-off-by: Adrien Gallouët <[email protected]> * arm_neon.h is useless Signed-off-by: Adrien Gallouët <[email protected]> --------- Signed-off-by: Adrien Gallouët <[email protected]>
b7087
ggml : add missing AVX512 feature checks (#17270) _mm512_cvtepu8_epi16 requires __AVX512BW__ _mm512_srli_epi16 requires __AVX512BW__ __builtin_ia32_inserti32x8 requires __AVX512DQ__ Signed-off-by: Adrien Gallouët <[email protected]>