Releases · ggml-org/llama.cpp

24 Nov 04:27

923ae3c

b7139

hexagon: add support for ROPE_NEOX (#17458)

Assets 20

24 Nov 02:29

github-actions

b7138

01ad35e

b7138

CANN: Define `cann_graph_update_required` before macro (#17434)

**Description of the problem**

`cann_graph_update_required` is redundantly defined and
initialized as `false` inside two mutually exclusive macro branches.

**Proposed solution**

Define it right before the macro so that it could serve both
branches.

Assets 20

24 Nov 01:26

github-actions

b7137

fcb0138

b7137

ggml-hexagon: Initial Hexagon v68/v69 support  (#17394)

* ggml-hexagon: fix build error with GCC

Add stdexcept include to fix GCC build errors

Signed-off-by: Mohamed Mediouni <[email protected]>

* ggml-hexagon: check VTCM acquire failures

Signed-off-by: Mohamed Mediouni <[email protected]>

* ggml-hexagon: disable destination bypass on older than v73

v68 errors out if having bypass enabled when the VTCM is the destination.

At least on v68 this made things actually work... not a proper fix though, so to look at later...

Signed-off-by: Mohamed Mediouni <[email protected]>

* ggml-hexagon: add initial v68/v69 support

v68 is the Hexagon revision notably used on the Snapdragon 8cx
Gen 3 and the QCM6490.

Also add support for v69.

8MB isn't a supported page size, so relax asked for page size constraint
for HAP_compute_res_attr_set_vtcm_param_v2 to optimal.

Signed-off-by: Mohamed Mediouni <[email protected]>

---------

Signed-off-by: Mohamed Mediouni <[email protected]>

Assets 20

24 Nov 00:18

github-actions

b7136

d5bc1ad

b7136

ggml-hexagon: add `hex_supported_buffer` for better buffer supported …

Assets 20

23 Nov 10:51

github-actions

b7134

96ac5a2

b7134

cuda : support non-contiguous i32 to i32 copy (#17326)

* support non-contiguous i32 to i32 copy

* add tests

* rename cpy_flt to cpy_scalar and reindent params

Assets 20

23 Nov 07:03

github-actions

b7132

54d83bb

b7132

vulkan: remove a couple unnecessary switches (#17419)

Assets 20

22 Nov 10:26

github-actions

b7130

3f3a4fb

b7130

Revive MUL_MAT_ID to perf testing (#17397)

Assets 20

22 Nov 00:38

github-actions

b7129

028f93e

b7129

HIP: RDNA4 tensor core support for MMF (#17077)

* mmf for rdna4

* align the padding for rdna4

* forbit mul_mat_f for rdna4

* fix as comment

* remove device kernels

* add constexpr for early return

* update based on review comment

* change based on the review comment

* pass compile error

* keep code consistency

---------

Co-authored-by: zhang hui <[email protected]>

Assets 20

21 Nov 23:16

github-actions

b7128

8e9ddba

b7128

opencl: refine condition for kqv mm (#17392)

Assets 20

Releases: ggml-org/llama.cpp

b7139

Uh oh!

b7138

Uh oh!

b7137

Uh oh!

b7136

Uh oh!

b7134

Uh oh!

b7132

Uh oh!

b7130

Uh oh!

b7129

Uh oh!

b7128

Uh oh!