amd-vlarakic

Follow

amd-vlarakic

Follow

Popular repositories Loading

exllamav2 exllamav2 Public

Forked from turboderp-org/exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python
GPTQModel GPTQModel Public

Forked from ModelCloud/GPTQModel

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

Python