Popular repositories Loading
-
exllamav2
exllamav2 PublicForked from turboderp-org/exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
Python
-
GPTQModel
GPTQModel PublicForked from ModelCloud/GPTQModel
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.