Intel Neural Compressor Release 3.6

Latest

Latest

thuang6 released this 24 Oct 02:59

· 22 commits to master since this release

ffe0d73

Highlights
Features
Improvements
Validated Hardware
Validated Configurations

Highlights

Introduced new low precision data type quantization experimental support, including MXFP8 and MXFP4

Features

Support MXFP8 Post-Training Quantization (PTQ) on LLM models (experimental)
Support MXFP8 PTQ on diffusion models (experimental)
Support MXFP4 PTQ on LLM models (experimental)
Support Quantization-Aware Training (QAT) on LLM models (experimental)

Improvements

New LLM example (Llama 3 series) for MXFP4 / MXFP8 PTQ
New VLM example (Llama 4 Scout) for MXFP4 PTQ
New diffusion example (Flux) for MXFP8 PTQ
New LLM example (Llama 3) for MXFP8 QAT
Static safe check for evaluation function in 2.x API

Validated Hardware 

Intel Gaudi Al Accelerators (Gaudi 2 and 3)
Intel Xeon Scalable processor (4th, 5th, 6th Gen)
Intel Core Ultra Processors (Series 1 and 2)
Intel Data Center GPU Max Series (1550)
Intel® Arc™ B-Series Graphics GPU (B580)

Validated Configurations

Ubuntu 24.04 & Win 11
Python 3.10, 3.11, 3.12
PyTorch/IPEX 2.7, 2.8

Assets 2