Skip to content

Intel Neural Compressor Release 3.6

Latest

Choose a tag to compare

@thuang6 thuang6 released this 24 Oct 02:59
· 22 commits to master since this release
v3.6
ffe0d73
  • Highlights
  • Features
  • Improvements
  • Validated Hardware
  • Validated Configurations

Highlights

  • Introduced new low precision data type quantization experimental support, including MXFP8 and MXFP4

Features

  • Support MXFP8 Post-Training Quantization (PTQ) on LLM models (experimental)
  • Support MXFP8 PTQ on diffusion models (experimental)
  • Support MXFP4 PTQ on LLM models (experimental)
  • Support Quantization-Aware Training (QAT) on LLM models (experimental)

Improvements

  • New LLM example (Llama 3 series) for MXFP4 / MXFP8 PTQ
  • New VLM example (Llama 4 Scout) for MXFP4 PTQ
  • New diffusion example (Flux) for MXFP8 PTQ
  • New LLM example (Llama 3) for MXFP8 QAT
  • Static safe check for evaluation function in 2.x API

Validated Hardware

  • Intel Gaudi Al Accelerators (Gaudi 2 and 3)
  • Intel Xeon Scalable processor (4th, 5th, 6th Gen)
  • Intel Core Ultra Processors (Series 1 and 2)
  • Intel Data Center GPU Max Series (1550)
  • Intel® Arc™ B-Series Graphics GPU (B580)

Validated Configurations

  • Ubuntu 24.04 & Win 11
  • Python 3.10, 3.11, 3.12
  • PyTorch/IPEX 2.7, 2.8