[mxpf8] Make mxfp8 dim1 cast kernel configurable #1401

danielvegamyhre · 2025-07-15T21:58:00Z

Summary

We recently developed a CUDA kernel in torchao to perform mxfp8 casting with scaling along dim1, which is ~1.4x faster than the previous Triton implementation, this results in e2e training speedup of 1.5% - 2.5% with torchtitan Llama3 8b with FSDP=4/8: Add CUDA kernel for MXFP8 dim1 casting ao#2513
The integration into torchao is finished (integration of new mxfp8 casting cuda kernel ao#2564), so we need to update torchtitan to make the kernel choice for mxfp8 dim1 cast configurable to "triton", "cuda", or "torch".

Triton: NGPU=8 CONFIG_FILE="./torchtitan/models/llama3/train_configs/llama3_8b.toml" ./run_train.sh --training.steps=100 --model.converters="mx" --mx.recipe_name="mxfp8" --training.compile --mx.mxfp8_dim1_cast_kernel_choice="triton"
Cuda: NGPU=8 CONFIG_FILE="./torchtitan/models/llama3/train_configs/llama3_8b.toml" ./run_train.sh --training.steps=100 --model.converters="mx" --mx.recipe_name="mxfp8" --training.compile --mx.mxfp8_dim1_cast_kernel_choice="cuda"

danielvegamyhre · 2025-07-15T22:00:46Z

danielvegamyhre requested review from tianyu-l, fegin, wwwjn and wconstab as code owners July 15, 2025 21:58

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 15, 2025

danielvegamyhre marked this pull request as draft July 15, 2025 21:58

danielvegamyhre mentioned this pull request Jul 15, 2025

integrate mxfp8 dim1 cast kernel choice enum into MXLinear pytorch/ao#2554

Closed

danielvegamyhre force-pushed the mxcuda branch from d420d93 to 8bf1a65 Compare July 15, 2025 22:00

make mxfp8 dim1 cast kernel configurable

affe1a8

danielvegamyhre force-pushed the mxcuda branch from 8bf1a65 to affe1a8 Compare July 16, 2025 05:41

update api name

5e84ec7

danielvegamyhre mentioned this pull request Jul 16, 2025

integration of new mxfp8 casting cuda kernel pytorch/ao#2564

Merged

danielvegamyhre marked this pull request as ready for review July 18, 2025 15:37

danielvegamyhre closed this Jul 24, 2025