Skip to content

Regarding the Hadamard transform accuracy drop for nvfp4 with RTN in the paper #19

@TheTinyTeddy

Description

@TheTinyTeddy

Many thanks for the great work!

I was wondering regarding the Hadamard transform accuracy drop for nvfp4 with RTN in the paper, is it because of the block size of hadamard is too small (16 vs 32 in INT4 and MXFP4) such that the outlier smoothing characteristic of the HT has been weakened? Also, could randomness (i.e. RHT) help with the result?

Furthermore, would you consider decrease the Quantization block size/Hadamard transformaton size from 32 to 16 for INT4 as a additional option to verify it?

Kind regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions