[Quantization Format] Add functionality to infer format #441

dsikka · 2025-08-29T16:25:58Z

Summary

Move format infer functionality from llmcompressor to compressed-tensors - [QuantizationFormat] Remove code inferring format vllm-project/llm-compressor#1786
Simplifies functionality to set the format based on the quant_scheme and sparsity scheme, if a format is not already set by the user
Use the per module format to determine the global format for the model
Adds dosctrings
Add loguru
Return dense if no format inferred, not None

Next Steps

We can set the per module format when the quantization scheme is initialized with some work on how we're handling the sparsity config / structure. Will work on simplifying this next to allow this. Otherwise, we currently do it during compression time at the end

brian-dellabetta

LGTM. Few code-related things, nothing pops out as wrong in implementation but deferring to others on that 😄

src/compressed_tensors/config/format.py

shanjiaz

Looks good to me!

kylesayrs

Can you add a comment on QuantizationScheme which specifies that None means that the value will be inferred by infer_and_set_per_module_quantization_format before compression?

Otherwise looks great, thanks for doing this

src/compressed_tensors/config/format.py

kylesayrs · 2025-09-08T22:39:09Z

#449

This reverts commit 141cbba.

…" (#451) This reverts commit 141cbba.

) * add format infer code * update * update * add loguru * use dense not None

…almagic#441)" (neuralmagic#451) This reverts commit 141cbba.

* add format infer code * update * update * add loguru * use dense not None

…" (#451) This reverts commit 141cbba.

dsikka added 5 commits August 29, 2025 16:24

add format infer code

211fc85

update

a3bb4dd

update

5174978

add loguru

9bd80bb

use dense not None

5bf3212

dsikka mentioned this pull request Aug 29, 2025

[QuantizationFormat] Remove code inferring format vllm-project/llm-compressor#1786

Merged

dsikka marked this pull request as ready for review August 29, 2025 21:01

brian-dellabetta approved these changes Aug 29, 2025

View reviewed changes

src/compressed_tensors/config/format.py Show resolved Hide resolved

src/compressed_tensors/config/format.py Show resolved Hide resolved

src/compressed_tensors/config/format.py Show resolved Hide resolved

dsikka requested a review from kylesayrs September 8, 2025 18:46

shanjiaz approved these changes Sep 8, 2025

View reviewed changes

kylesayrs approved these changes Sep 8, 2025

View reviewed changes

src/compressed_tensors/config/format.py Show resolved Hide resolved

src/compressed_tensors/config/format.py Show resolved Hide resolved

src/compressed_tensors/config/format.py Show resolved Hide resolved

kylesayrs enabled auto-merge (squash) September 8, 2025 22:32

kylesayrs merged commit 141cbba into main Sep 8, 2025
2 checks passed

kylesayrs deleted the update_format branch September 8, 2025 22:32

kylesayrs mentioned this pull request Sep 8, 2025

[Compressor] Infer format follow ups #449

Closed

dsikka added a commit that referenced this pull request Sep 9, 2025

Revert "[Quantization Format] Add functionality to infer format (#441)"

d39020b

This reverts commit 141cbba.

dsikka added a commit that referenced this pull request Sep 9, 2025

Revert "[Quantization Format] Add functionality to infer format (#441)…

4370618

…" (#451) This reverts commit 141cbba.

dsikka restored the update_format branch September 9, 2025 00:01

dsikka mentioned this pull request Sep 9, 2025

[Quantization Format] Add functionality to infer format #452

Merged

Etelis added a commit to Etelis/compressed-tensors that referenced this pull request Sep 11, 2025

[Quantization Format] Add functionality to infer format (neuralmagic#441

b1b74a2

) * add format infer code * update * update * add loguru * use dense not None

Etelis added a commit to Etelis/compressed-tensors that referenced this pull request Sep 11, 2025

Revert "[Quantization Format] Add functionality to infer format (neur…

48b5fd6

…almagic#441)" (neuralmagic#451) This reverts commit 141cbba.

kylesayrs pushed a commit that referenced this pull request Sep 11, 2025

[Quantization Format] Add functionality to infer format (#441)

51fbecb

* add format infer code * update * update * add loguru * use dense not None

kylesayrs pushed a commit that referenced this pull request Sep 11, 2025

Revert "[Quantization Format] Add functionality to infer format (#441)…

5f8fe7b

…" (#451) This reverts commit 141cbba.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Quantization Format] Add functionality to infer format #441

[Quantization Format] Add functionality to infer format #441

Uh oh!

dsikka commented Aug 29, 2025 •

edited

Loading

Uh oh!

brian-dellabetta left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shanjiaz left a comment

Uh oh!

kylesayrs left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kylesayrs commented Sep 8, 2025

Uh oh!

Uh oh!

[Quantization Format] Add functionality to infer format #441

[Quantization Format] Add functionality to infer format #441

Uh oh!

Conversation

dsikka commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Next Steps

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shanjiaz left a comment

Choose a reason for hiding this comment

Uh oh!

kylesayrs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kylesayrs commented Sep 8, 2025

Uh oh!

Uh oh!

dsikka commented Aug 29, 2025 •

edited

Loading