Skip to content

Conversation

dsikka
Copy link
Collaborator

@dsikka dsikka commented Aug 29, 2025

Summary

  • Move format infer functionality from llmcompressor to compressed-tensors - [QuantizationFormat] Remove code inferring format vllm-project/llm-compressor#1786
  • Simplifies functionality to set the format based on the quant_scheme and sparsity scheme, if a format is not already set by the user
  • Use the per module format to determine the global format for the model
  • Adds dosctrings
  • Add loguru
  • Return dense if no format inferred, not None

Next Steps

  • We can set the per module format when the quantization scheme is initialized with some work on how we're handling the sparsity config / structure. Will work on simplifying this next to allow this. Otherwise, we currently do it during compression time at the end

Copy link
Contributor

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Few code-related things, nothing pops out as wrong in implementation but deferring to others on that 😄

@dsikka dsikka requested a review from kylesayrs September 8, 2025 18:46
Copy link
Contributor

@shanjiaz shanjiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment on QuantizationScheme which specifies that None means that the value will be inferred by infer_and_set_per_module_quantization_format before compression?

Otherwise looks great, thanks for doing this

@kylesayrs kylesayrs enabled auto-merge (squash) September 8, 2025 22:32
@kylesayrs kylesayrs merged commit 141cbba into main Sep 8, 2025
2 checks passed
@kylesayrs kylesayrs deleted the update_format branch September 8, 2025 22:32
@kylesayrs
Copy link
Contributor

#449

dsikka added a commit that referenced this pull request Sep 9, 2025
dsikka added a commit that referenced this pull request Sep 9, 2025
@dsikka dsikka restored the update_format branch September 9, 2025 00:01
Etelis added a commit to Etelis/compressed-tensors that referenced this pull request Sep 11, 2025
)

* add format infer code

* update

* update

* add loguru

* use dense not None
Etelis added a commit to Etelis/compressed-tensors that referenced this pull request Sep 11, 2025
kylesayrs pushed a commit that referenced this pull request Sep 11, 2025
* add format infer code

* update

* update

* add loguru

* use dense not None
kylesayrs pushed a commit that referenced this pull request Sep 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants