Skip to content

Conversation

kylesayrs
Copy link
Contributor

Purpose

  • Refactor initialize to support passing the shapes of activations
    • While the true shapes of activations are determined at inference/calibration time, the last dimensions can be known from the weight
    • This information is necessary to support group/head quantization, since these strategies need to know the shape of the activation at init time

Changes

  • Use the module weight to infer activation and output quantization shapes
  • Rename _initialize_scale_zero_point to initialize_qparams
  • Introduce strategy_cdiv to share logic related to validating divisibility

Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as draft September 30, 2025 14:51
@kylesayrs
Copy link
Contributor Author

Drafting for now before release

@kylesayrs
Copy link
Contributor Author

I have no clue with tests/test_utils/test_helpers.py is reported to fail, nor why the CI seems to stall. All other tests pass.

Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as ready for review October 1, 2025 15:31
Copy link
Contributor Author

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants