[Quantization] Channel wise output activation quantization for QKV Attention layers #270

horheynm · 2025-03-07T04:35:03Z

SUMMARY:
Initialize the Parameter for output activation quantization for QKV. O/Up/down is not quantized.

TEST PLAN:

Pass tests
Check shapes manually

… attn_quant

kylesayrs · 2025-10-14T14:09:14Z

#436

dsikka and others added 4 commits February 10, 2025 23:34

update

f618e84

update

b9b2846

Merge branch 'main' of github.com:neuralmagic/compressed-tensors into…

4f2b62e

… attn_quant

channel wise fp8 attn

6f64b38

horheynm enabled auto-merge (squash) March 7, 2025 04:35

horheynm mentioned this pull request Mar 7, 2025

[Quantization] Channel-wise Output Activation Quantization for Attention QKV Modules + KV-cache channel quantization vllm-project/llm-compressor#1233

Closed

horheynm changed the title ~~Attn quant~~ [Quantization] Channel wise quantization for output activation on QKV Mar 7, 2025

horheynm changed the title ~~[Quantization] Channel wise quantization for output activation on QKV~~ [Quantization] Channel wise quantization for output activation on QKV Attention layers Mar 7, 2025

remove unnec comments

6fb81ba

horheynm changed the title ~~[Quantization] Channel wise quantization for output activation on QKV Attention layers~~ [Quantization] Channel wise output activation quantization for QKV Attention layers Mar 7, 2025

dsikka marked this pull request as draft March 7, 2025 14:27

auto-merge was automatically disabled March 7, 2025 14:27
Pull request was converted to draft

kylesayrs closed this Oct 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Quantization] Channel wise output activation quantization for QKV Attention layers #270

[Quantization] Channel wise output activation quantization for QKV Attention layers #270

Uh oh!

horheynm commented Mar 7, 2025

Uh oh!

kylesayrs commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Quantization] Channel wise output activation quantization for QKV Attention layers #270

[Quantization] Channel wise output activation quantization for QKV Attention layers #270

Uh oh!

Conversation

horheynm commented Mar 7, 2025

Uh oh!

kylesayrs commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants