[Compression] Remove legacy compression and decompression pathways #465

kylesayrs · 2025-09-11T19:58:48Z

No description provided.

Signed-off-by: Kyle Sayers <[email protected]>

dsikka

I don’t think we want to remove these until the new methods can compress / decompress when starting with a checkpoint that isn’t already in memory.

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs · 2025-09-11T20:22:08Z

@dsikka To be clear, are you referring to compressing/ decompressing from disk? Is there a remaining use case for this?

Signed-off-by: Kyle Sayers <[email protected]>

dsikka · 2025-09-11T20:26:43Z

@dsikka To be clear, are you referring to compressing/ decompressing from disk? Is there a remaining use case for this?

Yeah for anyone who wants to use compressed-tensors independent of the transformers pathway / is using ct as a standalone

I certainly think we can improve these functions but from disk is something we should have some way to support.

kylesayrs · 2025-09-17T15:16:16Z

Yeah for anyone who wants to use compressed-tensors independent of the transformers pathway / is using ct as a standalone

@dsikka There has never been a "from disk" compression pathway. In order to load any compressed model, you must use transformers from_pretrained.

In terms of the from disk decompression pathway, there has also never been a "from disk" pathway that doesn't also rely on transformers from_pretrained. The old disk pathway would use an already compressed model loaded from_pretrained, then load again from disk and replace the weights

dsikka · 2025-09-18T08:16:48Z

Yeah for anyone who wants to use compressed-tensors independent of the transformers pathway / is using ct as a standalone

@dsikka There has never been a "from disk" compression pathway. In order to load any compressed model, you must use transformers from_pretrained.

In terms of the from disk decompression pathway, there has also never been a "from disk" pathway that doesn't also rely on transformers from_pretrained. The old disk pathway would use an already compressed model loaded from_pretrained, then load again from disk and replace the weights

This is not true:

Afrom transformers import AutoModelForCausalLM, AutoTokenizer
from llmcompressor.utils import dispatch_for_generation


MODEL_ID = "nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8_tensor_weight_static_per_tensor_act-e2e"
model = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

from compressed_tensors.compressors import ModelCompressor

compressor = ModelCompressor.from_pretrained(MODEL_ID)
compressor.decompress(MODEL_ID, model)

print("\n\n")
print("========== SAMPLE GENERATION ==============")

dispatch_for_generation(model)
input_ids = tokenizer("Hello my name is", return_tensors="pt").input_ids.to(
    model.device
)
output = model.generate(input_ids, max_new_tokens=100)
print(tokenizer.decode(output[0]))
print("==========================================\n\n")

The compressed weights are decompressed after being read from disk. from_pretrained loads the skeleton for the model however the compressed weights are never read through the from_pretrained pathway. This enables decompression without relying on our transformers integration

This also gives us independent compression functionality, such as

compressed-tensors/src/compressed_tensors/compressors/helpers.py

Line 34 in 0e5df88

def save_compressed(

allowing a pathway to checkpoint while maintaining the original state dict in memory for further use

While the default pathway makes sense to be in memory compression / decompression, these are useful tools we still should maintain

brian-dellabetta · 2025-09-18T14:19:16Z

I think part of the motivation for this is that decompress/compress seem like the two main public methods that users should use, which led myself and this user astray. If we don't want to deprecate these, could we at least rename them (maybe to decompress_from_disk/compress_from_disk?) so users don't think they are the preferred methods to be used?

remove state dict compress and disk decompress

24f6104

Signed-off-by: Kyle Sayers <[email protected]>

dsikka requested changes Sep 11, 2025

View reviewed changes

fix zero points initialize

25bd87a

Signed-off-by: Kyle Sayers <[email protected]>

remove function

22bedc9

Signed-off-by: Kyle Sayers <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Compression] Remove legacy compression and decompression pathways #465

[Compression] Remove legacy compression and decompression pathways #465

Uh oh!

kylesayrs commented Sep 11, 2025

Uh oh!

dsikka left a comment

Uh oh!

kylesayrs commented Sep 11, 2025

Uh oh!

dsikka commented Sep 11, 2025

Uh oh!

kylesayrs commented Sep 17, 2025 •

edited by dsikka

Loading

Uh oh!

dsikka commented Sep 18, 2025 •

edited

Loading

Uh oh!

brian-dellabetta commented Sep 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

[Compression] Remove legacy compression and decompression pathways #465

Are you sure you want to change the base?

[Compression] Remove legacy compression and decompression pathways #465

Uh oh!

Conversation

kylesayrs commented Sep 11, 2025

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

kylesayrs commented Sep 11, 2025

Uh oh!

dsikka commented Sep 11, 2025

Uh oh!

kylesayrs commented Sep 17, 2025 • edited by dsikka Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsikka commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-dellabetta commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

kylesayrs commented Sep 17, 2025 •

edited by dsikka

Loading

dsikka commented Sep 18, 2025 •

edited

Loading

brian-dellabetta commented Sep 18, 2025 •

edited

Loading