[WIP] [Design] LLMCompressor Class #1256

kylesayrs · 2025-03-15T20:55:55Z

LLMCompressor Class

from llmcompressor.core.llmcompressor.llmcompressor import LLMCompressor
from llmcompressor.modifiers.quantization.gptq import GPTQModifier
from llmcompressor.modifiers.smoothquant.base import SmoothQuantModifier

model_id = "meta-llama/Llama-3.1-8B-Instruct"
recipe = [
    SmoothQuantModifier(smoothing_strength=0.8),
    GPTQModifier(targets="Linear", scheme="W8A8", ignore=["lm_head"])
]

compressor = LLMCompressor(model_id, recipe)
compressor.set_calibration_dataset("ultrachat_200k", split="train_sft[:512]")
compressor.post_train(save_path="save_path")

Status

All core functionality has been implemented except recipe args
All functionality needs rigorous testing and regression evaluation

Purpose

The primary purpose of this design is to simplify the core logic of LLM Compressor

Simplified Features

Separate functions for model loading, recipe loading, dataset loading, and oneshot/train
Variables are attached directly to the LLM Compressor class instance, rather than being obfuscated behind session functions
Recipes are handled by one function which returns a list of modifiers, rather than being handled by multiple layers of recipe abstractions

Maintained Functionality

PTQ and Training integration interface
- Global access to compressor through which to trigger events
- Global steps can either by handled by integrator using EventLifeCycle auto-stepping
Event lifecycle validation
- Now handled by EventLifeCycle which implements minimally invasive decorators
- EventLifeCycle handles auto-stepping and order validation
Dataset processing
- Calibration datasets and training datasets are decoupled
SFT Training pathway
- Implemented through a training mixin

Removed Classes/Abstractions

Recipe classes
StageModifiers
Stage Runner
LifecycleCallbacks
Session/Session Globals
Lifecycle
Event class is greatly simplified
ModifiedState

Questions

Is there any recipe metadata outside of modifiers worth saving/recording?
What is required in order to support recipe args?
Is there any case where a modifier would want to start/stops on events besides batch_start/end?

Integration Examples

PTQ

compressor = LLMCompressor(model, recipe)

global_step = 0
compressor.initialize()
for batch in calibration_data:
    compressor.batch_start(batch_index)
    outputs = model(**batch)
    compressor.batch_end()
    global_step += 1

compressor.finalize()
model.save_pretrained(...)

Training

compressor = LLMCompressor(model, recipe)

compressor.initialize()
for epoch in num_epochs:
    for batch in training_data:
        compressor.batch_start(global_step=epoch)
        outputs = model(**batch)

        loss = loss_fn(labels, outputs)
        loss = compressor.update_loss(loss)
        loss.backwards()
        
        compressor.pre_optim()
        optimizer.step()
        compressor.post_optim()
    
        compressor.batch_end()
        
    if save_checkpoint:
        model.save_pretrained(...)

compressor.finalize()
model.save_pretrained(...)

Future Extensions

### Teacher/Delayed-State Training Integration ### ```python3 compressor = LLMCompressor(model, recipe)

global_step = 0
compressor.initialize()
compressor.update_state(teacher=teacher)
for epoch in num_epochs:
for batch in training_data:
...


### Multi-round PTQ ###
Add a finalized_modifiers attribute. When modifiers finalize, move from modifiers list to finalized_modifiers list
```python3
compressor = LLMCompressor(model, pruning_recipe)
# round 1: sparsification
compressor.set_calibration_dataset(dataset_one)
compressor.compress(calibration_pipeline="basic")

# round 2: quantization
compressor.append_recipe(quantization_recipe)
compressor.set_calibration_dataset(dataset_two)
compressor.compress(calibration_pipeline="sequential")

model.save_pretrained(...)

Recipe-Tailored Custom Device Map

compressor = LLMCompressor(model_stub, recipe, device_map="auto")
compressor.set_calibration_dataset(dataset)
compressor.post_train()

compressor.model.save_pretrained(...)

Alternating Oneshot/SFT

Add a finalized_modifiers attribute. When modifiers finalize, move from modifiers list to finalized_modifiers list

compressor = LLMCompressor(model, training_recipe)
# round 1: training
compressor.set_train_dataset(dataset_one)
compressor.train(**training_kwargs)

# round 2: compression
compressor.append_recipe(quantization_recipe)
compressor.set_calibration_dataset(dataset_two)
compressor.post_train()

model.save_pretrained(...)

Signed-off-by: Kyle Sayers <[email protected]>

…t implementation Signed-off-by: Kyle Sayers <[email protected]>

Signed-off-by: Kyle Sayers <[email protected]>

src/llmcompressor/core/llmcompressor/llmcompressor.py

src/llmcompressor/typing.py

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py

src/llmcompressor/modifiers/smoothquant/base.py

src/llmcompressor/core/llmcompressor/event_lifecycle.py

src/llmcompressor/core/llmcompressor/llmcompressor.py

src/llmcompressor/core/llmcompressor/utils.py

src/llmcompressor/core/llmcompressor/train.py

Signed-off-by: Kyle Sayers <[email protected]>

…ve quantization modifier from gptq Signed-off-by: Kyle Sayers <[email protected]>

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs added 12 commits March 15, 2025 17:29

model loading

351841a

Signed-off-by: Kyle Sayers <[email protected]>

datasets

75d7d1e

Signed-off-by: Kyle Sayers <[email protected]>

post train works

4a22b90

Signed-off-by: Kyle Sayers <[email protected]>

Merge remote-tracking branch 'origin' into kylesayrs/llm-compressor

48a1c40

pipeline resolution

4761351

Signed-off-by: Kyle Sayers <[email protected]>

style

7c4dee4

Signed-off-by: Kyle Sayers <[email protected]>

implement train skeleton

63252ad

Signed-off-by: Kyle Sayers <[email protected]>

cleanup

e32e1c4

Signed-off-by: Kyle Sayers <[email protected]>

extract data pipelines

51fb047

Signed-off-by: Kyle Sayers <[email protected]>

extract data pipeline events, integrate smoothquant, begin independen…

710fe24

…t implementation Signed-off-by: Kyle Sayers <[email protected]>

model saving

0a3f8f2

Signed-off-by: Kyle Sayers <[email protected]>

add calibration data check

7f59359

Signed-off-by: Kyle Sayers <[email protected]>

vllm-project deleted a comment from github-actions bot Mar 17, 2025

add save path

abf1818

Signed-off-by: Kyle Sayers <[email protected]>

brian-dellabetta reviewed Mar 17, 2025

View reviewed changes

kylesayrs added 15 commits March 17, 2025 18:38

only send after start and before end

e33793e

Signed-off-by: Kyle Sayers <[email protected]>

move initialize and finalize into pipelines

e8a2fe9

Signed-off-by: Kyle Sayers <[email protected]>

WIP: implement get_modifiers_from_recipe

058ccf6

Signed-off-by: Kyle Sayers <[email protected]>

merge with extract pipelines, remove event dependency for current_index

81c60f1

Signed-off-by: Kyle Sayers <[email protected]>

merge in layerwise performance

244ae34

Signed-off-by: Kyle Sayers <[email protected]>

trainer integration, remove pipeline from quantization modifier, remo…

43708af

…ve quantization modifier from gptq Signed-off-by: Kyle Sayers <[email protected]>

add entrypoints

bb2def2

Signed-off-by: Kyle Sayers <[email protected]>

remove custom data classes

6274601

Signed-off-by: Kyle Sayers <[email protected]>

remove some no-longer-relevant tests

0281234

Signed-off-by: Kyle Sayers <[email protected]>

simplify data args

02834f4

Signed-off-by: Kyle Sayers <[email protected]>

reduce import path length

ebb0410

Signed-off-by: Kyle Sayers <[email protected]>

remove llmcompressor folder

34b88f8

Signed-off-by: Kyle Sayers <[email protected]>

remove unused file

eccddaa

Signed-off-by: Kyle Sayers <[email protected]>

move out resolve_modifier_quantization_config

02d81e9

Signed-off-by: Kyle Sayers <[email protected]>

rename file

fffd20a

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs added 17 commits March 28, 2025 13:14

reduce core import dependency on modifiers

e5c66b7

Signed-off-by: Kyle Sayers <[email protected]>

validated training

c3ba7ca

Signed-off-by: Kyle Sayers <[email protected]>

training with distillation works

d4552eb

Signed-off-by: Kyle Sayers <[email protected]>

cleanup

4c3e70d

Signed-off-by: Kyle Sayers <[email protected]>

remove typehinting

bd9ca1f

Signed-off-by: Kyle Sayers <[email protected]>

enable quantization during calibration

a525a3c

Signed-off-by: Kyle Sayers <[email protected]>

update script

30c7169

Signed-off-by: Kyle Sayers <[email protected]>

break out register_calibration_hooks

2c3e39b

Signed-off-by: Kyle Sayers <[email protected]>

WIP

da62925

Signed-off-by: Kyle Sayers <[email protected]>

clean up calibration, allow shapes to be iterated during tracing

1e88239

Signed-off-by: Kyle Sayers <[email protected]>

comment

65f7912

Signed-off-by: Kyle Sayers <[email protected]>

confirm whisper

2da0916

Signed-off-by: Kyle Sayers <[email protected]>

WIP

6c7dad7

Signed-off-by: Kyle Sayers <[email protected]>

use calibration_epoch_end in basic pipeline

4e5fb5c

Signed-off-by: Kyle Sayers <[email protected]>

qmod

d0f6790

Signed-off-by: Kyle Sayers <[email protected]>

handle no-data

12eb66f

Signed-off-by: Kyle Sayers <[email protected]>

skip

b00ca59

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs mentioned this pull request May 5, 2025

Pipeline Extraction #1279

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] [Design] LLMCompressor Class #1256

[WIP] [Design] LLMCompressor Class #1256

Uh oh!

kylesayrs commented Mar 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[WIP] [Design] LLMCompressor Class #1256

Are you sure you want to change the base?

[WIP] [Design] LLMCompressor Class #1256

Uh oh!

Conversation

kylesayrs commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LLMCompressor Class

Status

Purpose

Simplified Features

Maintained Functionality

Removed Classes/Abstractions

Questions

Integration Examples

PTQ

Training

Future Extensions

Recipe-Tailored Custom Device Map

Alternating Oneshot/SFT

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kylesayrs commented Mar 15, 2025 •

edited

Loading