You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bump version for float8 dynamic quant and weight only quant configs
Summary:
This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2
and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649
Also extended current config serialization to work with multiple config versions
Deprecation Note:
```
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "torchao-testing/opt-125m-float8dq-row-v1-0.13-dev"
quantized_model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="bfloat16",
device_map="cuda",
)
/data/users/jerryzh/ao/torchao/core/config.py:249: UserWarning: Stored version is not the same as current default version of the config: stored_version=1, current_version=2, please check the deprecation warning
warnings.warn(
/data/users/jerryzh/ao/torchao/dtypes/floatx/float8_layout.py:113: UserWarning: Models quantized with VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in a future release, please upgrade torchao and quantize again, or download a newer torchao checkpoint, see #2649 for more details
warnings.warn(
```
Suggestion: upgrade torchao to 0.13 and later and generate the checkpoint again:
```
quantize_(model, Float8DynamicActivationFloat8WeightConfig(granularity=PerRow()))
```
Or download the checkpoint again (please let us know if the checkpoint is not updated)
Test Plan:
tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed
```
python test/integration/test_loading_deprecated_checkpoint.py
```
Reviewers:
Subscribers:
Tasks:
Tags:
stack-info: PR: #2650, branch: jerryzh168/stack/14
0 commit comments