-
Notifications
You must be signed in to change notification settings - Fork 310
Bump version for float8 dynamic quant and weight only quant configs #2650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2650
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
e19cb46
to
5ae457c
Compare
5ae457c
to
79e89ff
Compare
79e89ff
to
d375bbb
Compare
d375bbb
to
c464d5b
Compare
c464d5b
to
456a77f
Compare
456a77f
to
f6e4522
Compare
f6e4522
to
7cdfe0a
Compare
b06dafd
to
5016603
Compare
5016603
to
b2c8536
Compare
b2c8536
to
912f6e5
Compare
Ci failures look real |
Summary: This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2 and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649 Also extended current config serialization to work with multiple config versions Deprecation Note: ``` from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "torchao-testing/opt-125m-float8dq-row-v1-0.13-dev" quantized_model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="bfloat16", device_map="cuda", ) /data/users/jerryzh/ao/torchao/core/config.py:249: UserWarning: Stored version is not the same as current default version of the config: stored_version=1, current_version=2, please check the deprecation warning warnings.warn( /data/users/jerryzh/ao/torchao/dtypes/floatx/float8_layout.py:113: UserWarning: Models quantized with VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in a future release, please upgrade torchao and quantize again, or download a newer torchao checkpoint, see #2649 for more details warnings.warn( ``` Suggestion: upgrade torchao to 0.13 and later and generate the checkpoint again: ``` quantize_(model, Float8DynamicActivationFloat8WeightConfig(granularity=PerRow())) ``` Or download the checkpoint again (please let us know if the checkpoint is not updated) Test Plan: tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed ``` python test/integration/test_loading_deprecated_checkpoint.py ``` Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2650, branch: jerryzh168/stack/14
912f6e5
to
16b2c4b
Compare
d3ee6c2
to
356e477
Compare
@@ -1506,6 +1506,9 @@ class Float8WeightOnlyConfig(AOBaseConfig): | |||
|
|||
def _float8_weight_only_quant_tensor(weight, config): | |||
if config.VERSION == 1: | |||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed it in previous PRs, but why VERSION
instead of version
? If there is no good reason, can we change it before it's too late?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's from here:
Line 61 in 3b4bc98
VERSION: ClassVar[int] = _DEFAULT_VERSION |
should we change both or ignore VERSION
in TorchAOBaseConfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK updated it to instance variable and renamed to version
b844dbd
to
5f1eb8a
Compare
a76e263
to
b855b91
Compare
torchao/core/config.py
Outdated
data_dict[f.name] = self.encode_value(getattr(o, f.name)) | ||
|
||
return { | ||
# Only store the class name for dataclasses too | ||
"_type": o.__class__.__name__, | ||
"_version": getattr(o.__class__, "VERSION", 1), | ||
"_version": getattr(o, "version", 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the 3rd arg here be the _DEFAULT_VERSION?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, makes sense
@@ -253,10 +244,11 @@ def config_from_dict(data: Dict[str, Any]) -> AOBaseConfig: | |||
f"Failed to find class {type_path} in any of the allowed modules: {allowed_modules_str}" | |||
) | |||
|
|||
# Check version - require exact match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like in general versions will error right? like shouldnt we have this path but just let instances figure out if they should error or not at deserialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they can still error, see quant_api.py, they can error when config.version
has a deprecated version number
but erroring out here would mean older version are not supported at all
Summary: This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2 and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649 Also extended current config serialization to work with multiple config versions Deprecation Note: ``` from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "torchao-testing/opt-125m-float8dq-row-v1-0.13-dev" quantized_model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="bfloat16", device_map="cuda", ) /data/users/jerryzh/ao/torchao/core/config.py:249: UserWarning: Stored version is not the same as current default version of the config: stored_version=1, current_version=2, please check the deprecation warning warnings.warn( /data/users/jerryzh/ao/torchao/dtypes/floatx/float8_layout.py:113: UserWarning: Models quantized with VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in a future release, please upgrade torchao and quantize again, or download a newer torchao checkpoint, see #2649 for more details warnings.warn( ``` Suggestion: upgrade torchao to 0.13 and later and generate the checkpoint again: ``` quantize_(model, Float8DynamicActivationFloat8WeightConfig(granularity=PerRow())) ``` Or download the checkpoint again (please let us know if the checkpoint is not updated) Test Plan: tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed ``` python test/integration/test_loading_deprecated_checkpoint.py ``` Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2650, branch: jerryzh168/stack/14
b855b91
to
e66af4f
Compare
Stacked PRs:
optional_tensor_names
in TorchAOBaseTensor #2710Bump version for float8 dynamic quant and weight only quant configs
Summary:
This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2
and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649
Also extended current config serialization to work with multiple config versions
Deprecation Note:
Suggestion: upgrade torchao to 0.13 and later and generate the checkpoint again:
Or download the checkpoint again (please let us know if the checkpoint is not updated)
Test Plan:
tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed
Reviewers:
Subscribers:
Tasks:
Tags: