Skip to content

Bump version for float8 dynamic quant and weight only quant configs #2650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 7, 2025

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Aug 1, 2025

Stacked PRs:


Bump version for float8 dynamic quant and weight only quant configs

Summary:
This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2
and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649

Also extended current config serialization to work with multiple config versions

Deprecation Note:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "torchao-testing/opt-125m-float8dq-row-v1-0.13-dev"
quantized_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="bfloat16",
    device_map="cuda",
)

/data/users/jerryzh/ao/torchao/core/config.py:249: UserWarning: Stored version is not the same as current default version of the config: stored_version=1, current_version=2, please check the deprecation warning
  warnings.warn(
/data/users/jerryzh/ao/torchao/dtypes/floatx/float8_layout.py:113: UserWarning: Models quantized with VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in a future release, please upgrade torchao and quantize again, or download a newer torchao checkpoint, see https://github.com/pytorch/ao/issues/2649 for more details
  warnings.warn(

Suggestion: upgrade torchao to 0.13 and later and generate the checkpoint again:

quantize_(model, Float8DynamicActivationFloat8WeightConfig(granularity=PerRow()))

Or download the checkpoint again (please let us know if the checkpoint is not updated)

Test Plan:
tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed

python test/integration/test_loading_deprecated_checkpoint.py

Reviewers:

Subscribers:

Tasks:

Tags:

Copy link

pytorch-bot bot commented Aug 1, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2650

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 1, 2025
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from e19cb46 to 5ae457c Compare August 1, 2025 00:53
@jerryzh168 jerryzh168 added the topic: bc-breaking Use this tag if this PR breaks backward compatibility label Aug 1, 2025
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 1, 2025 00:56
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from 5ae457c to 79e89ff Compare August 1, 2025 00:56
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 1, 2025 00:56
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 1, 2025 03:38
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from 79e89ff to d375bbb Compare August 1, 2025 03:38
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 1, 2025 03:38
@jerryzh168 jerryzh168 requested review from drisspg and vkuzo August 1, 2025 04:43
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 1, 2025 21:12
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from d375bbb to c464d5b Compare August 1, 2025 21:13
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 1, 2025 21:13
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 2, 2025 01:31
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 2, 2025 01:31
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 4, 2025 17:30
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from c464d5b to 456a77f Compare August 4, 2025 17:30
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 4, 2025 17:30
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 4, 2025 18:14
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from 456a77f to f6e4522 Compare August 4, 2025 18:15
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 4, 2025 18:15
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 4, 2025 22:14
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from f6e4522 to 7cdfe0a Compare August 4, 2025 22:14
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 4, 2025 22:15
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 4, 2025 23:51
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 5, 2025 18:39
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from b06dafd to 5016603 Compare August 5, 2025 18:39
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 5, 2025 18:39
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 5, 2025 23:29
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from 5016603 to b2c8536 Compare August 5, 2025 23:30
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 5, 2025 23:30
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 6, 2025 01:07
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from b2c8536 to 912f6e5 Compare August 6, 2025 01:08
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/9 August 6, 2025 01:08
@drisspg
Copy link
Contributor

drisspg commented Aug 6, 2025

Ci failures look real

jerryzh168 added a commit that referenced this pull request Aug 6, 2025
Summary:
This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2
and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649

Also extended current config serialization to work with multiple config versions

Deprecation Note:
```
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "torchao-testing/opt-125m-float8dq-row-v1-0.13-dev"
quantized_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="bfloat16",
    device_map="cuda",
)

/data/users/jerryzh/ao/torchao/core/config.py:249: UserWarning: Stored version is not the same as current default version of the config: stored_version=1, current_version=2, please check the deprecation warning
  warnings.warn(
/data/users/jerryzh/ao/torchao/dtypes/floatx/float8_layout.py:113: UserWarning: Models quantized with VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in a future release, please upgrade torchao and quantize again, or download a newer torchao checkpoint, see #2649 for more details
  warnings.warn(

```

Suggestion: upgrade torchao to 0.13 and later and generate the checkpoint again:
```
quantize_(model, Float8DynamicActivationFloat8WeightConfig(granularity=PerRow()))
```
Or download the checkpoint again (please let us know if the checkpoint is not updated)

Test Plan:
tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed
```
python test/integration/test_loading_deprecated_checkpoint.py
```

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2650, branch: jerryzh168/stack/14
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from 912f6e5 to 16b2c4b Compare August 6, 2025 17:37
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/9 to main August 6, 2025 17:37
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch 3 times, most recently from d3ee6c2 to 356e477 Compare August 6, 2025 19:12
@@ -1506,6 +1506,9 @@ class Float8WeightOnlyConfig(AOBaseConfig):

def _float8_weight_only_quant_tensor(weight, config):
if config.VERSION == 1:
warnings.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed it in previous PRs, but why VERSION instead of version? If there is no good reason, can we change it before it's too late?

Copy link
Contributor Author

@jerryzh168 jerryzh168 Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's from here:

VERSION: ClassVar[int] = _DEFAULT_VERSION

should we change both or ignore VERSION in TorchAOBaseConfig?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK updated it to instance variable and renamed to version

@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch 2 times, most recently from b844dbd to 5f1eb8a Compare August 6, 2025 22:10
@jerryzh168 jerryzh168 requested review from drisspg and vkuzo August 6, 2025 22:10
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch 2 times, most recently from a76e263 to b855b91 Compare August 6, 2025 23:27
data_dict[f.name] = self.encode_value(getattr(o, f.name))

return {
# Only store the class name for dataclasses too
"_type": o.__class__.__name__,
"_version": getattr(o.__class__, "VERSION", 1),
"_version": getattr(o, "version", 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the 3rd arg here be the _DEFAULT_VERSION?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, makes sense

@@ -253,10 +244,11 @@ def config_from_dict(data: Dict[str, Any]) -> AOBaseConfig:
f"Failed to find class {type_path} in any of the allowed modules: {allowed_modules_str}"
)

# Check version - require exact match
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like in general versions will error right? like shouldnt we have this path but just let instances figure out if they should error or not at deserialization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they can still error, see quant_api.py, they can error when config.version has a deprecated version number

but erroring out here would mean older version are not supported at all

Summary:
This PR changes the default VERSION for Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig from 1 to 2
and makes the VERSION 1 config and VERSION 1 quantized models deprecated, more details in: #2649

Also extended current config serialization to work with multiple config versions

Deprecation Note:
```
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "torchao-testing/opt-125m-float8dq-row-v1-0.13-dev"
quantized_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="bfloat16",
    device_map="cuda",
)

/data/users/jerryzh/ao/torchao/core/config.py:249: UserWarning: Stored version is not the same as current default version of the config: stored_version=1, current_version=2, please check the deprecation warning
  warnings.warn(
/data/users/jerryzh/ao/torchao/dtypes/floatx/float8_layout.py:113: UserWarning: Models quantized with VERSION 1 of Float8DynamicActivationFloat8WeightConfig is deprecated and will no longer be supported in a future release, please upgrade torchao and quantize again, or download a newer torchao checkpoint, see #2649 for more details
  warnings.warn(

```

Suggestion: upgrade torchao to 0.13 and later and generate the checkpoint again:
```
quantize_(model, Float8DynamicActivationFloat8WeightConfig(granularity=PerRow()))
```
Or download the checkpoint again (please let us know if the checkpoint is not updated)

Test Plan:
tested with serializing a model with VERSION 1 config and load it, and checks warnings are properly printed
```
python test/integration/test_loading_deprecated_checkpoint.py
```

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2650, branch: jerryzh168/stack/14
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/14 branch from b855b91 to e66af4f Compare August 7, 2025 02:57
@jerryzh168 jerryzh168 merged commit d2e791b into main Aug 7, 2025
11 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: bc-breaking Use this tag if this PR breaks backward compatibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants