Expose `weights_only` for loading checkpoints with `Trainer`, `LightningModule`, `LightningDataModule` #21072

matsumotosan · 2025-08-14T19:05:11Z

What does this PR do?

Fixes #20450 #20058 #20643

Before submitting

Was this discussed/agreed via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

📚 Documentation preview 📚: https://pytorch-lightning--21072.org.readthedocs.build/en/21072/

codecov · 2025-08-15T08:25:23Z

Codecov Report

❌ Patch coverage is 84.21053% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 87%. Comparing base (74b3fd5) to head (bfd8656).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #21072   +/-   ##
=======================================
  Coverage      87%      87%           
=======================================
  Files         269      269           
  Lines       23732    23744   +12     
=======================================
+ Hits        20557    20569   +12     
  Misses       3175     3175

… based on ckpt version

src/lightning/pytorch/core/saving.py

tests/tests_pytorch/checkpointing/test_legacy_checkpoints.py

matsumotosan · 2025-08-16T17:30:14Z

@Borda I wanted to get your opinion on something before moving forward.

I've added weights_only as an argument to LightningModule.load_from_checkpoint and all downstream functions to allow users to determine which option they want to use to load checkpoints.

My issue right now is with resuming training from a checkpoint with Trainer.fit. I see a few options right now:

Add weights_only as an argument to Trainer.fit (would also have to modify args for validate, test, and predict). Set default value to True.
Use weights_only=True everywhere, and print an error message advising user to set TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD if they would like to load with weights_only=False. Users must explicitly set environment variable to force loading with weights_only=False.
Add weights_only as an argument to Trainer initialization. Easy, but would not allow fine-grained control on loading models between different calls of fit, validate, etc.

I'm leaning towards option 1, but it involves changing up Trainer methods, which affects a lot of code so wanted to run this by you beforehand.

Borda · 2025-08-18T06:56:36Z

My issue right now is with resuming training from a checkpoint with Trainer.fit. I see a few options right now:

Add weights_only as an argument to Trainer.fit (would also have to modify args for validate, test, and predict). Set default value to True.

Use weights_only=True everywhere, and print an error message advising user to set TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD if they would like to load with weights_only=False. Users must explicitly set environment variable to force loading with weights_only=False.

Add weights_only as an argument to Trainer initialization. Easy, but would not allow fine-grained control on loading models between different calls of fit, validate, etc.

The cleanest way would probably be 1), but it brings so many new arguments for a marginal use... so personally I would go with 2)
cc: @lantiga

deependujha · 2025-11-06T20:45:37Z

Seems like an actual issue than a flaky test behavior

…rs to torch's default (`True`) See PR on this change: Lightning-AI/pytorch-lightning#21072

Set `weights_only=False` when loading ckpts, since Lightning now defers to torch's default (`True`) * See PR on this change: Lightning-AI/pytorch-lightning#21072 --------- Co-authored-by: Nathan Painchaud <[email protected]>

## Description [this](Lightning-AI/pytorch-lightning#21072) change in ptl 2.6.0 means we have to explicitly specify "weight_only=False" when calling `BaseGraphModule.load_from_checkpoint` (nice spot Ana!) ***As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/*** By opening this pull request, I affirm that all authors agree to the [Contributor License Agreement.](https://github.com/ecmwf/codex/blob/main/Legal/contributor_license_agreement.md)

matsumotosan added 2 commits August 14, 2025 14:22

change weights_only default to True

074b01e

add docs on weights_only arg

65cc1ed

matsumotosan requested review from Borda, ethanwharris, justusschock, lantiga and tchaton as code owners August 14, 2025 19:05

github-actions bot added the fabric lightning.fabric.Fabric label Aug 14, 2025

Merge branch 'master' into weights-only-compatibility

4eaaf58

matsumotosan marked this pull request as draft August 15, 2025 18:21

add weights_only arg to checkpoint save. weights_only during test set…

f276114

… based on ckpt version

github-actions bot added the pl Generic label for PyTorch Lightning package label Aug 15, 2025

Merge branch 'master' into weights-only-compatibility

601e300

matsumotosan force-pushed the weights-only-compatibility branch from d7cb702 to 601e300 Compare August 15, 2025 22:20

matsumotosan added 4 commits August 15, 2025 20:35

add weights_only arg to checkpoint_io

28f53ae

woops, reverting changes

b1cfdf1

permissions too

4d96a78

fix link

4c39c30

matsumotosan commented Aug 16, 2025

View reviewed changes

src/lightning/pytorch/core/saving.py Show resolved Hide resolved

matsumotosan commented Aug 16, 2025

View reviewed changes

tests/tests_pytorch/checkpointing/test_legacy_checkpoints.py Show resolved Hide resolved

matsumotosan marked this pull request as ready for review August 16, 2025 15:37

fix another link

861d7e0

matsumotosan changed the title ~~Compatibility for weights_only=True by default~~ Compatibility for weights_only=True by default for loading weights Aug 16, 2025

matsumotosan added 4 commits August 17, 2025 18:46

datamodule weights_only args

12bd0d6

wip: try safe_globals context manager for tests

5eacb6e

add weights_only arg to _run_standard_hparams_test

0430e22

weights_only=False when adding extra_args

2abe915

Borda and others added 20 commits September 11, 2025 15:53

Empty-Commit

36c419b

Merge branch 'master' into weights-only-compatibility

b8cf6ea

Merge branch 'master' into weights-only-compatibility

a51ad78

Merge branch 'master' into weights-only-compatibility

fe93b5a

Merge branch 'master' into weights-only-compatibility

31fbf8e

Merge branch 'master' into weights-only-compatibility

ad494aa

Empty-Commit

0102edf

Merge branch 'master' into weights-only-compatibility

6bd56f7

Merge branch 'master' into weights-only-compatibility

cd10696

Merge branch 'master' into weights-only-compatibility

010bcf9

Merge branch 'master' into weights-only-compatibility

d0ebd86

Merge branch 'master' into weights-only-compatibility

717ade0

Merge branch 'master' into weights-only-compatibility

ef11c76

Merge branch 'master' into weights-only-compatibility

4853c4b

Merge branch 'master' into weights-only-compatibility

e4da4f8

Merge branch 'master' into weights-only-compatibility

eb70cf7

Merge branch 'master' into weights-only-compatibility

13b7665

trigger ci

6c07e38

Merge branch 'master' into weights-only-compatibility

324d884

Merge branch 'master' into weights-only-compatibility

0e2727c

skip ddp_fork on macos

bfd8656

deependujha merged commit 29abe6e into Lightning-AI:master Nov 7, 2025
112 checks passed

matsumotosan deleted the weights-only-compatibility branch November 7, 2025 13:34

This was referenced Nov 8, 2025

Expose weights_only option for loading checkpoints #20058

Closed

convert_zero_checkpoint_to_fp32_state_dict fails with torch 2.6.0 due to weights_only default change #20643

Closed

cathalobrien mentioned this pull request Nov 28, 2025

fix: ptl 2.6.0 explicitly pass weights_only=False ecmwf/anemoi-core#710

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expose `weights_only` for loading checkpoints with `Trainer`, `LightningModule`, `LightningDataModule` #21072

Expose `weights_only` for loading checkpoints with `Trainer`, `LightningModule`, `LightningDataModule` #21072

Uh oh!

matsumotosan commented Aug 14, 2025 •

edited

Loading

Uh oh!

codecov bot commented Aug 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

matsumotosan commented Aug 16, 2025

Uh oh!

Borda commented Aug 18, 2025

Uh oh!

deependujha commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Expose weights_only for loading checkpoints with Trainer, LightningModule, LightningDataModule #21072

Expose weights_only for loading checkpoints with Trainer, LightningModule, LightningDataModule #21072

Uh oh!

Conversation

matsumotosan commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

PR review

Uh oh!

codecov bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

matsumotosan commented Aug 16, 2025

Uh oh!

Borda commented Aug 18, 2025

Uh oh!

deependujha commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Expose `weights_only` for loading checkpoints with `Trainer`, `LightningModule`, `LightningDataModule` #21072

Expose `weights_only` for loading checkpoints with `Trainer`, `LightningModule`, `LightningDataModule` #21072

matsumotosan commented Aug 14, 2025 •

edited

Loading

codecov bot commented Aug 15, 2025 •

edited

Loading