Skip to content

[Accelerate] Support get_offloaded_device for models #364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented Jun 20, 2025

Purpose

  • Enable util that may be useful for dealing with offloading of modules which are no leaf modules. For example, if we want to attach parameters to an attention module for attention quantization, we'll need to know the offload device of the attention module (which is not a leaf module)

Changes

  • Generalize get_offloaded_device to support nested modules

Testing

  • Added additional tests, previous tests pass and previous behavior is preserved

Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs changed the title [Accelerate] Support inference of offload device for models [Accelerate] Support get_offloaded_device for models Jun 20, 2025
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as ready for review July 31, 2025 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant