-
Notifications
You must be signed in to change notification settings - Fork 232
Recovered skipped w8a8 compression related tests #1785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: shanjiaz <[email protected]>
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
These will fail until the next transformers release? |
Yeah they won't work till our changes are applied |
Signed-off-by: shanjiaz <[email protected]>
…oject/llm-compressor into hz-recover-skipped-tests
Signed-off-by: shanjiaz <[email protected]>
…oject/llm-compressor into hz-recover-skipped-tests
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is moving to the CPU required to fix the memory leak? If so, that would be vey strange.
Perhaps torch.cuda.synchronize
is having an effect? I never tested with this when I last looked at the memory leak issue.
yeah the same code doesn't work if I don't move the models to cpu first. |
SUMMARY:
Recovered skipped w8a8 compression/decompression tests now that transformer side of code is merged. Added memory collection between test instances.
TEST PLAN:
tested locally on transformers 4.56.0.dev0 they passed.