-
Notifications
You must be signed in to change notification settings - Fork 660
forward performance tuning for MI350 #4925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
warp per row wg change
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Hi @q10 , sorry I missed your message. BTW, we discovered a numerical issue in 986cceb and reverted it in 85417b4. It unblocks merging the bwd optimization first. Thank you. |
I think this commit only addresses the build step, where we need to link to tbb. However, for runtime, you might need to do a find in $CONDA_PREFIX from inside the container, and manually update LD_LIBRARY_PATH, or create a symlink (something like FBGEMM/.github/scripts/utils_build.bash Line 383 in 0d49628
|
bwd performance optimization for ROCm.
Fix numerical issues