Skip to content

Conversation

NikhilAPatel
Copy link
Contributor

Summary: Force the unrolling of 3 loops through the use of range_constexpr to increase performance.

Differential Revision: D83513863

@facebook-github-bot
Copy link
Contributor

@NikhilAPatel has exported this pull request. If you are a Meta employee, you can view the originating Diff in D83513863.

@NikhilAPatel NikhilAPatel changed the title Force unrolling of certain loops [Grouped Gemm][CuTeDSL] Force unrolling of certain loops Sep 30, 2025
NikhilAPatel added a commit to NikhilAPatel/tritonbench that referenced this pull request Sep 30, 2025
Summary:

Force the unrolling of 3 loops through the use of `range_constexpr` to increase performance.

Differential Revision: D83513863
@facebook-github-bot
Copy link
Contributor

@NikhilAPatel has exported this pull request. If you are a Meta employee, you can view the originating Diff in D83513863.

Summary:

Force the unrolling of 3 loops through the use of `range_constexpr` to increase performance.

Differential Revision: D83513863
@facebook-github-bot
Copy link
Contributor

@NikhilAPatel has exported this pull request. If you are a Meta employee, you can view the originating Diff in D83513863.

@facebook-github-bot facebook-github-bot merged commit a409d72 into meta-pytorch:main Sep 30, 2025
8 checks passed
xuzhao9 pushed a commit that referenced this pull request Oct 1, 2025
Differential Revision: D83513863

Pull Request resolved: #497
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants