-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[PGO] Fix zeroed estimated trip count #167792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The premise of this patch is that an estimated trip count of 0 is always invalid. Before PR #152775, `llvm::getLoopEstimatedTripCount` never returned 0. PR #152775 changed that behavior but kept documentation saying it returns a positive count. Some passes continue to rely on the previous behavior, as reported in issue #164254. And yet some passes call `llvm::setLoopEstimatedTripCount` with a value of 0. To understand why it seems like an estimated trip count can be 0 but cannot, consider the example of LoopPeel. Given a loop with an estimated trip count of 10, if LoopPeel peels 2 iterations, it seems reasonable that the remaining loop will have an estimated trip count of 8. However, what should the remaining loop's estimated trip count be when peeling 10 iterations? What about 50? Naively, it seems like the answers are 0 and -40, respectively. But neither is valid. Recall that we are talking about estimates. That means, the probability is likely *low* but not 0 that execution will reach iteration 11, iteration 51, or the remaining loop. In the unlikely case that it does reach them, it executes them. In other words, if execution reaches the loop header, at least one iteration of the remaining loop executes, and the probability is likely low that more will execute. Thus, a pass like LoopPeel might naively calculate that the remaining loop's estimated trip count is 0, but it must be at least 1. We could try to ensure that all passes never set the estimated trip count as 0. For now, this patch instead: - Asserts that `llvm.loop.estimated_trip_count` never ends up as 0. - If `EstimatedloopInvocationWeight` is not specified, adjusts `llvm::setLoopEstimatedTripCount` to convert 0 to 1. - If `EstimatedloopInvocationWeight` is specified, adjusts `llvm::setLoopEstimatedTripCount` to set zeroed branch weights and remove any `llvm.loop.estimated_trip_count`. The effect is that `llvm::getLoopEstimatedTripCount` will return `std::nullopt`. For passes that still use `EstimatedloopInvocationWeight`, this patch thus restores the behavior from before PR #152775. Eventually, no passes should use `EstimatedloopInvocationWeight`.
|
I have put this in a draft state until I receive confirmation that it fixes #164254. |
After thinking about it for another night, I believe that approach is wrong.
I think we should instead continue that approach, even when For example, LoopPeel sometimes calls Besides, it is not my goal right now to tweak estimated trip counts and their impact on passes. My goal is to move them out of the way without disturbing them so I can fix BFI. So I am thinking of changing this patch to:
Thoughts? |
I thought the loop count metadata doesn't impact BFI, just branch weights do?
|
|
Can we use "unknown" instead of zeroed branch weights? I'm pondering if making all-zero branch weights invalid. The reason is that "0" is a value that can come out of unchecked math. "unknown" is not. |
Right. The purpose of that metadata, introduced by PR #152775, is to move estimated trip counts out of branch weights so I can fix BFI issues. But PR #152775 broke a guarantee of
That makes sense to me, but not in the current PR. Branch weights are zeroed by Does that make sense? |
Ah, existing behavior - got it, and makes sense - thanks! |
I've rewritten the PR to do that, and I've rewritten the PR initial comment to reflect the change. The new version is significantly simpler. |
Before PR #152775,
llvm::getLoopEstimatedTripCountnever returned 0. Ifllvm::setLoopEstimatedTripCountwere called with 0, it would zero branch weights, causingllvm::getLoopEstimatedTripCountto returnstd::nullopt.PR #152775 changed that behavior: if
llvm::setLoopEstimatedTripCountis called with 0, it setsllvm.loop.estimated_trip_countto 0, causingllvm::getLoopEstimatedTripCountto return 0. However, it kept documentation sayingllvm::getLoopEstimatedTripCountreturns a positive count.Some passes continue to assume
llvm::getLoopEstimatedTripCountnever returns 0 and crash if it does, as reported in issue #164254. To restore the behavior they expect, this patch changesllvm::getLoopEstimatedTripCountto returnstd::nulloptwhenllvm.loop.estimated_trip_countis 0.