-
Notifications
You must be signed in to change notification settings - Fork 14.5k
MC,AMDGPU: Don't pad .text with s_code_end if it would otherwise be empty #147980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
We don't want that padding in a module that only contains data, not code.
@llvm/pr-subscribers-mc @llvm/pr-subscribers-backend-amdgpu Author: Tim Renouf (trenouf) ChangesWe don't want that padding in a module that only contains data, not code. Full diff: https://github.com/llvm/llvm-project/pull/147980.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index 749b9efc81378..3765c1d2b106c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -486,10 +486,12 @@ bool AMDGPUAsmPrinter::doFinalization(Module &M) {
// Pad with s_code_end to help tools and guard against instruction prefetch
// causing stale data in caches. Arguably this should be done by the linker,
// which is why this isn't done for Mesa.
+ // Don't do it if there are no functions.
const MCSubtargetInfo &STI = *getGlobalSTI();
if ((AMDGPU::isGFX10Plus(STI) || AMDGPU::isGFX90A(STI)) &&
(STI.getTargetTriple().getOS() == Triple::AMDHSA ||
- STI.getTargetTriple().getOS() == Triple::AMDPAL)) {
+ STI.getTargetTriple().getOS() == Triple::AMDPAL) &&
+ !M.empty()) {
OutStreamer->switchSection(getObjFileLowering().getTextSection());
getTargetStreamer()->EmitCodeEnd(STI);
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testcase?
const MCSubtargetInfo &STI = *getGlobalSTI(); | ||
if ((AMDGPU::isGFX10Plus(STI) || AMDGPU::isGFX90A(STI)) && | ||
(STI.getTargetTriple().getOS() == Triple::AMDHSA || | ||
STI.getTargetTriple().getOS() == Triple::AMDPAL)) { | ||
STI.getTargetTriple().getOS() == Triple::AMDPAL) && | ||
!M.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable but I wonder if there's a better way to test "did we emit any code?". E.g. what if the module contains some function declarations but no definitions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus this may misdetect available_externally functions which will be dropped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to using MCSection::hasInstructions(), which I had to fix to work with the asm streamer.
Also added test. The positive case (when it does add padding) already has a test.
We don't want that padding in a module that only contains data, not code.
Also fix MCSection::hasInstructions() so it works with the asm streamer too.