Skip to content

Commit dfb3f42

Browse files
committed
[AArch64][SME] Implement the SME ABI (ZA state management) in Machine IR
<h1>Short Summary</h1> This patch adds a new pass `aarch64-machine-sme-abi` to handle the ABI for ZA state (e.g., lazy saves and agnostic ZA functions). This is currently not enabled by default (but aims to be by LLVM 22). The goal is for this new pass to more optimally place ZA saves/restores and to work with exception handling. <h1>Long Description</h1> This patch reimplements management of ZA state for functions with private and shared ZA state. Agnostic ZA functions will be handled in a later patch. For now, this is under the flag `-aarch64-new-sme-abi`, however, we intend for this to replace the current SelectionDAG implementation once complete. The approach taken here is to mark instructions as needing ZA to be in a specific ("ACTIVE" or "LOCAL_SAVED"). Machine instructions implicitly defining or using ZA registers (such as $zt0 or $zab0) require the "ACTIVE" state. Function calls may need the "LOCAL_SAVED" or "ACTIVE" state depending on the callee (having shared or private ZA). We already add ZA register uses/definitions to machine instructions, so no extra work is needed to mark these. Calls need to be marked by glueing Arch64ISD::INOUT_ZA_USE or Arch64ISD::REQUIRES_ZA_SAVE to the CALLSEQ_START. These markers are then used by the MachineSMEABIPass to find instructions where there is a transition between required ZA states. These are the points we need to insert code to set up or restore a ZA save (or initialize ZA). To handle control flow between blocks (which may have different ZA state requirements), we bundle the incoming and outgoing edges of blocks. Bundles are formed by assigning each block an incoming and outgoing bundle (initially, all blocks have their own two bundles). Bundles are then combined by joining the outgoing bundle of a block with the incoming bundle of all successors. These bundles are then assigned a ZA state based on the blocks that participate in the bundle. Blocks whose incoming edges are in a bundle "vote" for a ZA state that matches the state required at the first instruction in the block, and likewise, blocks whose outgoing edges are in a bundle vote for the ZA state that matches the last instruction in the block. The ZA state with the most votes is used, which aims to minimize the number of state transitions. Change-Id: Iced4a3f329deab3ff8f3fd449a2337f7bbfa71ec
1 parent 653872f commit dfb3f42

23 files changed

+3884
-503
lines changed

llvm/lib/Target/AArch64/AArch64.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ FunctionPass *createAArch64CleanupLocalDynamicTLSPass();
6060
FunctionPass *createAArch64CollectLOHPass();
6161
FunctionPass *createSMEABIPass();
6262
FunctionPass *createSMEPeepholeOptPass();
63+
FunctionPass *createMachineSMEABIPass();
6364
ModulePass *createSVEIntrinsicOptsPass();
6465
InstructionSelector *
6566
createAArch64InstructionSelector(const AArch64TargetMachine &,
@@ -111,6 +112,7 @@ void initializeFalkorMarkStridedAccessesLegacyPass(PassRegistry&);
111112
void initializeLDTLSCleanupPass(PassRegistry&);
112113
void initializeSMEABIPass(PassRegistry &);
113114
void initializeSMEPeepholeOptPass(PassRegistry &);
115+
void initializeMachineSMEABIPass(PassRegistry &);
114116
void initializeSVEIntrinsicOptsPass(PassRegistry &);
115117
void initializeAArch64Arm64ECCallLoweringPass(PassRegistry &);
116118
} // end namespace llvm

llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp

Lines changed: 26 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -92,8 +92,8 @@ class AArch64ExpandPseudo : public MachineFunctionPass {
9292
bool expandCALL_BTI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
9393
bool expandStoreSwiftAsyncContext(MachineBasicBlock &MBB,
9494
MachineBasicBlock::iterator MBBI);
95-
MachineBasicBlock *expandRestoreZA(MachineBasicBlock &MBB,
96-
MachineBasicBlock::iterator MBBI);
95+
MachineBasicBlock *expandCommitOrRestoreZA(MachineBasicBlock &MBB,
96+
MachineBasicBlock::iterator MBBI);
9797
MachineBasicBlock *expandCondSMToggle(MachineBasicBlock &MBB,
9898
MachineBasicBlock::iterator MBBI);
9999
};
@@ -974,40 +974,50 @@ bool AArch64ExpandPseudo::expandStoreSwiftAsyncContext(
974974
}
975975

976976
MachineBasicBlock *
977-
AArch64ExpandPseudo::expandRestoreZA(MachineBasicBlock &MBB,
978-
MachineBasicBlock::iterator MBBI) {
977+
AArch64ExpandPseudo::expandCommitOrRestoreZA(MachineBasicBlock &MBB,
978+
MachineBasicBlock::iterator MBBI) {
979979
MachineInstr &MI = *MBBI;
980+
bool IsRestoreZA = MI.getOpcode() == AArch64::RestoreZAPseudo;
981+
assert((MI.getOpcode() == AArch64::RestoreZAPseudo ||
982+
MI.getOpcode() == AArch64::CommitZAPseudo) &&
983+
"Expected ZA commit or restore");
980984
assert((std::next(MBBI) != MBB.end() ||
981985
MI.getParent()->successors().begin() !=
982986
MI.getParent()->successors().end()) &&
983987
"Unexpected unreachable in block that restores ZA");
984988

985989
// Compare TPIDR2_EL0 value against 0.
986990
DebugLoc DL = MI.getDebugLoc();
987-
MachineInstrBuilder Cbz = BuildMI(MBB, MBBI, DL, TII->get(AArch64::CBZX))
988-
.add(MI.getOperand(0));
991+
MachineInstrBuilder Branch =
992+
BuildMI(MBB, MBBI, DL,
993+
TII->get(IsRestoreZA ? AArch64::CBZX : AArch64::CBNZX))
994+
.add(MI.getOperand(0));
989995

990996
// Split MBB and create two new blocks:
991997
// - MBB now contains all instructions before RestoreZAPseudo.
992-
// - SMBB contains the RestoreZAPseudo instruction only.
993-
// - EndBB contains all instructions after RestoreZAPseudo.
998+
// - SMBB contains the [Commit|RestoreZA]Pseudo instruction only.
999+
// - EndBB contains all instructions after [Commit|RestoreZA]Pseudo.
9941000
MachineInstr &PrevMI = *std::prev(MBBI);
9951001
MachineBasicBlock *SMBB = MBB.splitAt(PrevMI, /*UpdateLiveIns*/ true);
9961002
MachineBasicBlock *EndBB = std::next(MI.getIterator()) == SMBB->end()
9971003
? *SMBB->successors().begin()
9981004
: SMBB->splitAt(MI, /*UpdateLiveIns*/ true);
9991005

1000-
// Add the SMBB label to the TB[N]Z instruction & create a branch to EndBB.
1001-
Cbz.addMBB(SMBB);
1006+
// Add the SMBB label to the CB[N]Z instruction & create a branch to EndBB.
1007+
Branch.addMBB(SMBB);
10021008
BuildMI(&MBB, DL, TII->get(AArch64::B))
10031009
.addMBB(EndBB);
10041010
MBB.addSuccessor(EndBB);
10051011

10061012
// Replace the pseudo with a call (BL).
10071013
MachineInstrBuilder MIB =
10081014
BuildMI(*SMBB, SMBB->end(), DL, TII->get(AArch64::BL));
1009-
MIB.addReg(MI.getOperand(1).getReg(), RegState::Implicit);
1010-
for (unsigned I = 2; I < MI.getNumOperands(); ++I)
1015+
unsigned FirstBLOperand = 1;
1016+
if (IsRestoreZA) {
1017+
MIB.addReg(MI.getOperand(1).getReg(), RegState::Implicit);
1018+
FirstBLOperand = 2;
1019+
}
1020+
for (unsigned I = FirstBLOperand; I < MI.getNumOperands(); ++I)
10111021
MIB.add(MI.getOperand(I));
10121022
BuildMI(SMBB, DL, TII->get(AArch64::B)).addMBB(EndBB);
10131023

@@ -1617,8 +1627,9 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock &MBB,
16171627
return expandCALL_BTI(MBB, MBBI);
16181628
case AArch64::StoreSwiftAsyncContext:
16191629
return expandStoreSwiftAsyncContext(MBB, MBBI);
1630+
case AArch64::CommitZAPseudo:
16201631
case AArch64::RestoreZAPseudo: {
1621-
auto *NewMBB = expandRestoreZA(MBB, MBBI);
1632+
auto *NewMBB = expandCommitOrRestoreZA(MBB, MBBI);
16221633
if (NewMBB != &MBB)
16231634
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
16241635
return true;
@@ -1629,6 +1640,8 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock &MBB,
16291640
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
16301641
return true;
16311642
}
1643+
case AArch64::InOutZAUsePseudo:
1644+
case AArch64::RequiresZASavePseudo:
16321645
case AArch64::COALESCER_BARRIER_FPR16:
16331646
case AArch64::COALESCER_BARRIER_FPR32:
16341647
case AArch64::COALESCER_BARRIER_FPR64:

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Lines changed: 88 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -8154,53 +8154,54 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
81548154
if (Subtarget->hasCustomCallingConv())
81558155
Subtarget->getRegisterInfo()->UpdateCustomCalleeSavedRegs(MF);
81568156

8157-
// Create a 16 Byte TPIDR2 object. The dynamic buffer
8158-
// will be expanded and stored in the static object later using a pseudonode.
8159-
if (Attrs.hasZAState()) {
8160-
TPIDR2Object &TPIDR2 = FuncInfo->getTPIDR2Obj();
8161-
TPIDR2.FrameIndex = MFI.CreateStackObject(16, Align(16), false);
8162-
SDValue SVL = DAG.getNode(AArch64ISD::RDSVL, DL, MVT::i64,
8163-
DAG.getConstant(1, DL, MVT::i32));
8164-
8165-
SDValue Buffer;
8166-
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8167-
Buffer = DAG.getNode(AArch64ISD::ALLOCATE_ZA_BUFFER, DL,
8168-
DAG.getVTList(MVT::i64, MVT::Other), {Chain, SVL});
8169-
} else {
8170-
SDValue Size = DAG.getNode(ISD::MUL, DL, MVT::i64, SVL, SVL);
8171-
Buffer = DAG.getNode(ISD::DYNAMIC_STACKALLOC, DL,
8172-
DAG.getVTList(MVT::i64, MVT::Other),
8173-
{Chain, Size, DAG.getConstant(1, DL, MVT::i64)});
8174-
MFI.CreateVariableSizedObject(Align(16), nullptr);
8175-
}
8176-
Chain = DAG.getNode(
8177-
AArch64ISD::INIT_TPIDR2OBJ, DL, DAG.getVTList(MVT::Other),
8178-
{/*Chain*/ Buffer.getValue(1), /*Buffer ptr*/ Buffer.getValue(0)});
8179-
} else if (Attrs.hasAgnosticZAInterface()) {
8180-
// Call __arm_sme_state_size().
8181-
SDValue BufferSize =
8182-
DAG.getNode(AArch64ISD::GET_SME_SAVE_SIZE, DL,
8183-
DAG.getVTList(MVT::i64, MVT::Other), Chain);
8184-
Chain = BufferSize.getValue(1);
8185-
8186-
SDValue Buffer;
8187-
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8188-
Buffer =
8189-
DAG.getNode(AArch64ISD::ALLOC_SME_SAVE_BUFFER, DL,
8190-
DAG.getVTList(MVT::i64, MVT::Other), {Chain, BufferSize});
8191-
} else {
8192-
// Allocate space dynamically.
8193-
Buffer = DAG.getNode(
8194-
ISD::DYNAMIC_STACKALLOC, DL, DAG.getVTList(MVT::i64, MVT::Other),
8195-
{Chain, BufferSize, DAG.getConstant(1, DL, MVT::i64)});
8196-
MFI.CreateVariableSizedObject(Align(16), nullptr);
8157+
if (!Subtarget->useNewSMEABILowering() || Attrs.hasAgnosticZAInterface()) {
8158+
// Old SME ABI lowering (deprecated):
8159+
// Create a 16 Byte TPIDR2 object. The dynamic buffer
8160+
// will be expanded and stored in the static object later using a
8161+
// pseudonode.
8162+
if (Attrs.hasZAState()) {
8163+
TPIDR2Object &TPIDR2 = FuncInfo->getTPIDR2Obj();
8164+
TPIDR2.FrameIndex = MFI.CreateStackObject(16, Align(16), false);
8165+
SDValue SVL = DAG.getNode(AArch64ISD::RDSVL, DL, MVT::i64,
8166+
DAG.getConstant(1, DL, MVT::i32));
8167+
SDValue Buffer;
8168+
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8169+
Buffer = DAG.getNode(AArch64ISD::ALLOCATE_ZA_BUFFER, DL,
8170+
DAG.getVTList(MVT::i64, MVT::Other), {Chain, SVL});
8171+
} else {
8172+
SDValue Size = DAG.getNode(ISD::MUL, DL, MVT::i64, SVL, SVL);
8173+
Buffer = DAG.getNode(ISD::DYNAMIC_STACKALLOC, DL,
8174+
DAG.getVTList(MVT::i64, MVT::Other),
8175+
{Chain, Size, DAG.getConstant(1, DL, MVT::i64)});
8176+
MFI.CreateVariableSizedObject(Align(16), nullptr);
8177+
}
8178+
Chain = DAG.getNode(
8179+
AArch64ISD::INIT_TPIDR2OBJ, DL, DAG.getVTList(MVT::Other),
8180+
{/*Chain*/ Buffer.getValue(1), /*Buffer ptr*/ Buffer.getValue(0)});
8181+
} else if (Attrs.hasAgnosticZAInterface()) {
8182+
// Call __arm_sme_state_size().
8183+
SDValue BufferSize =
8184+
DAG.getNode(AArch64ISD::GET_SME_SAVE_SIZE, DL,
8185+
DAG.getVTList(MVT::i64, MVT::Other), Chain);
8186+
Chain = BufferSize.getValue(1);
8187+
SDValue Buffer;
8188+
if (!Subtarget->isTargetWindows() && !hasInlineStackProbe(MF)) {
8189+
Buffer = DAG.getNode(AArch64ISD::ALLOC_SME_SAVE_BUFFER, DL,
8190+
DAG.getVTList(MVT::i64, MVT::Other),
8191+
{Chain, BufferSize});
8192+
} else {
8193+
// Allocate space dynamically.
8194+
Buffer = DAG.getNode(
8195+
ISD::DYNAMIC_STACKALLOC, DL, DAG.getVTList(MVT::i64, MVT::Other),
8196+
{Chain, BufferSize, DAG.getConstant(1, DL, MVT::i64)});
8197+
MFI.CreateVariableSizedObject(Align(16), nullptr);
8198+
}
8199+
// Copy the value to a virtual register, and save that in FuncInfo.
8200+
Register BufferPtr =
8201+
MF.getRegInfo().createVirtualRegister(&AArch64::GPR64RegClass);
8202+
FuncInfo->setSMESaveBufferAddr(BufferPtr);
8203+
Chain = DAG.getCopyToReg(Chain, DL, BufferPtr, Buffer);
81978204
}
8198-
8199-
// Copy the value to a virtual register, and save that in FuncInfo.
8200-
Register BufferPtr =
8201-
MF.getRegInfo().createVirtualRegister(&AArch64::GPR64RegClass);
8202-
FuncInfo->setSMESaveBufferAddr(BufferPtr);
8203-
Chain = DAG.getCopyToReg(Chain, DL, BufferPtr, Buffer);
82048205
}
82058206

82068207
if (CallConv == CallingConv::PreserveNone) {
@@ -8217,6 +8218,15 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
82178218
}
82188219
}
82198220

8221+
if (Subtarget->useNewSMEABILowering()) {
8222+
// Clear new ZT0 state. TODO: Move this to the SME ABI pass.
8223+
if (Attrs.isNewZT0())
8224+
Chain = DAG.getNode(
8225+
ISD::INTRINSIC_VOID, DL, MVT::Other, Chain,
8226+
DAG.getConstant(Intrinsic::aarch64_sme_zero_zt, DL, MVT::i32),
8227+
DAG.getTargetConstant(0, DL, MVT::i32));
8228+
}
8229+
82208230
return Chain;
82218231
}
82228232

@@ -8781,14 +8791,12 @@ static SDValue emitSMEStateSaveRestore(const AArch64TargetLowering &TLI,
87818791
MachineFunction &MF = DAG.getMachineFunction();
87828792
AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();
87838793
FuncInfo->setSMESaveBufferUsed();
8784-
87858794
TargetLowering::ArgListTy Args;
87868795
TargetLowering::ArgListEntry Entry;
87878796
Entry.Ty = PointerType::getUnqual(*DAG.getContext());
87888797
Entry.Node =
87898798
DAG.getCopyFromReg(Chain, DL, Info->getSMESaveBufferAddr(), MVT::i64);
87908799
Args.push_back(Entry);
8791-
87928800
SDValue Callee =
87938801
DAG.getExternalSymbol(IsSave ? "__arm_sme_save" : "__arm_sme_restore",
87948802
TLI.getPointerTy(DAG.getDataLayout()));
@@ -8906,6 +8914,9 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
89068914
*DAG.getContext());
89078915
RetCCInfo.AnalyzeCallResult(Ins, RetCC);
89088916

8917+
// Determine whether we need any streaming mode changes.
8918+
SMECallAttrs CallAttrs = getSMECallAttrs(MF.getFunction(), CLI);
8919+
89098920
// Check callee args/returns for SVE registers and set calling convention
89108921
// accordingly.
89118922
if (CallConv == CallingConv::C || CallConv == CallingConv::Fast) {
@@ -8919,14 +8930,26 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
89198930
CallConv = CallingConv::AArch64_SVE_VectorCall;
89208931
}
89218932

8933+
bool UseNewSMEABILowering = Subtarget->useNewSMEABILowering();
8934+
bool IsAgnosticZAFunction = CallAttrs.caller().hasAgnosticZAInterface();
8935+
auto ZAMarkerNode = [&]() -> std::optional<unsigned> {
8936+
// TODO: Handle agnostic ZA functions.
8937+
if (!UseNewSMEABILowering || IsAgnosticZAFunction)
8938+
return std::nullopt;
8939+
if (!CallAttrs.caller().hasZAState() && !CallAttrs.caller().hasZT0State())
8940+
return std::nullopt;
8941+
return CallAttrs.requiresLazySave() ? AArch64ISD::REQUIRES_ZA_SAVE
8942+
: AArch64ISD::INOUT_ZA_USE;
8943+
}();
8944+
89228945
if (IsTailCall) {
89238946
// Check if it's really possible to do a tail call.
89248947
IsTailCall = isEligibleForTailCallOptimization(CLI);
89258948

89268949
// A sibling call is one where we're under the usual C ABI and not planning
89278950
// to change that but can still do a tail call:
8928-
if (!TailCallOpt && IsTailCall && CallConv != CallingConv::Tail &&
8929-
CallConv != CallingConv::SwiftTail)
8951+
if (!ZAMarkerNode.has_value() && !TailCallOpt && IsTailCall &&
8952+
CallConv != CallingConv::Tail && CallConv != CallingConv::SwiftTail)
89308953
IsSibCall = true;
89318954

89328955
if (IsTailCall)
@@ -8978,9 +9001,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
89789001
assert(FPDiff % 16 == 0 && "unaligned stack on tail call");
89799002
}
89809003

8981-
// Determine whether we need any streaming mode changes.
8982-
SMECallAttrs CallAttrs = getSMECallAttrs(MF.getFunction(), CLI);
8983-
89849004
auto DescribeCallsite =
89859005
[&](OptimizationRemarkAnalysis &R) -> OptimizationRemarkAnalysis & {
89869006
R << "call from '" << ore::NV("Caller", MF.getName()) << "' to '";
@@ -8994,7 +9014,7 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
89949014
return R;
89959015
};
89969016

8997-
bool RequiresLazySave = CallAttrs.requiresLazySave();
9017+
bool RequiresLazySave = !UseNewSMEABILowering && CallAttrs.requiresLazySave();
89989018
bool RequiresSaveAllZA = CallAttrs.requiresPreservingAllZAState();
89999019
if (RequiresLazySave) {
90009020
const TPIDR2Object &TPIDR2 = FuncInfo->getTPIDR2Obj();
@@ -9076,10 +9096,21 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
90769096
AArch64ISD::SMSTOP, DL, DAG.getVTList(MVT::Other, MVT::Glue), Chain,
90779097
DAG.getTargetConstant((int32_t)(AArch64SVCR::SVCRZA), DL, MVT::i32));
90789098

9079-
// Adjust the stack pointer for the new arguments...
9099+
// Adjust the stack pointer for the new arguments... and mark ZA uses.
90809100
// These operations are automatically eliminated by the prolog/epilog pass
9081-
if (!IsSibCall)
9101+
assert((!IsSibCall || !ZAMarkerNode.has_value()) &&
9102+
"ZA markers require CALLSEQ_START");
9103+
if (!IsSibCall) {
90829104
Chain = DAG.getCALLSEQ_START(Chain, IsTailCall ? 0 : NumBytes, 0, DL);
9105+
if (ZAMarkerNode) {
9106+
// Note: We need the CALLSEQ_START to glue the ZAMarkerNode to, simply
9107+
// using a chain can result in incorrect scheduling. The markers referer
9108+
// to the position just before the CALLSEQ_START (though occur after as
9109+
// CALLSEQ_START lacks in-glue).
9110+
Chain = DAG.getNode(*ZAMarkerNode, DL, DAG.getVTList(MVT::Other),
9111+
{Chain, Chain.getValue(1)});
9112+
}
9113+
}
90839114

90849115
SDValue StackPtr = DAG.getCopyFromReg(Chain, DL, AArch64::SP,
90859116
getPointerTy(DAG.getDataLayout()));
@@ -9551,7 +9582,7 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
95519582
}
95529583
}
95539584

9554-
if (CallAttrs.requiresEnablingZAAfterCall())
9585+
if (RequiresLazySave || CallAttrs.requiresEnablingZAAfterCall())
95559586
// Unconditionally resume ZA.
95569587
Result = DAG.getNode(
95579588
AArch64ISD::SMSTART, DL, DAG.getVTList(MVT::Other, MVT::Glue), Result,
@@ -9572,7 +9603,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
95729603
SDValue TPIDR2_EL0 = DAG.getNode(
95739604
ISD::INTRINSIC_W_CHAIN, DL, MVT::i64, Result,
95749605
DAG.getConstant(Intrinsic::aarch64_sme_get_tpidr2, DL, MVT::i32));
9575-
95769606
// Copy the address of the TPIDR2 block into X0 before 'calling' the
95779607
// RESTORE_ZA pseudo.
95789608
SDValue Glue;
@@ -9584,7 +9614,6 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
95849614
DAG.getNode(AArch64ISD::RESTORE_ZA, DL, MVT::Other,
95859615
{Result, TPIDR2_EL0, DAG.getRegister(AArch64::X0, MVT::i64),
95869616
RestoreRoutine, RegMask, Result.getValue(1)});
9587-
95889617
// Finally reset the TPIDR2_EL0 register to 0.
95899618
Result = DAG.getNode(
95909619
ISD::INTRINSIC_VOID, DL, MVT::Other, Result,

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,10 @@ class AArch64TargetLowering : public TargetLowering {
173173
MachineBasicBlock *EmitZTInstr(MachineInstr &MI, MachineBasicBlock *BB,
174174
unsigned Opcode, bool Op0IsDef) const;
175175
MachineBasicBlock *EmitZero(MachineInstr &MI, MachineBasicBlock *BB) const;
176+
177+
// Note: The following group of functions are only used as part of the old SME
178+
// ABI lowering. They will be removed once -aarch64-new-sme-abi=true is the
179+
// default.
176180
MachineBasicBlock *EmitInitTPIDR2Object(MachineInstr &MI,
177181
MachineBasicBlock *BB) const;
178182
MachineBasicBlock *EmitAllocateZABuffer(MachineInstr &MI,

0 commit comments

Comments
 (0)