[pull] main from llvm:main #602

pull · 2025-10-20T17:51:05Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

This patch pivots GPR32 and GPR64 zeroing into distinct branches to simplify the code an improve the lowering. Zeroing GPR moves are now handled differently than non-zeroing ones. Zero source registers WZR and XZR do not require register annotations of undef, implicit and kill. The non-zeroing source now cannot process WZR removing the ternary expression. This patch also moves GPR64 logic right after GPR32 for better organization.

…4071) Add documentation for the no-rollback conversion driver. Also improve the documentation of the old rollback driver. In particular: which modifications are performed immediately and which are delayed.

Handle ptrtoaddr the same way as ptrtoint. The fold already only operates on the index/address bits.

If a main instruction in the copyables is a div-like instruction, the compiler cannot pack duplicates, extending with poisons, these instructions, being vectorize, will result in undefined behavior. Fixes #164185

`UnqualPtrTy` didn't always match `llvm::PointerType::getUnqual`: sometimes it returned a pointer that is not in address space 0 (notably for SPIRV). Since `UnqualPtrTy` was used as the "generic" or "default" pointer type, this patch renames it to `DefaultPtrTy` to avoid confusion with LLVM's `PointerType::getUnqual`.

All the existing tests test code either in ConstantFolding or InstSimplify, so move them to use -passes=instsimplify instead of -passes=instcombine. This makes sure we keep InstSimplify coverage even if there are subsuming InstCombine folds. This requires writing some of the constant folding tests in a different way, as InstSimplify does not try to re-fold already existing constant expressions.

This reverts commit 1943c9e. This took out quite a few buildbots. Some of the Z3 test cases are failing and enabling this is causing some LLVM tests to begin failing.

Add parsing and semantic checks for DEVICE_SAFESYNC clause. No lowering.

This PR fixes a crash in the `bf_getbuffer` implementation of `PyDenseElementsAttribute` that occurred when an element type was not supported, such as `bf16`. I believe that supportion `bf16` is not possible with that protocol but that's out of the scope of this PR. Previsouly, the code raised an `std::exception` out of `bf_getbuffer` that nanobind does not catch (see also pybind/pybind11#3336). The PR makes the function catch all `std::exception`s and manually raises a Python exception instead. Signed-off-by: Ingo Müller <[email protected]>

Add test with urem guard with non-constant divisor and AddRec guards. Extra test coverage for #163021

OpenACC 3.4 includes the ability to add an 'if' to an atomic operation. From the change log: `Added the if clause to the atomic construct to enable conditional atomic operations based867 on the parallelism strategy employed` In 2.12, the C/C++ grammar is changed to say: `#pragma acc atomic [ atomic-clause ] [ if( condition ) ] new-line` With corresponding changes to the Fortran standard This patch adds support to this for the dialect, so that Clang can use it soon.

…es (#163972) The lowering of `!$acc loop` loops with an early exit currently ends-up "duplicating" the control flow in the acc.loop and inside it as explicit control flow (as if each iteration executes each iteration until the early exit). Add a TODO for now.

)

Move getPreviousSCEVDivisibleByDivisor from a lambda to a static function and clarify the name (DividesBy -> DivisibleBy). Split off refactoring from #163021.

…162993) Early if conversion can create instruction sequences such as ``` mov x1, #1 csel x0, x1, x2, eq ``` which could be simplified into the following instead ``` csinc x0, x2, xzr, ne ``` One notable example that generates code like this is `cmpxchg weak`. This is fixed by handling an immediate value of 1 as `add(wzr, 1)` so that the addition can be folded into CSEL by using CSINC instead.

[lookup](https://llvm.org/doxygen/classllvm_1_1DenseMapBase.html#a0b2ca98dc28c61793ff5c90d23e5f14e) does a find and returns the default if no matching element was found.

…ns (#164099) The `MLInlineAdvisor` currently skips over recursive cases, except that when we delegate to the default policy for non-cold functions, that policy could allow such inlining. The code updating internal state afterwards needs to handle that case. Fix for https://issues.chromium.org/issues/369637577#comment14

If there is a call inside a TEAMS construct, and that call contains a DISTRIBUTE construct, the DISTRIBUTE region is considered to be enclosed by the TEAMS region (based on the dynamic extent of the construct). Currently, Flang diagnoses this as an error, which is incorrect. For eg : ``` subroutine f !$omp distribute do i = 1, 100 ... end do end subroutine subroutine g !$omp teams call f ! this call is ok, distribute enclosed by teams !$omp end teams end subroutine ``` This patch adjusts the nesting check for the OpenMP DISTRIBUTE directive. It retains the error for DISTRIBUTE directives that are incorrectly nested lexically but downgrades it to a warning for orphaned directives to allow dynamic nesting, such as when a subroutine with DISTRIBUTE is called from within a TEAMS region. Co-authored-by: Chandra Ghale <[email protected]>

…rser.cpp (NFC)

Also replace the undef values with function arguments.

If the type of the ParmVarDecl and the parameter type from the FunctionProtoType don't match, we're in for trouble. Just reject those functions. Fixes #163568

Created new OpenACC utilities library (MLIROpenACCUtils) containing helper functions for region analysis, value usage checking, default attribute lookup, and type categorization. Includes comprehensive unit tests and refactors existing getEnclosingComputeOp function into the new library.

Per-entry-point metrics are captured during the path-sensitive analysis time. For that reason, it is not trivial to add the syntax-only analysis time as it runs in a separate stage. Luckily syntax-only analysis is done before path-senstivie analysis. I use the function summary field to keep the syntax-only anlaysis time once syntax analysis is done, and then forward it to the per-EP metrics snapshot during the path-sensitive analysis. Note that some of the entry points that were analyzed by syntax-only rules may be missing in the CSV export if they were never analyzed by path-sensitive rules. Conversely, if a function is analyzed with path-sensitive analysis but not syntax-only analysis, its `SyntaxRunningTime` will be empty. -- CPP-7099

…PrivateLinkage (#164236)

…InteralLinkage/PrivateLinkage (#164240) Same as #164236, but I found this one later.

#164173) Update `.Cases` and `.CasesLower` with 4+ args to use the `initializer_list` overload. The deprecation of these functions will come in a separate PR. For more context, see: #163405.

These tests were setting environment variables, which needs to be done explicitly with env when using the internal shell.

Per Intel Architecture Instruction Set Extensions Programming Reference rev. 59 (https://cdrdv2.intel.com/v1/dl/getContent/671368), table 1-2, DMR doesn't support USER_MSR (URDMSR and UWRMSR instructions)

This PR exposes `translate_module_to_llvmir` in the Python bindings.

@b

This test has loop iterating past (`61`) the array boundaries (`58`). So far this didn't seem to matter, but recently with this change #155253 the constraint elimination in swift has been able to figure this out and is transforming the loop into an infinite one like this ``` *** IR Dump After ConstraintEliminationPass on test_known_trip_count *** define void @test_known_trip_count() local_unnamed_addr { entry: br label %for.body for.body: ; preds = %entry, %for.body %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] %arrayidx = getelementptr inbounds nuw double, ptr @b, i64 %indvars.iv %0 = load double, ptr %arrayidx, align 8 %arrayidx2 = getelementptr inbounds nuw double, ptr @c, i64 %indvars.iv %1 = load double, ptr %arrayidx2, align 8 %add = fadd double %0, %1 %arrayidx4 = getelementptr inbounds nuw double, ptr @A, i64 %indvars.iv store double %add, ptr %arrayidx4, align 8 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 br i1 false, label %exit, label %for.body exit: ; preds = %for.body ret void }% ``` causing the test to fail. This is trying to address the root cause here.

Previously, invalid offset is set to UINT64_MAX, this is not right when DWARF32, which leads to incorrect debug into in GSYM, the branch: ``` if (StmtSeqVal != UNIT64_MAX) StmtSeqOffset = StmtSeqVal; ``` will always be true. In this PR, [commit 1](b1983d6) sets up a test that demonstrates the problem, [commit 2](0d58ce4) fixes it. [Diffing commit 1 and 2](0d58ce4#diff-019bdbc9922ad34fdfbcb524a9805f5af26c432540e76b87a6a5f73d9e0e853aL44) in this PR shows how after the PR the symbolicated line number changed from function definition to function body

Currently when peeling the first iteration, any mentioning of UB within the loop body is replaced with the new UB in the peeled out first iteration. This introduces a bug in the following scenario: Operations inside of the loop that intentionally use the original UB are incorrectly updated.

Remove support for long unsupported Ubuntu, Debian and RHEL. Add support for RHEL 8, 9 and 10 and recognize Rocky and AlmaLinux as RHEL.

…164267)

…r in SPIRVUtils (#164248) There was some repeated code that was used to deduce the SPIRV::LinkageType from a GlobalVariable/Function. At several related parts of the code we also had functions taking 2 parameters: a 'hasLinkage' bool, and a 'LinkageType'. This is error-prone since the later parameter's meaning depends on the first. This patch also merges these two options into a single `std::optional<SPIRV::LinkageType>`.

…lasses (#163588) Extend CS rule to use namespace qualifiers to define previously declared functions to variables and classes as well.

This adopts use of namespace qualifiers to define previously declared functions as per LLVM CS: https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions

- Fix function names to conform to LLVM CS and mark local function static. - Use range for loops to simplify code. - use `interleave` instead of manual loops to print lists.

- Use namespace qualifiers to define variables declared in `llvm` namespace. - move file local `TimeTracerRAII` struct into anonymous namespace. - Use explicit types in a few places. - Convert the loop over `PassList` to a range for loop.

- Use nested namespace definitions in header files. - Mark file local function static and enclods file local structs in anonymous namespace. - Drop some unnecessary namespace qualifiers.

Upstream support ComplexType as a function return type Issue #141365

This patch implements visitors for MemberExpr, UnaryDeref, StringLiteral and CompoundLiteralExpr inside aggregate expressions.

This hardens the unwinding logic and datastructures on systems that support pointer authentication. The approach taken to hardening is to harden the schemas of as many high value fields in the myriad structs as possible, and then also explicitly qualify local variables referencing privileged or security critical values. This does introduce ABI linkage between libcxx, libcxxabi, and libunwind but those are in principle separate from the OS itself so we've kept the schema definitions in the library specific headers rather than ptrauth.h

Implement CXXDefaultArgExpr support for ComplexType Issue #141365

… SSE41 phminposuw intrinsic to be used in constexp (#163041) Fix #161336

…texpr (#161914) Fix #154520

Added support for ConditionalOperator, BinaryConditionalOperator and OpaqueValueExpr as lvalue. Implemented support for ternary operators with one branch being a throw expression. This required weakening the requirement that the true and false regions of the ternary operator must terminate with a `YieldOp`. Instead the true and false regions are now allowed to terminate with an `UnreachableOp` and no `YieldOp` gets emitted when the block throws.

This were all removed in #160028, but I apparently missed this one instance in the documentation. Remove it given that it no longer works.

This patch adds a new script, premerge_advisor_explain.py that requests test failure explanations from the premerge advisor. For now it just prints them out to STDOUT. This allows for testing of the entire system by looking at failure explanations in failed jobs before we do the rest of the wiring to enable the premerge advisor to write out comments.

… AVX/AVX512 subvector extraction intrinsics to be used in constexpr #157712 (#162836) **This PR supersedes and replaces PR #158853** The original branch diverged too far from the main branch, resulting in significant merge conflicts that were difficult to resolve cleanly. To provide a clean and reviewable history, this new PR was created by cherry-picking the necessary commits onto a fresh branch based on the latest `main`. --- *(Original Description)* This patch enables the use of AVX/AVX512 subvector extraction intrinsics within `constexpr` functions. This is achieved by implementing the evaluation logic for these intrinsics in `VectorExprEvaluator::VisitCallExpr` and `InterpretBuiltin`. The original discussion and review comments can be found in the previous pull request for context: #158853 Fixes #157712

The primary purpose of this commit is to enable marking loads to LDS (global.load.lds, buffer.*.load.lds) volatile (using bit 31 of the aux as with normal buffer loads) and to ensure that their !nontemporal annotations translate to appropriate settings of te cache control bits. However, in the process of implementing this feature, we also fixed - Incorrect handling of buffer loads to LDS in GlobalISel - Updating the handling of volatile on buffers in SIMemoryLegalizer: previously, the mapping of address spaces would cause volatile on buffer loads to be silently dropped on at least gfx10. --------- Co-authored-by: Matt Arsenault <[email protected]>

davemgreen and others added 30 commits October 20, 2025 12:53

[AArch64][GlobalISel] Add rax1.ll test converage. NFC

324bd15

[mlir][docs] Add documentation for No-rollback Conversion Driver (#16…

565e9fa

…4071) Add documentation for the no-rollback conversion driver. Also improve the documentation of the old rollback driver. In particular: which modifications are performed immediately and which are delayed.

[InstSimplify] Support ptrtoaddr in simplifyCastInst()

ee50839

Handle ptrtoaddr the same way as ptrtoint. The fold already only operates on the index/address bits.

[SLP]Do not pack div-like copyable values

154138c

If a main instruction in the copyables is a div-like instruction, the compiler cannot pack duplicates, extending with poisons, these instructions, being vectorize, will result in undefined behavior. Fixes #164185

Revert "Reapply "[Clang] Enable lit internal shell by default""

32de3b9

This reverts commit 1943c9e. This took out quite a few buildbots. Some of the Z3 test cases are failing and enabling this is causing some LLVM tests to begin failing.

[flang][OpenMP] Frontend support for DEVICE_SAFESYNC (#163560)

3590a91

Add parsing and semantic checks for DEVICE_SAFESYNC clause. No lowering.

[SCEV] Add extra test coverage with URem & AddRec guards.

0731f18

Add test with urem guard with non-constant divisor and AddRec guards. Extra test coverage for #163021

[lldb] Remove a redundant call to std::unique_ptr<T>::get (NFC) (#164191

c7da79e

)

[SCEV] Move and clarify names of prev/next divisor helpers (NFC).

385ea0d

Move getPreviousSCEVDivisibleByDivisor from a lambda to a static function and clarify the name (DividesBy -> DivisibleBy). Split off refactoring from #163021.

[NFC][SPIRV] Remove useless static_cast (#164239)

fbc2d06

[SPIRV][NFC] Use DenseMap's lookup instead of find (#164237)

b9f9b3b

[lookup](https://llvm.org/doxygen/classllvm_1_1DenseMapBase.html#a0b2ca98dc28c61793ff5c90d23e5f14e) does a find and returns the default if no matching element was found.

[GVNSink] Add support for ptrtoaddr

2ec549a

[lldb-dap][NFC] avoid copy in launch process (#164243)

4a5dbd5

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in TypePa…

5a98392

…rser.cpp (NFC)

[SpeculativeExecution] Generate test checks (NFC)

cceca04

Also replace the undef values with function arguments.

[SpeculativeExecution] Add support for ptrtoaddr

80b311a

[clang][bytecode] Check param types against function prototype (#163920)

c332952

If the type of the ParmVarDecl and the parameter type from the FunctionProtoType don't match, we're in for trouble. Just reject those functions. Fixes #163568

[NFC] Use F->isDeclaration instead of (*F).isDeclaration (#164238)

77ade89

[NFC][SPIRV] Use hasLocalLinkage instead of hasInternalLinkage or has…

66b7d38

…PrivateLinkage (#164236)

jmmartinez and others added 27 commits October 20, 2025 17:55

[NFC][SPIRV] Use hasLocalLinage instead of manual comparison against …

4e88280

…InteralLinkage/PrivateLinkage (#164240) Same as #164236, but I found this one later.

[ADT] Prepare for deprecation of StringSwitch cases with 4+ args. NFC. (

d86da4e

#164173) Update `.Cases` and `.CasesLower` with 4+ args to use the `initializer_list` overload. The deprecation of these functions will come in a separate PR. For more context, see: #163405.

[Clang] Make Z3 Tests Work with Internal Shell

2d550b9

These tests were setting environment variables, which needs to be done explicitly with env when using the internal shell.

[X86] Remove USER_MSR from DMR (#164232)

8c82606

Per Intel Architecture Instruction Set Extensions Programming Reference rev. 59 (https://cdrdv2.intel.com/v1/dl/getContent/671368), table 1-2, DMR doesn't support USER_MSR (URDMSR and UWRMSR instructions)

[MLIR][Python] expose translate_module_to_llvmir (#163881)

5a112de

This PR exposes `translate_module_to_llvmir` in the Python bindings.

[clang] Updates for support for Ubuntu, Debian and RHEL (#162796)

7152d4e

Remove support for long unsupported Ubuntu, Debian and RHEL. Add support for RHEL 8, 9 and 10 and recognize Rocky and AlmaLinux as RHEL.

[AArch64] Remove trailing whitespace in IntrinsicsAArch64.td (NFC) (#…

e25e43a

…164267)

[LLVM][CodingStandard] Extend namespace qualifier rule to variables/c…

58dd7a6

…lasses (#163588) Extend CS rule to use namespace qualifiers to define previously declared functions to variables and classes as well.

[NFC][LLVM] Namespace cleanup in MSCVPaths (#163779)

2bcb42f

This adopts use of namespace qualifiers to define previously declared functions as per LLVM CS: https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions

[NFC][LLVM] Code cleanup in llvm-config.cpp (#163993)

6683f9b

- Fix function names to conform to LLVM CS and mark local function static. - Use range for loops to simplify code. - use `interleave` instead of manual loops to print lists.

[NFC][LLVM] Code cleanup in opt (#164077)

39128b9

- Use namespace qualifiers to define variables declared in `llvm` namespace. - move file local `TimeTracerRAII` struct into anonymous namespace. - Use explicit types in a few places. - Convert the loop over `PassList` to a range for loop.

[NFC][LLVM] Code cleanup in llvm-xray (#164080)

6eb1ddf

- Use nested namespace definitions in header files. - Mark file local function static and enclods file local structs in anonymous namespace. - Drop some unnecessary namespace qualifiers.

[CIR] Upstream support ComplexType as return type (#164072)

61ba312

Upstream support ComplexType as a function return type Issue #141365

[CIR] Add Aggregate Expression LValue Visitors (#163410)

cd05383

This patch implements visitors for MemberExpr, UnaryDeref, StringLiteral and CompoundLiteralExpr inside aggregate expressions.

[CIR] Implement VisitCXXDefaultArgExpr for ComplexType (#164079)

aac8a0d

Implement CXXDefaultArgExpr support for ComplexType Issue #141365

[Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - allow…

3afbda0

… SSE41 phminposuw intrinsic to be used in constexp (#163041) Fix #161336

[Headers][X86] Allow MMX/SSE/AVX MOVMSK intrinsics to be used in cons…

725a297

…texpr (#161914) Fix #154520

[LLVM][Docs] Remove Stray %T Substitution

737e116

This were all removed in #160028, but I apparently missed this one instance in the documentation. Remove it given that it no longer works.

pull bot locked and limited conversation to collaborators Oct 20, 2025

pull bot added the ⤵️ pull label Oct 20, 2025

pull bot merged commit d371417 into optimizecompile:main Oct 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from llvm:main #602

[pull] main from llvm:main #602

Uh oh!

pull bot commented Oct 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[pull] main from llvm:main #602

[pull] main from llvm:main #602

Uh oh!

Conversation

pull bot commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

pull bot commented Oct 20, 2025 •

edited

Loading