[pull] main from llvm:main #585

pull · 2025-10-16T11:51:04Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

PR #120064 added several MCPlusBuilder helpers for recognising instructions which sign or authenticate the link register. This patch adds MCPlusBuilder unittests for these helpers.

Lower v16i8 to v4i32 partial_smla to relaxed_dot_add. I'm still unsure whether we could/should take advantage of the unknown signedness of the rhs, and also lower the partial_sumla operation too.

…163208) Depends on: * #163348 * #162632 With this patch Clang will start emitting `DW_AT_language_{name, version}` for C++/C/Objective-C/Objective-C++ when using `-gdwarf-6`. We adjust the `DISourceLanguageName` (which we pass to `DICompileUnit`) to hold a `DW_AT_language_name_` and version code when in DWARFv6. Otherwise we continue using the `DW_LANG_` version of `DISourceLanguageName`. We didn't back-port emitting `DW_AT_language_name`/`DW_AT_language_version` to DWARFv5 (unlike GCC, which emits both the new and old language attributes in DWARFv5) because there wasn't a compelling reason to do so (yet).

... non-constexpr variable initializers.

#163712) Reverts #163622, see #163711 for details.

For this specific case, when catching a pointer data type, by reference, Clang generates a special code pattern, which directly accesses the exception data by skipping past the `_Unwind_Exception` manually (rather than using the return value of `__cxa_begin_catch`). On most platforms, `_Unwind_Exception` is 32 bytes, but in some configurations it's different. (ARM EHABI is one preexisting case.) In the case of SEH, it's also different - it is 48 bytes in 32 bit mode and 64 bytes in 64 bit mode. (See the SEH ifdef in `_Unwind_Exception` in `clang/lib/Headers/unwind.h`.) Handle this case in `TargetCodeGenInfo::getSizeOfUnwindException`, fixing the code generation for catching pointers by reference. This fixes mstorsjo/llvm-mingw#522.

When working on privatization, it is easier to work with fir.box explicitly in memory, otherwise, there is no way to express that the fir.box will end-up being a descriptor address in FIR which makes it hard to deal with data management. However, introducing fir.ref<fir.box> early can pessimize early HLFIR optimization because it is harder to reason about the aliasing of `fir.ref<fir.box>` because of the extra memory indirection. This patch introduces a pass that turns acc `!fir.box<T>` recipes into `!fir.ref<!fir.box<T>>` recipes and updated the related recipe usages to use `!fir.ref<!fir.box<T>>` (creating new alloca+store+load). It is added to flang and not OpenACC because it is specific to the `fir.box` type, so it makes little sense to make it an OpenACC generic pass and to create a new OpenACC dialect type interface for this use case.

`__alignas_is_defined` and `__alignof_is_defined` are a C++11 feature which we only recently added. I don't think it will break anybody if we don't provide these macros in C++03, so this simply disable the test instead.

… for it (#163372)

This basically reverts the test changes in #118837.

…#162856) The `|| __bc == 0` case will never be relevant, since we know that `size() + 1` will always be exactly 1 if `__bc == 0` and `0 * max_load_factor()` will be zero, so the branch will already be taken due to the first condition.

…163386) Fixes #163127

…#163707) SimpleRemoteMemoryMapper is a MemoryMapper implementation that manages remote memory via EPC-calls to reserve, initialize, deinitialize, and release operations. It is compatible with the SimpleExecutorMemoryManager backend, and its introduction allows MapperJITLinkMemoryManager to use this backend. It is also intended to be compatible with the orc_rt::SimpleNativeMemoryMap backend.

This PR adds a testcase where pipeliner bails out early because the number of the store instructions exceeds the threshold set by `pipeliner-max-num-stores`. The test should have been added in #154940, but it was missed.

* Added high-level section labels in linalg-ops-with-patterns.mlir. * Moved tests for `memref.copy` to the bottom, after all Linalg ops. * Removed duplicate `@test_vectorize_padded_pack_no_vector_sizes` tests - they differed only in tensor dimensions (both static). * Updated comments and test names for `linalg.pack` to improve clarity and align with https://mlir.llvm.org/getting_started/TestingGuide/. * Re-grouped tests for `linalg.pack`. For a broader context, I plan to update the vectorization logic for `linalg.pack`. This clean-up will make the following PRs easier to review.

On powerpc long double may be ppc_fp128, so add corresponding cases to the test.

…e improved (#151332) Fix #65136 |Benchmark | Baseline | Candidate | Difference | % Difference |------------------------- | ---------- | ----------- | ------------ | -------------- |BM_CmpEqual_int_int | 0.46 | 0.46 | -0.00 | -0.62 |BM_CmpEqual_int_schar | 0.45 | 0.45 | -0.00 | -0.40 |BM_CmpEqual_int_short | 0.45 | 0.45 | 0.00 | 0.34 |BM_CmpEqual_int_uchar | 0.78 | 0.44 | -0.34 | -43.18 |BM_CmpEqual_int_uint | 0.90 | 0.66 | -0.24 | -26.84 |BM_CmpEqual_int_ushort | 0.78 | 0.45 | -0.33 | -42.20 |BM_CmpEqual_schar_int | 0.45 | 0.45 | -0.00 | -0.77 |BM_CmpEqual_schar_schar | 0.54 | 0.57 | 0.03 | 5.64 |BM_CmpEqual_schar_short | 0.92 | 0.88 | -0.04 | -4.80 |BM_CmpEqual_schar_uchar | 1.84 | 0.66 | -1.18 | -64.16 |BM_CmpEqual_schar_uint | 0.78 | 0.66 | -0.12 | -15.18 |BM_CmpEqual_schar_ushort | 1.01 | 0.66 | -0.35 | -34.53 |BM_CmpEqual_short_int | 0.45 | 0.45 | 0.00 | 0.03 |BM_CmpEqual_short_schar | 0.89 | 0.88 | -0.01 | -0.80 |BM_CmpEqual_short_short | 0.47 | 0.46 | -0.01 | -1.28 |BM_CmpEqual_short_uchar | 1.11 | 0.66 | -0.45 | -40.63 |BM_CmpEqual_short_uint | 0.77 | 0.66 | -0.12 | -14.88 |BM_CmpEqual_short_ushort | 1.76 | 0.66 | -1.10 | -62.64 |BM_CmpEqual_uchar_int | 0.79 | 0.44 | -0.35 | -44.06 |BM_CmpEqual_uchar_schar | 1.76 | 0.66 | -1.11 | -62.68 |BM_CmpEqual_uchar_short | 1.11 | 0.66 | -0.45 | -40.33 |BM_CmpEqual_uchar_uchar | 0.57 | 0.51 | -0.06 | -10.61 |BM_CmpEqual_uchar_uint | 0.45 | 0.44 | -0.01 | -1.74 |BM_CmpEqual_uchar_ushort | 0.77 | 0.77 | -0.00 | -0.64 |BM_CmpEqual_uint_int | 0.88 | 0.66 | -0.23 | -25.69 |BM_CmpEqual_uint_schar | 0.77 | 0.66 | -0.11 | -14.85 |BM_CmpEqual_uint_short | 0.77 | 0.66 | -0.11 | -14.56 |BM_CmpEqual_uint_uchar | 0.44 | 0.44 | -0.00 | -0.57 |BM_CmpEqual_uint_uint | 0.47 | 0.51 | 0.04 | 8.62 |BM_CmpEqual_uint_ushort | 0.45 | 0.44 | -0.00 | -0.47 |BM_CmpEqual_ushort_int | 0.77 | 0.45 | -0.33 | -42.02 |BM_CmpEqual_ushort_schar | 1.02 | 0.66 | -0.36 | -35.30 |BM_CmpEqual_ushort_short | 1.76 | 0.66 | -1.10 | -62.60 |BM_CmpEqual_ushort_uchar | 0.78 | 0.77 | -0.01 | -1.84 |BM_CmpEqual_ushort_uint | 0.45 | 0.45 | 0.00 | 0.24 |BM_CmpEqual_ushort_ushort | 0.46 | 0.51 | 0.05 | 11.00 |BM_CmpLess_int_int | 0.67 | 0.66 | -0.01 | -0.99 |BM_CmpLess_int_schar | 0.66 | 0.66 | -0.01 | -0.86 |BM_CmpLess_int_short | 0.66 | 0.66 | -0.00 | -0.57 |BM_CmpLess_int_uchar | 0.88 | 0.66 | -0.23 | -25.48 |BM_CmpLess_int_uint | 1.76 | 0.66 | -1.11 | -62.68 |BM_CmpLess_int_ushort | 0.89 | 0.66 | -0.23 | -25.50 |BM_CmpLess_schar_int | 0.66 | 0.66 | -0.00 | -0.44 |BM_CmpLess_schar_schar | 0.66 | 0.66 | -0.00 | -0.40 |BM_CmpLess_schar_short | 0.88 | 0.88 | -0.00 | -0.50 |BM_CmpLess_schar_uchar | 1.10 | 0.71 | -0.39 | -35.24 |BM_CmpLess_schar_uint | 0.89 | 0.66 | -0.23 | -25.66 |BM_CmpLess_schar_ushort | 0.99 | 0.77 | -0.22 | -22.49 |BM_CmpLess_short_int | 0.66 | 0.66 | -0.00 | -0.35 |BM_CmpLess_short_schar | 0.89 | 0.88 | -0.00 | -0.48 |BM_CmpLess_short_short | 0.66 | 0.66 | -0.00 | -0.34 |BM_CmpLess_short_uchar | 1.10 | 0.71 | -0.39 | -35.36 |BM_CmpLess_short_uint | 0.88 | 0.66 | -0.22 | -25.39 |BM_CmpLess_short_ushort | 1.77 | 0.77 | -1.00 | -56.42 |BM_CmpLess_uchar_int | 0.97 | 0.66 | -0.31 | -31.95 |BM_CmpLess_uchar_schar | 1.11 | 0.66 | -0.44 | -40.17 |BM_CmpLess_uchar_short | 1.19 | 0.66 | -0.53 | -44.59 |BM_CmpLess_uchar_uchar | 0.66 | 0.66 | -0.00 | -0.67 |BM_CmpLess_uchar_uint | 0.67 | 0.66 | -0.01 | -1.19 |BM_CmpLess_uchar_ushort | 0.77 | 0.77 | -0.00 | -0.40 |BM_CmpLess_uint_int | 1.76 | 0.66 | -1.10 | -62.59 |BM_CmpLess_uint_schar | 0.89 | 0.66 | -0.23 | -25.99 |BM_CmpLess_uint_short | 0.88 | 0.66 | -0.22 | -25.41 |BM_CmpLess_uint_uchar | 0.66 | 0.66 | -0.01 | -0.81 |BM_CmpLess_uint_uint | 0.66 | 0.66 | -0.00 | -0.71 |BM_CmpLess_uint_ushort | 0.66 | 0.66 | -0.00 | -0.29 |BM_CmpLess_ushort_int | 0.98 | 0.66 | -0.32 | -33.00 |BM_CmpLess_ushort_schar | 1.29 | 0.77 | -0.52 | -40.56 |BM_CmpLess_ushort_short | 1.77 | 0.77 | -1.00 | -56.55 |BM_CmpLess_ushort_uchar | 0.77 | 0.77 | -0.01 | -0.72 |BM_CmpLess_ushort_uint | 0.66 | 0.66 | -0.00 | -0.46 |BM_CmpLess_ushort_ushort | 0.66 | 0.66 | -0.00 | -0.71

Split off from PR #163525, this standalone patch replaces `ret * undef` returns with `ret void` in order to reduce the likelihood of contributors hitting the `undef deprecator` warning in github.

) This commit adds a new "specification_version" field to the TOSA target environment attribute. This allows a user to specify which version of the TOSA specification they would like to target during lowering. A leading example in the validation pass has also been added. This addition adds a version to each profile compliance entry to track which version of the specification the entry was added. This allows a backwards compatibility check to be implemented between the target version and the profile compliance entry version. For now a default version of "1.0" is assumed. "1.1.draft" is added to denote an in-development version of the specification targeting the next release.

PolyhedralInfo is tied to the legacy pass manager. With the eventual removal of the legacy pass manager it will not be useful anymore. PolyhedralInfo was an experiment to make Polly's analysis available to other passes. Its power is limited due to not being able to make assumptions for which regular Polly would emit a runtime condition/code versioning during optimization. When eventually porting such an API to the new pass manager, we will have to invent a new API.

InstCombine currently fails to call into InstSimplify for cast instructions. I noticed this because the transform from #98649 can be triggered via `-passes=instsimplify` but not `-passes=instcombine`, which is not supposed to happen.

While people look into it, xfail the tests.

2.x had ListType and StringTypes (https://docs.python.org/2.7/library/types.html), 3.x removed these (https://docs.python.org/3.0/library/types.html). We can use "str" and "list" directly as in 3.x all strings are just "str", and ListType was always an alias to "list".

Python3 removed "unichr" when string encoding was changed, so this code tried to import that then defaulted to "chr" if it couldn't. Since LLVM requires >=3.8 we can use "chr" directly.

… AVX512 conflict intrinsics to be used in constexpr (#163293) Resolves #160524

We might be able to do better by using SVE2 and perhaps even NEON for the final stages, but this version works everywhere so seems like is a good place to start. Fixes #155468

…ents (#163590) As noticed on #163567 - if the constant pool data wasn't the expected element size for the instruction, we weren't adding the asm comment at all

Fixes round-tripping where literals used to be reassembled into inline constants. Also fix the %extract-encodings substitution in lit tests to emit each instruction code once and not twice. Eliminate the Literal64 field.

…3324) We have `noinline` and `alwaysinline` present as first class function attributes. Add `inline_hint` to the list of function attributes as well. Update the module import and translation to support the new attribute. The verifier does not need to be changed as `inlinehint` does not conflict with `noinline` or `alwaysinline`. `inline_hint` is needed to support the `inline` C/C++ keyword in CIR.

To update Python2 print statements to Python3 print function calls.

To remove a confusing diff in #159522

The test will fail if libc++ starts to use a lambda in `<array>`. This will become the case because - libc++'s `array::fill` uses `std::fill_n`, and - `std::fill_n` is to be optimized for segment iterators, and - the natural approach for such optimization uses lambdas. Until ASTImport of `clang::LambdaExpr` nodes gets properly fix, this will need to be skipped.

This updates Python2 print statements to Python3 print functions, and makes lists out of some things that are iterators in Python3. The latter we could not bother with as some code is fine with iterators, but it does keep the script behaving exactly as it was in case anyone does try to use this. (and it's clear it was purely 2to3 changes, no hand editing)

These imports were moved around in Python 3.0 (https://docs.python.org/3/whatsnew/3.0.html#library-changes). LLVM requires Python >= 3.8 so we can expect the Python3 names to exist.

When building llvm from a subdirectory (like clspv does) `CMAKE_BINARY_DIR` is at the top of the build directory. When building runtimes (libclc for example), the build fails looking for clang (through `find_package` looking at `LLVM_BINARY_DIR` with `NO_DEFAULT_PATH` & `NO_CMAKE_FIND_ROOT_PATH`) because clang is not in `LLVM_BINARY_DIR`. Fix that issue by setting `clang_cmake_builddir` the same way we set `llvm_cmake_builddir` from `LLVM_BINARY_DIR`. For default llvm build (using llvm as the main cmake project), it should not change anything. For standalone clang build, keep the actual value as libclc cannot be built that way.

X64 triples include SSE2 by default, which we already test this, and it was causing check prefix clash warnings in update_llc_test_checks.py

…163745) Fix check prefix clash warnings in update_llc_test_checks.py by adding an additional prefix for AVX512F and AVX512BW capable targets

These imports got moved around in Python 3.0 (https://docs.python.org/3/whatsnew/3.0.html#library-changes). LLVM requires Python >= 3.8 so we can assume the Python3 names are available.

REQUIRES clauses apply to the compilation unit, which the OpenMP spec defines as the program unit in Fortran. Don't set REQUIRES flags on all containing scopes, only on the containng program unit, where flags coming from different directives are gathered. If we wanted to set the flags on subprograms, we would need to first accummulate all of them, then propagate them down to all subprograms. That is not done as it is not necessary (the containing program unit is always available).

Recipes in licm are safe to hoist if the legality check passes, and the recipe is guaranteed to execute; the single successor of the vector preheader is the vector loop region. Clarify this in the code structure and comments.

Only a couple of changes, including adding two empty comments to resolve differences between different versions of clang-format.

bgergely0 and others added 30 commits October 16, 2025 08:07

[BOLT][NFC] Add MCPlusBuilder unittests for PAuth helpers (#162251)

db2d8fc

PR #120064 added several MCPlusBuilder helpers for recognising instructions which sign or authenticate the link register. This patch adds MCPlusBuilder unittests for these helpers.

[WebAssembly] Partial SMLA with relaxed dot (#163529)

65363e6

Lower v16i8 to v4i32 partial_smla to relaxed_dot_add. I'm still unsure whether we could/should take advantage of the unknown signedness of the rhs, and also lower the partial_sumla operation too.

[clang-tidy][bazel][NFC] enable custom checks in bazel build (#160548)

4c4c028

[clang][bytecode] Diagnose out-of-bounds enum values in .... (#163530)

06cc20c

... non-constexpr variable initializers.

[clang][analyzer] Add checker 'core.NullPointerArithm' (#157129)

8570ba2

Revert "[libc] Enable intermediate computation in float for baremetal" (

d50423e

#163712) Reverts #163622, see #163711 for details.

[libc++][C++03] Don't run cstdalign.compile.pass.cpp (#163357)

bef39e6

`__alignas_is_defined` and `__alignof_is_defined` are a C++11 feature which we only recently added. I don't think it will break anybody if we don't provide these macros in C++03, so this simply disable the test instead.

[libc++][C++03] Fix alg.copy/copy.pass.cpp (#163365)

be5941e

[libc++][C++03] Make __libcpp_verbose_abort noexcept and fix the test…

35d8360

… for it (#163372)

[libc++][C++03] Fix support.dynamic/libcpp_deallocate.sh.cpp (#163378)

1f9a70f

This basically reverts the test changes in #118837.

[clang][bytecode] Fix null Descriptor dereference in ArrayElemPtrPop (#…

cf55dfb

…163386) Fixes #163127

[gn build] Port 9c456e5

a42546e

[MachinePipeliner] Add test missed in #154940 (NFC) (#163350)

71b001e

This PR adds a testcase where pipeliner bails out early because the number of the store instructions exceeds the threshold set by `pipeliner-max-num-stores`. The test should have been added in #154940, but it was missed.

[Clang] Handle ppc_fp128 in N3364 test (NFC)

910c868

On powerpc long double may be ppc_fp128, so add corresponding cases to the test.

[LV][NFC] Remove undef from function return values (#163578)

c48aa54

Split off from PR #163525, this standalone patch replaces `ret * undef` returns with `ret void` in order to reduce the likelihood of contributors hitting the `undef deprecator` warning in github.

[lldb][util] Use Python3 print function in example code

8160025

[Offload] XFAIL pgo tests until resolved (#163722)

f7e9968

While people look into it, xfail the tests.

[lldb][examples] Use "chr" in CFString.py

65c24e5

Python3 removed "unichr" when string encoding was changed, so this code tried to import that then defaulted to "chr" if it couldn't. Since LLVM requires >=3.8 we can use "chr" directly.

[Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - Allow…

d2a8486

… AVX512 conflict intrinsics to be used in constexpr (#163293) Resolves #160524

paulwalker-arm and others added 17 commits October 16, 2025 11:03

[LLVM][CodeGen][SVE] Add lowering for ISD::VECREDUCE_MUL/FMUL. (#161842)

57b797f

We might be able to do better by using SVE2 and perhaps even NEON for the final stages, but this version works everywhere so seems like is a good place to start. Fixes #155468

[AMDGPU] Add product names to processor table (#163717)

4773751

[X86] Relax vector element width constraint on SSE pmul/madd asm comm…

e3f9b4c

…ents (#163590) As noticed on #163567 - if the constant pool data wasn't the expected element size for the instruction, we weren't adding the asm comment at all

[AMDGPU] Preserve literal operands on disassembling. (#163376)

33503d0

Fixes round-tripping where literals used to be reassembled into inline constants. Also fix the %extract-encodings substitution in lit tests to emit each instruction code once and not twice. Eliminate the Literal64 field.

[llvm][utils] Run 2to3 on clang-parse-diagnostics-file

d342fa1

To update Python2 print statements to Python3 print function calls.

[SimpleLoopUnswitch] Regenerate UTC test. NFC

9393f23

To remove a confusing diff in #159522

[lld][utils] Remove Python2 compatible imports in benchmark.py

44c9692

These imports were moved around in Python 3.0 (https://docs.python.org/3/whatsnew/3.0.html#library-changes). LLVM requires Python >= 3.8 so we can expect the Python3 names to exist.

[X86] rem-seteq-illegal-types.ll - remove unnecessary X64 RUN (#163742)

41b1ff8

X64 triples include SSE2 by default, which we already test this, and it was causing check prefix clash warnings in update_llc_test_checks.py

[X86] var-permute-128.ll - fix AVX512F/AVX512BW check prefix clashes (#…

4ae1233

…163745) Fix check prefix clash warnings in update_llc_test_checks.py by adding an additional prefix for AVX512F and AVX512BW capable targets

[llvm][utils] Remove Python2 comaptaible import in unicode-case-fold.py

c2eed93

These imports got moved around in Python 3.0 (https://docs.python.org/3/whatsnew/3.0.html#library-changes). LLVM requires Python >= 3.8 so we can assume the Python3 names are available.

[VPlan] Clarify legality check in licm (NFC) (#162486)

8f04f07

Recipes in licm are safe to hoist if the legality check passes, and the recipe is guaranteed to execute; the single successor of the vector preheader is the vector loop region. Clarify this in the code structure and comments.

[flang][OpenMP] Format check-omp-structure.cpp, NFC (#163750)

c8b8fa2

Only a couple of changes, including adding two empty comments to resolve differences between different versions of clang-format.

pull bot locked and limited conversation to collaborators Oct 16, 2025

pull bot added the ⤵️ pull label Oct 16, 2025

pull bot merged commit c8b8fa2 into optimizecompile:main Oct 16, 2025
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from llvm:main #585

[pull] main from llvm:main #585

Uh oh!

pull bot commented Oct 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[pull] main from llvm:main #585

[pull] main from llvm:main #585

Uh oh!

Conversation

pull bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

pull bot commented Oct 16, 2025 •

edited

Loading