[pull] main from llvm:main #691

pull · 2025-11-12T11:51:05Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…167588) Rename RegNum to Reg.

…c. (#167417) ctype_utils/wctype_utils were chaged in 120689e and e7f7973, respectively to operate on char/wchar_t. Now we can switch to the overloaded names (e.g. have noth `isspace(char` and `isspace(wchar_t)`) to simplify the templatized strtointeger implementation from 315dfe5 and make it easier to potentially add templatized strtofloat implementation.

I'm considering a operator>(MCRegister, unsigned) and operator<(MCRegister, unsigned) so I have not updated those lines. Such comparisons are common on MCRegister.

We don't actually support Windows builds at this time, so this is not needed. I plan to add a different implementation once the release-binaries workflow supports Windows again.

…ing (#165411)

This patch adds `printOperation()` functions for deferred emission ops in order to unify the API used for emitting operations. No functional change intended.

…div/rem (#154072) Since div/rem operations don’t support a mask operand, the lanes of the divisor that are masked out are currently replaced with 1 using VPInstruction::Select before the predicated div/rem operation. This patch replaces ``` VPInstruction::Select(logical_and(header_mask, conditional_mask), LHS, RHS) ``` with ``` vp.merge(conditional_mask, LHS, RHS, EVL) ``` so that the header mask can be replaced by EVL in this usage scenario when tail folding with EVL.

…5573) Support for multi-image features has begun to be integrated into LLVM with the MIF dialect. In this PR, you will find lowering and operations related to the TEAM features (`SYNC TEAM`, `GET_TEAM`, `FORM TEAM`, `CHANGE TEAM`, `TEAM_NUMBER`). Note regarding the operation for `CHANGE TEAM` : This operation is partial because it does not support the associated list of coarrays because the allocation of a coarray and the lowering of PRIF's `prif_alias_{create|destroy}` procedures are not yet supported in Flang. This will be integrated later. Any feedback is welcome.

This is a follow-up from #166926 that ensures the hints are only added once, and ensures that hints inserted by the register allocator take priority over hints to reduce movprfx.

…ement array to a 1-element array (#166950) When compiling with `-fembed-bitcode-marker`, Clang inserts a placeholder for the bitcode. This placeholder is a `[0 x i8]` array, which we cannot represent in SPIRV. For AMD flavored SPIRV, we extend the `llvm.embedded.module` global to a `zeroinitializer [1 x i8]` array. To achieve this, this patch adds a new pass, `SPIRVPrepareGlobals`, that we can use to write global variable's _non-trivial-to-lower-IR_ -> _trivial-to-lower-IR_ mappings. This is a second attempt at #162082, but cleaner. In the translator something similar is done for every 0-element array since KhronosGroup/SPIRV-LLVM-Translator#2743 . But I don't think we want to do this mapping for all cases.

…esses (#167649) Revert #166005 due to breaking x86 iOS sims We're sometimes hitting a allocator assert when running x86 iOS sim tests. I don't believe this PR is at fault, but there's probably a memory safety / allocator issue somewhere which the allocation pattern here is exposing.

…correctly (#167485)

This applies `[[nodiscard]]` according to our coding guidelines to `basic_string`.

…7662) The new test fails on x86 and arm64 public macOS bots: ``` 09:27:59 ====================================================================== 09:27:59 FAIL: test_append_frames (TestScriptedFrameProvider.ScriptedFrameProviderTestCase) 09:27:59 Test that we can add frames after real stack. 09:27:59 ---------------------------------------------------------------------- 09:27:59 Traceback (most recent call last): 09:27:59 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py", line 122, in test_append_frames 09:27:59 self.assertEqual(new_frame_count, original_frame_count + 1) 09:27:59 AssertionError: 5 != 6 09:27:59 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 09:27:59 ====================================================================== 09:27:59 FAIL: test_applies_to_thread (TestScriptedFrameProvider.ScriptedFrameProviderTestCase) 09:27:59 Test that applies_to_thread filters which threads get the provider. 09:27:59 ---------------------------------------------------------------------- 09:27:59 Traceback (most recent call last): 09:27:59 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py", line 218, in test_applies_to_thread 09:27:59 self.assertEqual( 09:27:59 AssertionError: 5 != 1 : Thread with ID 1 should have 1 synthetic frame 09:27:59 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 09:27:59 ====================================================================== 09:27:59 FAIL: test_prepend_frames (TestScriptedFrameProvider.ScriptedFrameProviderTestCase) 09:27:59 Test that we can add frames before real stack. 09:27:59 ---------------------------------------------------------------------- 09:27:59 Traceback (most recent call last): 09:27:59 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py", line 84, in test_prepend_frames 09:27:59 self.assertEqual(new_frame_count, original_frame_count + 2) 09:27:59 AssertionError: 5 != 7 09:27:59 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 09:27:59 ====================================================================== 09:27:59 FAIL: test_remove_frame_provider_by_id (TestScriptedFrameProvider.ScriptedFrameProviderTestCase) 09:27:59 Test that RemoveScriptedFrameProvider removes a specific provider by ID. 09:27:59 ---------------------------------------------------------------------- 09:27:59 Traceback (most recent call last): 09:27:59 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py", line 272, in test_remove_frame_provider_by_id 09:27:59 self.assertEqual(thread.GetNumFrames(), 3, "Should have 3 synthetic frames") 09:27:59 AssertionError: 5 != 3 : Should have 3 synthetic frames 09:27:59 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 09:27:59 ====================================================================== 09:27:59 FAIL: test_replace_all_frames (TestScriptedFrameProvider.ScriptedFrameProviderTestCase) 09:27:59 Test that we can replace the entire stack. 09:27:59 ---------------------------------------------------------------------- 09:27:59 Traceback (most recent call last): 09:27:59 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py", line 41, in test_replace_all_frames 09:27:59 self.assertEqual(thread.GetNumFrames(), 3, "Should have 3 synthetic frames") 09:27:59 AssertionError: 5 != 3 : Should have 3 synthetic frames 09:27:59 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 09:27:59 ====================================================================== 09:27:59 FAIL: test_scripted_frame_objects (TestScriptedFrameProvider.ScriptedFrameProviderTestCase) 09:27:59 Test that provider can return ScriptedFrame objects. 09:27:59 ---------------------------------------------------------------------- 09:27:59 Traceback (most recent call last): 09:27:59 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py", line 159, in test_scripted_frame_objects 09:27:59 self.assertEqual(frame0.GetFunctionName(), "custom_scripted_frame_0") 09:27:59 AssertionError: 'thread_func(int)' != 'custom_scripted_frame_0' 09:27:59 - thread_func(int) 09:27:59 + custom_scripted_frame_0 09:27:59 09:27:59 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 09:27:59 ---------------------------------------------------------------------- 09:27:59 Ran 6 tests in 14.242s 09:27:59 09:27:59 FAILED (failures=6) ``` Reverts #161870

Fold or (fcmp uno %A, %A), (fcmp uno %B, %B), ... -> or (fcmp uno %A, %B), ... This pattern is generated to check if any vector lane is NaN, and combining multiple compares is beneficial on architectures that have dedicated instructions. Alive2 Proof: https://alive2.llvm.org/ce/z/vA_aoM Combine suggested as part of #161735 PR: #167251

…e.li` (#167481)

This commit extends the tosa.intdiv operand/result types to allow int64 tensors.

CHARACTER dummy arguments were treated as local variables in debug info. This happened because our method to get the argument number was not robust. It relied on `DeclareOp` having a direct reference to arguments which was not the case for character arguments. This is fixed by storing source-level argument positions in `DeclareOp`. Fixes #112886

Adds `transform.xegpu.insert_prefetch` transform op that inserts `xegpu.prefetch_nd` ops for the given `Value` in an `scf.for` loop.

This is a follow-up of PR #165558 and #165993. This patch updates the remaining two Ops to use the AnyTypeOf[] construct, completing the migration for the mbarrier family of Ops. ``` mbarrier.arrive.expect_tx mbarrier.try_wait.parity ``` Signed-off-by: Durgadoss R <[email protected]>

…ultiversion resolvers (#167516) - Fixes #163369 - Segmentation fault occurred because resolver was calling TSan instrumentation functions (__tsan_func_entry, __tsan_func_exit) but as the resolver is run by the dynamic linker at load time, TSan is not initialized yet so the current thread pointer is null. - This PR adds the DisableSanitizerInstrumentation attribute to the multiversion function resolvers to avoid issues like this. - Added regression test for TSan segfault.

…7378) This commit fixes support for the argmax operation by allowing fp8/bf16 input operands with an int64 output type in the profile compilance such that it aligns with the spec.

…to arrays with UINT32_MAX elements (#166952) In HIP, dynamic LDS variables are represented using `0-element` global arrays in the `__shared__` language address-space. ```cpp extern __shared__ int LDS[]; ``` These are not representable in SPIRV directly. To represent them, for AMD, we use an array with `UINT32_MAX`-elements. These are reverse translated to 0-element arrays later in AMD's SPIRV runtime pipeline (in [SPIRVReader.cpp](https://github.com/ROCm/SPIRV-LLVM-Translator/blob/8cb74e264ddcde89f62354544803dc8cdbac148d/lib/SPIRV/SPIRVReader.cpp#L358)).

The LLVM-customized GTest has a dependency on LLVM to support `llvm::raw_ostream` and hence has to link to LLVMSupport. The runtimes use the LLVMSupport from the bootstrapping LLVM build. The problem is that the boostrapping compiler and the runtimes target can diverge in their ABI, even in the runtimes default build. For instance, Clang is built using gcc which uses libstdc++, but the runtimes is built by Clang which can be configured to use libcxx by default. Altough it does not use gcc, this issue has caused [flang-aarch64-libcxx](https://lab.llvm.org/buildbot/#/builders/89)) to break, and is still (again?) broken. This patch makes the runtimes' GTest independent from LLVMSupport so we do not link any runtimes component with LLVM components. Runtime projects that use GTest unittests: * flang-rt * libc * compiler-rt: Adds `gtest-all.cpp` with [GTEST_NO_LLVM_SUPPORT=1](https://github.com/llvm/llvm-project/blob/f801b6f67ea896d6e4d2de38bce9a79689ceb254/compiler-rt/CMakeLists.txt#L723) to each unittest without using `llvm_gtest`. Not touched by this PR. * openmp: Handled by #159416. Not touched for now by this PR to avoid conflict. The current state of this PR tries to reuse https://github.com/llvm/llvm-project/blob/main/third-party/unittest/CMakeLists.txt as much as possible, altough personally I would prefer to make it use "modern CMake" style. third-party/unittest/CMakeLists.txt will detect whether it is used in runtimes build and adjaust accordingly. It creates a different target for LLVM (`llvm_gtest`, NFCI) and another one for the runtimes (`runtimes_gtest`). It is not possible to reuse `llvm_gtest` for both since `llvm_gtest` is imported using `find_package(LLVM)` if configured using LLVM_INSTALL_GTEST. An alias `default_gtest` is used to select between the two. `default_gtest` could also be used for openmp which also supports standalone and [LLVM_ENABLE_PROJECTS](#152189) build mode.

#166561) This patch enables `aarch64-split-sve-objects` to handle hazard padding in functions that use the SVE CC even when there are no predicate spills/locals. This improves the codegen over the base hazard padding implementation, as rather than placing the padding in the callee-save area, it is placed at the start of the ZPR area. E.g., Current lowering: ``` sub sp, sp, #1040 str x29, [sp, #1024] // 8-byte Folded Spill addvl sp, sp, #-1 str z8, [sp] // 16-byte Folded Spill sub sp, sp, #1040 ``` New lowering: ``` str x29, [sp, #-16]! // 8-byte Folded Spill sub sp, sp, #1024 addvl sp, sp, #-1 str z8, [sp] // 16-byte Folded Spill sub sp, sp, #1040 ``` This also re-enables paired stores for GPRs (as the offsets no longer include the hazard padding).

…ars with VALUE attribute (#166682) Scalars with VALUE attribute are likely passed in registers, so it's now clear what lowering should do with the array actual argument in this case. Fail this case with an error before getting to lowering.

…167505) On RISC-V narrowInterleaveGroups doesn't kick in because the wrong VectorRegWidth is passed to isConsecutiveInterleaveGroup. narrowInterleaveGroups is always passed the RGK_FixedWidthVector register size, but on RISC-V the RGK_ScalableVector size is twice as large because we want to use LMUL 2. This causes the `GroupSize == VectorRegWidth` check to fail. This fixes it by using the scalable register size whenever the VF is scalable and plumbing it through as a potentially scalable TypeSize. Note that this only makes a difference when tail folding is disabled, as narrowInterleaveGroups can't handle EVL based IVs yet.

…167340) We're likely to get better code from custom legalisation, where we can remove unpack instructions (plus SVE2p1 has BFMLSLB/T), but we get much of benefit with these two small changes. NOTE: LLVM has no support for FEAT_AFP in terms of feature detection or ACLE builtins, so the compiler works under the assumption the feature is not enabled. Patch is also more aggressive when enabling bfloat fma construction because it removes unnecessary rounding which is generally preferable regardless of whether BFMLALB is used or not.

topperc and others added 30 commits November 11, 2025 22:32

[RISCV] Remove implicit conversions of MCRegister to unsigned. NFC (#…

d04d291

…167588) Rename RegNum to Reg.

[PowerPC] Use MCRegister instead of unsigned. NFC (#167602)

b1eb7fa

I'm considering a operator>(MCRegister, unsigned) and operator<(MCRegister, unsigned) so I have not updated those lines. Such comparisons are common on MCRegister.

workflows/release-binaries: Drop use of setup-windows action (#167440)

9dfd14a

We don't actually support Windows builds at this time, so this is not needed. I plan to add a different implementation once the release-binaries workflow supports Windows again.

[clang-tidy] Provide fix-its for downcasts in google-readability-cast…

da9015a

…ing (#165411)

[mlir][emitc] Unify API for deferred emission (#167532)

9be980c

This patch adds `printOperation()` functions for deferred emission ops in order to unify the API used for emitting operations. No functional change intended.

[AArch64] Prioritize regalloc hints over movprfx hints (#167480)

fe8865c

This is a follow-up from #166926 that ensures the hints are only added once, and ensures that hints inserted by the register allocator take priority over hints to reduce movprfx.

[libc++abi] Add a test to ensure the abi namespace alias is declared …

a8e058a

…correctly (#167485)

[libc++] Use variable templates in is_floating_point (#167141)

1590034

[libc++] Mark string functions as [[nodiscard]] (#166524)

36c1273

This applies `[[nodiscard]]` according to our coding guidelines to `basic_string`.

[AMDGPU] Fix missing S_WAIT_XCNT with multiple pending VMEMs (#166779)

5e4f177

[RISCV] Add short forward branch support for lui, qc.li, and `qc.…

b1343e3

…e.li` (#167481)

[mlir][tosa] Allow int64 tensors in tosa.intdiv (#167367)

3a66089

This commit extends the tosa.intdiv operand/result types to allow int64 tensors.

[MLIR][XeGPU][TransformOps] Add insert_prefetch op (#167356)

3c52f53

Adds `transform.xegpu.insert_prefetch` transform op that inserts `xegpu.prefetch_nd` ops for the given `Value` in an `scf.for` loop.

[X86] bitcnt-big-integer.ll - add zero_undef test coverage (#167663)

f48288a

[mlir][tosa] Fix validation support for argmax with int64 output (#16…

d5388c3

…7378) This commit fixes support for the argmax operation by allowing fp8/bf16 input operands with an int64 output type in the profile compilance such that it aligns with the spec.

[BLAZE] Add missing SCFUtil dep after #167356 (#167671)

57b2341

lukel97 and others added 2 commits November 12, 2025 11:14

pull bot locked and limited conversation to collaborators Nov 12, 2025

pull bot added the ⤵️ pull label Nov 12, 2025

pull bot merged commit 46e9d63 into optimizecompile:main Nov 12, 2025
20 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from llvm:main #691

[pull] main from llvm:main #691

Uh oh!

pull bot commented Nov 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

27 participants

[pull] main from llvm:main #691

[pull] main from llvm:main #691

Uh oh!

Conversation

pull bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

27 participants

pull bot commented Nov 12, 2025 •

edited

Loading