[pull] main from llvm:main #715

pull · 2025-11-18T17:51:05Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…8427) This adds handling for f16 and f128 lround/llround under LP64 targets, promoting the f16 where needed and using a libcall for f128. This codegen is now identical to the selection dag version.

This helps cases where the immediate range of FDUP is not sufficient.

…is deduced (#164440) Previously, the handling of the `cleanup` attribute had some checks based on the type, but we were deducing the type after handling the attribute. This PR fixes the way the are dealing with type checks for the `cleanup` attribute by delaying these checks after we are deducing the type. It is also fixed in a way that the solution can be adapted for other attributes that does some type based checks. This is the list of C/C++ attributes that are doing type based checks and will need to be fixed in additional PRs: - CUDAShared - MutualExclusions - PassObjectSize - InitPriority - Sentinel - AcquireCapability - RequiresCapability - LocksExcluded - AcquireHandle NB: Some attributes could have been missed in my shallow search. Fixes #129631

This tells the build system to check and regenerate the *GenRegisterInfo*.inc files, should any of them be missing for whatever reason. A follow-up from <#167700>.

It was switched from a function pointer to std::function in TableGen: Make 2nd arg MainFn of TableGenMain(argv0, MainFn) optional. f675ec6 but there's no mention of any particular reason for that.

…sure amt doesn't depend on original load chain (#168400) Relax fix for #165755 / #165850 - it doesn't matter if the amt is dependent on the original load value, just any users of the chain

- Detect cases where LHS & RHS values will not cause overflow (when the Hi halfs are zero).

This pass aims to narrow i64 types on TOSA operations to i32. It can be useful for legalizations from various frameworks. It comes with the following options: - "aggressive-rewrite" - This option is typically able to narrow more values, but may impact numerical behaviour if not used carefully. - "convert-function-boundaries" - If enabled, parameters/ results to/from a function may be narrowed. Otherwise, casts are inserted to preserve the I/O of the function. Currently the non aggressive mode is very limited, targeting an argmax -> cast sequence that has been observed during legalization as well as some data layout operations that can always narrow. Support for more operations will be added in the future. Co-authored-by: Vitalii Shutov <[email protected]> Co-authored-by: Shubham <[email protected]> Co-authored-by: Declan Flavin <[email protected]> Signed-off-by: Luke Hutton <[email protected]> Co-authored-by: Vitalii Shutov <[email protected]> Co-authored-by: Shubham <[email protected]> Co-authored-by: Declan Flavin <[email protected]>

This patch fixes the only RTSan test that was broken by enabling lit's internal shell on Darwin. This patch rewrites the test to prefix env variables with `env` and to avoid the use of subshells.

…d() (#164392) Use the implementation in libomptarget. If libomptarget is not available, always return the UID / device number of the host / the initial device.

…167519) [andv, eorv, orv, s/uaddv, s/umaxv, s/uminv] sve_reduce_##(none, ?) -> op's neutral value sve_reduce_##(any, neutral) -> op's neutral value [andv, orv, s/umaxv, s/uminv] sve_reduce_##(all, splat(X)) -> X [eorv] sve_reduce_##(all, splat(X)) -> 0

Currently, there are no diagnostics issued when including a deprecated header, since the diagnostic is issued inside a system header. This patch fixes that by using `#warning` instead, which also simplifies the implementation of the deprecation warnings.

…_from_uid()" (#168547) Reverts #164392 due to fortran issues

Update VPlan to populate VPIRFlags during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRFlags from the underlying IR instruction each time. The VPRecipeWithIRFlags constructor taking an underlying instruction and setting the flags based on it has been removed. This centralizes initial VPIRFlags creation and ensures flags are consistently available throughout VPlan transformations and makes sure we don't accidentally re-add flags from the underlying instruction that already got dropped during transformations. Follow-up to #167253, which did the same for VPIRMetadata. Should be NFC w.r.t. to the generated IR. PR: #168450

Note that getCurrentUnwindRow does not change any state. Identified with unused-local-non-trivial-variable.

Identified with modernize-loop-convert.

While I am at it, this patch switches to the constructor that takes a container instead of a pair of begin/end. Identified with readability-const-return-type.

- MemoryEffectsAttr in MLIR LLVM dialect is out of sync with LLVM itself.

Fixes: e1979ae ("Implement gd to ie relaxation for aarch64.")

This patch makes all tsan tests work with the internal shell on Darwin. Tests were using various features not supported by the internal shell, mainly subshells and not using env to set environment variables. This patch also fixes one of the dynamiclib substitutions to not use a subshell. Reviewers: ndrewh, DanBlackwell, fmayer, vitalybuka Reviewed By: DanBlackwell Pull Request: #168544

#162443) In some cases, such as when recommending the compiler option _FORTIFY_SOURCE, the current custom message format is clunky. Now, when the reason starts with `>`, the replacement string is omitted., so only the Reason is shown. `^function$,,has a custom message;` - function 'function' has a custom message; it should not be used `^function$,,>has a custom message and no replacement suggestion;` - function 'function' has a custom message and no replacement suggestion --------- Co-authored-by: Donát Nagy <[email protected]>

…#168542) Where possible: * notifyMatchFailure happen first * then op.emitOpError * finally assertions / op creation. --------- Co-authored-by: Jakub Kuderski <[email protected]>

Closes #99097 Closes #99100 As ddx and ddy are near identical implementations I've combined them in this PR. This aims to unblock #161378 --------- Co-authored-by: Alexander Johnston <[email protected]>

…d-bundler and AMD SPIR-V. (#168521) `clang-linker-wrapper` was incorrectly calling `clang-offload-bundler` for AMD SPIR-V. This resulted in a binary that couldn't be executed if built using the new driver. The runtime couldn't recognise the triple triggering this error at execution time: ``` No compatible code objects found for: gfx90a:sramecc+:xnack-, ``` With this PR, this is solved: ``` Creating ISA for: gfx90a:sramecc+:xnack- from spirv ```

This is a simple translation of the current WORKSPACE file. * External repos are replaced with `bazel_dep()`. The versions have been bumped to newer versions. * `maybe()` doesn't seem to be a thing, so I just removed that. * Existing repos where we define our own BUILD file in third_party_build have *not* been replaced due to compatibility issues. For example, `nanobind_bazel` could replace the `nanobind` config we have, but switching to that caused some build errors. * For these existing repos, they have been specified as module extensions This should have no effect since `.bazelrc` defines `common --enable_bzlmod=false --enable_workspace` Tested locally: `bazel test --enable_bzlmod --noenable_workspace --config=generic_clang @llvm-project//... //...`

I missed these attributes when I added the wrapper for GPUFuncOp in fbdd98f.

This patch makes Clang produce the crash reproducer shell script for IR inputs as well.

…he shadow map (#167772) The AddressSanitizer transform currently defaults to placing the shadow map in address space 0, but it is desirable for some targets (namely BPF) to select a different address space for the map. Add a compilation option for specifying the address space of the target.

…PF target (#167768) The AddressSanitizer transform does not have a default offset registered for the shadow map. Set the default shadow map offset for BPF be dynamically set by the KASAN implementation.

The BPF LLVM target currently doesn't support turning on the AddressSanitizer pass, either for userspace ASAN or KASAN. Enable the KASAN option for the BPF target in anticipation of a KASAN implementation for BPF.

This patch fixes most of the ASan tests that were failing on Darwin when running under the internal shell. There are still a couple left that are more interesting cases that I'll do in a follow up patch. The tests that still need to be done: ``` TestCases/Darwin/duplicate_os_log_reports.cpp TestCases/Darwin/dyld_insert_libraries_reexec.cpp TestCases/Darwin/interface_symbols_darwin.cpp ``` Reviewers: thetruestblue, fhahn, vitalybuka, DanBlackwell, ndrewh Reviewed By: DanBlackwell Pull Request: #168545

Only the fortran source files in flang/test/Lower/PowerPC and some in flang/test/Lower have been modified. The other files in the directory will be cleaned up in subsequent commits

…ific address spaces (#167770) For some backends, e.g., BPF, it is desirable to only sanitize memory belonging to specific address spaces. More specifically, it is sometimes desirable to only apply address sanitization for arena memory belonging to address space 1. However, AddressSanitizer currently does not support selectively sanitizing address spaces. Add a new option to select which address spaces to apply AddressSanitizer to. No functional change for existing targets (namely AMD GPU) that hardcode which address spaces to sanitize

In this PR we are proposing to change LLDB codebase so that LLDB is able to print values of integer registers that have more than 64-bits (even if the number of bits is not equal to 128). --------- Co-authored-by: Matej Košík <[email protected]> Co-authored-by: Jonas Devlieghere <[email protected]>

…to non-vectors (#168081) Updates the demanded elements before recursing through copies in case the type of the source register changes from a non-vector register to a vector register. Fixes #167842.

* original change #162730 * with windows fix #164843 * remove timeout that was pointed out in the comment above * Remove test that starts and listens on a socket to avoid timeout issues

…68165) and make (#165264) Truely recover Executor::getDefaultExecutor. The previous change missed std::unique_ptr, which is needed in a normal program exit, since only with that ThreadPoolExecutor destructor will be called in a normal program exit, where it ensures the executor has been stopped and waits for worker threads to finish. The wait is important as it prevents intermittent crashes on Windows when the process is doing a full exit.

In line with a std proposal to introduce std::clmul, and in preparation to introduce a clmul intrinsic, implement carry-less multiply primitives for APIntOps, clmul[rh]. Ref: https://isocpp.org/files/papers/P3642R3.html

…67575)

Identified with modernize-loop-convert.

https://alive2.llvm.org/ce/z/YGT5SN https://alive2.llvm.org/ce/z/PVDxCw https://alive2.llvm.org/ce/z/8buR2N This is tricky because with positive numbers, we only go up, so we can in fact always hit the signed_max boundary. This is important because the intrinsic we use has the behavior of going the OTHER way, aka clamp to INT_MIN if it goes in that direction. And the range checking we do only works for positive numbers. Because of this issue, we can only do this for constants as well.

When building just the runtimes (eg a patch only touches compiler-rt), we do not actually run any normal check targets. This ends up causing an empty ninja invocation, which builds more targets than necessary. Gate the ninja build for normal check-* targets under an if statement to fix this.

The AArch64 backend converts trees formed by conjunctions/disjunctions of comparisons into sequences of `CCMP` instructions. The implementation before this change checks whether a sub-tree must be processed first. If not, it processes the operations in the order they occur in the DAG. This may not be optimal if there is a corresponding `SUB` node for one of the comparisons. In this case, we should process this comparison first because we can then use the same instruction for the `SUB` node and the comparison. To achieve this, this commit comprises the following changes: - Extend `canEmitConjunction` with a new output parameter `PreferFirst`, which reports to the caller whether the sub-tree should preferably be processed first. - Set `PreferFirst` to `true` if we can find a corresponding `SUB` node in the DAG. - If we can process a sub-tree with `PreferFirst = true` first (i.e., we do not violate any `MustBeFirst` constraint by doing so), we swap the sub-trees. - The already existing code for performing the common subexpression elimination takes care to use only a single instruction for the comparison and the `SUB` node if possible. Closes #149685.

Pull Request: #168209

In general, "Flat instructions look at the per-workitem address and determine for each work item if the target memory address is in global, private or scratch memory." (RDNA2 ISA) That means that FLAT instructions need to be considered for VMEM hazards even without "specific segment". Also, LDS DMA should be considered for LDS hazard detection. See also #137148

…8549) Move `GetInnermostExecPart` and `IsStrictlyStructuredBlock` from Semantics/openmp-utils.* to Parser/openmp-utils.*. These two only depend on the AST contents and properties.

This reverts commit b3d6264. This broke the workflow because the sync-labels flag was set to a zero-length string to work around an issue. The underlying issue has been fixed and the value is now required to be a boolean. We can just drop the value because we want the default behavior anyways. This should be the last remaining breaking change from v5 that we need to migrate.

grypp and others added 30 commits November 18, 2025 12:56

[MLIR][NVVM] Move the docs to markdown file (#168375)

76dac58

[AArch64][GlobalISel] Add better basic legalization for llround. (#16…

4ecfaa6

…8427) This adds handling for f16 and f128 lround/llround under LP64 targets, promoting the f16 where needed and using a libcall for f128. This codegen is now identical to the selection dag version.

[LLVM][CodeGen][SVE] Use DUPM for constantfp splats. (#168391)

59ed6df

This helps cases where the immediate range of FDUP is not sufficient.

[CMake] Declare all parts of *GenRegisterInfo.inc as outputs. (#168405)

0be4218

This tells the build system to check and regenerate the *GenRegisterInfo*.inc files, should any of them be missing for whatever reason. A follow-up from <#167700>.

[TableGen][NFCI] Change TableGenMain() to take function_ref. (#167888)

3c87119

It was switched from a function pointer to std::function in TableGen: Make 2nd arg MainFn of TableGenMain(argv0, MainFn) optional. f675ec6 but there's no mention of any particular reason for that.

[ORC] Fix shlibs build: add Object to libLLVMOrcDebugging (#168343)

4c9020d

[X86] combineTruncate - trunc(srl(load(p),amt)) -> load(p+amt/8) - en…

52f4c36

…sure amt doesn't depend on original load chain (#168400) Relax fix for #165755 / #165850 - it doesn't matter if the amt is dependent on the original load value, just any users of the chain

[CGP]: Optimize mul.overflow. (#148343)

3d5d32c

- Detect cases where LHS & RHS values will not cause overflow (when the Hi halfs are zero).

[RTSan] Fix tests under Internal Shell (#168470)

c771159

This patch fixes the only RTSan test that was broken by enabling lit's internal shell on Darwin. This patch rewrites the test to prefix env variables with `env` and to avoid the use of subshells.

[BAZEL] Fix BAZEL build issue (#168539)

e9f74df

[mlir][tosa] Fix shared build

38891ba

[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_ui…

65c4a53

…d() (#164392) Use the implementation in libomptarget. If libomptarget is not available, always return the UID / device number of the host / the initial device.

[BAZEL] Fix OrcDebugging dep (#168540)

6fc2bc1

Revert "[OpenMP] Implement omp_get_uid_from_device() / omp_get_device…

9a0fd22

…_from_uid()" (#168547) Reverts #164392 due to fortran issues

[DWARFCFIChecker] Remove an unused local variable (NFC) (#168487)

1e18b48

Note that getCurrentUnwindRow does not change any state. Identified with unused-local-non-trivial-variable.

[Bitcode] Use a range-based for loop (NFC) (#168489)

4749cc4

Identified with modernize-loop-convert.

[AMDGPU] Remove const on a return type. (#168490)

00ef948

While I am at it, this patch switches to the constructor that takes a container instead of a pair of begin/end. Identified with readability-const-return-type.

[clang][CIR] Temporarily fix CIR codegen test on call. NFC

cc0c899

- MemoryEffectsAttr in MLIR LLVM dialect is out of sync with LLVM itself.

[ELF][AArch64] Fix copy/paste error in llvm_unreachable message

906f175

Fixes: e1979ae ("Implement gd to ie relaxation for aarch64.")

[mlir][amdgpu] Sink op creation in scaled conversion intrinsics (NFC) (…

1fcfd5c

…#168542) Where possible: * notifyMatchFailure happen first * then op.emitOpError * finally assertions / op creation. --------- Co-authored-by: Jakub Kuderski <[email protected]>

[HLSL] Implement ddx/ddy_coarse intrinsics (#164831)

ed60cd2

Closes #99097 Closes #99100 As ddx and ddy are near identical implementations I've combined them in this PR. This aims to unblock #161378 --------- Co-authored-by: Alexander Johnston <[email protected]>

ashermancinelli and others added 22 commits November 18, 2025 07:55

[MLIR][Python] Add arg_attrs and res_attrs to gpu func (#168475)

47d9d73

I missed these attributes when I added the wrapper for GPUFuncOp in fbdd98f.

[Clang][Driver] Create crash reproducers for IR inputs (#165572)

83d27f6

This patch makes Clang produce the crash reproducer shell script for IR inputs as well.

[llvm][AddressSanitizer][BPF] add default shadow mapping offset for B…

82a7832

…PF target (#167768) The AddressSanitizer transform does not have a default offset registered for the shadow map. Set the default shadow map offset for BPF be dynamically set by the KASAN implementation.

[clang][BPF] Turn on AddressSanitizer pass (#167766)

1347b23

The BPF LLVM target currently doesn't support turning on the AddressSanitizer pass, either for userspace ASAN or KASAN. Enable the KASAN option for the BPF target in anticipation of a KASAN implementation for BPF.

[flang][NFC] Strip trailing whitespace from tests (6 of N)

38c1a58

Only the fortran source files in flang/test/Lower/PowerPC and some in flang/test/Lower have been modified. The other files in the directory will be cleaned up in subsequent commits

[AArch64][GISel] Don't crash in known-bits when copying from vectors …

93a8ca8

…to non-vectors (#168081) Updates the demanded elements before recursing through copies in case the type of the source register changes from a non-vector register to a vector register. Fixes #167842.

[lldb] update lldb-server platform help parsing (attempt 3) (#164904)

2675dcd

* original change #162730 * with windows fix #164843 * remove timeout that was pointed out in the comment above * Remove test that starts and listens on a socket to avoid timeout issues

[APInt] Introduce carry-less multiply primitives (#168527)

727ee7e

In line with a std proposal to introduce std::clmul, and in preparation to introduce a clmul intrinsic, implement carry-less multiply primitives for APIntOps, clmul[rh]. Ref: https://isocpp.org/files/papers/P3642R3.html

[AMDGPU][GlobalISel] Add RegBankLegalize support for G_IS_FPCLASS (#1…

cb58129

…67575)

[AsmParser] Use a range-based for loop (NFC) (#168488)

6d3971d

Identified with modernize-loop-convert.

[ARM] Pattern match Low Overhead Loops pseudos (NFC) (#168209)

3cf1f0c

Pull Request: #168209

[flang][OpenMP] Move two utilities from Semantics to Parser, NFC (#16…

c88ae6e

…8549) Move `GetInnermostExecPart` and `IsStrictlyStructuredBlock` from Semantics/openmp-utils.* to Parser/openmp-utils.*. These two only depend on the AST contents and properties.

pull bot locked and limited conversation to collaborators Nov 18, 2025

pull bot added the ⤵️ pull label Nov 18, 2025

pull bot merged commit bd8c941 into optimizecompile:main Nov 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from llvm:main #715

[pull] main from llvm:main #715

Uh oh!

pull bot commented Nov 18, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

36 participants

[pull] main from llvm:main #715

[pull] main from llvm:main #715

Uh oh!

Conversation

pull bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

36 participants

pull bot commented Nov 18, 2025 •

edited

Loading