Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Oct 29, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

clementval and others added 30 commits October 29, 2025 17:55
… in the AffineForEmptyLoopFolder (#164064)

Co-authored-by: Jakub Kuderski <[email protected]>
…/TESTP node just uses the ZF flag (#165601)

If we're just comparing against zero then move the constant to the RHS to reduce duplicated folds.

Noticed while triaging #156233
…65453)

Modify the python wrapper to return uint32_t,
which prevents incorrect child name-to-index mapping and avoids
performing redundant operations on non-existent SBValues.
This follows similar reasoning as 45ce887
(#159556):

LV does not preserve LCSSA, it constructs it just before processing a
loop to vectorize. Runtime check expressions are invariant to that loop,
so expanding them should not break LCSSA form for the loop we are about
to vectorize.

LV creates SCEV and memory runtime checks early on and then disconnects
the blocks temporarily. The patch fixes a mis-compile, where previously
LCSSA construction during SCEV expand may replace uses in currently
unreachable SCEV/memory check blocks.

Fixes #162512

PR: #165505
… operand in the AffineForEmptyLoopFolder" (#165607)

Reverts #164064

Broke Windows on mlir-s390x-linux buildbot build, needs investigations.
Previously, `hoistCommonCodeFromSuccessors` returned early if one of the
succ of BB has >1 predecessors.

However, if the succ is an unreachable BB, we can relax the condition to
perform `hoistCommonCodeFromSuccessors` based on the assumption of not
reaching UB.

See discussion dtcxzyw/llvm-opt-benchmark#2989
for details.

Alive2 proof: https://alive2.llvm.org/ce/z/OJOw0s
Promising optimization impact:
dtcxzyw/llvm-opt-benchmark#2995
After the default PDB plugin changed to the native one (#165363), this
test failed, because it uses the size of public symbols and the native
plugin sets the size to 0 (as PDB doesn't include this information
explicitly). A PDB was built because the final executable in that test
was linked with `-gdwarf`.
…nsert instruction

If the gather/buildvector node has the match and this matching node has
a scheduled copyable parent, and the parent node of the original node
has a last instruction, which is non-schedulable and is part of the
schedule copyable parent, such matching node should be excluded as
non-matching, since it produces wrong def-use chain.

Fixes #165435
Attaching using `core`, `gdbremote` or `attachInfo` may have an error.
fail early if it does.
This pr adds the equivalent validation of `llvm.loop` metadata that is
[done in
DXC](https://github.com/microsoft/DirectXShaderCompiler/blob/8f21027f2ad5dcfa63a275cbd278691f2c8fad33/lib/DxilValidation/DxilValidation.cpp#L3010).

This is done as follows:
- Add `llvm.loop` to the metadata allow-list in `DXILTranslateMetadata`
- Iterate through all `llvm.loop` metadata nodes and strip all
incompatible ones
- Raise an error for ill-formed nodes that are compatible with DXIL

Resolves: #137387
... which silently caused the wrong overload to be selected.
To my knowledge, NetBSD is mostly like other BSDs, but doesn't have
`xlocale.h`. I think c664a7f may have inadvertently broken this.

With this change, I was able to run
[zig-bootstrap](https://github.com/ziglang/zig-bootstrap) to completion
for `x86_64-netbsd10.1-none`.
…5611)

When we create a `SparseIterator`, we sometimes wrap it in a
`FilterIterator`, which delegates _some_ calls to the underlying
`SparseIterator`.

After construction, e.g. in `makeNonEmptySubSectIterator()`, we call
`setSparseEmitStrategy()`. This sets the strategy only in one of the
filters -- if we call `setSparseEmitStrategy()` immediately after
creating the `SparseIterator`, then the wrapped `SparseIterator` will
have the right strategy, and the `FilterIterator` strategy will be
unintialized; if we call `setSparseEmitStrategy()` after wrapping the
iterator in `FilterIterator`, then the opposite happens.

If we make `setSparseEmitStrategy()` a virtual method so that it's
included in the `FilterIterator` pattern, and then do all reads of
`emitStrategy` via a virtual method as well, it's pretty simple to
ensure that the value of `strategy` is being set consistently and
correctly.

Without this, the UB of strategy being uninitialized manifests as a
sporadic test failure in
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir,
when run downstream with the right flags (e.g. asan + assertions off).
The test sometimes fails with `ne_sub<trivial<dense[0,1]>>.begin' op
created with unregistered dialect`. It can also be directly observed w/
msan that this uninitialized read is the cause of that issue, but msan
causes other problems w/ this test.
This pr introduces an allow-list for module metadata, this encompasses
the llvm metadata nodes: `llvm.ident` and `llvm.module.flags`, as well
as, the generated `dx.` options.

Resolves: #164473.
…165496)

We currently use a background thread to read the DAP output. This means
the test thread and the background thread can race at times and we may
have inconsistent timing due to these races.

To improve the consistency I've removed the reader thread and instead
switched to using the `selectors` module that wraps `select` in a
platform independent way.
And reduce the number of getLLVMStyleWithColumnLimit calls.
Fix getShadowAddress computation by adding ShadowBase if it is not zero.

Co-authored-by: anoopkg6 <[email protected]>
This consists of marking the various strict opcodes as legal, and
adjusting instruction selection patterns so that 'op' is 'any_op'. The
changes are similar to those in D114946 for AArch64.

Custom lowering and promotion are set for some FP16 strict ops to work
correctly.

This PR is part of the work on adding strict FP support in ARM, which
was previously discussed in #137101.
The CRC optimization relies on stripping the auxiliary data completely,
and should hence be forbidden when it has a user in the exit-block.
Forbid this case, fixing a miscompile.

Fixes #165382.
CreateProcess fails with ERROR_INVALID_PARAMETER when duplicate HANDLEs
are passed via `PROC_THREAD_ATTRIBUTE_HANDLE_LIST`. This can happen, for
example, if stdout and stdin are the same device (e.g. a bidirectional
named pipe), or if stdout and stderr are the same device.

Fixes msys2/MINGW-packages#26030
…5628)

Extends OpenACCSupport utilities to include recipe name generation and
error reporting for unsupported features, providing foundation for
variable privatization handling.

Changes:
- Add RecipeKind enum (private, firstprivate, reduction) for APIs that
request a specific kind of recipe
- Add getRecipeName() API to OpenACCSupport and OpenACCUtils that
generates recipe names from types (e.g.,
"privatization_memref_5x10xf32_")
- Add emitNYI() API to OpenACCSupport for graceful handling of
not-yet-implemented cases
- Generalize MemRefPointerLikeModel template to support
UnrankedMemRefType
- Add unit tests and integration tests for new APIs
Identified with readability-duplicate-include.
@pull pull bot locked and limited conversation to collaborators Oct 29, 2025
@pull pull bot added the ⤵️ pull label Oct 29, 2025
@pull pull bot merged commit 03e66ae into optimizecompile:main Oct 29, 2025
10 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.