Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Oct 20, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

erichkeane and others added 30 commits October 20, 2025 18:02
These two are lowered as if they are the expression: LHS = (LHS < RHS )
? RHS : LHS;
and
LHS = (LHS < RHS ) ? LHS : RHS;

This patch generates these expressions and ensures they are properly
emitted into IR.

Note: this is dependent on
#163580
and cannot be merged until that one is (or the tests will fail).
Replace with PatGprShiftMaskXLen/PatGprShiftMask32 or using the
ShiftMaskXLen/ShiftMask32 ComplexPattern direclty in patterns.

This avoids various casts that were need to make a ComplexPattern work
inside of a PatFrag.
Variant part, represented by `DW_TAG_variant_part` is a structure with a
discriminant and different variants, from which only one can be active
and valid at the same time. The discriminant is the main difference
between variant parts and unions represented by `DW_TAG_union` type.

Variant parts are used by Rust enums, which look like:

```rust
pub enum MyEnum {
    First { a: u32, b: i32 },
    Second(u32),
}
```

This type's debug info is the following `DICompositeType` with
`DW_TAG_structure_type` tag:

```llvm
!4 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyEnum",
     scope: !2, file: !5, size: 96, align: 32, flags: DIFlagPublic,
     elements: !6, templateParams: !16,
     identifier: "faba668fd9f71e9b7cf3b9ac5e8b93cb")
```

With one element being also a `DICompositeType`, but with
`DW_TAG_variant_part` tag:

```llvm
!6 = !{!7}
!7 = !DICompositeType(tag: DW_TAG_variant_part, scope: !4, file: !5,
     size: 96, align: 32, elements: !8, templateParams: !16,
     identifier: "e4aee046fc86d111657622fdcb8c42f7", discriminator: !21)
```

Which has a discriminator:

```llvm
!21 = !DIDerivedType(tag: DW_TAG_member, scope: !4, file: !5,
      baseType: !13, size: 32, align: 32, flags: DIFlagArtificial)
```

Which then holds different variants as `DIDerivedType` elements with
`DW_TAG_member` tag:

```llvm
!8 = !{!9, !17}
!9 = !DIDerivedType(tag: DW_TAG_member, name: "First", scope: !7,
     file: !5, baseType: !10, size: 96, align: 32, extraData: i32 0)
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "First",
      scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
      elements: !11, templateParams: !16,
      identifier: "cc7748c842e275452db4205b190c8ff7")
!11 = !{!12, !14}
!12 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !10,
      file: !5, baseType: !13, size: 32, align: 32, offset: 32,
      flags: DIFlagPublic)
!13 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!14 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !10,
      file: !5, baseType: !15, size: 32, align: 32, offset: 64,
      flags: DIFlagPublic)
!15 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!16 = !{}
!17 = !DIDerivedType(tag: DW_TAG_member, name: "Second", scope: !7,
      file: !5, baseType: !18, size: 96, align: 32, extraData: i32 1)
!18 = !DICompositeType(tag: DW_TAG_structure_type, name: "Second",
      scope: !4, file: !5, size: 96, align: 32, flags: DIFlagPublic,
      elements: !19, templateParams: !16,
      identifier: "a2094b1381f3082d504fbd0903aa7c06")
!19 = !{!20}
!20 = !DIDerivedType(tag: DW_TAG_member, name: "__0", scope: !18,
      file: !5, baseType: !13, size: 32, align: 32, offset: 32,
      flags: DIFlagPublic)
```

BPF backend was assuming that all the elements of any `DICompositeType`
have tag `DW_TAG_member` and are instances of `DIDerivedType`. However,
the single element of the outer composite type `!4` has tag
`DW_TAG_variant_part` and is an instance of `DICompositeType`. The
unconditional call of `cast<DIDerivedType>` on all elements was causing
an assertion failure when any Rust code with enums was compiled to the
BPF target.

Fix that by:

* Handling `DW_TAG_variant_part` in `visitStructType`.
* Replacing unconditional call of `cast<DIDerivedType>` over
`DICompositeType` elements with a `switch` statement, handling both
`DW_TAG_member` and `DW_TAG_variant_part` and casting the element to an
appropriate type (`DIDerivedType` or `DICompositeType`).

Fixes: #155778
Add `try_lock` to confirm to Lockable, which is necessary to use it with
`std::scoped_lock`.
Having taken on a maintainer role for these dialects, make it official
with a CODEOWNERS entry.

---------

Co-authored-by: Jakub Kuderski <[email protected]>
Suggest the `initializer_list` overload instead.

4+ args is an arbitrary number that allows for incremental deprecation
without having too update too many call sites.

For more context, see #163117.
This patch implements llvm::countr_zero_constexpr, a constexpr version
of llvm::countr_zero, in terms of llvm::popcount while making
llvm::popcount a constexpr function at the same time.

The new function is intended to serve as a marker.  When we switch to
C++20, we will most likely go through functions in llvm/ADT/bit.h and
replace them with their counterparts from <bit>.  With
llvm::countr_zero_constexpr, we can easily replace its use with
std::countr_zero.

This patch reimplements ConstantLog2 in terms of the new function.
This rewrite does not preserve numerics: for example, we'd expect the
maximum fp value to yield Inf instead of identity.

`GL.Length` does not allow for fast math flags, so we need to remove
this. Special cases (constants) can be handled via a folder if someone
wants to implement one.
These two files were left during the upstream of the corresponding
feature.
Add builders on the Python side that match builders in the C++ side, add tests for launching GPU kernels and regions, and correct some small documentation mistakes. This reflects the API decisions already made in the func dialect's Python bindings and makes use of the GPU dialect's bindings work more similar to C++ interface.
…ns (#163863)

Before the patch the added test case would indent the function and
moving its second line beyond the column limit.

Fixes #68122.
As the Cygwin platform requires $PATH to be set in order to run
unittests, do the same as for the regular Windows target.
…164039)

Two of the tests are currently asserting, and two are emitting
unexpected results.

The asserting tests will be fixed using the ATTACH-style codegen from
#153683.

The other two involve `use_device_addr` on byrefs, and need more
follow-up codegen changes, that have been noted in a FIXME comment.
…ion variable (#164147)

`@SHLIBDIR@` is replaced by CMake's configuration function, so it must
be in `lit.site.cfg.py.in` but not `lit.cfg.py`. `lit.cfg.py` must
reference variables in generated `lit.site.cfg.py`.

We didn't notice this problem because it only affects Windows (including
MinGW and Cygwin) that are configured with either
LLVM_LINK_LLVM_DYLIB=ON or BUILD_SHARED=ON.
Add OnDiskGraphDB and OnDiskKeyValueDB that can be used to implement
ObjectStore and ActionCache respectively. Those are on-disk persistent
storage that build upon OnDiskTrieHashMap and implements key functions
that are required by LLVMCAS interfaces.

This abstraction layer defines how the objects are hashed and stored on
disk. OnDiskKeyValueDB is a basic OnDiskTrieHashMap while OnDiskGraphDB
also defines:
* How objects of various size are store on disk and are referenced by
  the trie nodes.
* How to store the references from one stored object to another object
  that is referenced.

In addition to basic APIs for ObjectStore and ActionCache, other
advances database configuration features can be implemented in this
layer without exposing to the users of the LLVMCAS interface. For
example, OnDiskGraphDB has a faulty in function to fetch data from an
upstream OnDiskGraphDB if the data is missing.
Move the parse tree utility function
semantics::getDesignatorNameIfDataRef to Parser/tools.h and rename it to
comply with the local style.
This fixes a build error when building tensorflow on riscv64 linux.
This variable is only read from.
When the user send `thread return <expr>` command this changes the stack
length but the UI does not update.
Send stack invalidated event to the client to update the stack.
Co-authored-by: Rahul Utkoor <[email protected]>
Co-authored-by: Brendon Cahoon <[email protected]>
Co-authored-by: abhikran <[email protected]>
Co-authored-by: Sumanth Gundapaneni <[email protected]>
Co-authored-by: Ikhlas Ajbar <[email protected]>
Co-authored-by: Anirudh Sundar <[email protected]>
Co-authored-by: Yashas Andaluri <[email protected]>
Co-authored-by: quic-santdas <[email protected]>
That led us to overwrite the data of the last row with the geomean.
The mock was not accurate, absl defines in_place[_t] as an alias to
std::in_place[_t].
…164317)

Bypass the declare op because it is rewritten in CUFOpConversion and
will only provide the device address. c_loc is expected to have the host
address of a device address to be used in API like `cudaMemcpyToSymbol`
so we need to provide the address of op directly.
Set block size x and y to 1024 if the given value is higher. Set block z
to 64 if the given value is higher.
TruncInst must truncate at most to their destination. Return false if
MinBWs contains a destination size > the trunc result type size.

Fixes #162688.
target("aarch64.svcount") is not properly supported by MSan, and will
lead to a crash:
```
fatal error: error in backend: Cannot implicitly convert a scalable size to a fixed-width size in `TypeSize::operator ScalarTy()`
```

This commit adds two test cases: a full test case for tracking any
future improvements to the instrumentation (and also showing the crash),
and a manually reduced test case to show the crash.

Forked from llvm/test/CodeGen/AArch64/sme-aarch64-svcount.ll
andykaylor and others added 8 commits October 20, 2025 15:05
This upstreams the implementation for handling binary assignment
involving aggregate types.
#164319)

This way, testing with --debug flag can correctly specify that it
requires assertions.

This is a fix for #164098
If the parent node is non-schedulable and it includes several copies of
the same instruction, its operand might be replaced by the copyable
nodes in multiple children nodes, and if the instruction is commutative,
they can be used in different operands. The compiler shall consider this
opportunity, taking into account that non-copyable children are
scheduled only ones for the same parent instruction.

Fixes #164242
…#163027)

WaitingOnGraph tracks waiting-on relationships between nodes (intended
to represent symbols in an ORC program) in order to identify nodes that
are *Ready* (i.e. are not waiting on any other nodes) or have *Failed*
(are waiting on some node that cannot be produced).

WaitingOnGraph replaces ORC's baked-in data structures that were
tracking the same information (EmissionDepUnit, EmissionDepUnitInfo,
...). Isolating this information in a separate data structure simplifies
the code, allows us to unit test it, and simplifies performance testing.

The WaitingOnGraph uses several techniques to improve performance
relative to the old data structures, including symbol coalescing
("SuperNodes") and symbol keys that don't perform unnecessary reference
counting (NonOwningSymbolStringPtr).

This commit includes unit tests for common dependence-tracking issues
that have led to ORC bugs in the past.
StopInfoBreakpoint keeps a BreakpointLocationCollection for all the
breakpoint locations at the BreakpointSite that was hit. It is also
lives through the time a given thread is stopped, so there are plenty of
opportunities for one of the owning breakpoints to get deleted.

But BreakpointLocations don't keep their owner Breakpoints alive, so if
the BreakpointLocationCollection can live past when some code gets a
chance to delete an owner breakpoint, and then you ask that location for
some breakpoint information, it will access freed memory.

This wasn't a problem before PR #158128 because the StopInfoBreakpoint
just kept the BreakpointSite that was hit, and when you asked it
questions, it relooked up that list. That was not great, however,
because if you hit breakpoints 5 & 6, deleted 5 and then asked which
breakpoints got hit, you would just get 6. For that and other reasons
that PR changed to storing a BreakpointLocationCollection of the
breakpoints that were hit. That's better from a UI perspective but
caused this potential problem.

I fix it by adding a variant of the BreakpointLocationCollection that
also holds onto a shared pointer to the Breakpoints that own the
locations that were hit, thus keeping them alive till the
StopInfoBreakpoint goes away.

This fixed the ASAN assertion. I also added a test that works harder to
cause trouble by deleting breakpoints during a stop.
@pull pull bot locked and limited conversation to collaborators Oct 20, 2025
@pull pull bot added the ⤵️ pull label Oct 20, 2025
@pull pull bot merged commit c9124a1 into optimizecompile:main Oct 20, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.