forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 75
merge main into amd-staging #354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ronlieb
merged 47 commits into
amd-staging
from
amd/merge/upstream_merge_20251022171705
Oct 23, 2025
Merged
merge main into amd-staging #354
ronlieb
merged 47 commits into
amd-staging
from
amd/merge/upstream_merge_20251022171705
Oct 23, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
See https://discourse.llvm.org/t/psa-opty-create-now-with-100-more-tab-complete/87339. I plan to mark these as deprecated in llvm#164649.
…utions (llvm#164515) Some operations like the gpu.func have arguments that need to stay in place while rewriting the signature. This is the case for the workgroup and private attribution. Update the target rewrite pass to be aware of that when adding argument at the end of the function signature. If any trailing arguments are present, the new argument will be inserted just before them.
Since llvm#158381 the `CompilerInstance` is aware of the VFS and co-owns it. To reduce scope of that PR, the VFS was being inherited from the `FileManager` during `setFileManager()` if it wasn't configured before. However, the implementation of that setter was buggy. This PR fixes the bug, and moves us closer to the long-term goal of `CompilerInstance` requiring the VFS to be configured explicitly and owned by the instance.
…get-tasks (llvm#155348) This PR adds support for translation of the private clause on deferred target tasks - that is `omp.target` operations with the `nowait` clause. An offloading call for a deferred target-task is not blocking - the offloading (target-generating) host task continues its execution after issuing the offloading call. Therefore, the key problem we need to solve is to ensure that the data needed for private variables to be initialized in the target task persists even after the host task has completed. We do this in a new pass called `PrepareForOMPOffloadPrivatizationPass`. For a privatized variable that needs its host counterpart for initialization (such as the shape of the data from the descriptor when an allocatable is privatized or the value of the data when an allocatable is firstprivatized), - the pass allocates memory on the heap. - it then initializes this memory by using the `init` and `copy` (for firstprivate) regions of the corresponding `omp::PrivateClauseOp`. - Finally the memory allocated on the heap is freed using the `dealloc` region of the same `omp::PrivateClauseOp` instance. This step is not straightforward though, because we cannot simply free the memory that's going to be used by another thread without any synchronization. So, for deallocation, we create a `omp.task` after the `omp.target` and synchronize the two with a dummy dependency (using the `depend` clause). In this newly created `omp.task` we do the deallocation.
The comment here pointed out that RAUW would fall over given a constantexpr, but then proceeded to just do what RAUW does by hand, which falls over in the same way. Instead, convert constantexprs involving cbuffer globals to instructions before processing them. The test update just modifies the existing cbuffer test, since it implied it was trying to test this exact case anyways.
DXILResource was falling over trying to name a resource type that contained an array, such as `StructuredBuffer<float[3][2]>`. Handle this by walking through array types to gather the dimensions.
To make the CI happy again.
Handling opcodes in embedding computation. - Revamped MIR Vocabulary with four sections - `Opcodes`, `Common Operands`, `Physical Registers`, and `Virtual Registers` - Operands broadly fall into 3 categories -- the generic MO types that are common across architectures, physical and virtual register classes. We handle these categories separately in MIR2Vec. (Though we have same classes for both physical and virtual registers, their embeddings vary).
…y" (llvm#164670) Reverts llvm#164048 This led to a regression in clang-format where a space gets added in between the parameter type and `&`. For example, this ``` ::test_anonymous::FunctionApplication& ::test_anonymous::FunctionApplication::operator=(const ::test_anonymous::FunctionApplication& other) noexcept { ``` becomes ``` ::test_anonymous::FunctionApplication& ::test_anonymous::FunctionApplication::operator=(const ::test_anonymous::FunctionApplication & other) noexcept { ```
* Add FILE type declaration, as it should be presented in `<wchar.h>`, as well as in `<stdio.h>` * Fix argument type in `wcsrtombs` / `wcsnrtombs` function - it should be restrict pointer to `mbstate_t`. Add restrict qualifier to internal implementation as well. This brings us closer to being able to build libcxx with wide-character support against llvm-libc headers.
RFC https://discourse.llvm.org/t/rfc-bounds-checking-interfaces-for-llvm-libc/87685 Add internal support macros required by Annex K interface in LLVM libc.
See https://discourse.llvm.org/t/psa-opty-create-now-with-100-more-tab-complete/87339. I plan to make these deprecated in llvm#164649.
Relates to llvm#119281 Note: 1) As this PR enables `-Werror` for `libc` tests, it's very likely some downstream CI's may fail / start failing, so it's very likely this PR may need to be reverted and re-applied. P.S. I do not have merge permissions, so I will need one of the reviews to merge it for me. Thank you!
Add a flag to the GlobalValueSummaryInfo indicating whether the associated SummaryList (all summaries with the same GUID) contains any summaries with local linkage. This flag is set when building the index, so it is associated with the original linkage type before internalization and promotion. Consumers should check the withInternalizeAndPromote() flag on the index before using it. In most cases we expect a 1-1 mapping between a GUID and a summary with local linkage, because for locals the GUID is computed from the hash of "modulepath;name". However, there can be multiple locals with the same GUID if translation units are not compiled with enough path. And in rare but theoretically possible cases, there can be hash collisions on the underlying MD5 computation. So to be safe when looking for local summaries, analyses currently look through all summaries in the list. These lists can be extremely long in the case of large binaries with template function defs in widely used headers (i.e. linkonce_odr). A follow on change will use this flag to reduce ThinLTO analysis time in WPD by 5-6% for a large target (details in PR164046 which will be reworked to use this flag). Note that in the past we have tried to keep bits related to the GUID in the ValueInfo (which has a pointer to the associated GlobalValueSummaryInfo), via its PointerIntPair. However, we are out of bits there. This change does add a byte to every GlobalValueSummaryInfo instance, which I measured as a little under 0.90% overhead in a large target. However, it enables adding 7 bits of other per-GUID flags in the future without adding more overhead. Note that it was lower overhead to add this to the GlobalValueSummaryInfo than the ValueInfo, which tends to be copied into other maps.
…m#164678) The recently added ulimit_reset.txt section in shtest-ulimit.py was failing on some builders if the default file descriptor limit started with 50. This patch fixes that by explicitly checking that the file descriptor limit is equal to the default value.
…lback (llvm#164149) The last use of it was removed in 6163aa9.
…lvm#164649) These have been soft-deprecated since July: https://discourse.llvm.org/t/psa-opty-create-now-with-100-more-tab-complete/87339 Add a deprecation attribute to prevent new uses from creeping in.
The --source option was broken when using the --macho flag because DisassembleMachO() only initialized debug info when UseDbg was true, and would return early if no dSYM was found.
This patch provides an approximation of the memory locations touched by `llvm.matrix.column.major.load` and `llvm.matrix.column.major.store`, enabling dead store elimination and GVN to remove redundant loads and dead stores. PR: llvm#163368
…m#164676) In 4368616 we accidentally moved uses of command-line args saved into a bump pointer allocator during response file expansion out of scope of the allocator. Also, the test that should have caught this (at least with asan) was not working correctly because clang-scan-deps was expanding response files itself during argument adjustment rather than the underlying scanner library. rdar://162720059
Reviewers: Pull Request: llvm#164693
We already have a matching constructor from ArrayRef, so add support for assigning from ArrayRef as well.
This testcase shows that adding a ubsan check and then removing it during the LowerAllowCheck pass does not entirely undo the effects of adding the check.
…izations. (llvm#149706)" This reverts commit 8d29d09. There have been reports of mis-compiles in llvm#149706. Revert while I investigate.
…lvm#164677) The OpenACC data clause operation `acc.copyin` used for mapping variables to device memory includes bookkeeping required by the OpenACC spec for updating present counters. However, for firstprivate variables, no counters should be updated since this clause creates a private copy on the device initialized with the original value from the host (as described in OpenACC 3.4 section 2.5.14: "the copy will be initialized with the value of that item on the local thread"). This PR introduces the `acc.firstprivate_map` operation to capture these mapping semantics without counter updates. A test is included demonstrating how this operation can be used to initialize a materialized private variable (represented by `memref.alloca` inside an `acc.parallel` region).
Test update was missed in bfc322d due a codegen test running loop-vectorize directly. The loop does not get vectorized any longer.
This PR refactors `ASTUnit::LoadFromASTFile()` to be easier to follow. Conceptually, it tries to read an AST file, adopt the serialized options, and set up `Sema` and `ASTContext` to deserialize the AST file contents on-demand. The implementation of this used to be spread across an `ASTReaderListener` and the function in question. Figuring out what listener method gets called when and how it's supposed to interact with the rest of the functionality was very unclear. The `FileManager`'s VFS was being swapped-out during deserialization, the options were being adopted by `Preprocessor` and others just-in-time to pass `ASTReader`'s validation checks, and the target was being initialized somewhere in between all of this. This lead to a very muddy semantics. This PR splits `ASTUnit::LoadFromASTFile()` into three distinct steps: 1. Read out the options from the AST file. 2. Initialize objects from the VFS to the `ASTContext`. 3. Load the AST file and hook it up with the compiler objects. This should be much easier to understand, and I've done my best to clearly document the remaining gotchas. (This was originally motivated by the desire to remove `FileManager::setVirtualFileSystem()` and make it impossible to swap out VFSs from underneath `FileManager` mid-compile.)
…63872) This allows comparison which these status codes
Make call graph section to have a dedicated type instead of the generic progbits type.
Collaborator
dpalermo
approved these changes
Oct 23, 2025
…vm#164323)" breaks build/test of comgr This reverts commit 866879f.
Collaborator
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.