forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] main from llvm:main #602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch pivots GPR32 and GPR64 zeroing into distinct branches to simplify the code an improve the lowering. Zeroing GPR moves are now handled differently than non-zeroing ones. Zero source registers WZR and XZR do not require register annotations of undef, implicit and kill. The non-zeroing source now cannot process WZR removing the ternary expression. This patch also moves GPR64 logic right after GPR32 for better organization.
…4071) Add documentation for the no-rollback conversion driver. Also improve the documentation of the old rollback driver. In particular: which modifications are performed immediately and which are delayed.
Handle ptrtoaddr the same way as ptrtoint. The fold already only operates on the index/address bits.
If a main instruction in the copyables is a div-like instruction, the compiler cannot pack duplicates, extending with poisons, these instructions, being vectorize, will result in undefined behavior. Fixes #164185
`UnqualPtrTy` didn't always match `llvm::PointerType::getUnqual`: sometimes it returned a pointer that is not in address space 0 (notably for SPIRV). Since `UnqualPtrTy` was used as the "generic" or "default" pointer type, this patch renames it to `DefaultPtrTy` to avoid confusion with LLVM's `PointerType::getUnqual`.
All the existing tests test code either in ConstantFolding or InstSimplify, so move them to use -passes=instsimplify instead of -passes=instcombine. This makes sure we keep InstSimplify coverage even if there are subsuming InstCombine folds. This requires writing some of the constant folding tests in a different way, as InstSimplify does not try to re-fold already existing constant expressions.
This reverts commit 1943c9e. This took out quite a few buildbots. Some of the Z3 test cases are failing and enabling this is causing some LLVM tests to begin failing.
Add parsing and semantic checks for DEVICE_SAFESYNC clause. No lowering.
This PR fixes a crash in the `bf_getbuffer` implementation of `PyDenseElementsAttribute` that occurred when an element type was not supported, such as `bf16`. I believe that supportion `bf16` is not possible with that protocol but that's out of the scope of this PR. Previsouly, the code raised an `std::exception` out of `bf_getbuffer` that nanobind does not catch (see also pybind/pybind11#3336). The PR makes the function catch all `std::exception`s and manually raises a Python exception instead. Signed-off-by: Ingo Müller <[email protected]>
Add test with urem guard with non-constant divisor and AddRec guards. Extra test coverage for #163021
OpenACC 3.4 includes the ability to add an 'if' to an atomic operation. From the change log: `Added the if clause to the atomic construct to enable conditional atomic operations based867 on the parallelism strategy employed` In 2.12, the C/C++ grammar is changed to say: `#pragma acc atomic [ atomic-clause ] [ if( condition ) ] new-line` With corresponding changes to the Fortran standard This patch adds support to this for the dialect, so that Clang can use it soon.
…es (#163972) The lowering of `!$acc loop` loops with an early exit currently ends-up "duplicating" the control flow in the acc.loop and inside it as explicit control flow (as if each iteration executes each iteration until the early exit). Add a TODO for now.
Move getPreviousSCEVDivisibleByDivisor from a lambda to a static function and clarify the name (DividesBy -> DivisibleBy). Split off refactoring from #163021.
…162993) Early if conversion can create instruction sequences such as ``` mov x1, #1 csel x0, x1, x2, eq ``` which could be simplified into the following instead ``` csinc x0, x2, xzr, ne ``` One notable example that generates code like this is `cmpxchg weak`. This is fixed by handling an immediate value of 1 as `add(wzr, 1)` so that the addition can be folded into CSEL by using CSINC instead.
[lookup](https://llvm.org/doxygen/classllvm_1_1DenseMapBase.html#a0b2ca98dc28c61793ff5c90d23e5f14e) does a find and returns the default if no matching element was found.
…ns (#164099) The `MLInlineAdvisor` currently skips over recursive cases, except that when we delegate to the default policy for non-cold functions, that policy could allow such inlining. The code updating internal state afterwards needs to handle that case. Fix for https://issues.chromium.org/issues/369637577#comment14
If there is a call inside a TEAMS construct, and that call contains a
DISTRIBUTE construct, the DISTRIBUTE region is considered to be enclosed
by the TEAMS region (based on the dynamic extent of the construct).
Currently, Flang diagnoses this as an error, which is incorrect.
For eg :
```
subroutine f
!$omp distribute
do i = 1, 100
...
end do
end subroutine
subroutine g
!$omp teams
call f ! this call is ok, distribute enclosed by teams
!$omp end teams
end subroutine
```
This patch adjusts the nesting check for the OpenMP DISTRIBUTE
directive. It retains the error for DISTRIBUTE directives that are
incorrectly nested lexically but downgrades it to a warning for orphaned
directives to allow dynamic nesting, such as when a subroutine with
DISTRIBUTE is called from within a TEAMS region.
Co-authored-by: Chandra Ghale <[email protected]>
Also replace the undef values with function arguments.
If the type of the ParmVarDecl and the parameter type from the FunctionProtoType don't match, we're in for trouble. Just reject those functions. Fixes #163568
Created new OpenACC utilities library (MLIROpenACCUtils) containing helper functions for region analysis, value usage checking, default attribute lookup, and type categorization. Includes comprehensive unit tests and refactors existing getEnclosingComputeOp function into the new library.
Per-entry-point metrics are captured during the path-sensitive analysis time. For that reason, it is not trivial to add the syntax-only analysis time as it runs in a separate stage. Luckily syntax-only analysis is done before path-senstivie analysis. I use the function summary field to keep the syntax-only anlaysis time once syntax analysis is done, and then forward it to the per-EP metrics snapshot during the path-sensitive analysis. Note that some of the entry points that were analyzed by syntax-only rules may be missing in the CSV export if they were never analyzed by path-sensitive rules. Conversely, if a function is analyzed with path-sensitive analysis but not syntax-only analysis, its `SyntaxRunningTime` will be empty. -- CPP-7099
These tests were setting environment variables, which needs to be done explicitly with env when using the internal shell.
Per Intel Architecture Instruction Set Extensions Programming Reference rev. 59 (https://cdrdv2.intel.com/v1/dl/getContent/671368), table 1-2, DMR doesn't support USER_MSR (URDMSR and UWRMSR instructions)
This PR exposes `translate_module_to_llvmir` in the Python bindings.
This test has loop iterating past (`61`) the array boundaries (`58`). So far this didn't seem to matter, but recently with this change #155253 the constraint elimination in swift has been able to figure this out and is transforming the loop into an infinite one like this ``` *** IR Dump After ConstraintEliminationPass on test_known_trip_count *** define void @test_known_trip_count() local_unnamed_addr { entry: br label %for.body for.body: ; preds = %entry, %for.body %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] %arrayidx = getelementptr inbounds nuw double, ptr @b, i64 %indvars.iv %0 = load double, ptr %arrayidx, align 8 %arrayidx2 = getelementptr inbounds nuw double, ptr @c, i64 %indvars.iv %1 = load double, ptr %arrayidx2, align 8 %add = fadd double %0, %1 %arrayidx4 = getelementptr inbounds nuw double, ptr @A, i64 %indvars.iv store double %add, ptr %arrayidx4, align 8 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 br i1 false, label %exit, label %for.body exit: ; preds = %for.body ret void }% ``` causing the test to fail. This is trying to address the root cause here.
Previously, invalid offset is set to UINT64_MAX, this is not right when
DWARF32, which leads to incorrect debug into in GSYM, the branch:
```
if (StmtSeqVal != UNIT64_MAX)
StmtSeqOffset = StmtSeqVal;
```
will always be true.
In this PR, [commit
1](b1983d6)
sets up a test that demonstrates the problem, [commit
2](0d58ce4)
fixes it.
[Diffing commit 1 and
2](0d58ce4#diff-019bdbc9922ad34fdfbcb524a9805f5af26c432540e76b87a6a5f73d9e0e853aL44)
in this PR shows how after the PR the symbolicated line number changed
from function definition to function body
Currently when peeling the first iteration, any mentioning of UB within the loop body is replaced with the new UB in the peeled out first iteration. This introduces a bug in the following scenario: Operations inside of the loop that intentionally use the original UB are incorrectly updated.
Remove support for long unsupported Ubuntu, Debian and RHEL. Add support for RHEL 8, 9 and 10 and recognize Rocky and AlmaLinux as RHEL.
…r in SPIRVUtils (#164248) There was some repeated code that was used to deduce the SPIRV::LinkageType from a GlobalVariable/Function. At several related parts of the code we also had functions taking 2 parameters: a 'hasLinkage' bool, and a 'LinkageType'. This is error-prone since the later parameter's meaning depends on the first. This patch also merges these two options into a single `std::optional<SPIRV::LinkageType>`.
…lasses (#163588) Extend CS rule to use namespace qualifiers to define previously declared functions to variables and classes as well.
This adopts use of namespace qualifiers to define previously declared functions as per LLVM CS: https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions
- Fix function names to conform to LLVM CS and mark local function static. - Use range for loops to simplify code. - use `interleave` instead of manual loops to print lists.
Upstream support ComplexType as a function return type Issue #141365
This patch implements visitors for MemberExpr, UnaryDeref, StringLiteral and CompoundLiteralExpr inside aggregate expressions.
This hardens the unwinding logic and datastructures on systems that support pointer authentication. The approach taken to hardening is to harden the schemas of as many high value fields in the myriad structs as possible, and then also explicitly qualify local variables referencing privileged or security critical values. This does introduce ABI linkage between libcxx, libcxxabi, and libunwind but those are in principle separate from the OS itself so we've kept the schema definitions in the library specific headers rather than ptrauth.h
Implement CXXDefaultArgExpr support for ComplexType Issue #141365
Added support for ConditionalOperator, BinaryConditionalOperator and OpaqueValueExpr as lvalue. Implemented support for ternary operators with one branch being a throw expression. This required weakening the requirement that the true and false regions of the ternary operator must terminate with a `YieldOp`. Instead the true and false regions are now allowed to terminate with an `UnreachableOp` and no `YieldOp` gets emitted when the block throws.
This were all removed in #160028, but I apparently missed this one instance in the documentation. Remove it given that it no longer works.
This patch adds a new script, premerge_advisor_explain.py that requests test failure explanations from the premerge advisor. For now it just prints them out to STDOUT. This allows for testing of the entire system by looking at failure explanations in failed jobs before we do the rest of the wiring to enable the premerge advisor to write out comments.
… AVX/AVX512 subvector extraction intrinsics to be used in constexpr #157712 (#162836) **This PR supersedes and replaces PR #158853** The original branch diverged too far from the main branch, resulting in significant merge conflicts that were difficult to resolve cleanly. To provide a clean and reviewable history, this new PR was created by cherry-picking the necessary commits onto a fresh branch based on the latest `main`. --- *(Original Description)* This patch enables the use of AVX/AVX512 subvector extraction intrinsics within `constexpr` functions. This is achieved by implementing the evaluation logic for these intrinsics in `VectorExprEvaluator::VisitCallExpr` and `InterpretBuiltin`. The original discussion and review comments can be found in the previous pull request for context: #158853 Fixes #157712
The primary purpose of this commit is to enable marking loads to LDS (global.load.lds, buffer.*.load.lds) volatile (using bit 31 of the aux as with normal buffer loads) and to ensure that their !nontemporal annotations translate to appropriate settings of te cache control bits. However, in the process of implementing this feature, we also fixed - Incorrect handling of buffer loads to LDS in GlobalISel - Updating the handling of volatile on buffers in SIMemoryLegalizer: previously, the mapping of address spaces would cause volatile on buffer loads to be silently dropped on at least gfx10. --------- Co-authored-by: Matt Arsenault <[email protected]>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )