-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[clang-tidy] Avoid matching nodes in system headers #151035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang-tools-extra @llvm/pr-subscribers-clang-tidy Author: Carlos Galvez (carlosgalvezp) ChangesThis commit is a re-do of e4a8969, which got reverted, with the same goal: dramatically speed-up clang-tidy by avoiding doing work in system headers (which is wasteful as warnings are later discarded). This proposal was already discussed here with favorable feedback: The novelty of this patch is:
As a benchmark, I ran clang-tidy with all checks activated, on a single .cpp file which #includes all the standard C++ headers, then measure the time as well as found warnings. On trunk:
With this patch:
With the original patch that got reverted:
A lot of warnings remain and the runtime is sligthly higher, but we still got a dramatic reduction with no change in functionality. Further investigation has shown that all of the remaining warnings are due to However this may not be as straightforward or wanted, it may even need to be done on a per-check basis (there's about 10 checks or so that would need to explicitly ignore system headers). I will leave that for another patch, it's low priority and does not improve the runtime much (it just prints better statistics). Fixes #52959 Full diff: https://github.com/llvm/llvm-project/pull/151035.diff 7 Files Affected:
diff --git a/clang-tools-extra/clang-tidy/ClangTidy.cpp b/clang-tools-extra/clang-tidy/ClangTidy.cpp
index e84be0461f280..eb91484363d6f 100644
--- a/clang-tools-extra/clang-tidy/ClangTidy.cpp
+++ b/clang-tools-extra/clang-tidy/ClangTidy.cpp
@@ -425,6 +425,10 @@ ClangTidyASTConsumerFactory::createASTConsumer(
FinderOptions.CheckProfiling.emplace(Profiling->Records);
}
+ // Avoid processing system headers, unless the user explicitly requests it
+ if (!Context.getOptions().SystemHeaders.value_or(false))
+ FinderOptions.IgnoreSystemHeaders = true;
+
std::unique_ptr<ast_matchers::MatchFinder> Finder(
new ast_matchers::MatchFinder(std::move(FinderOptions)));
diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst
index 3ea1c5104316f..f761665677f20 100644
--- a/clang-tools-extra/docs/ReleaseNotes.rst
+++ b/clang-tools-extra/docs/ReleaseNotes.rst
@@ -97,6 +97,10 @@ Improvements to clang-tidy
now run checks in parallel by default using all available hardware threads.
Both scripts display the number of threads being used in their output.
+- :program:`clang-tidy` no longer attemps to match AST nodes from system headers
+ by default, greatly improving performance. This behavior is disabled if the
+ `SystemHeaders` option is enabled.
+
New checks
^^^^^^^^^^
diff --git a/clang-tools-extra/test/clang-tidy/infrastructure/file-filter.cpp b/clang-tools-extra/test/clang-tidy/infrastructure/file-filter.cpp
index 448ef9ddf166c..d9ec1049963b0 100644
--- a/clang-tools-extra/test/clang-tidy/infrastructure/file-filter.cpp
+++ b/clang-tools-extra/test/clang-tidy/infrastructure/file-filter.cpp
@@ -66,19 +66,14 @@ class A { A(int); };
// CHECK4-NOT: warning:
// CHECK4-QUIET-NOT: warning:
-// CHECK: Suppressed 3 warnings (3 in non-user code)
// CHECK: Use -header-filter=.* to display errors from all non-system headers.
// CHECK-QUIET-NOT: Suppressed
-// CHECK2: Suppressed 1 warnings (1 in non-user code)
-// CHECK2: Use -header-filter=.* {{.*}}
// CHECK2-QUIET-NOT: Suppressed
-// CHECK3: Suppressed 2 warnings (2 in non-user code)
// CHECK3: Use -header-filter=.* {{.*}}
// CHECK3-QUIET-NOT: Suppressed
// CHECK4-NOT: Suppressed {{.*}} warnings
// CHECK4-NOT: Use -header-filter=.* {{.*}}
// CHECK4-QUIET-NOT: Suppressed
-// CHECK6: Suppressed 2 warnings (2 in non-user code)
// CHECK6: Use -header-filter=.* {{.*}}
int x = 123;
diff --git a/clang-tools-extra/test/clang-tidy/infrastructure/system-headers.cpp b/clang-tools-extra/test/clang-tidy/infrastructure/system-headers.cpp
index 9fa990b6aac8c..a25480e9aa39c 100644
--- a/clang-tools-extra/test/clang-tidy/infrastructure/system-headers.cpp
+++ b/clang-tools-extra/test/clang-tidy/infrastructure/system-headers.cpp
@@ -11,9 +11,9 @@
// RUN: clang-tidy -help | FileCheck -check-prefix=CHECK-OPT-PRESENT %s
// RUN: clang-tidy -checks='-*,google-explicit-constructor' -header-filter='.*' -system-headers=true %s -- -isystem %S/Inputs/system-headers 2>&1 | FileCheck -check-prefix=CHECK-SYSTEM-HEADERS %s
-// RUN: clang-tidy -checks='-*,google-explicit-constructor' -header-filter='.*' -system-headers=false %s -- -isystem %S/Inputs/system-headers 2>&1 | FileCheck -check-prefix=CHECK-NO-SYSTEM-HEADERS %s
+// RUN: clang-tidy -checks='-*,google-explicit-constructor' -header-filter='.*' -system-headers=false %s -- -isystem %S/Inputs/system-headers 2>&1 | FileCheck -check-prefix=CHECK-NO-SYSTEM-HEADERS --allow-empty %s
// RUN: clang-tidy -checks='-*,google-explicit-constructor' -header-filter='.*' -config='SystemHeaders: true' %s -- -isystem %S/Inputs/system-headers 2>&1 | FileCheck -check-prefix=CHECK-SYSTEM-HEADERS %s
-// RUN: clang-tidy -checks='-*,google-explicit-constructor' -header-filter='.*' -config='SystemHeaders: false' %s -- -isystem %S/Inputs/system-headers 2>&1 | FileCheck -check-prefix=CHECK-NO-SYSTEM-HEADERS %s
+// RUN: clang-tidy -checks='-*,google-explicit-constructor' -header-filter='.*' -config='SystemHeaders: false' %s -- -isystem %S/Inputs/system-headers 2>&1 | FileCheck -check-prefix=CHECK-NO-SYSTEM-HEADERS --allow-empty %s
#include <system_header.h>
// CHECK-SYSTEM-HEADERS: system_header.h:1:13: warning: single-argument constructors must be marked explicit
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ec51ffddce1af..7a0b833dd7bce 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -207,6 +207,9 @@ AST Matchers
- Ensure ``hasBitWidth`` doesn't crash on bit widths that are dependent on template
parameters.
+- Add a boolean member ``IgnoreSystemHeaders`` to ``MatchFinderOptions``. This
+ allows it to ignore nodes in system headers when traversing the AST.
+
clang-format
------------
diff --git a/clang/include/clang/ASTMatchers/ASTMatchFinder.h b/clang/include/clang/ASTMatchers/ASTMatchFinder.h
index 73cbcf1f25025..2d36e8c4fae1c 100644
--- a/clang/include/clang/ASTMatchers/ASTMatchFinder.h
+++ b/clang/include/clang/ASTMatchers/ASTMatchFinder.h
@@ -135,10 +135,15 @@ class MatchFinder {
llvm::StringMap<llvm::TimeRecord> &Records;
};
+ MatchFinderOptions() {}
+
/// Enables per-check timers.
///
/// It prints a report after match.
std::optional<Profiling> CheckProfiling;
+
+ /// Avoids matching declarations in system headers.
+ bool IgnoreSystemHeaders{false};
};
MatchFinder(MatchFinderOptions Options = MatchFinderOptions());
diff --git a/clang/lib/ASTMatchers/ASTMatchFinder.cpp b/clang/lib/ASTMatchers/ASTMatchFinder.cpp
index 6d0ba0b7907a1..15d9183c9fddb 100644
--- a/clang/lib/ASTMatchers/ASTMatchFinder.cpp
+++ b/clang/lib/ASTMatchers/ASTMatchFinder.cpp
@@ -1336,6 +1336,45 @@ class MatchASTVisitor : public RecursiveASTVisitor<MatchASTVisitor>,
return false;
}
+ bool isInSystemHeader(const SourceLocation &Loc) {
+ const SourceManager &SM = getASTContext().getSourceManager();
+ return SM.isInSystemHeader(Loc);
+ }
+
+ template <typename T> SourceLocation getNodeLocation(T const &Node) {
+ return Node.getBeginLoc();
+ }
+
+ SourceLocation getNodeLocation(QualType const &Node) { return {}; }
+
+ SourceLocation getNodeLocation(NestedNameSpecifier const &Node) { return {}; }
+
+ SourceLocation getNodeLocation(CXXCtorInitializer const &Node) {
+ return Node.getSourceLocation();
+ }
+
+ SourceLocation getNodeLocation(TemplateArgumentLoc const &Node) {
+ return Node.getLocation();
+ }
+
+ SourceLocation getNodeLocation(Attr const &Node) {
+ return Node.getLocation();
+ }
+
+ template <typename T>
+ auto shouldSkipNode(T const &Node)
+ -> std::enable_if_t<std::is_pointer_v<T>, bool> {
+ return (Node == nullptr) || shouldSkipNode(*Node);
+ }
+
+ template <typename T>
+ auto shouldSkipNode(T const &Node)
+ -> std::enable_if_t<!std::is_pointer_v<T>, bool> {
+ if (Options.IgnoreSystemHeaders && isInSystemHeader(getNodeLocation(Node)))
+ return true;
+ return false;
+ }
+
/// Bucket to record map.
///
/// Used to get the appropriate bucket for each matcher.
@@ -1465,9 +1504,8 @@ bool MatchASTVisitor::objcClassIsDerivedFrom(
}
bool MatchASTVisitor::TraverseDecl(Decl *DeclNode) {
- if (!DeclNode) {
+ if (shouldSkipNode(DeclNode))
return true;
- }
bool ScopedTraversal =
TraversingASTNodeNotSpelledInSource || DeclNode->isImplicit();
@@ -1495,9 +1533,9 @@ bool MatchASTVisitor::TraverseDecl(Decl *DeclNode) {
}
bool MatchASTVisitor::TraverseStmt(Stmt *StmtNode, DataRecursionQueue *Queue) {
- if (!StmtNode) {
+ if (shouldSkipNode(StmtNode))
return true;
- }
+
bool ScopedTraversal = TraversingASTNodeNotSpelledInSource ||
TraversingASTChildrenNotSpelledInSource;
@@ -1507,11 +1545,17 @@ bool MatchASTVisitor::TraverseStmt(Stmt *StmtNode, DataRecursionQueue *Queue) {
}
bool MatchASTVisitor::TraverseType(QualType TypeNode) {
+ if (shouldSkipNode(TypeNode))
+ return true;
+
match(TypeNode);
return RecursiveASTVisitor<MatchASTVisitor>::TraverseType(TypeNode);
}
bool MatchASTVisitor::TraverseTypeLoc(TypeLoc TypeLocNode) {
+ if (shouldSkipNode(TypeLocNode))
+ return true;
+
// The RecursiveASTVisitor only visits types if they're not within TypeLocs.
// We still want to find those types via matchers, so we match them here. Note
// that the TypeLocs are structurally a shadow-hierarchy to the expressed
@@ -1523,6 +1567,9 @@ bool MatchASTVisitor::TraverseTypeLoc(TypeLoc TypeLocNode) {
}
bool MatchASTVisitor::TraverseNestedNameSpecifier(NestedNameSpecifier *NNS) {
+ if (shouldSkipNode(NNS))
+ return true;
+
match(*NNS);
return RecursiveASTVisitor<MatchASTVisitor>::TraverseNestedNameSpecifier(NNS);
}
@@ -1532,6 +1579,9 @@ bool MatchASTVisitor::TraverseNestedNameSpecifierLoc(
if (!NNS)
return true;
+ if (shouldSkipNode(NNS))
+ return true;
+
match(NNS);
// We only match the nested name specifier here (as opposed to traversing it)
@@ -1544,7 +1594,7 @@ bool MatchASTVisitor::TraverseNestedNameSpecifierLoc(
bool MatchASTVisitor::TraverseConstructorInitializer(
CXXCtorInitializer *CtorInit) {
- if (!CtorInit)
+ if (shouldSkipNode(CtorInit))
return true;
bool ScopedTraversal = TraversingASTNodeNotSpelledInSource ||
@@ -1562,11 +1612,17 @@ bool MatchASTVisitor::TraverseConstructorInitializer(
}
bool MatchASTVisitor::TraverseTemplateArgumentLoc(TemplateArgumentLoc Loc) {
+ if (shouldSkipNode(Loc))
+ return true;
+
match(Loc);
return RecursiveASTVisitor<MatchASTVisitor>::TraverseTemplateArgumentLoc(Loc);
}
bool MatchASTVisitor::TraverseAttr(Attr *AttrNode) {
+ if (shouldSkipNode(AttrNode))
+ return true;
+
match(*AttrNode);
return RecursiveASTVisitor<MatchASTVisitor>::TraverseAttr(AttrNode);
}
|
d52db8c
to
cc9a245
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this!
Left a couple of NFC comments on style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome to see this continuing. The changes also look quite clean and extendable to, e.g., modules, like you mentioned. The performance improvement is still incredible to see. Especially when considering library layering in user-code from external or internal libraries, giving this a potentially much stronger impact on analysis time. I'm not sure if we could recover the last ~2.5s compared to the previous change without breaking something.
It's probably an LGTM from me, but I will think about the change for a bit and try it out a bit in a few days (e.g., w.r.t. macros).
@@ -135,10 +135,15 @@ class MatchFinder { | |||
llvm::StringMap<llvm::TimeRecord> &Records; | |||
}; | |||
|
|||
MatchFinderOptions() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MatchFinderOptions() = default;
(why add it in the first place? Having no ctor should be enough)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, it bothers me a lot as well. I am not sure if this is a compiler bug or expected, but the moment I add a non-static member initializer (NSDMI) to the struct, I get the following compiler error:
https://godbolt.org/z/Mb3bnxjjM
Using = default
surprisingly does not work :(
The solutions are:
- Move
MatchFinderOptions
outside theMatchFinder
class (like I did on the original patch). I didn't like this because it introduced unrelated noise to the patch, but I'm happy to do it on a separate patch if we feel it's the cleanest. - Not use NSDMI, and instead create a constructor with member initializer list.
- Add a non-default user-defined constructor (this patch).
cc9a245
to
38fb173
Compare
This commit is a re-do of e4a8969, which got reverted, with the same goal: dramatically speed-up clang-tidy by avoiding doing work in system headers (which is wasteful as warnings are later discarded). This proposal was already discussed here with favorable feedback: llvm#132725 The novelty of this patch is: - It's less aggressive: it does not fiddle with AST traversal. This solves the issue with the previous patch, which impacted the ability to inspect parents of a given node. - Instead, what we optimize for is exitting early in each Traverse* function of MatchASTVisitor if the node is in a system header, thus avoiding calling the match() function with its corresponding callback (when there is a match). - It does not cause any failing tests. - It does not move MatchFinderOptions outside - instead we add a user-defined default constructor which solves the same problem. - It introduces a function "shouldSkipNode" which can be extended for adding more conditions. For example there's a PR open about skipping modules in clang-tidy where this could come handy: llvm#145630 As a benchmark, I ran clang-tidy with all checks activated, on a single .cpp file which #includes all the standard C++ headers, then measure the time as well as found warnings. On trunk: Suppressed 213314 warnings (213314 in non-user code). real 0m14.311s user 0m14.126s sys 0m0.185s With this patch: Suppressed 149399 warnings (149399 in non-user code). real 0m3.583s user 0m3.454s sys 0m0.128s With the original patch that got reverted: Suppressed 8050 warnings (8050 in non-user code). Runtime: around 1 second. A lot of warnings remain and the runtime is sligthly higher, but we still got a dramatic reduction with no change in functionality. Further investigation has shown that all of the remaining warnings are due to PPCallbacks - implementing a similar system-header exclusion mechanism there can lead to almost no warnings left in system headers. This does not bring the runtime down as much, though. However this may not be as straightforward or wanted, it may even need to be done on a per-check basis (there's about 10 checks or so that would need to explicitly ignore system headers). I will leave that for another patch, it's low priority and does not improve the runtime much (it just prints better statistics). Fixes llvm#52959
38fb173
to
220d450
Compare
I'm happy to split the little refactor of |
I think this patch is good and small enough as a whole, so not need to worry about further reduction. |
@5chmidti Here goes 2 more seconds :) With both patches, the result is very close to the reverted patch, just 0.5 seconds diff :) Maybe the PPCallbacks aren't worth touching after all.
|
This commit is a re-do of e4a8969, which got reverted, with the same goal: dramatically speed-up clang-tidy by avoiding doing work in system headers (which is wasteful as warnings are later discarded). This proposal was already discussed here with favorable feedback: #132725
The novelty of this patch is:
It's less aggressive: it does not fiddle with AST traversal. This solves the issue with the previous patch, which impacted the ability to inspect parents of a given node.
Instead, what we optimize for is exitting early in each
Traverse*
function ofMatchASTVisitor
if the node is in a system header, thus avoiding calling thematch()
function with its corresponding callback (when there is a match).It does not cause any failing tests.
It does not move
MatchFinderOptions
- instead we add a user-defined default constructor which solves the same problem.It introduces a function
shouldSkipNode
which can be extended for adding more conditions. For example there's a PR open about skipping modules in clang-tidy where this could come handy: [clang-tidy] [Modules] Skip checking decls in clang-tidy #145630As a benchmark, I ran clang-tidy with all checks activated, on a single .cpp file which #includes all the standard C++ headers, then measure the time as well as found warnings.
On trunk:
With this patch:
With the original patch that got reverted:
A lot of warnings remain and the runtime is sligthly higher, but we still got a dramatic reduction with no change in functionality.
Further investigation has shown that all of the remaining warnings are due to
PPCallbacks
- implementing a similar system-header exclusion mechanism there can lead to almost no warnings left in system headers. This does not bring the runtime down as much, though.However this may not be as straightforward or wanted, it may even need to be done on a per-check basis (there's about 10 checks or so that would need to explicitly ignore system headers). I will leave that for another patch, it's low priority and does not improve the runtime much (it just prints better statistics).
Fixes #52959