-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[LLDB][NativePDB] Allow type lookup in namespaces #149876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lldb Author: nerix (Nerixyz) ChangesPreviously, This PR changes the lookup to go through all types and check their base name. As that could be a bit expensive, the names are first cached (similar to the function lookup in the DIA PDB plugin). Potential types are checked with To be able to handle anonymous namespaces, I changed This enables Full diff: https://github.com/llvm/llvm-project/pull/149876.diff 5 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index 20d8c1acf9c42..5141632649dd5 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -1630,6 +1630,53 @@ size_t SymbolFileNativePDB::ParseSymbolArrayInScope(
return count;
}
+void SymbolFileNativePDB::CacheTypeNames() {
+ if (!m_type_base_names.IsEmpty())
+ return;
+
+ LazyRandomTypeCollection &types = m_index->tpi().typeCollection();
+ for (auto ti = types.getFirst(); ti; ti = types.getNext(*ti)) {
+ CVType cvt = types.getType(*ti);
+ llvm::StringRef name;
+ // We are only interested in records, unions, and enums.
+ // We aren't interested in forward references as we'll visit the actual
+ // type later anyway.
+ switch (cvt.kind()) {
+ case LF_STRUCTURE:
+ case LF_CLASS: {
+ ClassRecord cr;
+ llvm::cantFail(TypeDeserializer::deserializeAs<ClassRecord>(cvt, cr));
+ if (cr.isForwardRef())
+ continue;
+ name = cr.Name;
+ } break;
+ case LF_UNION: {
+ UnionRecord ur;
+ llvm::cantFail(TypeDeserializer::deserializeAs<UnionRecord>(cvt, ur));
+ if (ur.isForwardRef())
+ continue;
+ name = ur.Name;
+ } break;
+ case LF_ENUM: {
+ EnumRecord er;
+ llvm::cantFail(TypeDeserializer::deserializeAs<EnumRecord>(cvt, er));
+ if (er.isForwardRef())
+ continue;
+ name = er.Name;
+ } break;
+ default:
+ continue;
+ }
+ if (name.empty())
+ continue;
+
+ auto base_name = MSVCUndecoratedNameParser::DropScope(name);
+ m_type_base_names.Append(ConstString(base_name), ti->getIndex());
+ }
+
+ m_type_base_names.Sort();
+}
+
void SymbolFileNativePDB::DumpClangAST(Stream &s, llvm::StringRef filter) {
auto ts_or_err = GetTypeSystemForLanguage(eLanguageTypeC_plus_plus);
if (!ts_or_err)
@@ -1720,11 +1767,14 @@ void SymbolFileNativePDB::FindTypes(const lldb_private::TypeQuery &query,
std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
- std::vector<TypeIndex> matches =
- m_index->tpi().findRecordsByName(query.GetTypeBasename().GetStringRef());
+ // We can't query for the basename or full name because the type might reside
+ // in an anonymous namespace. Cache the basenames first.
+ CacheTypeNames();
+ std::vector<uint32_t> matches;
+ m_type_base_names.GetValues(query.GetTypeBasename(), matches);
- for (TypeIndex type_idx : matches) {
- TypeSP type_sp = GetOrCreateType(type_idx);
+ for (uint32_t match_idx : matches) {
+ TypeSP type_sp = GetOrCreateType(TypeIndex(match_idx));
if (!type_sp)
continue;
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
index 9891313f11d0b..457b301c4a486 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
@@ -258,6 +258,8 @@ class SymbolFileNativePDB : public SymbolFileCommon {
void ParseInlineSite(PdbCompilandSymId inline_site_id, Address func_addr);
+ void CacheTypeNames();
+
llvm::BumpPtrAllocator m_allocator;
lldb::addr_t m_obj_load_address = 0;
@@ -278,6 +280,8 @@ class SymbolFileNativePDB : public SymbolFileCommon {
llvm::DenseMap<lldb::user_id_t, std::shared_ptr<InlineSite>> m_inline_sites;
llvm::DenseMap<llvm::codeview::TypeIndex, llvm::codeview::TypeIndex>
m_parent_types;
+
+ lldb_private::UniqueCStringMap<uint32_t> m_type_base_names;
};
} // namespace npdb
diff --git a/lldb/source/Symbol/Type.cpp b/lldb/source/Symbol/Type.cpp
index 0a886e56100a1..ddb22d611140b 100644
--- a/lldb/source/Symbol/Type.cpp
+++ b/lldb/source/Symbol/Type.cpp
@@ -134,7 +134,9 @@ bool TypeQuery::ContextMatches(
if (ctx == ctx_end)
return false; // Pattern too long.
- if (ctx->kind == CompilerContextKind::Namespace && ctx->name.IsEmpty()) {
+ if ((ctx->kind & CompilerContextKind::Namespace) ==
+ CompilerContextKind::Namespace &&
+ ctx->name.IsEmpty()) {
// We're matching an anonymous namespace. These are optional, so we check
// if the pattern expects an anonymous namespace.
if (pat->name.IsEmpty() && (pat->kind & CompilerContextKind::Namespace) ==
@@ -164,7 +166,9 @@ bool TypeQuery::ContextMatches(
auto should_skip = [this](const CompilerContext &ctx) {
if (ctx.kind == CompilerContextKind::Module)
return GetIgnoreModules();
- if (ctx.kind == CompilerContextKind::Namespace && ctx.name.IsEmpty())
+ if ((ctx.kind & CompilerContextKind::Namespace) ==
+ CompilerContextKind::Namespace &&
+ ctx.name.IsEmpty())
return !GetStrictNamespaces();
return false;
};
diff --git a/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit b/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit
new file mode 100644
index 0000000000000..e61ed2e2f453e
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit
@@ -0,0 +1,18 @@
+b main
+r
+
+type lookup S
+type lookup ::S
+type lookup Outer::S
+type lookup Outer::Inner1::S
+type lookup Inner1::S
+type lookup Outer::Inner1::Inner2::S
+type lookup Inner2::S
+type lookup Outer::Inner2::S
+type lookup Outer::A
+type lookup A
+type lookup ::A
+expr sizeof(S)
+expr sizeof(A)
+
+quit
diff --git a/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp b/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp
new file mode 100644
index 0000000000000..8dbe062d8240f
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp
@@ -0,0 +1,115 @@
+// clang-format off
+// REQUIRES: lld, x86
+
+// Test namespace lookup.
+// RUN: %clang_cl --target=x86_64-windows-msvc -Od -Z7 -GS- -c /Fo%t.obj -- %s
+// RUN: lld-link -debug:full -nodefaultlib -entry:main %t.obj -out:%t.exe -pdb:%t.pdb
+// RUN: %lldb -f %t.exe -s \
+// RUN: %p/Inputs/namespace-access.lldbinit 2>&1 | FileCheck %s
+
+struct S {
+ char a[1];
+};
+
+namespace Outer {
+
+ struct S {
+ char a[2];
+ };
+
+ namespace Inner1 {
+ struct S {
+ char a[3];
+ };
+
+ namespace Inner2 {
+ struct S {
+ char a[4];
+ };
+ } // namespace Inner2
+ } // namespace Inner1
+
+ namespace Inner2 {
+ struct S {
+ char a[5];
+ };
+ } // namespace Inner2
+
+ namespace {
+ struct A {
+ char a[6];
+ };
+ } // namespace
+
+} // namespace Outer
+
+namespace {
+ struct A {
+ char a[7];
+ };
+} // namespace
+
+int main(int argc, char **argv) {
+ S s;
+ Outer::S os;
+ Outer::Inner1::S oi1s;
+ Outer::Inner1::Inner2::S oi1i2s;
+ Outer::Inner2::S oi2s;
+ A a1;
+ Outer::A a2;
+ return sizeof(s) + sizeof(os) + sizeof(oi1s) + sizeof(oi1i2s) + sizeof(oi2s) + sizeof(a1) + sizeof(a2);
+}
+
+
+
+// CHECK: (lldb) type lookup S
+// CHECK: struct S {
+// CHECK: struct S {
+// CHECK: struct S {
+// CHECK: struct S {
+// CHECK: struct S {
+// CHECK: }
+// CHECK-NEXT: (lldb) type lookup ::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT: char a[1];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT: char a[2];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner1::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT: char a[3];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Inner1::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT: char a[3];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner1::Inner2::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT: char a[4];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Inner2::S
+// CHECK-NEXT: struct S {
+// CHECK: struct S {
+// CHECK: }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner2::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT: char a[5];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::A
+// CHECK-NEXT: struct A {
+// CHECK-NEXT: char a[6];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup A
+// CHECK-NEXT: struct A {
+// CHECK: struct A {
+// CHECK: }
+// CHECK-NEXT: (lldb) type lookup ::A
+// CHECK-NEXT: struct A {
+// CHECK-NEXT: char a[7];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) expr sizeof(S)
+// CHECK-NEXT: (__size_t) $0 = 1
+// CHECK-NEXT: (lldb) expr sizeof(A)
+// CHECK-NEXT: (__size_t) $1 = 7
|
95e8fab
to
fa3c96b
Compare
@@ -1630,6 +1630,53 @@ size_t SymbolFileNativePDB::ParseSymbolArrayInScope( | |||
return count; | |||
} | |||
|
|||
void SymbolFileNativePDB::CacheTypeNames() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, SymbolFileNativePDB::BuildParentMap()
already does the tpi stream iteration and it's called at NativePDB plugin initial setup. We could just cache the those base names there instead of iterating it the second time.
expr sizeof(S) | ||
expr sizeof(A) | ||
|
||
quit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets put these commands into the test file itself. You can use split-file
for this. For example:
# RUN: split-file %s %t |
@@ -2291,6 +2298,8 @@ void SymbolFileNativePDB::BuildParentMap() { | |||
TypeIndex fwd = full_to_forward[full]; | |||
m_parent_types[fwd] = m_parent_types[full]; | |||
} | |||
|
|||
m_type_base_names.Sort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_type_base_names
is a UniqueCStringMap
which holds a vector<(CString, T)>
. To act as a map (i.e. to be able to use binary search), it needs to be sorted. Append(element)
only translates to push_back
. Once we inserted all elements, the map is unsorted, so we have to sort it to be able to look up elements.
@@ -134,7 +134,9 @@ bool TypeQuery::ContextMatches( | |||
if (ctx == ctx_end) | |||
return false; // Pattern too long. | |||
|
|||
if (ctx->kind == CompilerContextKind::Namespace && ctx->name.IsEmpty()) { | |||
if ((ctx->kind & CompilerContextKind::Namespace) == |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't we do that? Is there some PDB limitation?
We could do that. It would probably replicate the TypeQuery
constructor, though (which arguably isn't that large).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the current assumption is that when matching types, the pattern can contain wildcards (you can say "i'm looking for a type or a namespace at a given position"), but the thing you're matching is always precise (it will say "i'm a type" or "I'm a namespace", but it can't say "I don't know"). For DWARF this isn't an issue because we always have that information. I don't know if that's the case for PDB, or if it's just a side-effect of how it performs the query (*). If it's not possible to do it the other way, then we may have to change the way we do matching, but I think it'd be better to not do that.
(*) Currently, the PDB code kind of cheats by taking the type name and parsing it as a type query (SymbolFileNativePDB.cpp:1736 in this patch), but then using the result of that parsing as the thing to match. This is bad for two reasons:
- if you're parsing from a name, it's clear that you won't be able to see whether enclosing scopes are a type or a namespace. The information just isn't there --
Foo::Bar
looks the same regardless of whetherFoo
is a namespace or not. - It's going to be slow because this approach requires you to fully materialize the type (and everything it depends on) before knowing if this is even the type you're looking for. This may not be as big of an issue for PDB as it is for DWARF because (so I hear) in doesn't have template information (which is the big source of fanout in this process) and it may have less of these searches internally (because it's already linked, in DWARF we need to do this just to search for a definition DIE), but it still means that a search for
foo::iterator
will materialize every iterator out there.
For this reason it would be better type context was created differently, without relying on a Type
object. For DWARF this happens in DWARFDIE::GetTypeLookupContext
, but I don't know if something like that works in PDB. The patch description seems to indicate that this information does not exist, but I don't understand how anything could work in that case. E.g. I don't know what kind of Clang AST would we created for these types. Are they properly nested in their expected namespace? If so how do you figure out that namespace?
Previously,
type lookup
for types in namespaces didn't work with the native PDB plugin, becauseFindTypes
would only look for types whose base name was equal to their full name. PDB/CodeView does not store the base names in the TPI stream, but the types have their full name (e.g.std::thread
instead ofthread
). SofindRecordsByName
would only return types in the top level namespace.This PR changes the lookup to go through all types and check their base name. As that could be a bit expensive, the names are first cached (similar to the function lookup in the DIA PDB plugin). Potential types are checked with
TypeQuery::ContextMatches
.To be able to handle anonymous namespaces, I changed
TypeQuery::ContextMatches
. TheTypeQuery
constructor inserts all name components asCompilerContextKind::AnyDeclContext
. To skip over anonymous namespaces,ContextMatches
checked if a component was empty and exactly of kindNamespace
. For our query, the last check was always false, so we never skipped anonymous namespaces. DWARF doesn't have this problem, as it constructs the context outside and has proper information about namespaces. I'm not fully sure if my change is correct and that it doesn't break other users ofTypeQuery
.This enables
type lookup <type>
to work on types in namespaces. However, expressions don't work with this yet, becauseFindNamespace
is unimplemented for native PDB.