Skip to content

[LLDB][NativePDB] Allow type lookup in namespaces #149876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Nerixyz
Copy link
Contributor

@Nerixyz Nerixyz commented Jul 21, 2025

Previously, type lookup for types in namespaces didn't work with the native PDB plugin, because FindTypes would only look for types whose base name was equal to their full name. PDB/CodeView does not store the base names in the TPI stream, but the types have their full name (e.g. std::thread instead of thread). So findRecordsByName would only return types in the top level namespace.

This PR changes the lookup to go through all types and check their base name. As that could be a bit expensive, the names are first cached (similar to the function lookup in the DIA PDB plugin). Potential types are checked with TypeQuery::ContextMatches.

To be able to handle anonymous namespaces, I changed TypeQuery::ContextMatches. The TypeQuery constructor inserts all name components as CompilerContextKind::AnyDeclContext. To skip over anonymous namespaces, ContextMatches checked if a component was empty and exactly of kind Namespace. For our query, the last check was always false, so we never skipped anonymous namespaces. DWARF doesn't have this problem, as it constructs the context outside and has proper information about namespaces. I'm not fully sure if my change is correct and that it doesn't break other users of TypeQuery.

This enables type lookup <type> to work on types in namespaces. However, expressions don't work with this yet, because FindNamespace is unimplemented for native PDB.

@Nerixyz Nerixyz requested a review from JDevlieghere as a code owner July 21, 2025 19:13
@llvmbot llvmbot added the lldb label Jul 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 21, 2025

@llvm/pr-subscribers-lldb

Author: nerix (Nerixyz)

Changes

Previously, type lookup for types in namespaces didn't work with the native PDB plugin, because FindTypes would only look for types whose base name was equal to their full name. PDB/CodeView does not store the base names in the TPI stream, but the types have their full name (e.g. std::thread instead of thread). So findRecordsByName would only return types in the top level namespace.

This PR changes the lookup to go through all types and check their base name. As that could be a bit expensive, the names are first cached (similar to the function lookup in the DIA PDB plugin). Potential types are checked with TypeQuery::ContextMatches.

To be able to handle anonymous namespaces, I changed TypeQuery::ContextMatches. The TypeQuery constructor inserts all name components as CompilerContextKind::AnyDeclContext. To skip over anonymous namespaces, ContextMatches checked if a component was empty and exactly of kind Namespace. For our query, the last check was always false, so we never skipped anonymous namespaces. DWARF doesn't have this problem, as it constructs the context outside and has proper information about namespaces. I'm not fully sure if my change is correct and that it doesn't break other users of TypeQuery.

This enables type lookup &lt;type&gt; to work on types in namespaces. However, expressions don't work with this yet, because FindNamespace is unimplemented for native PDB.


Full diff: https://github.com/llvm/llvm-project/pull/149876.diff

5 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp (+54-4)
  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h (+4)
  • (modified) lldb/source/Symbol/Type.cpp (+6-2)
  • (added) lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit (+18)
  • (added) lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp (+115)
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index 20d8c1acf9c42..5141632649dd5 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -1630,6 +1630,53 @@ size_t SymbolFileNativePDB::ParseSymbolArrayInScope(
   return count;
 }
 
+void SymbolFileNativePDB::CacheTypeNames() {
+  if (!m_type_base_names.IsEmpty())
+    return;
+
+  LazyRandomTypeCollection &types = m_index->tpi().typeCollection();
+  for (auto ti = types.getFirst(); ti; ti = types.getNext(*ti)) {
+    CVType cvt = types.getType(*ti);
+    llvm::StringRef name;
+    // We are only interested in records, unions, and enums.
+    // We aren't interested in forward references as we'll visit the actual
+    // type later anyway.
+    switch (cvt.kind()) {
+    case LF_STRUCTURE:
+    case LF_CLASS: {
+      ClassRecord cr;
+      llvm::cantFail(TypeDeserializer::deserializeAs<ClassRecord>(cvt, cr));
+      if (cr.isForwardRef())
+        continue;
+      name = cr.Name;
+    } break;
+    case LF_UNION: {
+      UnionRecord ur;
+      llvm::cantFail(TypeDeserializer::deserializeAs<UnionRecord>(cvt, ur));
+      if (ur.isForwardRef())
+        continue;
+      name = ur.Name;
+    } break;
+    case LF_ENUM: {
+      EnumRecord er;
+      llvm::cantFail(TypeDeserializer::deserializeAs<EnumRecord>(cvt, er));
+      if (er.isForwardRef())
+        continue;
+      name = er.Name;
+    } break;
+    default:
+      continue;
+    }
+    if (name.empty())
+      continue;
+
+    auto base_name = MSVCUndecoratedNameParser::DropScope(name);
+    m_type_base_names.Append(ConstString(base_name), ti->getIndex());
+  }
+
+  m_type_base_names.Sort();
+}
+
 void SymbolFileNativePDB::DumpClangAST(Stream &s, llvm::StringRef filter) {
   auto ts_or_err = GetTypeSystemForLanguage(eLanguageTypeC_plus_plus);
   if (!ts_or_err)
@@ -1720,11 +1767,14 @@ void SymbolFileNativePDB::FindTypes(const lldb_private::TypeQuery &query,
 
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
 
-  std::vector<TypeIndex> matches =
-      m_index->tpi().findRecordsByName(query.GetTypeBasename().GetStringRef());
+  // We can't query for the basename or full name because the type might reside
+  // in an anonymous namespace. Cache the basenames first.
+  CacheTypeNames();
+  std::vector<uint32_t> matches;
+  m_type_base_names.GetValues(query.GetTypeBasename(), matches);
 
-  for (TypeIndex type_idx : matches) {
-    TypeSP type_sp = GetOrCreateType(type_idx);
+  for (uint32_t match_idx : matches) {
+    TypeSP type_sp = GetOrCreateType(TypeIndex(match_idx));
     if (!type_sp)
       continue;
 
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
index 9891313f11d0b..457b301c4a486 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
@@ -258,6 +258,8 @@ class SymbolFileNativePDB : public SymbolFileCommon {
 
   void ParseInlineSite(PdbCompilandSymId inline_site_id, Address func_addr);
 
+  void CacheTypeNames();
+
   llvm::BumpPtrAllocator m_allocator;
 
   lldb::addr_t m_obj_load_address = 0;
@@ -278,6 +280,8 @@ class SymbolFileNativePDB : public SymbolFileCommon {
   llvm::DenseMap<lldb::user_id_t, std::shared_ptr<InlineSite>> m_inline_sites;
   llvm::DenseMap<llvm::codeview::TypeIndex, llvm::codeview::TypeIndex>
       m_parent_types;
+
+  lldb_private::UniqueCStringMap<uint32_t> m_type_base_names;
 };
 
 } // namespace npdb
diff --git a/lldb/source/Symbol/Type.cpp b/lldb/source/Symbol/Type.cpp
index 0a886e56100a1..ddb22d611140b 100644
--- a/lldb/source/Symbol/Type.cpp
+++ b/lldb/source/Symbol/Type.cpp
@@ -134,7 +134,9 @@ bool TypeQuery::ContextMatches(
     if (ctx == ctx_end)
       return false; // Pattern too long.
 
-    if (ctx->kind == CompilerContextKind::Namespace && ctx->name.IsEmpty()) {
+    if ((ctx->kind & CompilerContextKind::Namespace) ==
+            CompilerContextKind::Namespace &&
+        ctx->name.IsEmpty()) {
       // We're matching an anonymous namespace. These are optional, so we check
       // if the pattern expects an anonymous namespace.
       if (pat->name.IsEmpty() && (pat->kind & CompilerContextKind::Namespace) ==
@@ -164,7 +166,9 @@ bool TypeQuery::ContextMatches(
   auto should_skip = [this](const CompilerContext &ctx) {
     if (ctx.kind == CompilerContextKind::Module)
       return GetIgnoreModules();
-    if (ctx.kind == CompilerContextKind::Namespace && ctx.name.IsEmpty())
+    if ((ctx.kind & CompilerContextKind::Namespace) ==
+            CompilerContextKind::Namespace &&
+        ctx.name.IsEmpty())
       return !GetStrictNamespaces();
     return false;
   };
diff --git a/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit b/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit
new file mode 100644
index 0000000000000..e61ed2e2f453e
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit
@@ -0,0 +1,18 @@
+b main
+r
+
+type lookup S
+type lookup ::S
+type lookup Outer::S
+type lookup Outer::Inner1::S
+type lookup Inner1::S
+type lookup Outer::Inner1::Inner2::S
+type lookup Inner2::S
+type lookup Outer::Inner2::S
+type lookup Outer::A
+type lookup A
+type lookup ::A
+expr sizeof(S)
+expr sizeof(A)
+
+quit
diff --git a/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp b/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp
new file mode 100644
index 0000000000000..8dbe062d8240f
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp
@@ -0,0 +1,115 @@
+// clang-format off
+// REQUIRES: lld, x86
+
+// Test namespace lookup.
+// RUN: %clang_cl --target=x86_64-windows-msvc -Od -Z7 -GS- -c /Fo%t.obj -- %s
+// RUN: lld-link -debug:full -nodefaultlib -entry:main %t.obj -out:%t.exe -pdb:%t.pdb
+// RUN: %lldb -f %t.exe -s \
+// RUN:     %p/Inputs/namespace-access.lldbinit 2>&1 | FileCheck %s
+
+struct S {
+  char a[1];
+};
+
+namespace Outer {
+
+  struct S {
+    char a[2];
+  };
+
+  namespace Inner1 {
+    struct S {
+      char a[3];
+    };
+
+    namespace Inner2 {
+      struct S {
+        char a[4];
+      };
+    } // namespace Inner2
+  } // namespace Inner1
+
+  namespace Inner2 {
+    struct S {
+      char a[5];
+    };
+  } // namespace Inner2
+
+  namespace {
+    struct A {
+      char a[6];
+    };
+  } // namespace
+
+} // namespace Outer
+
+namespace {
+  struct A {
+    char a[7];
+  };
+} // namespace
+
+int main(int argc, char **argv) {
+  S s;
+  Outer::S os;
+  Outer::Inner1::S oi1s;
+  Outer::Inner1::Inner2::S oi1i2s;
+  Outer::Inner2::S oi2s;
+  A a1;
+  Outer::A a2;
+  return sizeof(s) + sizeof(os) + sizeof(oi1s) + sizeof(oi1i2s) + sizeof(oi2s) + sizeof(a1) + sizeof(a2);
+}
+
+
+
+// CHECK:      (lldb) type lookup S  
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      }
+// CHECK-NEXT: (lldb) type lookup ::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[1];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[2];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner1::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[3];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Inner1::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[3];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner1::Inner2::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[4];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Inner2::S         
+// CHECK-NEXT: struct S {
+// CHECK:      struct S {
+// CHECK:      }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner2::S        
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[5];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::A        
+// CHECK-NEXT: struct A {
+// CHECK-NEXT:     char a[6];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup A       
+// CHECK-NEXT: struct A {
+// CHECK:      struct A {
+// CHECK:      }
+// CHECK-NEXT: (lldb) type lookup ::A
+// CHECK-NEXT: struct A {
+// CHECK-NEXT:     char a[7];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) expr sizeof(S) 
+// CHECK-NEXT: (__size_t) $0 = 1
+// CHECK-NEXT: (lldb) expr sizeof(A)
+// CHECK-NEXT: (__size_t) $1 = 7

@Nerixyz Nerixyz force-pushed the fix/lldb-npdb-type-anon-ns branch from 95e8fab to fa3c96b Compare July 21, 2025 19:28
@@ -1630,6 +1630,53 @@ size_t SymbolFileNativePDB::ParseSymbolArrayInScope(
return count;
}

void SymbolFileNativePDB::CacheTypeNames() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, SymbolFileNativePDB::BuildParentMap() already does the tpi stream iteration and it's called at NativePDB plugin initial setup. We could just cache the those base names there instead of iterating it the second time.

expr sizeof(S)
expr sizeof(A)

quit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets put these commands into the test file itself. You can use split-file for this. For example:

@@ -2291,6 +2298,8 @@ void SymbolFileNativePDB::BuildParentMap() {
TypeIndex fwd = full_to_forward[full];
m_parent_types[fwd] = m_parent_types[full];
}

m_type_base_names.Sort();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this sort?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_type_base_names is a UniqueCStringMap which holds a vector<(CString, T)>. To act as a map (i.e. to be able to use binary search), it needs to be sorted. Append(element) only translates to push_back. Once we inserted all elements, the map is unsorted, so we have to sort it to be able to look up elements.

@@ -134,7 +134,9 @@ bool TypeQuery::ContextMatches(
if (ctx == ctx_end)
return false; // Pattern too long.

if (ctx->kind == CompilerContextKind::Namespace && ctx->name.IsEmpty()) {
if ((ctx->kind & CompilerContextKind::Namespace) ==
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@labath since you looked at this some time ago, do you see a concern with this? I guess the alternative is to make sure we don't construct the TypeQuery with a AnyDeclContext in PDB-land. @Nerixyz why can't we do that? Is there some PDB limitation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we do that? Is there some PDB limitation?

We could do that. It would probably replicate the TypeQuery constructor, though (which arguably isn't that large).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so the current assumption is that when matching types, the pattern can contain wildcards (you can say "i'm looking for a type or a namespace at a given position"), but the thing you're matching is always precise (it will say "i'm a type" or "I'm a namespace", but it can't say "I don't know"). For DWARF this isn't an issue because we always have that information. I don't know if that's the case for PDB, or if it's just a side-effect of how it performs the query (*). If it's not possible to do it the other way, then we may have to change the way we do matching, but I think it'd be better to not do that.

(*) Currently, the PDB code kind of cheats by taking the type name and parsing it as a type query (SymbolFileNativePDB.cpp:1736 in this patch), but then using the result of that parsing as the thing to match. This is bad for two reasons:

  • if you're parsing from a name, it's clear that you won't be able to see whether enclosing scopes are a type or a namespace. The information just isn't there --Foo::Bar looks the same regardless of whether Foo is a namespace or not.
  • It's going to be slow because this approach requires you to fully materialize the type (and everything it depends on) before knowing if this is even the type you're looking for. This may not be as big of an issue for PDB as it is for DWARF because (so I hear) in doesn't have template information (which is the big source of fanout in this process) and it may have less of these searches internally (because it's already linked, in DWARF we need to do this just to search for a definition DIE), but it still means that a search for foo::iterator will materialize every iterator out there.

For this reason it would be better type context was created differently, without relying on a Type object. For DWARF this happens in DWARFDIE::GetTypeLookupContext, but I don't know if something like that works in PDB. The patch description seems to indicate that this information does not exist, but I don't understand how anything could work in that case. E.g. I don't know what kind of Clang AST would we created for these types. Are they properly nested in their expected namespace? If so how do you figure out that namespace?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants